Table 3 The AUC results for each model on 5mC, 6 mA detection in human; epigenetic marks detection in yeast, and 4mC detection in multiple species

From: Benchmarking DNA foundation models for genomic and genetic tasks

Data

DNABERT-2

NT-v2

HyenaDNA

Caduceus-Ph

GROVER

Human 5mC

0.685

0.7377

0.6843

0.783

0.7437

Human 6 mA

0.7351

0.7508

0.7377

0.7731

0.7671

Yeast H3

0.9137

0.8951

0.8996

0.9285

0.9056

Yeast H3K14ac

0.7597

0.7407

0.7067

0.7297

0.7301

Yeast H3K36me3

0.7989

0.7849

0.7395

0.7662

0.7533

Yeast H3K4me1

0.7306

0.7115

0.6994

0.7071

0.696

Yeast H3K4me2

0.7078

0.6847

0.6854

0.6895

0.6939

Yeast H3K4me3

0.6813

0.6603

0.6486

0.6595

0.668

Yeast H3K79me3

0.8565

0.8436

0.8215

0.845

0.8427

Yeast H3K9ac

0.7922

0.7687

0.7555

0.7779

0.7692

Yeast H4

0.9314

0.9104

0.8983

0.9304

0.908

Yeast H4ac

0.7473

0.7263

0.6979

0.7235

0.7175

A.Thaliana 4mC

0.5994

0.6332

0.5941

0.6146

0.6026

C.Elegans 4mC

0.5985

0.6487

0.5964

0.5964

0.6057

D.Melanogaster 4mC

0.6147

0.6519

0.6096

0.6161

0.6167

E.Coli 4mC

0.5492

0.6028

0.6105

0.6283

0.5851

G.Pickeringii 4mC

0.5958

0.6302

0.6292

0.6348

0.6293

G.Subterraneus 4mC

0.5802

0.6145

0.609

0.6079

0.6061

  1. Using mean token pooling method. Bolded: higher than at least two other AUCs, p < 0.01. P-values are calculated using one-sided DeLong’s test.