Table 1 The AUC results for binary sequence classification tasks on human genome
From: Benchmarking DNA foundation models for genomic and genetic tasks
Data | DNABERT-2 | NT-v2 | HyenaDNA | Caduceus-Ph | GROVER |
|---|---|---|---|---|---|
DNase I Hypersensitive | 0.8666 | 0.8524 | 0.8295 | 0.8799 | 0.857 |
Human TFBS 1 | 0.8382 | 0.8315 | 0.8301 | 0.8796 | 0.8618 |
Human TFBS 2 | 0.821 | 0.809 | 0.8205 | 0.8687 | 0.8495 |
Human TFBS 3 | 0.7896 | 0.7974 | 0.7875 | 0.8249 | 0.8158 |
Human TFBS 4 | 0.726 | 0.7103 | 0.7149 | 0.7725 | 0.763 |
Human TFBS 5 | 0.9204 | 0.9149 | 0.9159 | 0.9294 | 0.931 |
Promoter GM12878 | 0.9856 | 0.9835 | 0.976 | 0.9865 | 0.9839 |
Promoter HUVEC | 0.9903 | 0.987 | 0.9817 | 0.9896 | 0.9885 |
Promoter Hela-S3 | 0.9886 | 0.9838 | 0.981 | 0.9871 | 0.9857 |
Promoter NHEK | 0.9501 | 0.9323 | 0.9271 | 0.9567 | 0.9507 |
Acceptor | 0.8969 | 0.7928 | 0.7946 | 0.8449 | 0.8041 |
Coding | 0.9438 | 0.9289 | 0.9406 | 0.9735 | 0.9594 |
Donor | 0.9056 | 0.8198 | 0.8128 | 0.8535 | 0.819 |
Enhancer | 0.8717 | 0.8674 | 0.8339 | 0.8384 | 0.8554 |
Enhancer Cohn | 0.8223 | 0.7894 | 0.7754 | 0.821 | 0.8161 |
Enhancer Ensembl | 0.9369 | 0.9389 | 0.9356 | 0.9431 | 0.9382 |
Open chromatin region | 0.7253 | 0.7183 | 0.7191 | 0.765 | 0.7455 |
Promoter All 300 bps | 0.9426 | 0.9445 | 0.9394 | 0.9519 | 0.9402 |
Promoter All 70 bps | 0.8311 | 0.8527 | 0.832 | 0.8748 | 0.8506 |
Promoter NonTATA 251 bps | 0.9297 | 0.8905 | 0.928 | 0.9426 | 0.9395 |
Promoter NonTATA 300 bps | 0.9765 | 0.9758 | 0.9662 | 0.9834 | 0.9728 |
Promoter NonTATA 70 bps | 0.8531 | 0.8729 | 0.8516 | 0.8961 | 0.8704 |
Promoter TATA 300 bps | 0.7646 | 0.7791 | 0.8077 | 0.76 | 0.78 |
Promoter TATA 70 bps | 0.7781 | 0.7947 | 0.7827 | 0.8103 | 0.796 |