Table 1 Transformer-45k and SpliceAI-10k performance on splice junctions from all tissues in GTEx V8 and Icelandic blood samples

From: Transformers significantly improve splice site prediction

 

PR-AUC [95% CI]

Top-k accuracy [95% CI]

Transformer-45k fine-tuned on RNA-Seq annotations

0.834 [0.833, 0.835]

0.744 (\(\frac{147,949}{198,984}\)) [0.742, 0.745]

SpliceAI-10k fine-tuned on RNA-Seq annotations

0.832 [0.830, 0.833]

0.741 (\(\frac{147,400}{198,984}\)) [0.739, 0.742]

SpliceAI-10k pre-trained weights

0.820 [0.819, 0.821]

0.732 (\(\frac{145,666}{198,984}\)) [0.731, 0.734]

SpliceAI-10k trained on ENSEMBL

0.753 [0.751, 0.754]

0.686 (\(\frac{136,550}{198,984}\)) [0.685, 0.688]

Transformer-45k trained on ENSEMBL

0.750 [0.749, 0.752]

0.691 (\(\frac{137,595}{198,984}\)) [0.690, 0.693]

  1. The fine-tuning was done on the combined RNA-Seq splice sites, however, the results are only shown for chromosomes left out during training, here the combined number of splice sites is 198,984. The performance metrics are PR-AUC and top-k accuracy. 95% confidence intervals (CIs) are shown in brackets and the best score displayed in bold.