Fig. 6: Comparison of full-length mRNA sequence traits prediction tasks.
From: mRNABERT: advancing mRNA sequence design with a universal language model and comprehensive dataset

A, B The bar charts in orange represent two stability-related prediction tasks. C–F The bar charts in blue represent four translation efficiency-related prediction tasks. The first four bars to the left of mRNABERT represent pre-trained models on 5′UTR, CDS, and 3′UTR regions, while the five bars to the right represent different ncRNA pre-trained models. G The average performance across the six tasks. The x-axis uses a logarithmic scale, with a break in the middle, to represent the number of model parameters. All models were fine-tuned using the same code, data, and parameter settings to ensure a fair comparison. Performance is compared using the R² metric, revealing that mRNABERT demonstrates a significant performance lead over most models of comparable size.