Fig. 7: Mean ribosome loading, translation efficiency, and expression level prediction.
From: RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks

a To produce multiple proteins from a single mRNA strand, cells utilize multiple ribosomes. The mean ribosome load metric (MRL), calculated based on a 5' UTR sequence, measures the number of ribosomes translating an mRNA at a given time. Further, 5' UTR is also predictive of the mRNA translation efficiency (TE) that quantifies the rate of translation into proteins and the mRNA expression level (EL) that reflects the relative quantity of the mRNA transcript in the cell. b 5' UTR representation produced by RiNALMo is forwarded to the one-dimensional convolutional ResNet which outputs the MRL, TE, or EL prediction. c, R2 score for MRL prediction on the Random7600 and Human7600 datasets. d Average Spearman correlation coefficient across ten folds for TE prediction for human muscle tissue (Muscle), human prostate cancer cell (PC3), and human embryonic kidney 293T (HEK), and the average across cell lines and tissue types. e Average Spearman correlation coefficient across ten folds for EL prediction for the same datasets as for the TE downstream task. In (c, d, e) FT denotes whether we trained the model or represented direct citations from the original papers with the same split train/test datasets. The best result for each evaluation dataset is shown in bold.