Fig. 1: RiNALMo pre-training and applications.
From: RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks

In the pre-training stage, RiNALMo is trained on unlabeled RNA sequences from several databases using masked language modeling (MLM). To corrupt the input sequence, we randomly mask 15% of the tokens in the training sequence. Before being passed to the Transformer, an RNA sequence is tokenized and turned into a 1280-dimensional vector using a learned input embedding module. The language model comprises 33 Transformer blocks. Each Transformer block consists of a multi-head attention and a feed-forward network. Once pre-trained, RiNALMo can be separately fine-tuned for various structural and functional downstream tasks in which its expressive output embeddings, utilized by the prediction heads, significantly improve performance. In this work, we fine-tuned RiNALMo for secondary structure, multi-species splice-site, mean ribosome loading (MRL), translation efficiency (TE), and expression level (EL) prediction, as well as for ncRNA family classification.