Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Roadmap
  • Published:

Decoding the interactions and functions of non-coding RNA with artificial intelligence

Abstract

In addition to encoding proteins, mRNAs have context-specific regulatory roles that contribute to many cellular processes. However, uncovering new mRNA functions is constrained by limitations of traditional biochemical and computational methods. In this Roadmap, we highlight how artificial intelligence can transform our understanding of RNA biology by fostering collaborations between RNA biologists and computational scientists to drive innovation in this fundamental field of research. We discuss how non-coding regions of the mRNA, including introns and 5′ and 3′ untranslated regions, regulate the metabolism and interactomes of mRNA, and the current challenges in characterizing these regions. We further discuss large language models, which can be used to learn biologically meaningful RNA sequence representations. We also provide a detailed roadmap for integrating large language models with graph neural networks to harness publicly available sequencing and knowledge data. Adopting this roadmap will allow us to predict RNA interactions with diverse molecules and the modelling of context-specific mRNA interactomes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Core principles defining the specificity and affinity of mRNA interactions.
Fig. 2: Defining global characteristics of the RNA interactome.
Fig. 3: Large language models for RNA sequence analysis.
Fig. 4: Roadmap for refined RNA sequence representation and comprehensive RNA interactome modelling.

Similar content being viewed by others

References

  1. Robertson, M. P. & Joyce, G. F. The origins of the RNA world. Cold Spring Harb. Perspect. Biol. 4, a003608 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Papastavrou, N., Horning, D. P. & Joyce, G. F. RNA-catalyzed evolution of catalytic RNA. Proc. Natl Acad. Sci. USA 121, e2321592121 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Pearce, B. K. D., Pudritz, R. E., Semenov, D. A. & Henning, T. K. Origin of the RNA world: the fate of nucleobases in warm little ponds. Proc. Natl Acad. Sci. USA 114, 11327–11332 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Palcau, A. C. et al. CircPVT1: a pivotal circular node intersecting long non-coding-PVT1 and c-MYC oncogenic signals. Mol. Cancer 21, 33 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Mou, X., Liew, S. W. & Kwok, C. K. Identification and targeting of G-quadruplex structures in MALAT1 long non-coding RNA. Nucleic Acids Res. 50, 397–410 (2022).

    Article  CAS  PubMed  Google Scholar 

  6. Roden, C. & Gladfelter, A. S. RNA contributions to the form and function of biomolecular condensates. Nat. Rev. Mol. Cell Biol. 22, 183–195 (2021).

    Article  CAS  PubMed  Google Scholar 

  7. Morris, K. V. & Mattick, J. S. The rise of regulatory RNA. Nat. Rev. Genet. 15, 423–437 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Yang, H., Li, Q., Stroup, E. K., Wang, S. & Ji, Z. Widespread stable noncanonical peptides identified by integrated analyses of ribosome profiling and ORF features. Nat. Commun. 15, 1932 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Nussbacher, J. K., Tabet, R., Yeo, G. W. & Lagier-Tourenne, C. Disruption of RNA metabolism in neurological diseases and emerging therapeutic interventions. Neuron 102, 294–320 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Wang, W., van Niekerk, E., Willis, D. E. & Twiss, J. L. RNA transport and localized protein synthesis in neurological disorders and neural repair. Dev. Neurobiol. 67, 1166–1182 (2007).

    Article  CAS  PubMed  Google Scholar 

  11. Goodall, G. J. & Wickramasinghe, V. O. RNA in cancer. Nat. Rev. Cancer 21, 22–36 (2021).

    Article  CAS  PubMed  Google Scholar 

  12. Egger, G. & Arimondo, P. Drug Discovery in Cancer Epigenetics (Academic, 2015).

  13. Zhou, Y., Huang, T., Li, T. & Sun, J. RNA Modification in Human Cancers: Roles and Therapeutic Implications (Frontiers Media, 2022).

  14. Giangrande, P. H., de Franciscis, V. & Rossi, J. J. RNA Therapeutics: the Evolving Landscape of RNA Therapeutics (Academic, 2022).

  15. Ahmad, R. U. & Pathak, S. Unlocking the therapeutic potential of RNA: a comprehensive review of RNA-based therapy. Doctoral dissertation, BRAC University (2023).

  16. Lin, C. & Miles, W. O. Beyond CLIP: advances and opportunities to measure RBP–RNA and RNA–RNA interactions. Nucleic Acids Res. 47, 5490–5501 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Baralle, F. E. & Giudice, J. Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Sciarrillo, R. et al. The role of alternative splicing in cancer: from oncogenesis to drug resistance. Drug Resist. Updat. 53, 100728 (2020).

    Article  PubMed  Google Scholar 

  19. Andreassi, C., Crerar, H. & Riccio, A. Post-transcriptional processing of mRNA in neurons: the vestiges of the RNA world drive transcriptome diversity. Front. Mol. Neurosci. 11, 304 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Andreassi, C. & Riccio, A. To localize or not to localize: mRNA fate is in 3′UTR ends. Trends Cell Biol. 19, 465–474 (2009).

    Article  CAS  PubMed  Google Scholar 

  21. Mayr, C. Regulation by 3′-untranslated regions. Annu. Rev. Genet. 51, 171–194 (2017).

    Article  CAS  PubMed  Google Scholar 

  22. Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Hentze, M. W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 19, 327–341 (2018).

    Article  CAS  PubMed  Google Scholar 

  24. Zhang, R. & Su, B. Small but influential: the role of microRNAs on gene regulatory network and 3′UTR evolution. J. Genet. Genomics 36, 1–6 (2009).

    Article  PubMed  Google Scholar 

  25. Rajyaguru, P. & Parker, R. RGG motif proteins: modulators of mRNA functional states. Cell Cycle 11, 2594–2599 (2012).

    Article  CAS  PubMed  Google Scholar 

  26. Schwartz, J. C., Cech, T. R. & Parker, R. R. Biochemical properties and biological functions of FET proteins. Annu. Rev. Biochem. 84, 355–379 (2015).

    Article  CAS  PubMed  Google Scholar 

  27. Taliaferro, J. M. et al. Distal alternative last exons localize mRNAs to neural projections. Mol. Cell 61, 821–833 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hinnebusch, A. G., Ivanov, I. P. & Sonenberg, N. Translational control by 5′-untranslated regions of eukaryotic mRNAs. Science 352, 1413–1416 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wieder, N. et al. Differences in 5′ untranslated regions highlight the importance of translational regulation of dosage sensitive genes. Genome Biol. 25, 111 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Jia, L. et al. Decoding mRNA translatability and stability from the 5′ UTR. Nat. Struct. Mol. Biol. 27, 814–821 (2020).

    Article  CAS  PubMed  Google Scholar 

  31. Ryczek, N., Łyś, A. & Makałowska, I. The functional meaning of 5′UTR in protein-coding genes. Int. J. Mol. Sci. 24, 2976 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Colliva, A. & Tongiorgi, E. Distinct role of 5′UTR sequences in dendritic trafficking of BDNF mRNA: additional mechanisms for the BDNF splice variants spatial code. Mol. Brain 14, 10 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Res. 22, 1173–1183 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Hilgers, V. et al. Neural-specific elongation of 3′ UTRs during drosophila development. Proc. Natl Acad. Sci. USA 108, 15864–15869 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Miura, P., Shenker, S., Andreu-Agullo, C., Westholm, J. O. & Lai, E. C. Widespread and extensive lengthening of 3′ UTRs in the mammalian brain. Genome Res. 23, 812–825 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Andreassi, C. et al. Cytoplasmic cleavage of IMPA1 3′ UTR is necessary for maintaining axon integrity. Cell Rep. 34, 108778 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Tushev, G. et al. Alternative 3′ UTRs modify the localization, regulatory potential, stability, and plasticity of mRNAs in neuronal compartments. Neuron 98, 495–511.e6 (2018).

    Article  CAS  PubMed  Google Scholar 

  38. Kislauskis, E. H., Zhu, X. & Singer, R. H. Sequences responsible for intracellular localization of beta-actin messenger RNA also affect cell phenotype. J. Cell Biol. 127, 441–451 (1994).

    Article  CAS  PubMed  Google Scholar 

  39. Braunschweig, U. et al. Widespread intron retention in mammals functionally tunes transcriptomes. Genome Res. 24, 1774–1786 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Petric-Howe, M. et al. Physiological intron retaining transcripts in the cytoplasm abound during human motor neurogenesis. Genome Res. 32, 1808–1825 (2022).

    PubMed  PubMed Central  Google Scholar 

  41. Skalska, L., Beltran-Nebot, M., Ule, J. & Jenner, R. G. Regulatory feedback from nascent RNA to chromatin and transcription. Nat. Rev. Mol. Cell Biol. 18, 331–337 (2017).

    Article  CAS  PubMed  Google Scholar 

  42. Wong, J. J.-L., Au, A. Y. M., Ritchie, W. & Rasko, J. E. J. Intron retention in mRNA: no longer nonsense: known and putative roles of intron retention in normal and disease biology: known and putative roles of intron retention in normal and disease biology. Bioessays 38, 41–49 (2016).

    Article  PubMed  Google Scholar 

  43. Wong, J. J.-L. et al. Orchestrated intron retention regulates normal granulocyte differentiation. Cell 154, 583–595 (2013).

    Article  CAS  PubMed  Google Scholar 

  44. Ortiz, R. et al. Recruitment of Staufen2 enhances dendritic localization of an intron-containing CaMKIIα mRNA. Cell Rep. 20, 13–20 (2017).

    Article  CAS  PubMed  Google Scholar 

  45. Luisier, R. et al. Intron retention and nuclear loss of SFPQ are molecular hallmarks of ALS. Nat. Commun. 9, 2010 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Ma, W. & Mayr, C. A membraneless organelle associated with the endoplasmic reticulum enables 3′UTR-mediated protein-protein interactions. Cell 175, 1492–1506.e19 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Horste, E. L. et al. Subcytoplasmic location of translation controls protein output. Mol. Cell 83, 4509–4523.e11 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Luo, Y. & Mayr, C. How the location of protein synthesis controls protein function. Biophys. J. 123, 309a (2024).

    Article  Google Scholar 

  49. Luo, Y. et al. mRNA interactions with disordered regions control protein activity. Preprint at bioRxiv https://doi.org/10.1101/2023.02.18.529068 (2023).

  50. Luisier, R., Andreassi, C., Fournier, L. & Riccio, A. The predicted RNA-binding protein regulome of axonal mRNAs. Genome Res. 33, 1497–1512 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Gadir, N., Haim-Vilmovsky, L., Kraut-Cohen, J. & Gerst, J. E. Localization of mRNAs coding for mitochondrial proteins in the yeast Saccharomyces cerevisiae. RNA 17, 1551–1565 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Morita, M. et al. mTORC1 controls mitochondrial activity and biogenesis through 4E-BP-dependent translational regulation. Cell Metab. 18, 698–711 (2013).

    Article  CAS  PubMed  Google Scholar 

  53. Gandin, V. et al. nanoCAGE reveals 5′ UTR features that define specific modes of translation of functionally related MTOR-sensitive mRNAs. Genome Res. 26, 636–648 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Bugler, B., Amalric, F. & Prats, H. Alternative initiation of translation determines cytoplasmic or nuclear localization of basic fibroblast growth factor. Mol. Cell. Biol. 11, 573–577 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Lee, I. et al. New class of microRNA targets containing simultaneous 5′-UTR and 3′-UTR interaction sites. Genome Res. 19, 1175–1183 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Jia, J., Yao, P., Arif, A. & Fox, P. L. Regulation and dysregulation of 3′UTR-mediated translational control. Curr. Opin. Genet. Dev. 23, 29–34 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Mattick, J. S. et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat. Rev. Mol. Cell Biol. 24, 430–447 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Kageyama, Y., Kondo, T. & Hashimoto, Y. Coding vs non-coding: translatability of short ORFs found in putative non-coding transcripts. Biochimie 93, 1981–1986 (2011).

    Article  CAS  PubMed  Google Scholar 

  59. Nam, J.-W., Choi, S.-W. & You, B.-H. Incredible RNA: dual functions of coding and noncoding. Mol. Cell 39, 367–374 (2016).

    Article  CAS  Google Scholar 

  60. Yang, Y. et al. Extensive translation of circular RNAs driven by N6-methyladenosine. Cell Res. 27, 626–641 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Rodriguez, C. M., Chun, S. Y., Mills, R. E. & Todd, P. K. Translation of upstream open reading frames in a model of neuronal differentiation. BMC Genomics 20, 391 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Rodriguez, J. M. et al. Evidence for widespread translation of 5’ untranslated regions. Nucleic Acids Res. 52, 8112–8126 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Sudmant, P. H., Lee, H., Dominguez, D., Heiman, M. & Burge, C. B. Widespread accumulation of ribosome-associated isolated 3′ UTRs in neuronal cell populations of the aging brain. Cell Rep. 25, 2447–2456.e4 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Osman, I., Tay, M. L.-I. & Pek, J. W. Stable intronic sequence RNAs (sisRNAs): a new layer of gene regulation. Cell. Mol. Life Sci. 73, 3507–3519 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Rasmussen, A. M. et al. Circular stable intronic RNAs possess distinct biological features and are deregulated in bladder cancer. NAR Cancer 5, zcad041 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Chan, S. N. & Pek, J. W. Stable intronic sequence RNAs (sisRNAs): an expanding universe. Trends Biochem. Sci. 44, 258–272 (2019).

    Article  CAS  PubMed  Google Scholar 

  67. Talhouarne, G. J. S. & Gall, J. G. Lariat intronic RNAs in the cytoplasm of vertebrate cells. Proc. Natl Acad. Sci. USA 115, E7970–E7977 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Wilson, T. J. & Lilley, D. RNA catalysis — is that it? RNA 21, 534–537 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Cech, T. R. & Steitz, J. A. The noncoding RNA revolution — trashing old rules to forge new ones. Cell 157, 77–94 (2014).

    Article  CAS  PubMed  Google Scholar 

  70. Sebastián, D. et al. TP53INP2-dependent activation of muscle autophagy ameliorates sarcopenia and promotes healthy aging. Autophagy 20, 1815–1824 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  71. Crerar, H. et al. Regulation of NGF signaling by an axonal untranslated mRNA. Neuron 102, 553–563.e8 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Valluy, J. et al. A coding-independent function of an alternative Ube3a transcript during neuronal development. Nat. Neurosci. 18, 666–673 (2015).

    Article  CAS  PubMed  Google Scholar 

  73. Lyford, G. L. et al. Arc, a growth factor and activity-regulated gene, encodes a novel cytoskeleton-associated protein that is enriched in neuronal dendrites. Neuron 14, 433–445 (1995).

    Article  CAS  PubMed  Google Scholar 

  74. Steward, O. & Worley, P. F. Selective targeting of newly synthesized Arc mRNA to active synapses requires NMDA receptor activation. Neuron 30, 227–240 (2001).

    Article  CAS  PubMed  Google Scholar 

  75. Ashley, J. et al. Retrovirus-like Gag protein Arc1 binds RNA and traffics across synaptic boutons. Cell 172, 262–274.e11 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Pastuzyn, E. D. et al. The neuronal gene Arc encodes a repurposed retrotransposon Gag protein that mediates intercellular RNA transfer. Cell 172, 275–288.e18 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. O’Brien, K., Breyne, K., Ughetto, S., Laurent, L. C. & Breakefield, X. O. RNA delivery by extracellular vesicles in mammalian cells and its applications. Nat. Rev. Mol. Cell Biol. 21, 585–606 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  78. Kafida, M., Karela, M. & Giakountis, A. RNA-independent regulatory functions of lncRNA in complex disease. Cancers 16, 2728 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Singh, S., Shyamal, S. & Panda, A. C. Detecting RNA–RNA interactome. Wiley Interdiscip. Rev. RNA 13, e1715 (2022).

    Article  CAS  PubMed  Google Scholar 

  81. Guil, S. & Esteller, M. RNA–RNA interactions in gene regulation: the coding and noncoding players. Trends Biochem. Sci. 40, 248–256 (2015).

    Article  CAS  PubMed  Google Scholar 

  82. Yoon, J.-H., Abdelmohsen, K. & Gorospe, M. Functional interactions among microRNAs and long noncoding RNAs. Semin. Cell Dev. Biol. 34, 9–14 (2014).

    Article  CAS  PubMed  Google Scholar 

  83. Li, X. et al. GRID-seq reveals the global RNA–chromatin interactome. Nat. Biotechnol. 35, 940–950 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Cao, X., Zhang, Y., Ding, Y. & Wan, Y. Identification of RNA structures and their roles in RNA functions. Nat. Rev. Mol. Cell Biol. 25, 784–801 (2024).

    Article  CAS  PubMed  Google Scholar 

  85. Doyle, M. & Kiebler, M. A. Mechanisms of dendritic mRNA transport and its role in synaptic tagging. EMBO J. 30, 3540–3552 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Mortimer, S. A., Kidwell, M. A. & Doudna, J. A. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 15, 469–479 (2014).

    Article  CAS  PubMed  Google Scholar 

  87. Sugimoto, Y. et al. hiCLIP reveals the in vivo atlas of mRNA secondary structures recognized by Staufen 1. Nature 519, 491–494 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Wan, Y. et al. Landscape and variation of RNA secondary structure across the human transcriptome. Nature 505, 706–709 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Leppek, K., Das, R. & Barna, M. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell Biol. 19, 158–174 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  90. Jacquet, K. et al. The TIP60 complex regulates bivalent chromatin recognition by 53BP1 through direct H4K20me binding and H2AK15 acetylation. Mol. Cell 62, 409–421 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Beaudoin, J.-D. et al. Analyses of mRNA structure dynamics identify embryonic gene regulatory programs. Nat. Struct. Mol. Biol. 25, 677–686 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Wu, M.-Z. et al. Interplay between HDAC3 and WDR5 is essential for hypoxia-induced epithelial–mesenchymal transition. Mol. Cell 43, 811–822 (2011).

    Article  CAS  PubMed  Google Scholar 

  93. Paz, I. et al. RBPmap: a web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Res. 42, W361–W367 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Xu, W., Biswas, J., Singer, R. H. & Rosbash, M. Targeted RNA editing: novel tools to study post-transcriptional regulation. Mol. Cell 82, 389–403 (2022).

    Article  CAS  PubMed  Google Scholar 

  95. Hu, X., Zou, Q., Yao, L. & Yang, X. Survey of the binding preferences of RNA-binding proteins to RNA editing events. Genome Biol. 23, 169 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Medina-Munoz, H. C. et al. Expanded palette of RNA base editors for comprehensive RBP–RNA interactome studies. Nat. Commun. 15, 875 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Seo, K. W. & Kleiner, R. E. Profiling dynamic RNA–protein interactions using small-molecule-induced RNA editing. Nat. Chem. Biol. 19, 1361–1371 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Baysal, B. E., Sharma, S., Hashemikhabir, S. & Janga, S. C. RNA editing in pathogenesis of cancer. Cancer Res. 77, 3733–3739 (2017).

    Article  CAS  PubMed  Google Scholar 

  99. Wassmer, E., Koppány, G., Hermes, M., Diederichs, S. & Caudron-Herger, M. Refining the pool of RNA-binding domains advances the classification and prediction of RNA-binding proteins. Nucleic Acids Res. 52, 7504–7522 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Zigdon, I. et al. Beyond RNA binding domains: determinants of protein–RNA binding. RNA 30, 1620–1633 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Lunde, B. M., Moore, C. & Varani, G. RNA-binding proteins: modular design for efficient function. Nat. Rev. Mol. Cell Biol. 8, 479–490 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Gronland, G. R. & Ramos, A. The devil is in the domain: understanding protein recognition of multiple RNA targets. Biochem. Soc. Trans. 45, 1305–1311 (2017).

    Article  CAS  PubMed  Google Scholar 

  103. Corley, M., Burns, M. C. & Yeo, G. W. How RNA-binding proteins interact with RNA: molecules and mechanisms. Mol. Cell 78, 9–29 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Street, L. A. et al. Large-scale map of RNA-binding protein interactomes across the mRNA life cycle. Mol. Cell 84, 3790–3809.e8 (2024).

    Article  CAS  PubMed  Google Scholar 

  105. He, S., Valkov, E., Cheloufi, S. & Murn, J. The nexus between RNA-binding proteins and their effectors. Nat. Rev. Genet. 24, 276–294 (2023).

    Article  CAS  PubMed  Google Scholar 

  106. Zhang, Y. et al. Structure, phosphorylation and U2AF65 binding of the N-terminal domain of splicing factor 1 during 3′-splice site recognition. Nucleic Acids Res. 41, 1343–1354 (2013).

    Article  CAS  PubMed  Google Scholar 

  107. Järvelin, A. I., Noerenberg, M., Davis, I. & Castello, A. The new (dis)order in RNA regulation. Cell Commun. Signal. 14, 9 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  108. Kato, M., Zhou, X. & McKnight, S. L. How do protein domains of low sequence complexity work? RNA 28, 3–15 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Stowell, J. A. W. et al. A low-complexity region in the YTH domain protein Mmi1 enhances RNA binding. J. Biol. Chem. 293, 9210–9222 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Nicastro, G. et al. Direct m6A recognition by IMP1 underlays an alternative model of target selection for non-canonical methyl-readers. Nucleic Acids Res. 51, 8774–8786 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Xu, C. et al. Structural basis for selective binding of m6A RNA by the YTHDC1 YTH domain. Nat. Chem. Biol. 10, 927–929 (2014).

    Article  CAS  PubMed  Google Scholar 

  112. Woods, C. T. et al. Comparative visualization of the RNA suboptimal conformational ensemble in vivo. Biophys. J. 113, 290–301 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Liu, N. et al. N6-methyladenosine-dependent RNA structural switches regulate RNA–protein interactions. Nature 518, 560–564 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).

    Article  CAS  PubMed  Google Scholar 

  115. Tang, J., Wang, X., Xiao, D., Liu, S. & Tao, Y. The chromatin-associated RNAs in gene regulation and cancer. Mol. Cancer 22, 27 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Calandrelli, R. et al. Genome-wide analysis of the interplay between chromatin-associated RNA and 3D genome organization in human cells. Nat. Commun. 14, 6519 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Childs-Disney, J. L. et al. Targeting RNA structures with small molecules. Nat. Rev. Drug Discov. 21, 736–762 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Balaratnam, S. et al. Investigating the NRAS 5′ UTR as a target for small molecules. Cell Chem. Biol. 30, 643–657.e8 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Wu, L. et al. RNALocate v3.0: advancing the repository of RNA subcellular localization with dynamic analysis and prediction. Nucleic Acids Res. 53, D284–D292 (2025).

    Article  PubMed  Google Scholar 

  120. Rangaraju, V., tom Dieck, S. & Schuman, E. M. Local translation in neuronal compartments: how local is local? EMBO Rep. 18, 693–711 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Bourke, A. M., Schwarz, A. & Schuman, E. M. De-centralizing the central dogma: mRNA translation in space and time. Mol. Cell 83, 452–468 (2023).

    Article  CAS  PubMed  Google Scholar 

  122. Spitale, R. C. & Incarnato, D. Probing the dynamic RNA structurome and its functions. Nat. Rev. Genet. 24, 178–196 (2023).

    Article  CAS  PubMed  Google Scholar 

  123. Goering, R., Arora, A., Pockalny, M. C. & Taliaferro, J. M. RNA localization mechanisms transcend cell morphology. eLife 12, e80040 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Yang, L. et al. The challenges of investigating RNA function. Mol. Cell 84, 3567–3571 (2024).

    Article  CAS  PubMed  Google Scholar 

  125. Das, S., Vera, M., Gandin, V., Singer, R. H. & Tutucci, E. Intracellular mRNA transport and localized translation. Nat. Rev. Mol. Cell Biol. 22, 483–504 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Andreassi, C. et al. An NGF-responsive element targets myo-inositol monophosphatase-1 mRNA to sympathetic neuron axons. Nat. Neurosci. 13, 291–301 (2010).

    Article  CAS  PubMed  Google Scholar 

  127. Jambor, H., Brunel, C. & Ephrussi, A. Dimerization of oskar 3′ UTRs promotes hitchhiking for RNA localization in the drosophila oocyte. RNA 17, 2049–2057 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Will, T. J. et al. Deep sequencing and high-resolution imaging reveal compartment-specific localization of Bdnf mRNA in hippocampal neurons. Sci. Signal. 6, rs16 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  129. Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Davuluri, R. V., Suzuki, Y., Sugano, S. & Zhang, M. Q. CART classification of human 5′ UTR sequences. Genome Res. 10, 1807–1816 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  131. Karollus, A., Avsec, Ž. & Gagneur, J. Predicting mean ribosome load for 5′UTR of any length using deep learning. PLoS Comput. Biol. 17, e1008982 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  132. Cuperus, J. T., Groves, B. & Kuchina, A. Deep learning of the regulatory grammar of yeast 5′ untranslated regions from 500,000 random sequences. Genome 27, 2015–2024 (2017).

    CAS  Google Scholar 

  133. Gilliot, P.-A. & Gorochowski, T. E. Transfer learning for cross-context prediction of protein expression from 5′UTR sequence. Nucleic Acids Res. 52, e58 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Wayment-Steele, H. K. et al. Deep learning models for predicting RNA degradation via dual crowdsourcing. Nat. Mach. Intell. 4, 1174–1184 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  135. He, S., Gao, B., Sabnis, R. & Sun, Q. RNAdegformer: accurate prediction of mRNA degradation at nucleotide resolution with deep learning. Brief. Bioinform. 24, bbac581 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  136. Garg, A., Singhal, N., Kumar, R. & Kumar, M. mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization. Nucleic Acids Res. 48, W239–W243 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  137. Zhang, Z.-Y. et al. Design powerful predictor for mRNA subcellular location prediction in Homo sapiens. Brief. Bioinform. 22, 526–535 (2021).

    Article  CAS  PubMed  Google Scholar 

  138. Samacoits, A. et al. A computational framework to study sub-cellular RNA localization. Nat. Commun. 9, 4584 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  139. Yan, Z., Lécuyer, E. & Blanchette, M. Prediction of mRNA subcellular localization using deep recurrent neural networks. Bioinformatics 35, i333–i342 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  140. Musleh, S., Islam, M. T., Qureshi, R., Alajez, N. M. & Alam, T. Correction: MSLP: mRNA subcellular localization predictor based on machine learning techniques. BMC Bioinformatics 24, 156 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  141. Wang, D. et al. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res. 49, e46 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Zeng, M. et al. DeepLncLoc: a deep learning framework for long non-coding RNA subcellular localization prediction based on subsequence embedding. Brief. Bioinform. 23, bbab360 (2022).

    Article  PubMed  Google Scholar 

  143. Yang, Y. et al. Deciphering 3′UTR mediated gene regulation using interpretable deep representation learning. Adv. Sci. 11, e2407013 (2024).

    Article  Google Scholar 

  144. Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  145. Fu, L. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res. 50, e14 (2022).

    Article  CAS  PubMed  Google Scholar 

  146. Singh, J., Hanson, J., Paliwal, K. & Zhou, Y. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun. 10, 5407 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  147. Sapoval, N. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13, 1728 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  148. Weinberg, D. E. et al. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 14, 1787–1799 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Qiu, X. Sequence similarity governs generalizability of de novo deep learning models for RNA secondary structure prediction. PLoS Comput. Biol. 19, e1011047 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  150. Flamm, C. et al. Caveats to deep learning approaches to RNA secondary structure prediction. Front. Bioinform. 2, 835422 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  151. Schlusser, N., González, A., Pandey, M. & Zavolan, M. Current limitations in predicting mRNA translation with deep learning models. Genome Biol. 25, 227 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  152. Tomasev, N. et al. Pushing the limits of self-supervised ResNets: can we outperform supervised learning without labels on ImageNet? In First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward (ICML, 2022).

  153. Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) Vol. 139 8748–8763 (PMLR, 2021).

  154. Waisberg, E. et al. GPT-4: a new era of artificial intelligence in medicine. Ir. J. Med. Sci. 192, 3197–3200 (2023).

    Article  PubMed  Google Scholar 

  155. Zhang, Y. et al. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Res. 52, e3 (2024).

    Article  CAS  PubMed  Google Scholar 

  156. Gong, T. & Bu, D. Language models enable zero-shot prediction of RNA secondary structures including pseudoknots. Preprint at bioRxiv https://doi.org/10.1101/2024.01.27.577533 (2024).

  157. Yin, W. et al. ERNIE-RNA: an RNA language model with structure-enhanced representations. Preprint at bioRxiv https://doi.org/10.1101/2024.03.17.585376 (2024).

  158. Wang, N. et al. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nat. Mach. Intell. 6, 548–557 (2024).

    Article  Google Scholar 

  159. Akiyama, M. & Sakakibara, Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom. Bioinform. 4, lqac012 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  160. Chen, J. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. Preprint at bioRxiv https://doi.org/10.1101/2022.08.06.503062 (2022).

  161. Penić, R. J., Vlašić, T., Huber, R. G., Wan, Y. & Šikić, M. RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks. Preprint at https://doi.org/10.48550/arXiv.2403.00043 (2024).

  162. Yamada, K. & Hamada, M. Prediction of RNA–protein interactions using a nucleotide language model. Bioinform. Adv. 2, vbac023 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  163. Wang, X. et al. Uni-Rna: universal pre-trained models revolutionize RNA research. Preprint at bioRxiv https://doi.org/10.1101/2023.07.11.548588 (2023).

  164. Chen, K. et al. Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction. Brief. Bioinform. 25, bbae163 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  165. Sun, L. et al. Predicting dynamic cellular protein–RNA interactions by deep learning using in vivo RNA structures. Cell Res. 31, 495–516 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  166. Flynn, R. A. et al. Transcriptome-wide interrogation of RNA secondary structure in living cells with icSHAPE. Nat. Protoc. 11, 273–290 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  167. Rao, R., Meier, J., Sercu, T., Ovchinnikov, S. & Rives, A. Transformer protein language models are unsupervised structure learners. In Proceedings of the International Conference on Learning Representations (ICLR, 2021).

  168. Vig, J. et al. BERTology meets biology: interpreting attention in protein language models. In Proceedings of the International Conference on Learning Representations 2021 (ICLR, 2021).

  169. Ali, S. et al. Explainable artificial intelligence (XAI): what we know and what is left to attain trustworthy artificial intelligence. Inf. Fusion 99, 101805 (2023).

    Article  Google Scholar 

  170. Zhao, H. et al. Explainability for large language models: a survey. ACM Trans. Intell. Syst. Technol. 15, 20 (2024).

    Article  CAS  Google Scholar 

  171. Vu, M. H. et al. Linguistically inspired roadmap for building biologically reliable protein language models. Nat. Mach. Intell. 5, 485–496 (2023).

    Article  Google Scholar 

  172. Dalla-Torre, H. et al. The nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).

    Article  CAS  PubMed  Google Scholar 

  173. Yang, Y. et al. Deciphering 3′ UTR mediated gene regulation using interpretable deep representation learning. Adv. Sci. (Weinh.) 11, e2407013 (2024).

    PubMed  Google Scholar 

  174. Ren, Y. et al. BEACON: benchmark for comprehensive RNA tasks and language models. Adv. Neural Inf. Process. Syst. 37, 92891–92921 (2024).

    Google Scholar 

  175. Su, J. et al. SaProt: protein language modelling with structure-aware vocabulary. In Proceedings of the International Confernce on Learning Representations (ICLR, 2024).

  176. Poli, M. et al. Hyena hierarchy: towards larger convolutional language models. ICML 202, 28043–28078 (2023).

    Google Scholar 

  177. Beck, M. et al. xLSTM: extended long short-term memory. Adv. Neural Inf. Process. Syst. 37, 107547–107603 (2025).

    Google Scholar 

  178. Dai, D. et al. DeepSeekMoE: towards ultimate expert specialization in mixture-of-experts language models. In Proc. 62nd Annual Meeting of the Association for Computational Linguistics (eds Ku, L.-W., Martins, A. & Srikumar, V.) Vol. 1: Long Papers, 1280–1297 (Association for Computational Linguistics, 2024).

  179. Peng, B., Quesnelle, J., Fan, H. & Shippole, E. YaRN: efficient context window extension of large language models. In The Twelfth International Conference on Learning Representations (2024).

  180. He, L. et al. Pre-training co-evolutionary protein representation via a pairwise masked language model. Preprint at https://doi.org/10.48550/arXiv.2110.15527 (2021).

  181. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  182. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  CAS  PubMed  Google Scholar 

  183. Zablocki, L. I. et al. Comprehensive benchmarking of large language models for RNA secondary structure prediction. Brief. Bioinform. 26, bbaf137 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  184. Hastings, J. Primer on ontologies. Methods Mol. Biol. 1446, 3–13 (2017).

    Article  CAS  PubMed  Google Scholar 

  185. Cavalleri, E. et al. An ontology-based knowledge graph for representing interactions involving RNA molecules. Sci. Data 11, 906 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  186. Bepler, T. & Berger, B. Learning the protein language: evolution, structure, and function. Cell Syst. 12, 654–669.e3 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  187. Zhou, Z. et al. Enhancing efficiency of protein language models with minimal wet-lab data through few-shot learning. Nat. Commun. 15, 5566 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  188. Zhang, N. et al. OntoProtein: protein pretraining with gene ontology embedding. In Proceedings of the International Conference on Learning Representations 2022 (ICLR, 2022).

  189. Kulmanov, M. et al. Protein function prediction as approximate semantic entailment. Nat. Mach. Intell. 6, 220–228 (2024).

    Article  Google Scholar 

  190. Fiannaca, A., La Rosa, M., La Paglia, L., Gaglio, S. & Urso, A. GOWDL: gene ontology-driven wide and deep learning model for cell typing of scRNA-seq data. Brief. Bioinform. 24, bbad332 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  191. Yin, Q. & Chen, L. CellTICS: an explainable neural network for cell-type identification and interpretation based on single-cell RNA-seq data. Brief. Bioinform. 25, bbad449 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  192. He, Y. et al. LucaOne: generalized biological foundation model with unified nucleic acid and protein language. Preprint at bioRxiv https://doi.org/10.1101/2024.05.10.592927 (2024).

  193. Fradkin, P. et al. Orthrus: towards evolutionary and functional RNA foundation models. In NeurIPS 2024 Workshop on AI for New Drug Modalities (2024).

  194. Boyd, N. et al. ATOM-1: a foundation model for RNA structure and function built on chemical mapping data. Preprint at bioRxiv https://doi.org/10.1101/2023.12.13.571579 (2023).

  195. He, S. et al. Ribonanza: deep learning of RNA structure through dual crowdsourcing. Preprint at bioRxiv https://doi.org/10.1101/2024.02.24.581671 (2024).

  196. Garau-Luis, J. J. et al. Multi-modal transfer learning between biological foundation models. Adv. Neural Inf. Process. Syst. 37, 78431–78450 (2024).

    Google Scholar 

  197. Jha, K., Karmakar, S. & Saha, S. Graph-BERT and language model-based framework for protein–protein interaction identification. Sci. Rep. 13, 5663 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  198. Zang, X., Zhao, X. & Tang, B. Hierarchical molecular graph self-supervised learning for property prediction. Commun. Chem. 6, 34 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  199. Birnbaum, F., Jain, S., Madry, A. & Keating, A. E. Jointly embedding protein structures and sequences through residue level alignment. PRX Life 2, 043013 (2024).

    Article  Google Scholar 

  200. Barua, A., Ahmed, M. U. & Begum, S. A systematic literature review on multimodal machine learning: applications, challenges, gaps and future directions. IEEE Access 11, 14804–14831 (2023).

    Article  Google Scholar 

  201. Li, S. & Tang, H. Multimodal alignment and fusion: a survey. Preprint at https://doi.org/10.48550/arXiv.2411.17040 (2024).

  202. Liang, P. P., Zadeh, A. & Morency, L.-P. Foundations & trends in multimodal machine learning: principles, challenges, and open questions. ACM Comput. Surv. 56, 1–42 (2024).

    Article  Google Scholar 

  203. Zhang, Z. et al. A systematic study of joint representation learning on protein sequences and structures. Preprint at https://doi.org/10.48550/arXiv.2303.06275 (2023).

  204. Ma, M., Ren, J., Zhao, L., Testuggine, D. & Peng, X. Are multimodal transformers robust to missing modality? In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition 18156–18165 (IEEE, 2022).

  205. Wang, H. et al. Multi-modal learning with missing modality via shared-specific feature modelling. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition 15878–15887 (IEEE, 2023).

  206. Flügel, S., Glauer, M., Mossakowski, T., & Neuhaus, F. A fuzzy loss for ontology classification. In International Conference on Neural-Symbolic Learning and Reasoning 101–118 (Springer Nature, 2024).

  207. Hawkins-Hooker, A., Kmec, J., Bent, O. & Duckworth, P. Likelihood-based fine-tuning of protein language models for few-shot fitness prediction and design. In ICML Workshop ML for Life ad Material Science: From Theory to Industry Applications (2024).

  208. Vo, H. V. et al. Automatic data curation for self-supervised learning: a clustering-based approach. Trans. Mach. Learn. Res. (2024).

  209. Campanella, G., Vanderbilt, C. & Fuchs, T. Computational pathology at health system scale — self-supervised foundation models from billions of images. In AAAI 2024 Spring Symposium on Clinical Foundation Models (2024).

  210. Chu, Y. et al. A 5’ UTR language model for decoding untranslated regions of mRNA and function predictions. Nat. Mach. Intell. 6, 449–460 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  211. Wei, J., Chen, S., Zong, L., Gao, X. & Li, Y. Protein–RNA interaction prediction with deep learning: structure matters. Brief. Bioinform. 23, bbab540 (2021).

    Article  PubMed Central  Google Scholar 

  212. Xia, Y., Xia, C.-Q., Pan, X. & Shen, H.-B. GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues. Nucleic Acids Res. 49, e51 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  213. Tieng, F. Y. F. et al. A Hitchhiker’s guide to RNA–RNA structure and interaction prediction tools. Brief. Bioinform. 25, bbad421 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  214. Fang, Y., Pan, X. & Shen, H.-B. Recent deep learning methodology development for RNA–RNA interaction prediction. Symmetry 14, 1302 (2022).

    Article  CAS  Google Scholar 

  215. Zhang, H. et al. ncRNAInter: a novel strategy based on graph neural network to discover interactions between lncRNA and miRNA. Brief. Bioinform. 23, bbac411 (2022).

    Article  PubMed  Google Scholar 

  216. Li, Y.-C. et al. DeepCMI: a graph-based model for accurate prediction of circRNA–miRNA interactions with multiple information. Brief. Funct. Genomics 23, 276–285 (2024).

    Article  CAS  PubMed  Google Scholar 

  217. Rasti, S. & Vogiatzis, C. A survey of computational methods in protein–protein interaction networks. Ann. Oper. Res. 276, 35–87 (2019).

    Article  Google Scholar 

  218. Hu, L., Wang, X., Huang, Y.-A., Hu, P. & You, Z.-H. A survey on computational models for predicting protein–protein interactions. Brief. Bioinform. 22, bbab036 (2021).

    Article  PubMed  Google Scholar 

  219. Xu, M. et al. Graph neural networks for protein-protein interactions — a short survey. Preprint at https://doi.org/10.48550/arXiv.2404.10450 (2024).

  220. Gao, Z. et al. Hierarchical graph learning for protein–protein interaction. Nat. Commun. 14, 1093 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  221. Huang, K., Xiao, C., Glass, L. M., Zitnik, M. & Sun, J. SkipGNN: predicting molecular interactions with skip-graph networks. Sci. Rep. 10, 21092 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  222. Jha, K., Saha, S. & Singh, H. Prediction of protein-protein interaction using graph neural networks. Sci. Rep. 12, 8360 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  223. Nambiar, A. et al. Transforming the language of life: transformer neural networks for protein prediction tasks. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics https://doi.org/10.1145/3388440.34124 (2020).

  224. Wang, Y. et al. RPI-GGCN: prediction of RNA–protein interaction based on interpretability gated graph convolution neural network and co-regularized variational autoencoders. IEEE Trans. Neural Netw. Learn. Syst. 36, 7681–7695 (2024).

    Article  Google Scholar 

  225. Yu, B. et al. RPI-MDLStack: predicting RNA–protein interactions through deep learning with stacking strategy and LASSO. Appl. Soft Comput. 120, 108676 (2022).

    Article  Google Scholar 

  226. Wang, Y. et al. RPI-CapsuleGAN: predicting RNA–protein interactions through an interpretable generative adversarial capsule network. Pattern Recognit. 141, 109626 (2023).

    Article  Google Scholar 

  227. Zhou, J., Wang, X., Niu, R., Shang, X. & Wen, J. Predicting circRNA–miRNA interactions utilizing transformer-based RNA sequential learning and high-order proximity preserved embedding. iScience 27, 108592 (2024).

    Article  CAS  PubMed  Google Scholar 

  228. Singh, R., Xu, J. & Berger, B. Struct2net: Integrating structure into protein-protein interaction prediction. Pac. Symp. Biocomput. 2006, 403–414 (2006).

    Google Scholar 

  229. Yang, F., Fan, K., Song, D. & Lin, H. Graph-based prediction of protein–protein interactions with attributed signed graph embedding. BMC Bioinformatics 21, 323 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  230. Yan, Z., Hamilton, W. L. & Blanchette, M. Graph neural representational learning of RNA secondary structures for predicting RNA–protein interactions. Bioinformatics 36, i276–i284 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  231. Huang, Y.-A. et al. Predicting lncRNA–miRNA interaction via graph convolution auto-encoder. Front. Genet. 10, 758 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  232. Zhao, C. et al. Graph embedding ensemble methods based on the heterogeneous network for lncRNA–miRNA interaction prediction. BMC Genomics 21, 867 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  233. Dutta, P. & Saha, S. Amalgamation of protein sequence, structure and textual information for improving protein–protein interaction identification. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds Jurafsky, D., et al.) 6396–6407 (Association for Computational Linguistics, 2020).

  234. Luo, X., Wang, L., Hu, P. & Hu, L. Predicting protein–protein interactions using sequence and network information via variational graph autoencoder. IEEE/ACM Trans. Comput. Biol. Bioinform. 20, 3182–3194 (2023).

    Article  CAS  PubMed  Google Scholar 

  235. Chen, L. et al. Graph optimal transport for cross-domain alignment. ICML 1542–1553 (2020).

  236. Bing, R. et al. Heterogeneous graph neural networks analysis: a survey of techniques, evaluations and applications. Artif. Intell. Rev. 56, 8003–8042 (2023).

    Article  Google Scholar 

  237. Li, D. et al. RNA–protein interaction prediction based on deep learning: a comprehensive survey. Preprint at https://doi.org/10.48550/arXiv.2410.00077 (2024).

  238. Ying, R., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. GNNExplainer: generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32, 9240–9251 (2019).

    PubMed  PubMed Central  Google Scholar 

  239. Lv, G., Hu, Z., Bi, Y. & Zhang, S. Learning unknown from correlations: graph neural network for inter-novel-protein interaction prediction. in Proc. Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21 (ed. Zhou, Z.-H.) 3677–3683 (International Joint Conferences on Artificial Intelligence Organization, 2021).

  240. Li, J. et al. Evaluating graph neural networks for link prediction: current pitfalls and new benchmarking. In Advances in Neural Information Processing Systems 36 (eds Oh, A. et al) 3853–3866 (NeurIPS, 2023).

  241. Morris, C. et al. Position: future directions in the theory of graph machine learning. in Proc. 41st International Conference on Machine Learning (eds Salakhutdinov, R. et al.) Vol. 235, 36294–36307 (PMLR, 2024).

  242. Zhang, B. et al. The expressive power of graph neural networks: a survey. IEEE Trans. Knowl. Data Eng. 37, 1455–1474 (2025).

    Article  Google Scholar 

  243. Papamarkou, T. et al. Position: topological deep learning is the new frontier for relational learning. in Proc. 41st International Conference on Machine Learning (eds Salakhutdinov, R. et al.) Vol. 235, 39529–39555 (PMLR, 2024).

  244. Zheng, X. et al. Graph neural networks for graphs with heterophily: a survey. Preprint at https://doi.org/10.48550/arXiv.2202.07082 (2022).

  245. Chen, F., Cocaign-Bousquet, M., Girbal, L. & Nouaille, S. 5′UTR sequences influence protein levels in Escherichia coli by regulating translation initiation and mRNA stability. Front. Microbiol. 13, 1088941 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  246. Lytle, J. R., Yario, T. A. & Steitz, J. A. Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5′ UTR as in the 3′ UTR. Proc. Natl Acad. Sci. USA 104, 9667–9672 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  247. Zhu, H. et al. Dynamic characterization and interpretation for protein–RNA interactions across diverse cellular conditions using HDRNet. Nat. Commun. 14, 6824 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  248. Li, M. M. et al. Contextual AI models for single-cell protein biology. Nat. Methods 21, 1546–1557 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  249. Blakes, A. J. M. et al. A systematic analysis of splicing variants identifies new diagnoses in the 100,000 genomes project. Genome Med. 14, 1–11 (2022).

    Article  Google Scholar 

  250. Farh, K. K.-H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  251. Whiffin, N. et al. Characterising the loss-of-function impact of 5′ untranslated region variants in 15,708 individuals. Nat. Commun. 11, 2523 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  252. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  253. Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  254. GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  PubMed Central  Google Scholar 

  255. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  256. Cui, Y. et al. Alternative polyadenylation transcriptome-wide association study identifies APA-linked susceptibility genes in brain disorders. Nat. Commun. 14, 583 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  257. Mai, J., Lu, M., Gao, Q., Zeng, J. & Xiao, J. Transcriptome-wide association studies: recent advances in methods, applications and available databases. Commun. Biol. 6, 899 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  258. Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

    Article  CAS  PubMed  Google Scholar 

  259. Fu, X.-D. & Ares, M. Jr Context-dependent control of alternative splicing by RNA-binding proteins. Nat. Rev. Genet. 15, 689–701 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  260. Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).

    Article  CAS  PubMed  Google Scholar 

  261. Grønning, A. G. B. et al. DeepCLIP: predicting the effect of mutations on protein-RNA binding with deep learning. Nucleic Acids Res. 48, 7099–7118 (2020).

    PubMed  PubMed Central  Google Scholar 

  262. Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034.e6 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  263. Cano-Gamez, E. & Trynka, G. From GWAS to function: using functional genomics to identify the mechanisms underlying complex diseases. Front. Genet. 11, 424 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  264. Klim, J. R. et al. ALS-implicated protein TDP-43 sustains levels of STMN2, a mediator of motor neuron growth and repair. Nat. Neurosci. 22, 167–179 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  265. Tyzack, G. E. et al. Widespread FUS mislocalization is a molecular hallmark of amyotrophic lateral sclerosis. Brain 142, 2572–2580 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  266. Ziff, O. J. et al. Nucleocytoplasmic mRNA redistribution accompanies RNA binding protein mislocalization in ALS motor neurons and is restored by VCP ATPase inhibition. Neuron 111, 3011–3027.e7 (2023).

    Article  CAS  PubMed  Google Scholar 

  267. Wood, M. J. A., Talbot, K. & Bowerman, M. Spinal muscular atrophy: antisense oligonucleotide therapy opens the door to an integrated therapeutic landscape. Hum. Mol. Genet. 26, R151–R159 (2017).

    Article  CAS  PubMed  Google Scholar 

  268. Raal, F. J. et al. Mipomersen, an apolipoprotein B synthesis inhibitor, for lowering of LDL cholesterol concentrations in patients with homozygous familial hypercholesterolaemia: a randomised, double-blind, placebo-controlled trial. Lancet 375, 998–1006 (2010).

    Article  CAS  PubMed  Google Scholar 

  269. Mercuri, E. et al. Nusinersen versus sham control in later-onset spinal muscular atrophy. N. Engl. J. Med. 378, 625–635 (2018).

    Article  CAS  PubMed  Google Scholar 

  270. Finkel, R. S. et al. Nusinersen versus sham control in infantile-onset spinal muscular atrophy. N. Engl. J. Med. 377, 1723–1732 (2017).

    Article  CAS  PubMed  Google Scholar 

  271. Mendell, J. R. et al. Eteplirsen for the treatment of Duchenne muscular dystrophy. Ann. Neurol. 74, 637–647 (2013).

    Article  CAS  PubMed  Google Scholar 

  272. Benson, M. D. et al. Inotersen treatment for patients with hereditary transthyretin amyloidosis. N. Engl. J. Med. 379, 22–31 (2018).

    Article  CAS  PubMed  Google Scholar 

  273. Frank, D. E. et al. Increased dystrophin production with golodirsen in patients with Duchenne muscular dystrophy. Neurology 94, e2270–e2282 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  274. Witztum, J. L. et al. Volanesorsen and triglyceride levels in familial chylomicronemia syndrome. N. Engl. J. Med. 381, 531–542 (2019).

    Article  CAS  PubMed  Google Scholar 

  275. Adams, D. et al. Patisiran, an RNAi therapeutic, for hereditary transthyretin amyloidosis. N. Engl. J. Med. 379, 11–21 (2018).

    Article  CAS  PubMed  Google Scholar 

  276. Balwani, M. et al. Phase 3 trial of RNAi therapeutic givosiran for acute intermittent porphyria. N. Engl. J. Med. 382, 2289–2301 (2020).

    Article  CAS  PubMed  Google Scholar 

  277. Garrelfs, S. F. et al. Lumasiran, an RNAi therapeutic for primary hyperoxaluria type 1. N. Engl. J. Med. 384, 1216–1226 (2021).

    Article  CAS  PubMed  Google Scholar 

  278. Ray, K. K. et al. Two phase 3 trials of inclisiran in patients with elevated LDL cholesterol. N. Engl. J. Med. 382, 1507–1519 (2020).

    Article  CAS  PubMed  Google Scholar 

  279. Clemens, P. R. et al. Long-term functional efficacy and safety of viltolarsen in patients with Duchenne muscular dystrophy. J. Neuromuscul. Dis. 9, 493–501 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  280. Wagner, K. R. et al. Safety, tolerability, and pharmacokinetics of casimersen in patients with Duchenne muscular dystrophy amenable to exon 45 skipping: a randomized, double-blind, placebo-controlled, dose-titration trial. Muscle Nerve 64, 285–292 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  281. Liu, A. et al. Nedosiran, a candidate siRNA drug for the treatment of primary hyperoxaluria: design, development, and clinical studies. ACS Pharmacol. Transl. Sci. 5, 1007–1016 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  282. Miller, T. M. et al. Trial of antisense oligonucleotide tofersen for ALS. N. Engl. J. Med. 387, 1099–1110 (2022).

    Article  CAS  PubMed  Google Scholar 

  283. van Roon-Mom, W., Ferguson, C. & Aartsma-Rus, A. From failure to meet the clinical endpoint to US food and drug administration approval: 15th antisense oligonucleotide therapy approved qalsody (tofersen) for treatment of SOD1 mutated amyotrophic lateral sclerosis. Nucleic Acid. Ther. 33, 234–237 (2023).

    Article  PubMed  Google Scholar 

  284. Korobeynikov, V. A., Lyashchenko, A. K., Blanco-Redondo, B., Jafar-Nejad, P. & Shneider, N. A. Antisense oligonucleotide silencing of FUS expression as a therapeutic approach in amyotrophic lateral sclerosis. Nat. Med. 28, 104–116 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  285. Musa, D. A., Raji, M. O., Sikiru, A. B., Aremu, K. H. & Aigboeghian, E. A. Promising RNA-based therapies for viral infections, genetic disorders and cancer. Acad. Mol. Biol. Genomics https://doi.org/10.20935/acadmolbiogen7329 (2024).

  286. Wang, Q. et al. Cell cycle regulation by alternative polyadenylation of CCND1. Sci. Rep. 8, 6824 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  287. Ng, K. P. et al. A common BIM deletion polymorphism mediates intrinsic resistance and inferior responses to tyrosine kinase inhibitors in cancer. Nat. Med. 18, 521–528 (2012).

    Article  CAS  PubMed  Google Scholar 

  288. Sotillo, E. et al. Convergence of acquired mutations and alternative splicing of CD19 enables resistance to CART-19 immunotherapy. Cancer Discov. 5, 1282–1295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  289. Sobczak, K. & Krzyzosiak, W. J. Structural determinants of BRCA1 translational regulation. J. Biol. Chem. 277, 17349–17358 (2002).

    Article  CAS  PubMed  Google Scholar 

  290. Mayr, C. & Bartel, D. P. Widespread shortening of 3′UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  291. Mitschka, S. & Mayr, C. Context-specific regulation and function of mRNA alternative polyadenylation. Nat. Rev. Mol. Cell Biol. 23, 779–796 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  292. Liang, X.-H. et al. Translation efficiency of mRNAs is increased by antisense oligonucleotides targeting upstream open reading frames. Nat. Biotechnol. 34, 875–880 (2016).

    Article  CAS  PubMed  Google Scholar 

  293. Liang, X.-H. et al. Antisense oligonucleotides targeting translation inhibitory elements in 5′ UTRs can selectively increase protein levels. Nucleic Acids Res. 45, 9528–9546 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  294. Zhao, Y., Oono, K., Takizawa, H. & Kotera, M. GenerRNA: a generative pre-trained language model for de novo RNA design. PLoS ONE 19, e0310814 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  295. Holdt, L. M., Kohlmaier, A. & Teupser, D. Circular RNAs as therapeutic agents and targets. Front. Physiol. 9, 1262 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  296. Touznik, A., Maruyama, R., Hosoki, K., Echigoya, Y. & Yokota, T. LNA/DNA mixmer-based antisense oligonucleotides correct alternative splicing of the SMN2 gene and restore SMN protein expression in type 1 SMA fibroblasts. Sci. Rep. 7, 3672 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  297. Roux, B. T., Lindsay, M. A. & Heward, J. A. Knockdown of nuclear-located enhancer RNAs and long ncRNAs using locked nucleic acid GapmeRs. Methods Mol. Biol. 1468, 11–18 (2017).

    Article  CAS  PubMed  Google Scholar 

  298. Amodio, N. et al. Drugging the lncRNA MALAT1 via LNA gapmeR ASO inhibits gene expression of proteasome subunits and triggers anti-multiple myeloma activity. Leukemia 32, 1948–1957 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  299. Chen, R. et al. Engineering circular RNA for enhanced protein production. Nat. Biotechnol. 41, 262–272 (2023).

    Article  CAS  PubMed  Google Scholar 

  300. Zhang, G. et al. KGANSynergy: knowledge graph attention network for drug synergy prediction. Brief. Bioinform. 24, bbad167 (2023).

    Article  PubMed  Google Scholar 

  301. Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation. In Proceedings of the International Conference on Learning Representations (ICLR, 2023).

  302. Zheng, S. et al. Predicting equilibrium distributions for molecular systems with deep learning. Nat. Mach. Intell. 6, 558–567 (2024).

    Article  Google Scholar 

  303. Nguyen, E. et al. Sequence modelling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  304. Atz, K. et al. Prospective de novo drug design with deep interactome learning. Nat. Commun. 15, 1–18 (2024).

    Article  Google Scholar 

  305. Cacciarelli, D. & Kulahci, M. Active learning for data streams: a survey. Mach. Learn. https://doi.org/10.1007/s10994-023-06454-2 (2023).

  306. Fournier, Q. et al. Protein language models: is scaling necessary? Preprint at bioRxiv https://doi.org/10.1101/2024.09.23.614603 (2024).

  307. Outeiral, C. & Deane, C. M. Codon language embeddings provide strong signals for use in protein engineering. Nat. Mach. Intell. 6, 170–179 (2024).

    Article  Google Scholar 

  308. Naghipourfar, M. et al. A suite of foundation models captures the contextual interplay between codons. Preprint at bioRxiv https://doi.org/10.1101/2024.10.10.617568 (2024).

  309. Celaj, A. et al. An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics. Preprint at bioRxiv https://doi.org/10.1101/2023.09.20.558508 (2023).

  310. Ren, Z. et al. CodonBERT: a BERT-based architecture tailored for codon optimization using the cross-attention mechanism. Bioinformatics 40, btae330 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  311. Boccaletto, P. et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res. 46, D303–D307 (2018).

    Article  CAS  PubMed  Google Scholar 

  312. Varani, L. et al. The NMR structure of the 38 kDa U1A protein–PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylation by human U1A protein. Nat. Struct. Biol. 7, 329–335 (2000).

    Article  CAS  PubMed  Google Scholar 

  313. Hennig, J. et al. Structural basis for the assembly of the Sxl–Unr translation regulatory complex. Nature 515, 287–290 (2014).

    Article  CAS  PubMed  Google Scholar 

  314. Chen, S.-J. RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu. Rev. Biophys. 37, 197–214 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  315. Halvorsen, M., Martin, J. S., Broadaway, S. & Laederach, A. Disease-associated mutations that alter the RNA structural ensemble. PLoS Genet. 6, e1001074 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  316. Mortimer, S. A. & Weeks, K. M. Time-resolved RNA SHAPE chemistry: quantitative RNA structure analysis in one-second snapshots and at single-nucleotide resolution. Nat. Protoc. 4, 1413–1421 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  317. Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A. & Eddy, S. R. Rfam: an RNA family database. Nucleic Acids Res. 31, 439–441 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  318. Kalvari, I. et al. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 49, D192–D200 (2021).

    Article  CAS  PubMed  Google Scholar 

  319. RNAcentral Consortium RNAcentral 2021: secondary structure integration, improved sequence search and new member databases. Nucleic Acids Res. 49, D212–D220 (2021).

    Article  Google Scholar 

  320. Harrison, P. W. et al. Ensembl 2024. Nucleic Acids Res. 52, D891–D899 (2024).

    Article  CAS  PubMed  Google Scholar 

  321. Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information in 2023. Nucleic Acids Res. 51, D29–D38 (2023).

    Article  CAS  PubMed  Google Scholar 

  322. Kenton, J. D. M.-W. & Chang, L. K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1, 4171–4186 (2019).

    Google Scholar 

  323. Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2023).

    Article  CAS  PubMed  Google Scholar 

  324. Celledoni, E. et al. Structure-preserving deep learning. Eur. J. Appl. Math. 32, 888–936 (2021).

    Article  Google Scholar 

  325. Lai, P., Zhang, Z., Zhang, W., Fu, F. & Cui, B. Enhancing unsupervised sentence embeddings via knowledge-driven data augmentation and Gaussian-decayed contrastive learning. Preprint at https://doi.org/10.48550/arXiv.2409.12887 (2024).

  326. Glauer, M., Neuhaus, F., Mossakowski, T. & Hastings, J. Ontology pre-training for poison prediction. In German Conference on Artificial Intelligence (Künstliche Intelligenz) 31–45 (Springer Nature, 2023).

  327. Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).

    Article  Google Scholar 

  328. Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  329. Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://doi.org/10.48550/arXiv.2104.13478 (2021).

  330. Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2021).

    Article  PubMed  Google Scholar 

  331. Chami, I., Abu-El-Haija, S., Perozzi, B., Ré, C. & Murphy, K. Machine learning on graphs: a model and comprehensive taxonomy. J. Mach. Learn. Res. 23, 1–64 (2022).

    Google Scholar 

  332. Müller, L., Galkin, M., Morris, C. & Rampášek, L. Attending to graph transformers. Trans. Mach. Learn. Res. 2835–8856 (2024).

  333. Kipf, T. N. & Welling, M. Variational graph auto-encoders. In NIPS Workshop on Bayesian Deep Learning (2016).

  334. Yang, Z. et al. Understanding negative sampling in graph representation learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1666–1676 (ACM, 2020).

Download references

Acknowledgements

This work is supported by the Swiss National Science Foundation (310030_207907 to Z.M.X.; 215906 to C.T. and J.H.; 1000144 to C.V.C.; and 205121_207437 to L.V.P.). A.R. and M.D. are supported by Wellcome Trust Investigator Award (217213/Z/19/Z), Y.W. holds a Non-Clinical Junior Research Fellowship from the Motor Neurone Disease Association (Wang/Oct23/2324-799), and R.P. holds a Lister Research Prize Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed substantially to discussion of the content. All authors wrote the article. All authors reviewed the manuscript before submission.

Corresponding author

Correspondence to Raphaëlle Luisier.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Molecular Cell Biology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Glossary

Attention maps

Matrix of attention coefficients, often used in the context of self-attention in Transformers. A large attention coefficient between two tokens means that one of them will weigh heavily in the updated representation of the other.

Contrastive learning

Machine learning technique in which a model learns to differentiate between similar and dissimilar pairs of data by bringing similar pairs closer in the representation space and pushing dissimilar pairs apart.

Embeddings

Learned, vectorial abstract representations of data.

Fine-tuning

Fine-tuning is a secondary training stage of foundation models or models that are being trained in multiple stages, usually done on smaller, labelled datasets to specialize the model on a given task.

Foundation model

A model trained on a large quantity of data to be general purpose and suitable for a variety of predictive tasks for a given data modality, in contrast to task-specific models.

Graph neural networks

(GNNs). Deep learning models dedicated to learning on graphs.

Heterogeneous

Graphs are homogeneous when their nodes and edges share comparable features regardless of entity or interaction type. If node or edge features originate from different spaces, the graph is heterogeneous.

Hierarchical training

A machine learning strategy in which a deep learning model, composed of a succession of components dedicated to the encoding of specific modalities or concepts, is trained to achieve a given final task. Each intermediate component can be associated with a specific objective function to optimize.

Knowledge graphs

Graph-structured representations of knowledge about a domain, typically comprising one or more ontologies associated to instance data of various types and their inter-relationships.

Knowledge-based weighting strategies

Strategies in machine learning in which weights are assigned to training data points based on their biological or domain-specific importance rather than treating all data equally. These strategies help models learn from scarce but crucial data, which is particularly beneficial for training with imbalanced datasets.

Large language models

Deep learning models that are trained in a self-supervised way on sequences. They often feature stacked transformer layers, resulting in millions to billions of parameters.

Message passing algorithms

A computational method used in graph-based models, where each node exchanges information with other specific nodes in the graph. These nodes can be direct or indirect neighbours of the central node. This process is used in GNNs to iteratively update the representation of each node according to the structure of the graph and the learned representations of neighbouring nodes.

Multimodal alignment

The use of multiple modalities that do not represent the same object but are related in some way.

Multimodal fusion

The process of combining multiple modalities representing the same object, for example, the sequence, structure and functional annotation of a given RNA sequence.

Multitask training

A machine learning strategy in which a model is trained to perform multiple tasks simultaneously. The model shares a common underlying architecture for all tasks, but has specific outputs tailored to each task.

Objective function

The function that is being optimized (minimized or maximized) during training, typically a loss function that represents the difference between what the model predicts and the ground truth.

Ontology

A knowledge representation structure in which explicit knowledge about a topic is organized into classes and relationships, each of which is given a definition and associated synonyms and other metadata.

Pretraining

Generally used in the context of foundation models or other transfer learning scenarios, it is the first training step that allows the model to build general-purpose representations. It is generally done on large datasets, often on unlabelled data and training is done in a self-supervised manner.

Self-attention mechanisms

Flexible mechanisms for a transformer model to learn the relative context-specific relationships between input tokens. They are the attention mechanisms that calculate attention scores between all elements of the input, and output the weighted averages using those scores.

Self-supervised learning

A paradigm in machine learning in which a model learns the structure of a dataset from the dataset itself without additional explicit labels, for example, by being tasked to learn to predict masked or missing parts of the data given surrounding elements.

Semantic loss function

A type of loss function used in machine learning models to ensure that the predictions made by the model adhere to specific domain knowledge or logical constraints, in addition to minimizing the prediction error.

Supervised deep learning

A machine learning paradigm in which an input object (for example, a sequence), together with a desired output value (for example, the type of sequence), are used to train a predictive model capable of inferring outputs of this type for new inputs.

Tokenization

The process of splitting a sequence into defined subunits, called ‘tokens’. In RNA, sequences are often split into nucleotides or in overlapping k-mers.

Transfer learning

A paradigm in machine learning in which a model is trained in different stages, with the learned parameters from earlier stages in the training being retained or ‘transferred’ into later stages, where they benefit downstream predictions.

Transformer architecture

A popular and powerful deep learning architecture that is characterized by an attention mechanism and positional encodings that allow complex contextual relationships to be learned. The architecture is made up of transformer blocks of a linear transformation layer, followed by a batch normalization layer, followed by a self-attention layer and one last batch normalization layer.

Vectorized representations

Abstract, numerical representation of data in the form of a vector of (usually) continuous values.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jung, V., Vincent-Cuaz, C., Tumescheit, C. et al. Decoding the interactions and functions of non-coding RNA with artificial intelligence. Nat Rev Mol Cell Biol 26, 797–818 (2025). https://doi.org/10.1038/s41580-025-00857-w

Download citation

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41580-025-00857-w

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing