Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. nature communications
  3. articles
  4. article
SamplingDesign: RNA design via continuous optimization with coupled variables and Monte-Carlo sampling
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 20 February 2026

SamplingDesign: RNA design via continuous optimization with coupled variables and Monte-Carlo sampling

  • Wei Yu Tang  ORCID: orcid.org/0009-0008-1141-94791,2,
  • Ning Dai1,
  • Tianshuo Zhou  ORCID: orcid.org/0009-0008-4804-08251,
  • David H. Mathews  ORCID: orcid.org/0000-0002-2907-65573,4,5 &
  • …
  • Liang Huang  ORCID: orcid.org/0000-0001-6444-70451,6 

Nature Communications , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Computer science
  • Machine learning
  • Structural biology

Abstract

RNA design aims to find a sequence that can fold into a target secondary structure. It can create artificial RNA molecules for specific functions, with wide applications in medicine. It is computationally challenging due to two levels of combinatorial explosion: the exponentially large design space and the exponentially many competing structures per design. Popular methods such as local search cannot keep up with these combinatorial explosions. We instead employ two techniques from machine learning, continuous optimization and Monte-Carlo sampling. We start from a distribution over all valid sequences, and use gradient descent to improve the expectation of an arbitrary objective function. We define novel coupled-variable distributions to model the correlation between nucleotides. We then use sampling to approximate the objective, estimate the gradient, and select the final candidate. Our work consistently outperforms state-of-the-art methods in key metrics including Boltzmann probability and ensemble defect, especially on long and hard-to-design structures.

Similar content being viewed by others

Comprehensive datasets for RNA design, machine learning, and beyond

Article Open access 01 July 2025

Advances and opportunities in RNA structure experimental determination and computational modeling

Article 06 October 2022

Deep learning models for predicting RNA degradation via dual crowdsourcing

Article Open access 14 December 2022

Data availability

The designed sequences generated in this study are provided in the Supplementary Informatio/Source Data file. Source data are provided with this paper.

Code availability

The SamplingDesign source code is available at https://github.com/weiyutang1010/SamplingDesign, under Apache 2.0 license. The specific version of the code associated with this publication is archived in Zenodo and is accessible via https://doi.org/10.5281/zenodo.1767402148.

References

  1. Eddy, S. R. Non-coding RNA genes and the modern RNA world. Nature Reviews Genetics 2, 919–929 (2001).

    Google Scholar 

  2. Doudna, J. A. & Cech, T. R. The chemical repertoire of natural ribozymes. Nature 418, 222–228 (2002).

    Google Scholar 

  3. Bachellerie, J. P., Cavaillé, J. & Hüttenhofer, A. The expanding snoRNA world. Biochimie 84, 775–790 (2002).

    Google Scholar 

  4. Zhou, T. et al. RNA design via structure-aware multifrontier ensemble optimization. Bioinformatics 39, i563–i571 (2023).

    Google Scholar 

  5. Portela, F. An unexpectedly effective Monte Carlo technique for the RNA inverse folding problem. BioRxiv (2018).

  6. Hofacker, I. L. et al. Fast folding and comparison of RNA secondary structures. Monatsh. Chem. 125, 167–167 (1994).

    Google Scholar 

  7. Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    Google Scholar 

  8. Andronescu, M., Fejes, A. P., Hutter, F., Hoos, H. H. & Condon, A. A New Algorithm for RNA Secondary Structure Design. Journal of Molecular Biology 336, 607–624 (2004).

    Google Scholar 

  9. Busch, A. & Backofen, R. INFO-RNA – a fast approach to inverse RNA folding. Bioinformatics 22, 1823–1831 (2006).

    Google Scholar 

  10. Bellaousov, S., Kayedkhordeh, M., Peterson, R. J. & Mathews, D. H. Accelerated RNA secondary structure design using preselected sequences for helices and loops. RNA 24, 1555–1567 (2018).

    Google Scholar 

  11. Garcia-Martin, J. A., Clote, P. & Dotu, I. RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design. J. Bioinfo. Comp. Bio.11 (2013).

  12. Zadeh, J. N. et al. NUPACK: Analysis and design of nucleic acid systems. J. Comp. Chem. 32, 170–173 (2011).

    Google Scholar 

  13. Dotu, I. et al. Complete RNA inverse folding: computational design of functional hammerhead ribozymes. Nucleic Acids Research 42, 11752–11762 (2014).

    Google Scholar 

  14. Yamagami, R., Kayedkhordeh, M., Mathews, D. H. & Bevilacqua, P. C. Design of highly active double-pseudoknotted ribozymes: a combined computational and experimental study. Nucleic Acids Research 47, 29–42 (2018).

    Google Scholar 

  15. Schwab, R., Ossowski, S., Riester, M., Warthmann, N. & Weigel, D. Highly specific gene silencing by artificial microRNAs in Arabidopsis. The Plant Cell 18, 1121–1133 (2006).

    Google Scholar 

  16. Hamada, M. In silico approaches to RNA aptamer design. Biochimie 145, 8–14 (2018).

    Google Scholar 

  17. Bauer, G. & Suess, B. Engineered riboswitches as novel tools in molecular biology. Journal of biotechnology 124, 4–11 (2006).

    Google Scholar 

  18. Findeiß, S., Etzel, M., Will, S., Mörl, M. & Stadler, P. F. Design of artificial riboswitches as biosensors. Sensors 17, 1990 (2017).

    Google Scholar 

  19. Norn, C. et al. Protein sequence design by conformational landscape optimization. Proceedings of the National Academy of Sciences 118, e2017228118 (2021).

    Google Scholar 

  20. Bonnet, É, Rzazewski, P. & Sikora, F. Designing RNA secondary structures is hard. J. Comp. Bio 27, 302–316 (2020).

    Google Scholar 

  21. Matthies, M., Krueger, R., Torda, A. & Ward, M. Differentiable Partition Function Calculation for RNA. Nucleic Acids Research (2023).

  22. Dai, N., Zhou, T., Tang, W. Y., Mathews, D. H. & Huang, L. EnsembleDesign: messenger RNA design minimizing ensemble free energy via probabilistic lattice parsing. Proceedings of ISMB (2025).

  23. Yang, X., Yoshizoe, K., Taneda, A. & Tsuda, K. RNA inverse folding using Monte Carlo tree search. BMC bioinformatics 18, 468 (2017).

    Google Scholar 

  24. Churkin, A. et al. Design of RNAs: comparing programs for inverse RNA folding. Briefings in bioinformatics 19, 350–358 (2018).

    Google Scholar 

  25. Mittal, A., Turner, D. H. & Mathews, D. H. NNDB: An Expanded Database of Nearest Neighbor Parameters for Predicting Stability of Nucleic Acid Secondary Structures. Journal of Molecular Biology436 (2024).

  26. Zhou, T., Tang, W. Y., Mathews, D. H. & Huang, L. Undesignable RNA Structure Identification via Rival Structure Generation and Structure Decomposition. Proceedings of RECOMB (2024).

  27. Zhou, T., Tang, W. Y., Mathews, D. H. & Huang, L. Scalable Identification of Minimum Undesignable RNA Motifs on Loop-Pair Graphs. Proceedings of RECOMB (2024).

  28. Ward, M., Courtney, E. & Rivas, E. Fitness functions for RNA structure design. Nucleic Acids Research 51, e40–e40 (2023).

    Google Scholar 

  29. Lafferty, J., McCallum, A. & Pereira, F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of ICML (2001).

  30. Vaswani, A. et al. Attention is all you need. Proceedings of NIPS (2017).

  31. Kahn, H. & Harris, T. E. Estimation of particle transmission by random sampling. NBS. applied math series 12, 27–30 (1951).

    Google Scholar 

  32. Zhang, H. et al. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature 621, 396–403 (2023).

    Google Scholar 

  33. Wayment-Steele, H. K. et al. Theoretical basis for stabilizing messenger rna through secondary structure design. Nucleic acids research 49, 10604–10617 (2021).

    Google Scholar 

  34. Runge, F., Franke, J., Fertmann, D., Backofen, R. & Hutter, F. Partial RNA design. Bioinformatics 40, i437–i445 (2024).

    Google Scholar 

  35. Deigan, K. E., Li, T. W., Mathews, D. H. & Weeks, K. M. Accurate SHAPE-directed RNA structure determination. Proceedings of the National Academy of Sciences 106, 97–102 (2009).

    Google Scholar 

  36. Zadeh, J. N., Wolfe, B. R. & Pierce, N. A. Nucleic acid sequence design via efficient ensemble defect optimization. Journal of computational chemistry 32, 439–452 (2011).

    Google Scholar 

  37. Zhang, H., Zhang, L., Mathews, D. H. & Huang, L. LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities. Bioinformatics36 (2020).

  38. Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).

  39. Nesterov, Y. A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Dokl. Akad. Nauk. SSSR 269, 543 (1983).

    Google Scholar 

  40. Huang, L. et al. LinearFold: linear-time approximate RNA folding by 5’-to-3’ dynamic programming and beam search. Bioinformatics 35, i295–i304 (2019).

    Google Scholar 

  41. Taneda, A. MODENA: a multi-objective RNA inverse folding. Adv. Appl. Bioinform. Chem. 4, 1 (2011).

    Google Scholar 

  42. Eastman, P., Shi, J., Ramsundar, B. & Pande, V. S. Solving the RNA design problem with reinforcement learning. PLoS Comp. Bio. 14, e1006176 (2018).

    Google Scholar 

  43. Runge, F., Stoll, D., Falkner, S. & Hutter, F. Learning to design RNA. arXiv:1812.11951 (2018).

  44. Anderson-Lee, J. et al. Principles for predicting RNA secondary structure design difficulty. J. Mol. Bio. 428, 748–757 (2016).

    Google Scholar 

  45. Koodli, R. V. et al. Redesigning the EteRNA100 for the Vienna 2 folding engine. BioRxiv 2021–08 (2021).

  46. Adamczyk, B., Antczak, M. & Szachniuk, M. RNAsolo: a repository of cleaned PDB-derived RNA 3D structures. Bioinformatics 38, 3668–3670 (2022).

    Google Scholar 

  47. Badura, J., Rybarczyk, A. & Zok, T. Comprehensive datasets for RNA design, machine learning, and beyond. Scientific Reports 15, 21417 (2025).

    Google Scholar 

  48. Tang, W. Y., Dai, N., Zhou, T., Mathews, D. H. & Huang, L. SamplingDesign: RNA Design via Continuous Optimization with Coupled Variables and Monte-Carlo Sampling (2025). https://doi.org/10.5281/zenodo.17674022.

Download references

Acknowledgements

This work was supported in part by NSF grants 2009071 (L.H.) and 2330737 (L.H. and D.H.M.).

Author information

Authors and Affiliations

  1. School of EECS, Oregon State University, Corvallis, OR, USA

    Wei Yu Tang, Ning Dai, Tianshuo Zhou & Liang Huang

  2. Dept. of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA

    Wei Yu Tang

  3. Dept. of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY, USA

    David H. Mathews

  4. Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA

    David H. Mathews

  5. Dept. of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, USA

    David H. Mathews

  6. Dept. of Biochemistry & Biophysics, Oregon State University, Corvallis, OR, USA

    Liang Huang

Authors
  1. Wei Yu Tang
    View author publications

    Search author on:PubMed Google Scholar

  2. Ning Dai
    View author publications

    Search author on:PubMed Google Scholar

  3. Tianshuo Zhou
    View author publications

    Search author on:PubMed Google Scholar

  4. David H. Mathews
    View author publications

    Search author on:PubMed Google Scholar

  5. Liang Huang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

L.H. conceived and directed the project, and developed the sampling framework. W.Y.T. implemented the whole system, and analyzed and visualized the results. N.D. suggested the coupled variable for pairs, guided the softmax parameterization, and ran with the baseline21. T.Z. implemented the projected gradient descent and contributed to data analysis, esp. for the SAMFEO baseline. D.H.M. suggested the coupled variable for mismatches, and guided the data analysis and visualizations. All authors wrote the manuscript.

Corresponding author

Correspondence to Liang Huang.

Ethics declarations

Competing interests

The authors have no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting Summary

Transparent Peer Review file

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, W.Y., Dai, N., Zhou, T. et al. SamplingDesign: RNA design via continuous optimization with coupled variables and Monte-Carlo sampling. Nat Commun (2026). https://doi.org/10.1038/s41467-025-67901-3

Download citation

  • Received: 18 May 2025

  • Accepted: 11 December 2025

  • Published: 20 February 2026

  • DOI: https://doi.org/10.1038/s41467-025-67901-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Videos
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Editors
  • Journal Information
  • Open Access Fees and Funding
  • Calls for Papers
  • Editorial Values Statement
  • Journal Metrics
  • Editors' Highlights
  • Contact
  • Editorial policies
  • Top Articles

Publish with us

  • For authors
  • For Reviewers
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Communications (Nat Commun)

ISSN 2041-1723 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics