Abstract
Despite the popularity of computer-aided study and design of RNA molecules, little is known about the accuracy of commonly used structure modeling packages in tasks sensitive to ensemble properties of RNA. Here, we demonstrate that the EternaBench dataset, a set of more than 20,000 synthetic RNA constructs designed on the RNA design platform Eterna, provides incisive discriminative power in evaluating current packages in ensemble-oriented structure prediction tasks. We find that CONTRAfold and RNAsoft, packages with parameters derived through statistical learning, achieve consistently higher accuracy than more widely used packages in their standard settings, which derive parameters primarily from thermodynamic experiments. We hypothesized that training a multitask model with the varied data types in EternaBench might improve inference on ensemble-based prediction tasks. Indeed, the resulting model, named EternaFold, demonstrated improved performance that generalizes to diverse external datasets including complete messenger RNAs, viral genomes probed in human cells and synthetic designs modeling mRNA vaccines.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
All datasets used here for evaluation are available at https://www.github.com/eternagame/EternaBench. The original Cloud Lab datasets are available at the RNA Mapping Database28 under accession IDs ETERNA_R00_0000 (round 00), ETERNA_R69_0000 (round 01), ETERNA_R70_0000 (round 02), ETERNA_R71_0000 (round 03), ETERNA_R72_0000 (round 04), ETERNA_R73_0000 (round 05), ETERNA_R74_0000 (round 06), ETERNA_R75_0000 (round 07), ETERNA_R76_0000 (round 08), ETERNA_R77_0002 (round 09), ETERNA_R78_0001 (round 10), ETERNA_R79_0001 (round 11), ETERNA_R80_0001 (round 12), ETERNA_R81_0001 (round 13), ETERNA_R82_0001 (round 14), ETERNA_R83_0003 (round 15), ETERNA_R84_0000 (round 16), ETERNA_R85_0000 (round 17), ETERNA_R86_0000 (round 18), ETERNA_R87_0001 (round 19), ETERNA_R89_0000 (round 20), ETERNA_R91_0000 (round 21), ETERNA_R92_0000 (round 22) and ETERNA_R94_0000 (round 23). A list of RMDB accession IDs or URLs corresponding to the data used for benchmarking SHAPE-guided folding is in Supplementary Table 12. Source data are provided with this paper.
Code availability
The datasets used here for evaluation, as well as scripts and Python notebooks for reproducing the filtered datasets and the chemical mapping and riboswitch affinity calculations described here, are available at https://www.github.com/eternagame/EternaBench. The code for training EternaFold is available at https://www.github.com/eternagame/EternaFold. A server to run EternaFold is available at https://eternafold.eternagame.org/. The EternaFold code is derived from the CONTRAfold-SE36 codebase, which is derived from the CONTRAfold11 codebase.
References
Amaral, P. P., Dinger, M. E., Mercer, T. R. & Mattick, J. S. The eukaryotic genome as an RNA machine. Science 319, 1787–1789 (2008).
Singh, V., Braddick, D. & Dhar, P. K. Exploring the potential of genome editing CRISPR-Cas9 technology. Gene 599, 1–18 (2017).
Jaffrey, S. R. RNA-based fluorescent biosensors for detecting metabolites in vitro and in living cells. Adv. Pharm. 82, 187–203 (2018).
Kramps, T. & Elbers, K. Introduction to RNA Vaccines. In: Kramps, T., Elbrs, K. (eds) RNA Vaccines. Methods Mol. Biol. Vol. 1499, 1–11 (2017).
Zuker, M. & Stiegler, P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9, 133–148 (1981).
Lorenz, R. et al. ViennaRNA package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Zadeh, J. N. et al. NUPACK: analysis and design of nucleic acid systems. J. Comput. Chem. 32, 170–173 (2011).
Reuter, J. S. & Mathews, D. H. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinf. 11, 129 (2010).
Xia, T. et al. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry 37, 14719–14735 (1998).
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. & Murphy, K. P. Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics 23, i19–i28 (2007).
Do, C. B., Woods, D. A. & Batzoglou, S. CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22, e90–e98 (2006).
Sloma, M. F. & Mathews, D. H. Base pair probability estimates improve the prediction accuracy of RNA non-canonical base pairs. PLoS Comput. Biol. 13, e1005827 (2017).
Rezaur Rahman Chowdhury, F.A., Zhang, H. & Huang, L. Learning to fold RNAs in linear time. Preprint at bioRxiv, 852871 (2019).
Akiyama, M., Sato, K. & Sakakibara, Y. A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model. J. Bioinform Comput Biol. 16, 1840025 (2018).
Singh, J., Hanson, J., Paliwal, K. & Zhou, Y. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun. 10, 5407 (2019).
Puton, T., Kozlowski, L. P., Rother, K. M. & Bujnicki, J. M. CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction. Nucleic Acids Res. 41, 4307–4323 (2013).
Wayment-Steele, H., Wu, M., Gotrik, M. & Das, R. Evaluating riboswitch optimality. Methods Enzymol. 623, 417–450 (2019).
Berens, C. & Suess, B. Riboswitch engineering–making the all-important second and third steps. Curr. Opin. Biotechnol. 31, 10–15 (2015).
Mauger, D. M. et al. mRNA structure regulates protein expression through changes in functional half-life. Proc. Natl Acad. Sci. USA 116, 24075–24083 (2019).
Watters, K. E. & Lucks, J. B. Mapping RNA structure in vitro with SHAPE chemistry and next-generation sequencing (SHAPE-Seq). Methods Mol. Biol. 1490, 135–162 (2016).
Wilkinson, K. A., Merino, E. J. & Weeks, K. M. Selective 2’-hydroxyl acylation analyzed by primer extension (SHAPE): quantitative RNA structure analysis at single nucleotide resolution. Nat. Protoc. 1, 1610–1616 (2006).
Tian, S. & Das, R. RNA structure through multidimensional chemical mapping. Q. Rev. Biophys. 49, e7 (2016).
Denny, S. K. et al. High-throughput investigation of diverse junction elements in RNA tertiary folding. Cell 174, 377–390 e320 (2018).
Buenrostro, J. D. et al. Quantitative analysis of RNA-protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes. Nat. Biotechnol. 32, 562–568 (2014).
Lee, J. et al. RNA design rules from a massive open laboratory. Proc. Natl Acad. Sci. USA 111, 2122–2127 (2014).
Delli Ponti, R., Marti, S., Armaos, A. & Tartaglia, G. G. A high-throughput approach to profile RNA structure. Nucleic Acids Res. 45, e35 (2017).
Eddy, S. R. Analysis of conserved RNA secondary structure in transcriptomes and genomes. Annu. Rev. Biophys. 43, 433–456 (2014).
Cordero, P., Lucks, J. B. & Das, R. An RNA mapping database for curating RNA structure mapping experiments. Bioinformatics 28, 3006–3008 (2012).
Wellington-Oguri, R. et al. Evidence of an unusual Poly(A) RNA signature detected by high-throughput chemical mapping. Biochemistry 59, 2041–2046 (2020).
Anderson-Lee, J. et al. Principles for predicting RNA secondary structure design difficulty. J. Mol. Biol. 428, 748–757 (2016).
Beisel, C. L. & Smolke, C. D. Design principles for riboswitch function. PLoS Comput. Biol. 5, e1000363 (2009).
Breaker, R. R. Prospects for riboswitch discovery and analysis. Mol. Cell 43, 867–879 (2011).
Andreasson, J. O. L. et al. Crowdsourced RNA design discovers diverse, reversible, efficient, self-contained molecular switches. Proc. Natl Acad. Sci. USA 119, e2112979119 (2022).
Wu, M. J., Andreasson, J. O. L., Kladwang, W., Greenleaf, W. & Das, R. Automated design of diverse stand-alone riboswitches. ACS Synth. Biol. 8, 1838–1846 (2019).
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. & Murphy, K. P. Computational approaches for RNA energy parameter estimation. RNA 16, 2304–2318 (2010).
Foo, C.-S. & Pop, C. Learning RNA secondary structure (only) from structure probing data. Preprint at bioRxiv, 152629 (2017).
Andronescu, M., Bereg, V., Hoos, H. H. & Condon, A. RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 9, 340 (2008).
Sloma, M. F. & Mathews, D. H. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA 22, 1808–1818 (2016).
Watters, K. E. et al. Probing of RNA structures in a positive sense RNA virus reveals selection pressures for structural elements. Nucleic Acids Res. 46, 2573–2584 (2018).
Watts, J. M. et al. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature 460, 711–716 (2009).
Kutchko, K. M. et al. Structural divergence creates new functional features in alphavirus genomes. Nucleic Acids Res. 46, 3657–3670 (2018).
Siegfried, N. A., Busan, S., Rice, G. M., Nelson, J. A. & Weeks, K. M. RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP). Nat. Methods 11, 959–965 (2014).
Dadonaite, B. et al. The structure of the influenza A virus genome. Nat. Microbiol 4, 1781–1789 (2019).
Simon, L. M. et al. In vivo analysis of influenza A mRNA secondary structures identifies critical regulatory motifs. Nucleic Acids Res. 47, 7003–7017 (2019).
Huber, R. G. et al. Structure mapping of dengue and Zika viruses reveals functional long-range interactions. Nat. Commun. 10, 1408 (2019).
Huston, N. C. et al. Comprehensive in vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms. Mol. Cell 81, 584–598 e585 (2021).
Manfredonia, I. et al. Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements. Nucleic Acids Res. 48, 12436–12452 (2020).
Sun, L. et al. In vivo structural characterization of the SARS-CoV-2 RNA genome identifies host proteins vulnerable to repurposed drugs. Cell 184, 1865–1883 e1820 (2021).
Lavender, C. A., Gorelick, R. J. & Weeks, K. M. Structure-based alignment and consensus secondary structures for three HIV-related RNA genomes. PLoS Comput. Biol. 11, e1004230 (2015).
Deigan, K. E., Li, T. W., Mathews, D. H. & Weeks, K. M. Accurate SHAPE-directed RNA structure determination. Proc. Natl Acad. Sci. USA 106, 97–102 (2009).
McGinnis, J. L. & Weeks, K. M. Ribosome RNA assembly intermediates visualized in living cells. Biochemistry 53, 3237–3247 (2014).
Leppek, K. et al. Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics. Nat. Commun. 13, 1536 (2022).
Sun, L. et al. RNA structure maps across mammalian cellular compartments. Nat. Struct. Mol. Biol. 26, 322–330 (2019).
Becker, W. R. et al. Quantitative high-throughput tests of ubiquitous RNA secondary structure prediction algorithms via RNA/protein binding. Preprint at bioRxiv, 571588 (2019).
Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. & Weissman, J. S. Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature 505, 701–705 (2014).
Morandi, E. et al. Genome-scale deconvolution of RNA structure ensembles. Nat. Methods 18, 249–252 (2021).
Hajdin, C. E. et al. Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots. Proc. Natl Acad. Sci. USA 110, 5498–5503 (2013).
Zarringhalam, K., Meyer, M. M., Dotu, I., Chuang, J. H. & Clote, P. Integrating chemical footprinting data into RNA secondary structure prediction. PLoS ONE 7, e45160 (2012).
Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).
Chen, X., Li, Y., Umarov, R., Gao, X. &, Song, L. RNA secondary structure prediction by learning unrolled algorithms. In Proceedings of the 8th International Conference on Learning Representations (2020).
Ward, M., Datta, A., Wise, M. & Mathews, D. H. Advanced multi-loop algorithms for RNA secondary structure prediction reveal that the simplest model is best. Nucleic Acids Res. 45, 8541–8550 (2017).
Zhao, B. S., Roundtree, I. A. & He, C. Post-transcriptional gene regulation by mRNA modifications. Nat. Rev. Mol. Cell Biol. 18, 31–42 (2017).
Rinnenthal, J. et al. Mapping the landscape of RNA dynamics with NMR spectroscopy. Acc. Chem. Res. 44, 1292–1301 (2011).
Kappel, K. et al. Accelerated cryo-EM-guided determination of three-dimensional RNA-only structures. Nat. Methods 17, 699–707 (2020).
McCaskill, J. S. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105–1119 (1990).
Washietl, S., Hofacker, I. L., Stadler, P. F. & Kellis, M. RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res. 40, 4261–4272 (2012).
Deng, F., Ledda, M., Vaziri, S. & Aviran, S. Data-directed RNA secondary structure prediction using probabilistic modeling. RNA 22, 1109–1119 (2016).
Cordero, P. & Das, R. Rich RNA structure landscapes revealed by mutate-and-map analysis. PLoS Comput. Biol. 11, e1004473 (2015).
Xu, Y. et al. Hoogsteen base pairs increase the susceptibility of double-stranded DNA to cytotoxic damage. J. Biol. Chem. 295, 15933–15947 (2020).
Kladwang, W. et al. Standardization of RNA chemical mapping experiments. Biochemistry 53, 3063–3065 (2014).
Seetin, M. G., Kladwang, W., Bida, J. P. & Das, R. Massively parallel RNA chemical mapping with a reduced bias MAP-seq protocol. Methods Mol. Biol. 1086, 95–117 (2014).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Kladwang, W. et al. Anomalous reverse transcription through chemical modifications in polyadenosine stretches. Biochemistry 59, 2154–2170 (2020).
Zhang, H., Zhang, L., Mathews, D. H. & Huang, L. LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities. Bioinformatics 36, i258–i267 (2020).
Zou, G. Y. Toward using confidence intervals to compare correlations. Psychol. Methods 12, 399–413 (2007).
Diedenhofen, B. & Musch, J. cocor: a comprehensive solution for the statistical comparison of correlations. PLoS ONE 10, e0121945 (2015).
Acknowledgements
We thank members of the Das and Barna laboratories (Stanford University), C. Pop and C.-S. Foo for useful discussions. We thank I. Jarmoskaite, V.V. Topkar, R. Rangan and J. Townley for helpful comments on the manuscript. Calculations and model training were performed on the Stanford Sherlock cluster. We acknowledge funding from the National Science Foundation (GRFP to H.K.W.S.), the National Institute of Health (grant no. R35 GM122579 to R.D.) and gifts to the Eterna OpenVaccine project from donors listed in Supplementary Table 13.
Author information
Authors and Affiliations
Consortia
Contributions
H.K.W.S. and R.D. designed the EternaBench benchmark approach and EternaFold multitask training method. H.K.W.S. prepared the EternaBench datasets, performed analyses and implemented and trained the EternaFold model. H.K.W.S. and R.D. wrote the manuscript. W.K. designed methods, acquired data for high-throughput chemical mapping experiments and reviewed the manuscript. A.I.S. performed data analyses and visualizations. W.K., J.L., A.T. and R.D. designed and implemented the Eterna Cloud Lab initiative. A.B. generated SHAPE and DMS data for RNAs of known structure used in SHAPE-directed folding benchmarking. Eterna participants created online design projects, provided RNA solutions and reviewed the manuscript (Supplementary Table 3).
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Hashim Al-Hashimi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Extended analysis of package rankings based on Eterna Cloud lab chemical mapping data.
a) Pearson correlation of all package options tested on Cloud Lab Round 1, which was also a holdout test set for EternaFold training studies. Mean ± SEM of Pearson correlation calculated via bootstrapping, n = 1088 independent constructs. b) ViennaRNA 2, NUPACK 1999, and RNAstructure show maximum Pearson correlation to chemical mapping data at 60 °C, 40 °C, and 60 °C respectively for Eterna Cloud Lab Round 1. Mean ± SEM of Pearson correlation calculated via bootstrapping, n = 1088 independent constructs. c) Ranking across Cloud lab dataset rounds using Spearman rank correlation (compare to Fig. 1e, f). Error bars represent 95% confidence interval of the mean obtained over 1000 iterations of bootstrapping over 24 independent experiments, n = 12,711 independent constructs total. d) (Top) Mean Pearson correlations, calculated over each project (as opposed to each dataset), compared to sequence metrics of the Cloud Lab projects. The strongest correlation to mean correlation was Signal/Noise ratio. (Bottom) Z-score of CONTRAfold-2, calculated over each project, compared to sequence metrics of the Cloud Lab projects.
Extended Data Fig. 2 Example chemical mapping predictions from all package options tested.
Example heatmaps of all package options tested for the ‘Aires’ project (compare to Fig. 1c).
Extended Data Fig. 3 Summary statistics for EternaBench datasets before and after performing CD-HIT filtering.
a) Distributions of sequence properties for chemical mapping data (n = 38,846 before filtering and n = 12,711 independent constructs after filtering, collected across 24 experiments), and B) riboswitch constructs (n = 19,016 independent constructs and n = 7,228 independent constructs after filtering, collected in 12 experiments). Dataset statistics of EternaBench train and test experimental rounds for (c) Chemical Mapping (Train set: n = 3,476 independent constructs collected over 6 experiments. Test set: n = 1,492 independent constructs collected over 18 experiments) and (d) Riboswitch data (Train set: n = 2,508 independent constructs collected over 3 experiments. Test set: n = 4,018 independent constructs collected over 9 experiments). Center dot, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. For all subplots: center dot, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range.
Extended Data Fig. 4 Overview of all Cloud Labs data.
Example reactivity and p(unpaired) heatmaps from example packages for all 24 Cloud Lab rounds. Data have been filtered to exclude nucleotides with reactivity equal to zero or less.
Extended Data Fig. 5 Extended analysis of package rankings based on riboswitch activity predictions.
a) Example set of states for a riboswitch that toggles binding of the fluorescent MS2 protein as an output, controlled by binding the small molecule FMN. The equilibrium constant for forming the MS2 aptamer in the absence of ligand, \(K_{MS2}^{ - lig}\), is estimated using the probability of forming the closing base pair for all packages. b) Riboswitch Z-scores stratified by input ligand type. Error bars represent standard error on Z-score as calculated by bootstrapping from 6402, 440, and 386 constructs collected over 8, 2, and 2 experiments, respectively. c) Overall ranking \(K_{MS2}^{ - lig}\) calculations using the calculated Spearman correlation (no linear assumption, compare to Fig. 2b). Evaulating the Pearson Correlation of package calculations for (d) \(K_{MS2}^{ + lig}\) as well as (e) riboswitch Activation Ratio results in a similar ranking. In C, D, E, error bars represent 95% confidence interval of the mean obtained over 1000 iterations of bootstrapping across datasets, n = 7,228 independent constructs collected over 12 experiments.
Extended Data Fig. 6 Example riboswitch predictions from all package options tested.
Scatterplots for all options tested for Ribologic dataset. Black solid line indicates line of best fit.
Extended Data Fig. 7 Example riboswitch predictions across all datasets.
Scatterplots for representative packages on all riboswitch datasets. Black solid line indicates line of best fit.
Extended Data Fig. 8 Effect of window size and Levenshtein distance filtering for independent chemical mapping test set.
a) Calculating p(unpaired) using varying sliding windows of size 300, 600, and 1200 does not change the overall ranking obtained across datasets, compare to Fig. 4b, which was calculated for window size 900 (n = 31 datasets for all). Package ranking is also consistent for a redundancy cutoff of 40% b) (n = 16 datasets included after filtering based on 40% cutoff by windowed Levenshtein distance). Error bars in A and B represent 95% confidence interval for the mean Z-score as calculated by bootstrapping across respective number of datasets for each.
Extended Data Fig. 9 Extended data corresponding to EternaFold development and test set evaluation.
a) Comparing Vienna, CONTRAfold, and EternaFold predictions in predicting free energy of PUM binding. i) Replication of ddG_exp for both PUM WT and mutant binding from (Becker, 2019). The same calculation in Vienna 2 at 37 °C shows lower Root-mean-squared error (RMSE) (ii), but higher RMSE at 60 °C (iii). CONTRAfold 2 shows no improvement over Vienna at 37 °C (iv), but EternaFold shows modest improvement over both (v). b) Package performance for the S-Processed test set is qualitatively similar to results on the ArchiveII-NR test set (cf. Figure 3b). Error bars represent 95% confidence interval of the mean calculated with 1000 iterations of bootstrapping over n = 6 independent datasets, which contain 974 independent constructs total. c) Evaluating SHAPE- and DMS- directed folding. Error bars represent 95% confidence interval of the mean calculated with 1000 iterations of bootstrapping over n = 5 independent datasets of RNAs with known secondary structures,, which contain 47 constructs total. d) Potentials learned from EternaFold training and used in SHAPE-directed structure prediction.
Extended Data Fig. 10 Extended data corresponding to predicting riboswitch affinity in the presence of small molecule ligands.
a) \(\log K_{MS2}^{ - lig}\) and \(\log K_{MS2}^{ + lig}\) values of riboswitches included in filtered datasets. Black starred datapoint indicates reference value used for \(K_{obs}^{ref}\). b) Estimates for the RiboLogic FMN dataset for \(\log K_{MS2}^{ + lig}\) in all package options able to make estimates with constrained-partition functions.
Supplementary information
Supplementary Table 1
Supplementary Tables 1–14.
Source data
Source Data Fig. 1
Raw source data in reactivity heatmap, raw project analysis source data, raw source P(unpaired) calculations, raw project analysis source data, raw z-scores and significant data across datasets.
Source Data Fig. 2
Raw source Kfold values and calculations, raw z-scores and significant data across datasets.
Source Data Fig. 3
Raw z-scores and significant data across datasets.
Source Data Fig. 4
Raw z-scores and significant data across datasets, raw windowed correlation traces and raw windowed correlation traces.
Source Data Extended Data Fig. 1
Raw z-scores and significant data across datasets and raw project analysis source data.
Source Data Extended Data Fig. 2
Raw source P(unpaired) calculations.
Source Data Extended Data Fig. 3
EternaBench Chemical mapping summary statistics, EternaBench riboswitch summary statistics, chemical mapping train/test split statistics and riboswitch train/test split statistics.
Source Data Extended Data Fig. 5
Raw source Kfold values and calculations.
Source Data Extended Data Fig. 6
Raw source Kfold values and calculations.
Source Data Extended Data Fig. 7
Raw z-scores and significant data across datasets.
Source Data Extended Data Fig. 8
Raw z-scores and significant data across datasets.
Source Data Extended Data Fig. 9
Pumilio protein sequences and Kfold calculations, Raw STRAND test set structure prediction metrics, SHAPE-directed folding z-scores and significant data across datasets.
Source Data Extended Data Fig. 10
Raw source Kfold values and calculations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wayment-Steele, H.K., Kladwang, W., Strom, A.I. et al. RNA secondary structure packages evaluated and improved by high-throughput experiments. Nat Methods 19, 1234–1242 (2022). https://doi.org/10.1038/s41592-022-01605-0
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-022-01605-0
This article is cited by
-
All-atom RNA structure determination from cryo-EM maps
Nature Biotechnology (2025)
-
RNA sample optimization for cryo-EM analysis
Nature Protocols (2025)
-
Generative and predictive neural networks for the design of functional RNA molecules
Nature Communications (2025)
-
RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks
Nature Communications (2025)
-
Deep generalizable prediction of RNA secondary structure via base pair motif energy
Nature Communications (2025)