Abstract
RNA structure heterogeneity is a major challenge when querying RNA structures with chemical probing. We introduce DRACO, an algorithm for the deconvolution of coexisting RNA conformations from mutational profiling experiments. Analysis of the SARS-CoV-2 genome using dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq) and DRACO, identifies multiple regions that fold into two mutually exclusive conformations, including a conserved structural switch in the 3′ untranslated region. This work may open the way to dissecting the heterogeneity of the RNA structurome.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
Data availability
Sequencing data have been deposited to the Gene Expression Omnibus (GEO) database under the accession GSE158052. Additional processed files are available at http://www.incarnatolab.com/datasets/DRACO_Morandi_2021.php. Source data are provided with this paper.
Code availability
The source code of DRACO is freely available from GitHub under the GPLv3 license (https://github.com/dincarnato/draco). A complete list of the software used for data analysis is available from the Nature Research Reporting Summary.
References
Incarnato, D. & Oliviero, S. The RNA epistructurome: uncovering RNA function by studying structure and post-transcriptional modifications. Trends Biotechnol. 35, 318–333 (2017).
Strobel, E. J., Yu, A. M. & Lucks, J. B. High-throughput determination of RNA structures. Nat. Rev. Genet. 19, 615–634 (2018).
Siegfried, N. A., Busan, S., Rice, G. M., Nelson, J. A. E. & Weeks, K. M. RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP). Nat. Methods 11, 959–965 (2014).
Zubradt, M. et al. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods 14, 75–82 (2017).
Tomezsko, P. J. et al. Determination of RNA structural diversity and its role in HIV-1 RNA splicing. Nature 582, 438–442 (2020).
Homan, P. J. et al. Single-molecule correlated chemical probing of RNA. Proc. Natl Acad. Sci. USA 111, 13858–13863 (2014).
Zhang, Y. et al. A stress response that monitors and regulates mRNA structure is central to cold shock adaptation. Mol. Cell 70, 274–286.e7 (2018).
Giuliodori, A. M. et al. The cspA mRNA is a thermosensor that modulates translation of the cold-shock protein CspA. Mol. Cell 37, 21–33 (2010).
Manfredonia, I. et al. Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements. Nucleic Acids Res. 48, 12436–12452 (2020).
Lan, T. C. T. et al. Structure of the full SARS-CoV-2 RNA genome in infected cells. Preprint at bioRxiv https://doi.org/10.1101/2020.06.29.178343 (2020).
Ziv, O. et al. The short- and long-range RNA-RNA interactome of SARS-CoV-2. Mol. Cell 80, 1067–1077.e5 (2020).
Ziv, O. et al. COMRADES determines in vivo RNA structures and interactions. Nat. Methods 15, 785–788 (2018).
Incarnato, D., Morandi, E., Simon, L. M. & Oliviero, S. RNA Framework: an all-in-one toolkit for the analysis of RNA structures and post-transcriptional modifications. Nucleic Acids Res. 46, e97 (2018).
Simon, L. M. et al. In vivo analysis of influenza A mRNA secondary structures identifies critical regulatory motifs. Nucleic Acids Res. 47, 7003–7017 (2019).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).
Darty, K., Denise, A. & Ponty, Y. VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 1974–1975 (2009).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Pickett, B. E. et al. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 40, D593–D598 (2012).
Lauber, C. et al. The footprint of genome architecture in the largest genome expansion in RNA viruses. PLoS Pathog. 9, e1003500 (2013).
Rivas, E., Clements, J. & Eddy, S. R. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods 14, 45–48 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Acknowledgements
D.I. was supported by the Dutch Research Council (Netherlands Organisation for Scientific Research, NWO) as part of the research programme NWO Open Competitie ENW-XS (project number OCENW.XS3.044), and by the Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen. S.O. was supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC), grant AIRC IG 2017 Id. 20240 and PRIN 2017. M.J.H. was supported by the Leiden University Fund (LUF), the Bontius Foundation, and donations from the crowdfunding initiative ‘wake up to corona’.
Author information
Authors and Affiliations
Contributions
E.M. and D.I. conceived the project; I.M., L.M.S. and F.A. carried out the wet-lab work; M.J.H. carried out SARS-CoV-2 manipulations; E.M. and D.I. designed and implemented the DRACO algorithm; E.M. and D.I. carried out bioinformatics, structure modeling and data analysis; D.I. and S.O. wrote the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Methods thanks Walter N. Moss and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Lei Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Overview of the DRACO algorithm.
By default, a window of a size equal to 90% of the median read is slid along the transcript, in 5% increments. For each window, a mutation map is generated using only the reads covering the entire window. Bases that are mutated with respect to the reference are assigned a value of 1, while not mutated bases are assigned a value of 0. By using this map, a graph is built, in which each vertex is a base of the transcript, and edges connecting two vertices are weighted proportionally to the number of reads in which the two connected bases have been observed to co-mutate. Starting from the adjacency matrix of the graph, the normalized Laplacian matrix is calculated and used for spectral deconvolution. A null-model is derived by repeating the same procedure after shuffling the mutations in the original mutation map. Analysis of the distance between consecutive eigenvalues (eigengaps) for the experimental data with respect to the null model allows identifying the number of informative eigengaps, corresponding to the number of coexisting RNA conformations (clusters). Once the number of clusters has been defined, fuzzy clustering is performed using a custom graph cut approach, that enables the weighting of vertices in accordance with their affinity to each cluster. This analysis is repeated across the whole transcript. Consecutive windows showing a compatible number of clusters are merged. Then, reads are re-assigned to the respective cluster, allowing the deconvolution of the cluster reactivity profiles and relative abundances.
Extended Data Fig. 2 In silico validation of DRACO.
a, Maximum number of conformations detected for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 1 to 4 conformations, at a coverage of 5,000X and a read length of 150 nt. Data are presented as mean values ± SD of the 10 sets. The individual data points, representing the mean of each set, are shown. b, Box-plot of median Pearson correlation coefficients (PCC) of reconstructed reactivity profiles for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 1 to 4 conformations, at a coverage of 5,000X and a read length of 150 nt. When DRACO detected more than one window with different numbers of clusters, only the largest window, spanning >50% of the RNA length, was considered. Boxes span the 25th to the 75th percentile. The center represents the median. Whiskers span from the 25th percentile – 1.5 times the IQR, to above the 75th percentile + 1.5 times the IQR. Data points falling outside of this range represent outliers and are reported as dots. c, Violin plot depicting the distribution of expected versus reconstructed conformation abundances for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 2 conformations with varying relative abundances, at a coverage of 5,000X and with a read length of 150 nt. When multiple windows were detected, only the largest window was considered. The Pearson correlation is indicated in the bottom-right corner of each plot. Whiskers span the 25th to the 75th percentile. The central dot represents the median.
Extended Data Fig. 3 Validation of DRACO on in silico-merged in vitro-generated profiles.
a, DRACO-deconvoluted profiles for cspA RNA folded and probed in vitro at either 37 °C or 10 °C (from Zhang et al., 2018), pooled at different percentages. The percentage of pooling is indicated next to each reconstructed profile. b, Heatmap of Pearson correlation coefficient for DRACO-deconvoluted profiles for each pool, compared to the expected profiles at 37 °C or 10 °C.
Supplementary information
Supplementary Information
Supplementary Figs. 1–19 and Notes 1 and 2.
Source data
Source Data Fig. 1
Maximum number of detected conformations, Pearson correlations for reconstructed profiles and estimated conformation abundances for simulated data
Source Data Fig. 2
Normalized reactivity values for DRACO-deconvoluted cspA and add conformations
Source Data Fig. 3
Relative abundance and normalized reactivity values for DRACO-deconvoluted SARS-CoV-2 3′ UTR conformations
Rights and permissions
About this article
Cite this article
Morandi, E., Manfredonia, I., Simon, L.M. et al. Genome-scale deconvolution of RNA structure ensembles. Nat Methods 18, 249–252 (2021). https://doi.org/10.1038/s41592-021-01075-w
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41592-021-01075-w
This article is cited by
-
Knowing when to fold ’em
Nature Methods (2025)
-
Telomerase RNA structural heterogeneity in living human cells detected by DMS-MaPseq
Nature Communications (2025)
-
Identification of conserved RNA regulatory switches in living cells using RNA secondary structure ensemble mapping and covariation analysis
Nature Biotechnology (2025)
-
Structural features within the NORAD long noncoding RNA underlie efficient repression of Pumilio activity
Nature Structural & Molecular Biology (2025)
-
Identification of RNA structures and their roles in RNA functions
Nature Reviews Molecular Cell Biology (2024)


