Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Genome-scale deconvolution of RNA structure ensembles

Abstract

RNA structure heterogeneity is a major challenge when querying RNA structures with chemical probing. We introduce DRACO, an algorithm for the deconvolution of coexisting RNA conformations from mutational profiling experiments. Analysis of the SARS-CoV-2 genome using dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq) and DRACO, identifies multiple regions that fold into two mutually exclusive conformations, including a conserved structural switch in the 3′ untranslated region. This work may open the way to dissecting the heterogeneity of the RNA structurome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: In silico validation of DRACO.
Fig. 2: In vitro validation of DRACO.
Fig. 3: A conserved structural switch in the 3′ UTR of SARS-CoV-2.

Similar content being viewed by others

Data availability

Sequencing data have been deposited to the Gene Expression Omnibus (GEO) database under the accession GSE158052. Additional processed files are available at http://www.incarnatolab.com/datasets/DRACO_Morandi_2021.php. Source data are provided with this paper.

Code availability

The source code of DRACO is freely available from GitHub under the GPLv3 license (https://github.com/dincarnato/draco). A complete list of the software used for data analysis is available from the Nature Research Reporting Summary.

References

  1. Incarnato, D. & Oliviero, S. The RNA epistructurome: uncovering RNA function by studying structure and post-transcriptional modifications. Trends Biotechnol. 35, 318–333 (2017).

    Article  CAS  Google Scholar 

  2. Strobel, E. J., Yu, A. M. & Lucks, J. B. High-throughput determination of RNA structures. Nat. Rev. Genet. 19, 615–634 (2018).

    Article  CAS  Google Scholar 

  3. Siegfried, N. A., Busan, S., Rice, G. M., Nelson, J. A. E. & Weeks, K. M. RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP). Nat. Methods 11, 959–965 (2014).

    Article  CAS  Google Scholar 

  4. Zubradt, M. et al. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods 14, 75–82 (2017).

    Article  CAS  Google Scholar 

  5. Tomezsko, P. J. et al. Determination of RNA structural diversity and its role in HIV-1 RNA splicing. Nature 582, 438–442 (2020).

    Article  CAS  Google Scholar 

  6. Homan, P. J. et al. Single-molecule correlated chemical probing of RNA. Proc. Natl Acad. Sci. USA 111, 13858–13863 (2014).

    Article  CAS  Google Scholar 

  7. Zhang, Y. et al. A stress response that monitors and regulates mRNA structure is central to cold shock adaptation. Mol. Cell 70, 274–286.e7 (2018).

    Article  CAS  Google Scholar 

  8. Giuliodori, A. M. et al. The cspA mRNA is a thermosensor that modulates translation of the cold-shock protein CspA. Mol. Cell 37, 21–33 (2010).

    Article  CAS  Google Scholar 

  9. Manfredonia, I. et al. Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements. Nucleic Acids Res. 48, 12436–12452 (2020).

    Article  Google Scholar 

  10. Lan, T. C. T. et al. Structure of the full SARS-CoV-2 RNA genome in infected cells. Preprint at bioRxiv https://doi.org/10.1101/2020.06.29.178343 (2020).

  11. Ziv, O. et al. The short- and long-range RNA-RNA interactome of SARS-CoV-2. Mol. Cell 80, 1067–1077.e5 (2020).

    Article  CAS  Google Scholar 

  12. Ziv, O. et al. COMRADES determines in vivo RNA structures and interactions. Nat. Methods 15, 785–788 (2018).

    Article  CAS  Google Scholar 

  13. Incarnato, D., Morandi, E., Simon, L. M. & Oliviero, S. RNA Framework: an all-in-one toolkit for the analysis of RNA structures and post-transcriptional modifications. Nucleic Acids Res. 46, e97 (2018).

    Article  Google Scholar 

  14. Simon, L. M. et al. In vivo analysis of influenza A mRNA secondary structures identifies critical regulatory motifs. Nucleic Acids Res. 47, 7003–7017 (2019).

    Article  CAS  Google Scholar 

  15. Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    Article  Google Scholar 

  16. Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).

    Article  CAS  Google Scholar 

  17. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).

    Article  Google Scholar 

  18. Darty, K., Denise, A. & Ponty, Y. VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 1974–1975 (2009).

    Article  CAS  Google Scholar 

  19. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Article  CAS  Google Scholar 

  20. Pickett, B. E. et al. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 40, D593–D598 (2012).

    Article  CAS  Google Scholar 

  21. Lauber, C. et al. The footprint of genome architecture in the largest genome expansion in RNA viruses. PLoS Pathog. 9, e1003500 (2013).

    Article  CAS  Google Scholar 

  22. Rivas, E., Clements, J. & Eddy, S. R. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods 14, 45–48 (2017).

    Article  CAS  Google Scholar 

  23. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

D.I. was supported by the Dutch Research Council (Netherlands Organisation for Scientific Research, NWO) as part of the research programme NWO Open Competitie ENW-XS (project number OCENW.XS3.044), and by the Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen. S.O. was supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC), grant AIRC IG 2017 Id. 20240 and PRIN 2017. M.J.H. was supported by the Leiden University Fund (LUF), the Bontius Foundation, and donations from the crowdfunding initiative ‘wake up to corona’.

Author information

Authors and Affiliations

Authors

Contributions

E.M. and D.I. conceived the project; I.M., L.M.S. and F.A. carried out the wet-lab work; M.J.H. carried out SARS-CoV-2 manipulations; E.M. and D.I. designed and implemented the DRACO algorithm; E.M. and D.I. carried out bioinformatics, structure modeling and data analysis; D.I. and S.O. wrote the manuscript.

Corresponding authors

Correspondence to Salvatore Oliviero or Danny Incarnato.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Methods thanks Walter N. Moss and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Lei Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overview of the DRACO algorithm.

By default, a window of a size equal to 90% of the median read is slid along the transcript, in 5% increments. For each window, a mutation map is generated using only the reads covering the entire window. Bases that are mutated with respect to the reference are assigned a value of 1, while not mutated bases are assigned a value of 0. By using this map, a graph is built, in which each vertex is a base of the transcript, and edges connecting two vertices are weighted proportionally to the number of reads in which the two connected bases have been observed to co-mutate. Starting from the adjacency matrix of the graph, the normalized Laplacian matrix is calculated and used for spectral deconvolution. A null-model is derived by repeating the same procedure after shuffling the mutations in the original mutation map. Analysis of the distance between consecutive eigenvalues (eigengaps) for the experimental data with respect to the null model allows identifying the number of informative eigengaps, corresponding to the number of coexisting RNA conformations (clusters). Once the number of clusters has been defined, fuzzy clustering is performed using a custom graph cut approach, that enables the weighting of vertices in accordance with their affinity to each cluster. This analysis is repeated across the whole transcript. Consecutive windows showing a compatible number of clusters are merged. Then, reads are re-assigned to the respective cluster, allowing the deconvolution of the cluster reactivity profiles and relative abundances.

Extended Data Fig. 2 In silico validation of DRACO.

a, Maximum number of conformations detected for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 1 to 4 conformations, at a coverage of 5,000X and a read length of 150 nt. Data are presented as mean values ± SD of the 10 sets. The individual data points, representing the mean of each set, are shown. b, Box-plot of median Pearson correlation coefficients (PCC) of reconstructed reactivity profiles for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 1 to 4 conformations, at a coverage of 5,000X and a read length of 150 nt. When DRACO detected more than one window with different numbers of clusters, only the largest window, spanning >50% of the RNA length, was considered. Boxes span the 25th to the 75th percentile. The center represents the median. Whiskers span from the 25th percentile – 1.5 times the IQR, to above the 75th percentile + 1.5 times the IQR. Data points falling outside of this range represent outliers and are reported as dots. c, Violin plot depicting the distribution of expected versus reconstructed conformation abundances for 10 sets of 100 simulated RNAs, with length ranging from 600 to 1,500 nt, expected to form 2 conformations with varying relative abundances, at a coverage of 5,000X and with a read length of 150 nt. When multiple windows were detected, only the largest window was considered. The Pearson correlation is indicated in the bottom-right corner of each plot. Whiskers span the 25th to the 75th percentile. The central dot represents the median.

Extended Data Fig. 3 Validation of DRACO on in silico-merged in vitro-generated profiles.

a, DRACO-deconvoluted profiles for cspA RNA folded and probed in vitro at either 37 °C or 10 °C (from Zhang et al., 2018), pooled at different percentages. The percentage of pooling is indicated next to each reconstructed profile. b, Heatmap of Pearson correlation coefficient for DRACO-deconvoluted profiles for each pool, compared to the expected profiles at 37 °C or 10 °C.

Supplementary information

Supplementary Information

Supplementary Figs. 1–19 and Notes 1 and 2.

Reporting Summary

Source data

Source Data Fig. 1

Maximum number of detected conformations, Pearson correlations for reconstructed profiles and estimated conformation abundances for simulated data

Source Data Fig. 2

Normalized reactivity values for DRACO-deconvoluted cspA and add conformations

Source Data Fig. 3

Relative abundance and normalized reactivity values for DRACO-deconvoluted SARS-CoV-2 3′ UTR conformations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Morandi, E., Manfredonia, I., Simon, L.M. et al. Genome-scale deconvolution of RNA structure ensembles. Nat Methods 18, 249–252 (2021). https://doi.org/10.1038/s41592-021-01075-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41592-021-01075-w

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics