Naturally ornate RNA-only complexes revealed by cryo-EM

Kretsch, Rachael C.; Wu, Yuan; Shabalina, Svetlana A.; Lee, Hyunbin; Nye, Grace; Koonin, Eugene V.; Gao, Alex; Chiu, Wah; Das, Rhiju

doi:10.1038/s41586-025-09073-0

Download PDF

Article
Open access
Published: 06 May 2025

Naturally ornate RNA-only complexes revealed by cryo-EM

Nature volume 643, pages 1135–1142 (2025)Cite this article

19k Accesses
7 Citations
71 Altmetric
Metrics details

Subjects

Abstract

The structures of natural RNAs remain poorly characterized and may hold numerous surprises^1,2,3,4. Here we report three-dimensional structures of three large ornate bacterial RNAs using cryo-electron microscopy (cryo-EM). GOLLD (Giant, Ornate, Lake- and Lactobacillales-Derived), ROOL (Rumen-Originating, Ornate, Large) and OLE (Ornate Large Extremophilic) RNAs form homo-oligomeric complexes whose stoichiometries are retained at lower concentrations than measured in cells. OLE RNA forms a dimeric complex with long co-axial pipes spanning two monomers. Both GOLLD and ROOL form distinct RNA-only multimeric nanocages with diameters larger than the ribosome, each empty except for a disordered loop. Extensive intramolecular and intermolecular A-minor interactions, kissing loops, an unusual A–A helix and other interactions stabilize the three complexes. Sequence covariation analysis of these large RNAs reveals evolutionary conservation of intermolecular interactions, supporting the biological importance of large, ornate RNA quaternary structures that can assemble without any involvement of proteins.

Structural insights into higher-order natural RNA-only multimers

Article 06 August 2025

Cryo-EM structure of a natural RNA nanocage

Article 16 June 2025

Sub-3-Å cryo-EM structure of RNA enabled by engineered homomeric self-assembly

Article 02 May 2022

Main

The importance of non-coding RNAs (ncRNAs) across biology is increasingly recognized, but only a small number have been functionally characterized, with studies revealing sophisticated catalytic and sensory functions in some cases^5,6,7,8,9. Bacteria, archaea and their viruses are thought to possess numerous diverse and complex ncRNAs, but most of these have not been thoroughly studied^1,2,3,4. Furthermore, there is a conspicuous shortage of data on the 3D structures of RNA molecules. Out of more than 4,000 RNA classes in the RNA Families (RFAM) database 15.0, only 143 have experimentally resolved tertiary structures¹⁰. For many of the remaining cases, it appears likely that structural characterization will depend on reconstitution of the RNA with small molecule, protein or nucleic acid partners, which are unknown in most cases.

The Breaker laboratory and collaborators have previously described three classes of bacterial and phage RNAs for which covariance analysis of genomic and metagenomic sequences revealed secondary structures that were so extensive and elaborate that ‘ornate’, ‘giant’ or ‘large’ were included in their names: GOLLD RNA³, ROOL RNA^1,11 (concomitantly reported in ref. ¹²) and OLE RNA². The functions of these three classes of large RNAs remain poorly understood.

Here, using cryo-EM, we show that OLE, ROOL and GOLLD all form atomically ordered 3D structures. Unexpectedly, the three structures are stabilized not by proteins but by other copies of the same RNA molecule in ornate quaternary assemblies with many intermolecular bridges, a phenomenon that has not previously been observed for natural RNA molecules¹³.

OLE forms an RNA-only dimer

OLE is a class of large RNAs with an ornate secondary structure that is conserved throughout evolution². OLE is found mainly in extremophilic bacteria, and experimental characterization in Halalkalibacterium halodurans has demonstrated its involvement in integrating energy availability, metal ion homeostasis and drug treatment to mediate cellular adaptation, although the underlying molecular mechanisms remain unknown^2,14,15,16. Cellular localization to the membrane, binding to at least six protein partners^{15,17,18,19,20,21,22} and evidence of alternative secondary structures¹⁷ suggested that OLE was unlikely to form a well-defined RNA-only 3D structure. However, our study showed that the 577-nucleotide (nt) OLE RNA from Clostridium acetobutylicum^2,23 formed distinct, compact particles that were clearly visible in cryo-EM images (Fig. 1a). Furthermore, a 2.9 Å resolution 3D map of a dimeric OLE RNA could be reconstructed with two-fold imposed symmetry (Extended Data Fig. 1). A model of the each chain has been built for 308 nt in the OLE 5′ region, with Q-scores²⁴ exceeding the expected score at this resolution (Fig. 1b, Extended Data Fig. 1e and Supplementary Video 1).

Our OLE dimer map shows that it is organized as a series of parallel A-form helices, resembling a bundle of pipes. The exterior ends of these pipes from each chain are interconnected into a five-way junction, with a secondary structure that agrees with the previously proposed one for the observed domain with stems P3 to P9.3 (refs. ^2,15) (Fig. 1c; hereafter, paired stems, hairpin loops and joining linkers are designated ‘P’, ‘L’ and ‘J’, respectively, following conventional RNA nomenclature). An unusual but highly conserved symmetric interaction comprised of four A–A base pairs between two chains (L4, Fig. 1d), intermolecular base pairing and stacking interactions connecting L5, L6 and L7 (Fig. 1e), and a kissing loop (L9.3, Fig. 1f) ‘weld’ the pipes together in the middle of the complex. We denote these intermolecular interactions ‘bridges’ B1–B3, as used in ribosome nomenclature²⁵. An elaborative list of intramolecular motifs and intermolecular interactions is presented in Supplementary Tables 1 and 2, respectively. Beyond the 5′ region, other conserved parts of OLE were not resolved in the structure, suggesting flexibility.

Surprisingly, regions of OLE that were previously thought to adopt alternative structures upon protein binding are clearly resolved and solvent-accessible, suggesting that proteins may bind the OLE dimer in the pre-formed RNA conformation that we observed here. Our cryo-EM data show that these proteins are not required for the folding of the 5′ domain of OLE, and the RNA structure itself may have a crucial role in organizing these proteins. In particular, the protein OapC was previously hypothesized to bind a kink turn between J4a/5 and J5/6, and binding of OapC was thought to alter secondary structure, in particular increasing protection of J6/7 to in-line hydrolysis¹⁷. Our OLE dimer structure supports formation of a kink turn^26,27 in J4a/5 at the base of P5, but this kink turn is formed with J6/7, not J5/6 (Fig. 1g). The previously observed protection of J6/7 may therefore be explained by direct binding to the protein, and not by a rearrangement of secondary structure. In addition, whereas the internal loop of the P6 stem is different from the previously proposed one, it exposes residues 163–165, which were proposed to bind the protein RpsU¹⁵. A163 is flipped out of the helix and docks into a pocket created by P5, P6, P7 and dimer interface. This OLE dimer pocket is reminiscent of the pocket RpsU occupies in the ribosome, supporting the previous hypothesis that OLE could sequester RpsU¹⁵.

ROOL assembles into an ordered nanocage

ROOL is a class of RNAs that is encoded in a wide variety of bacterial prophages and phages, often near tRNA islands^1,11,12. The predicted secondary structure is highly complex with multiple pseudoknots, but no protein binding partners have been identified, leading to the hypothesis that ROOL may function as an RNA-only complex¹. Although no function has been described for ROOL, it has been shown to be as abundant as 16S ribosomal RNA, but non-essential, in at least one strain of Ligilactobacillus salivarius¹².

The 659-nt ROOL env-120, discovered in cow rumen^1,28, produces visually clear, symmetric particles in cryo-EM micrographs (Fig. 2a and Extended Data Fig. 2). The 3.1 Å reconstructed map reveals a closed, hollow nanocage structure that comprises 8 chains with dihedral symmetry and a diameter of approximately 280 Å, larger than the maximal dimension of a bacterial ribosome (approximately 250 Å) (Fig. 2b and Supplementary Video 2). Each chain has a secondary structure that is consistent with the stems P1 to P19 proposed previously by covariation analysis¹², including the pseudoknot P10 (Fig. 2c). Atomic models for each chain can be built with a good match to the map density as shown by the Q-scores²⁴ (Extended Data Fig. 2e). Our model shows intramolecular tertiary interactions (Fig. 2d–i), which scaffold the flat monomer structure (Fig. 2f–i), including a set of non-canonical base pairs and stacking interactions that connect loops L3a and J6/7 (Fig. 2f), an A-minor interaction between L3c and P5 (Fig. 2g), an additional pseudoknot P13 (adjacent to P5 and P10, Fig. 2h), a complex set of non-canonical pairs between nucleotides that are already in stems P1, P2 and P3b (Fig. 2i), and other motifs (Supplementary Table 1).

**Fig. 2: Atomically ordered structure of ROOL homo-octamer.**

The ROOL quaternary complex is an octameric nanocage, with a top and bottom half-shells, each formed by 4 chains, hereafter labelled chains 1–4 and 1′−4′. Within a half-shell, each chain forms 8 bridges with its neighbours, 4 on each side, labelled B1–B4 (Fig. 2j–o). Starting from the top, the loop of stem P7 forms an isolated base pair with a bulged out base in stem 7 of the next chain (B3, Fig. 2l). This ‘daisy chain’ of interacting stem-loops forms an inner circle on the top of the half-shell approximately 36 Å in diameter (Fig. 2b). The P7 stem is not always conserved, but a second circle of RNA (Fig. 2b), involving a quaternary kissing loop (B4, Fig. 2m), is highly conserved in evolution, and was identified as tertiary interaction by previous covariation analysis¹. An A-minor interaction (B2, Fig. 2k) and a novel quaternary kissing loop (B1, Fig. 2j) further glue together the chains in the half-shell. Between a novel intramolecular tertiary interaction (Fig. 2g) and the intermolecular kissing loop B1 (Fig. 2j), we identified a disordered region that appears to be located inside the nanocage, based on the position of flanking regions (Fig. 2d and Extended Data Fig. 2d,e). This region was previously identified as a linker with little to no sequence or structural similarity across homologues¹.

As opposed to a simple dimer such as the OLE interface, where each chain interacts with a single partner, in the ROOL complex, each chain reaches over and interacts with two chains in the other half-shell. These interactions favour the full cage assembly, as opposed to isolated dimers. B5 and B6 are quaternary interactions in which the same sequences from different chains interact via adenosine stacking and Watson–Crick–Franklin base pairing, respectively (Fig. 2n,o). An additional interaction between the internal loop J17/18 of chain 1, previously proposed to form a pseudoknot with the flank of the linker region, and P19 of chain 1′ seems plausible given their proximity, but that region was not well-resolved in our structure.

GOLLD assembles into a distinct nanocage

GOLLD RNAs are the largest among the three RNA classes analysed here, with many members exceeding 800 nucleotides in length^3,11,29. GOLLD, similar to ROOL, is a molecule of unknown function encoded in bacterial prophages and phages, often near tRNA islands, but with sequences and secondary structures that are distinct from those of ROOL^3,11,29. GOLLD expression has been shown to increase during the lysis of bacterial cells infected by phage¹¹. Unlike ROOL, the predicted secondary structures of GOLLD RNAs consist of a universally conserved 3′ region and a less conserved 5′ region³.

The GOLLD env-38 RNA, first identified in a marine metagenomic sample downstream of Met-tRNA^3,30, produces visually striking flower-like particles in cryo-EM micrographs (Fig. 3a and Extended Data Fig. 3). The 3D reconstruction at 3.0 Å resolution shows that GOLLD forms a nanocage, similar to the one formed by ROOL, but larger. The GOLLD structure is a closed 14-mer with D₇ quaternary symmetry, with a diameter of 380 Å and a completely empty interior except for a disordered loop (Fig. 3b,c and Supplementary Video 3). Models for each of the 14 chains were built with Q-score²⁴ in accordance with the map resolution (Extended Data Fig. 3g). As with ROOL, kissing loops and A-minor interactions underlie the tertiary and quaternary structure of GOLLD in addition to other motifs (Fig. 3d–t and Supplementary Tables 1 and 2) but the specific interactions are distinct. Beyond confirming the accuracy of the previously predicted secondary structure with stems P1–P27 (Fig. 3d), the tertiary structure of GOLLD reveals prominent interactions, including A-minor interactions involving adenosines at the P3–P4–P5 junction (Fig. 3g), an A-minor interaction between adenosines in L26 and stem P14 (Fig. 3h) and a loop L22 that forms a pseudoknot with the nearby linker J17/22, in addition to an A-minor interaction with that pseudoknot (Fig. 3i). Furthermore, loop L27 brings together seven regions by forming base pairs with stem P23 and linker J24/26 as well as base-backbone interactions with two additional stems, P18 and P22, and linker J17/18 (Fig. 3j). Similar to ROOL, the variable linker within each chain is not resolved, but the positions of immediate flanking sequences in the 5′ and 3′ regions indicate that the linker resides in the interior of the cage (Fig. 3f and Extended Data Fig. 3f,g). Globally, the cryo-EM structure shows that the 5′ region and the 3′ region form separate domains in the 3D structure (Fig. 3c). This separation could explain why the 3′ and 5′ domains are divergent in GOLLD, whereas, in ROOL, the 5′ and 3′ regions are intertwined and hence have to co-evolve to maintain the tertiary and quaternary structure.

**Fig. 3: Atomically ordered structure of GOLLD homo-14-mer.**

The 5′ domains of GOLLD form the cap of each half-shell of the nanocage. Within the cap of each half-shell, each of 7 monomers forms 8 quaternary bridges to other chains—4 on each side, including kissing loops, A-minor interactions and other interactions (B2–B5, Fig. 3e,l–o). B2 (Fig. 3l) closely resembles the daisy chain of interacting stem-loops from ROOL, except that the distance between the pairs of interacting residues is reduced from 9 nt to 4 nt. This compensates for the increased number of chains in GOLLD, resulting in an inner circle of roughly the same diameter as of ROOL. In GOLLD, the only non-interacting loop with a conserved sequence, L11a (sequence GAAA), points towards this inner circle. The 3′ regions complete the half-shell below this 5′ cap through two interactions: an A-minor interaction (B6, Fig. 3p) and a kissing loop between L20 and J13/15, which was previously identified by covariation analysis³ and here shown to be an intermolecular bridge (B7, Fig. 3q). Only a single intermolecular A-minor interaction, B1, glues the 3′ and 5′ regions from different chains together (Fig. 3k).

Finally, similar to the ROOL nanocage, the two half-shells come together with each chain in the top half-shell interacting with two chains in the bottom half-shell. In the GOLLD nanocage, these interactions consist of two self-interacting kissing-loop interactions (B8 and B9, Fig. 3r,s) and an A-minor interaction (B10, Fig. 3t) involving 3′ regions from different chains.

Biological relevance of homo-multimers

Symmetric multimers are common among proteins and rationally designed RNA molecules^26,31,32,33, but observations of natural RNA multimers are rare. When observed, natural RNA homomeric interactions typically involve a single contact¹³. Further, with notable exceptions of viruses, such as HIV and other retroviruses³⁴ and the Φ29 bacteriophage^35,36,37, the biological relevance of RNA homomeric complexes has not been demonstrated, leaving the possibility that they form only at high RNA concentrations and extreme ionic conditions or in the context of the specific constructs chosen for in vitro structural characterization. By contrast, several lines of evidence support GOLLD, ROOL and OLE forming multimers in their biological contexts.

First, concomitant with the same set of cryo-EM studies presented above, we resolved a 2.9 Å resolution map of another large RNA molecule, the raiA motif from C. acetobutylicum^23,38, as a pure monomer (Extended Data Figs. 4 and 5 and Supplementary Text 1), refuting the possibility that any large RNA would form a multimer in our experimental conditions. Independent studies have also resolved the raiA motif as a monomer^39,40. Additionally, we characterized the 343-nt HNH endonuclease-associated RNA and open reading frame (HEARO)³⁹ from Limnospira maxima, which is known to form a defined RNA structure that is involved in DNA nickase activity when bound to the protein IsrB⁴¹. Unlike the 5′ region of OLE, which is also known to bind proteins, the HEARO RNA was disordered in the absence of the protein (Extended Data Fig. 6), suggesting that multimer formation of protein-binding RNAs is not an artefact of cryo-EM experimental conditions.

Second, mass photometry, which gives high precision estimates of molecular weight but requires molecular binding to surfaces, confirms the stoichiometry of GOLLD, ROOL and OLE to be 14, 8 and 2, respectively, at RNA concentrations as low as 12.5 nM (Extended Data Fig. 7). This concentration is three orders of magnitude lower than the concentrations in our cryo-EM experiments and corresponds to a population of only around ten RNA molecules in a bacterial cell, substantially lower than what is expected from observed expression levels¹.

Third, using dynamic light scattering (DLS; Extended Data Fig. 7), we confirmed that both ROOL and GOLLD primarily form thermostable multimers, with no detectable fraction of monomers, at temperatures up to 55 °C and concentrations as low as 110 nM.

Fourth, for all three structures each chain contains five or more conserved inter-subunit contacts, indicative of intricate arrangements that suggest selection pressure during the evolution of these RNAs.

Fifth, using comparative analysis of both sequences and secondary structures, we detected evolutionary conservation of structural elements and, in particular, the sites of intermolecular interactions supporting RNA homo-oligomerization (Supplementary Files 1–3 and Supplementary Table 3). Comparative analysis of OLE, ROOL and GOLLD showed that, although the sequences of these RNAs are not highly conserved, all intramolecular stems exhibit extensive base pairing supported by covariation analysis, including stems whose loops are involved in intermolecular bridges (Extended Data Fig. 8a–c, Supplementary Text 2 and Supplementary Table 3). The A positions in the OLE non-canonical A–A base-pair stem bridge B1 and the GOLLD A-minor interaction bridge B6 are highly conserved (Figs. 1d and 3p and Extended Data Fig. 8d,e). The intermolecular base pairs between ROOL J6/8 and L8 (bridge B4, Fig. 2m) were detected as a prominent, conserved quaternary interaction in prior covariation analysis⁴⁰ (Extended Data Fig. 8b and Supplementary Table 3). In other bridges, we observed intermolecular symmetric kissing loops that had base pairs between the same loop from two different chains: nucleotides 315–318 in chain A and B of OLE (B3) and nucleotides 656–657 from chain 1 and 7′ of GOLLD (B8). Apparent covariance at immediately adjacent nucleotides in these loop sequences supports intermolecular base pairs because base pairing of adjacent nucleotides within the same chain is stereochemically precluded (Extended Data Fig. 8f,g). OLE L9.3 and GOLLD L21a were each found to covary in this manner, switching an internal tetranucleotide between palindromes GGCC to GAUC or AGCU and an internal dinucleotide between GC and CG, supporting bridges OLE B3 and GOLLD B8, respectively (Extended Data Fig. 8f,g). The other symmetric kissing loops in our structures, GOLLD L21b (bridge B9) and ROOL L18 (bridge B6), were highly conserved across the variants for which the loops could be confidently aligned (Extended Data Fig. 8g,h), precluding similar covariance analysis but consistent with the importance of the observed intermolecular interactions.

Discussion

Together, our cryo-EM data, biophysical experiments and evolutionary analyses show that GOLLD, ROOL and OLE each form not only ornate secondary structures but also symmetric quaternary assemblies stabilized by many intermolecular contacts. While this Article was being revised, a publication appeared reporting similar cryo-EM structures, supporting the reproducibility of cryo-EM⁴². These structures and their complex network of RNA structure motifs offer a rich source of data for RNA structure prediction and design efforts. OLE forms a dimer shaped like a bundle of pipes and exposes structured binding pockets for protein partners such as the membrane-associated OLE-associated protein A (OapA). After superimposing an OapA dimer to each P4a site (OapA is known to bind OLE in a 2:1 ratio^41,42,43), we note that the RNA could induce the formation of an OapA tetramer. OapA is a membrane protein, and the tetramer is reminiscent of the double-stranded RNA transporter SID-1^43,44,45, suggesting that it may be able to accommodate RNA elements, such as the 3′ region of OLE, which was not resolved here (Fig. 4a). In contrast to OLE, and despite unrelated sequences and distinct secondary and tertiary structures, GOLLD and ROOL both form nanocages, suggesting that their function might involve encapsulating their internal disordered linkers and/or other molecules, analogous to proteinaceous microcompartments that are common in bacteria and archaea⁴⁶. Although not large enough to enclose entire DNA genomes of their parent phages, these cages might contain macromolecules of significant size (Fig. 4b,c), such as phage-encoded tRNAs, which are sometimes present in the GOLLD linker region, bacterial ribosomes, which have been shown to bind GOLLD in pull-down assays⁴⁵, metabolites, or stress response proteins. It remains to be determined whether nanocage formation is a common feature among large natural RNAs.

**Fig. 4: Structure-guided hypotheses for homo-oligomeric RNAs.**

Methods

In vitro RNA synthesis

DNA templates containing the RNA sequence of interest prepended with the T7 promoter (see Supplementary Table 4 for sequences) were ordered as gBlocks from IDT. Primers designed to amplify these sequences (see Supplementary Table 4 for sequences) were also from IDT. PCR amplification was carried out with NEBNext Ultra II Q5 Master Mix (NEB M0544S) using 10 ng of template per reaction. The thermocycler settings were: 98 °C for 30 s, then 35 cycles of 98 °C for 10 s, 55 °C for 30 s, then 72 °C for 30 s, and a final step of 72 °C for 5 min. The PCR products were then column purified using the QIAquick PCR Purification Kit (Qiagen 28104) and run on a 2% E-Gel agarose gel (Thermo Scientific A42135) to check DNA quality. DNA concentration was measured using a NanoDrop. Purified DNA smaller than 515 bp were in vitro transcribed using TranscriptAid T7 High Yield Transcription Kit (Thermo Scientific K0441) with 6 μl of DNA template per reaction. Purified DNA longer than 515 bp were in vitro transcribed using MEGAscript T7 Transcription Kit (Thermo Scientific AM1334) with 6−8 μl DNA template per reaction. These in vitro transcription reactions were incubated at 37 °C for 6 h, then held at 4 °C before DNase treatment. The RNA was then purified using the RNA Clean & Concentrator-25 Kit (Zymo Research R1017) and eluted in 30 μl water. The concentration of purified RNA was measured using a NanoDrop, and the quality was checked using the Agilent 2100 Bioanalyzer (Nano RNA Assay, run by the Stanford PAN Facility; Bioanlyzer 2100 Expert B.02.11.SI824), as shown in Extended Data Fig. 7a.

RNA folding

For all subsequent experiments, RNA was re-folded using the same basic protocol. RNA concentrations used and any other modifications to this standard protocol are mentioned in each section. RNA was denatured (90 °C for 3 min, room temperature for 10 min) in 50 mM Na-HEPES pH 8.0. RNA was then folded with 10 mM MgCl₂ at 50 °C for 20 min, and cooled to room temperature for at least 10 min before taking measurements.

Mass photometry

Mass photometry data were collected using the Refeyn TwoMP, using AcquireMP version 2024-R1.1 and DiscoverMP version 2024-R1 to obtain histogram data. For the OLE data, coated glass slides from the MP Sample Preparation Pack (MP-CON-21014) were used, for the ROOL and GOLLD data Mass Glass UC slides (MP-CON-41001) were used after coating with poly-l-lysine. The instrument was focused using droplet-dilution. Data were collected for 1 min using the large image size. The contrast data were calibrated to nucleotide length using the Millennium RNA Markers (Thermo Scientific AM7150). Gaussians were fitted by the automatic analysis in DiscoverMP. The resulting data and plotting code can be found in the accompanying GitHub repository.

Mass photometry data were not reliable for the raiA motif. raiA motif RNA was folded at 1 µM following the standard procedure above. On the stage two dilutions were attempted, 15 µl buffer:2 µl sample for final concentration of 118 nM and 18 µl buffer:2 µl 10x diluted sample for a final concentration of 10 nM. There is a known issue with nucleic acid samples, whereby there are noisy low-mass peaks⁴⁸ (communication with the company Refeyn). These are not present in the buffer alone. For this reason, raiA motif (205 nt) is below the recommended minimal size for mass photometry, and indeed when we attempted to collect data on the raiA motif we observed noise peaks, containing the size of raiA motif monomer but smaller than any multimer, in both binding and unbinding regimes, indicating unreliable results (data not shown).

OLE was folded at 0.25 µM and was diluted to 12.5 nM on the stage. OLE folded in various buffers, all including 50 mM Na-HEPES pH 8.0, with other components added at the time when MgCl₂ is added in the standard protocol: (1) nothing added; (2) 1 mM MgCl₂; (3) 10 mM MgCl₂, standard; (4) 100 mM MgCl₂; (5) 10 mM MgCl₂ and 1% ethanol; (6) 10 mM MgCl₂ and 5% ethanol; (7) 0 mM MgCl₂ and 200 mM KCl; (8) 10 mM MgCl₂ and 200 mM KCl; (9) 0 mM MgCl₂ and 200 mM NaCl; and (10) 10 mM MgCl₂ and 200 mM NaCl. Buffers with MnCl₂ were attempted but the manganese saturated the detector.

ROOL and GOLLD were folded at 1 µM. The samples were diluted 10× prior to taking data. On the stage the samples were further diluted (10 µl buffer:10 µl sample) for a final concentration of 50 nM.

DLS of RNA nanocages

RNA was folded at 30 ng µl⁻¹ using the standard folding protocol. DLS traces were collected using the Prometheus Panta. Two replicates (2 capillaries of 10 µl volume, NanoTemper PR-C002) for each RNA were obtained. DLS data of 10× 5 s acquisitions per capillary with laser power 100% were obtained using PR.PantaControl v.1.8.0. The auto-correlation function was calculated and size distribution was fitted using default parameters in PR.PantaAnalysis v.1.8.0. The resulting size distribution tables and plotting code can be found in the accompanying GitHub repository.

Cryo-electron microscopy grid preparation

For all samples, the RNA was frozen using a VitroBot Mark IV, using no. 542 filter paper and Quantifoil 1.2/1.3 200 mesh copper grids which were glow discharged for 30 s at 15 mA. GOLLD was folded at 8 µM, using the standard folding conditions except, after the 50 °C incubation, the temperature was lowered to 37 °C at a rate of 0.1 °C s⁻¹, held at 37 °C for 2 min, and then reduced to room temperature at a rate of 0.1 °C s⁻¹. To increase concentration of GOLLD in the ice, 4 cycles of applying 2 µl of sample and blotting for 3 s were performed before plunging. ROOL was folded at 9.1 µM with the standard folding protocol. The grid was coated with 2 µl of 100 mM NaCl which was blotted for 3 s. Then, 2 µl sample was immediately applied to the grid and blotted for 3 s before plunging into liquid ethane. OLE and raiA motif RNA were frozen with the standard folding protocol at 20 µM and 15 µM respectively; 2 µl of sample was applied to the grid, followed by 3 s blot and plunge into liquid ethane.

Cryo-electron microscopy data collection

All datasets were collected on Titan Krios G3 microscopes using a 50 μm C2 aperture and 100 μm objective aperture and EPU software (v.3.5). The OLE dataset was collected using a Falcon 4 camera with a 10 eV slit on a Selectris energy filter, while the other datasets were collected using a K3 camera with a 20 eV slit on a Bio Quantum energy filter and EPU software. Additional information on dose, magnification, and data collected for each RNA can be found in Extended Data Table 1.

Cryo-electron microscopy data processing

Data were processed live using CryoSparc (v.4.5.3)⁴⁹ and then further refined, including non-uniform refinement⁵⁰. For OLE and raiA motif per particle motion correction was performed⁵¹. For all datasets, symmetry was not applied until final refinement stages. For OLE, C₂ symmetry was applied. For ROOL and GOLLD, D₄ and D₇ symmetry were applied, respectively, followed by symmetry expansion of the particles and local refinement for one asymmetric unit. Finally, for GOLLD and ROOL subdomains of one asymmetric unit were locally refined and composite maps, and half-maps were created for one asymmetric unit and then composited to the full symmetry using phenix.combine_focused_maps (v.1.21)⁵². Local resolution was estimated using CryoSparc. See Extended Data Figs. 1–5 for more details on processing pipelines.

Modelling

Maps were sharpened using phenix.auto_sharpen with half-maps (v.1.21). Initial models for a monomer were obtained from ModelAngelo (Relion-5.0)⁵³; because current versions of ModelAngelo cannot be run on a pure RNA structure, EMDB-17659 was added to the corner of the map, the corresponding protein sequence (Protein Data Bank (PDB): 8PHE) was provided, and protein residues were subsequently deleted from the model. The RNA modelled chains were manually combined tracing the RNA sequences, adding and mutating residues when necessary (in particular, C to U mutations were commonly required). Manual model correction and refinement was accomplished in Coot (version 0.9.8)⁵⁴. Manual refinement of the monomer was performed using Isolde and Coot⁵⁴. Symmetry was applied to the model, from henceforth refinement was done asymmetrically due to limitations in refinement programs. Intermolecular interactions were analysed by hand and corrected using Isolde⁵⁵ and Coot⁵⁴. DRRAFTER⁵⁶ (Rosetta 3.10 (2020.42)) was used to fill in low resolution areas. For symmetric kissing loops, these models were selected and fit into the map and refined more symmetrically by hand using Isolde⁵⁵. Final refinement was first run through phenix.real_space_refine⁵⁷ followed by piecewise corrections using ERRASER2⁵⁸ (Rosetta 3.10 (2020.42)), followed by manual refinement in Coot⁵⁴ and Isolde⁵⁵ when necessary. A protocol was created to enable refinement on the large ROOL and GOLLD complexes. First, the monomer was refined in ~30 sections splitting the model and map prior to using ERRASER2. These were then stitched together and regions encompassing the stitch sites were further refined. Finally, problematic regions of the monomer were refined further. Symmetry was applied to the monomer and the interaction sites were refined in parallel until interactions were sufficiently realistic with only minor clashes. Throughout, split points were manually edited if they caused minimization errors or to include interaction residues. The following ERRASER2 command was used, repeating if not yet converged:

$ERRASER -s $PDB -edensity:mapfile $MAP -edensity::mapreso $RESOLUTION -score:weights stepwise/rna/rna_res_level_energy7beta.wts -set_weights elec_dens_fast 40 cart_bonded 5.0 linear_chainbreak 10.0 chainbreak 10.0 fa_rep 1.5 fa_intra_rep 0.5 rna_torsion 10 suiteness_bonus 5 rna_sugar_close 10 -rmsd_screen 3.0 -mute core.scoring.CartesianBondedEnergy core.scoring.electron_density.xray_scattering -rounds 3 -fasta $FASTA -cryoem_scatterers -rna:erraser:fixed_res $FIXED.

Validation metrics were calculated using Phenix, including phenix.rna_validate^59,60,61. ChimeraX (version 1.8)⁶² was used to calculate Q-score⁶¹ and for all visuals.

Base pairing and base stacking were identified using Rosetta rna_motif⁶³. Kink turns and ribose zippers were identified using DSSR with the “–k-turns” flag (v.1.9.9)⁶⁴. Z-anchors were manually labelled by aligning every 5-nt range of each structure to a representative Z-anchor (4E8Q residues 108–111) and manually inspected each region that had root mean squared deviation (rmsd) < 4 Å. Secondary structure was drawn using RiboDraw with manual manipulation⁶³. For visualizing a hypothetical OLE RNA–protein complex, AlphaFold 3 (server version)⁴⁷ was used to predict: (1) a OapA dimer with a OapC monomer; and (2) RpsU using the sequences in (Supplementary Table 4). The OapA dimer was fitted into the proposed RNA site manually. The OapC was close to its presumed binding site, but clashed with RNA and therefore its position was manually adjusted. RpsU was also placed manually in its proposed binding site. C₂ symmetry was then applied to visualize the full complex.

Bioinformatic analysis

Bacterial genomes were downloaded from National Center for Biotechnology Information (NCBI) Genome database in February 2024 (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/). GenBank records for phage genomes were downloaded in March 2024 (https://millardlab.org/bacteriophage-genomics/phage-genomes-march-2024/). Sequence profiles of GOLLD, ROOL and OLE were downloaded from the Rfam database (ftp.ebi.ac.uk/pub/databases/Rfam/) on March 2024. A custom sequence profile of raiA motif was built using the reported alignment⁶³. To retrieve ncRNAs from genome sequences, cmsearch was conducted using sequence profiles with a cutoff value of 10⁻⁵ (Infernal 1.15)⁶⁵. The overall procedure yielded the following numbers of nonredundant ncRNA sequences: 806, GOLLD; 1,596, ROOL; 8,585, OLE; 4,875, raiA motif.

The Infernal software⁶⁵ (v.1.1.2) was used to compare candidate RNA structures against Rfam models (cmscan), build and calibrate new covariance models (cmbuild, cmcalibrate) for separate clusters of RNAs, and perform structure-informed homology searches (cmsearch). Comparative analysis and multiple alignments for isolated RNA candidates were conducted using cmalign⁶⁵ and MUSCLE (v.5)⁶⁶ with pairwise comparisons refined using the OWEN program⁶⁷. Evolutionary history was inferred via the Maximum Likelihood method with different models in MEGA X⁶⁸. Evolutionary analysis of compensatory substitutions in isolated clusters was performed by DecipherSSC⁶⁹.

RNAalifold (from ViennaRNA 2.7.0)⁷⁰ applied to computationally fold multiple RNA alignments, and Afold/Hybrid^71,72 were used to predict locally folded secondary structures or hybrid duplex elements within clusters. Covariation analysis was performed with R-scape (v.1.2.3)⁷³, which annotates multiple structural alignments of RNAs using statistically significant covariations (E-value < 0.05) as base-pairing constraints.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The cryo-EM micrographs and particles, cryo-EM maps and model coordinates have been made available on Electron Microscopy Public Image Archive (EMPIAR), Electron Microscopy Data Bank (EMDB) and Protein Data Bank (PDB), respectively (raiA motif: EMPIAR-12706, EMD-48162 and 9ELY; OLE: EMPIAR-12707, EMD-48163 and 9MCW; ROOL: EMPIAR-12708, EMD-48179 and 9MDS; GOLLD: EMPIAR-12709, EMDB-48214 and 9MEE). Bioanalyzer, DLS and mass photometry data are presented in Supplementary Data 2.

Code availability

Custom scripts can be found at https://github.com/DasLab/RNA_multimer_2024.

References

Weinberg, Z. et al. Detection of 224 candidate structured RNAs by comparative analysis of specific subsets of intergenic regions. Nucleic Acids Res. 45, 10811–10823 (2017).
Article PubMed PubMed Central CAS Google Scholar
Puerta-Fernandez, E., Barrick, J. E., Roth, A. & Breaker, R. R. Identification of a large noncoding RNA in extremophilic eubacteria. Proc. Natl Acad. Sci. USA 103, 19490–19495 (2006).
Article ADS PubMed PubMed Central CAS Google Scholar
Weinberg, Z., Perreault, J., Meyer, M. M. & Breaker, R. R. Exceptional structured noncoding RNAs revealed by bacterial metagenome analysis. Nature 462, 656–659 (2009).
Article ADS PubMed PubMed Central CAS Google Scholar
Narunsky, A. et al. The discovery of novel noncoding RNAs in 50 bacterial genomes. Nucleic Acids Res. 52, 5152–5165 (2024).
Article PubMed PubMed Central Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article ADS Google Scholar
Nemeth, K., Bayraktar, R., Ferracin, M. & Calin, G. A. Non-coding RNAs in disease: from mechanisms to therapeutics. Nat. Rev. Genet. 25, 211–232 (2024).
Article PubMed CAS Google Scholar
Eddy, S. R. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919–929 (2001).
Article PubMed CAS Google Scholar
Stav, S. et al. Genome-wide discovery of structured noncoding RNAs in bacteria. BMC Microbiol. 19, 66 (2019).
Article PubMed PubMed Central Google Scholar
Chen, Y. et al. Hovlinc is a recently evolved class of ribozyme found in human lncRNA. Nat. Chem. Biol. 17, 601–607 (2021).
Article ADS PubMed CAS Google Scholar
Ontiveros-Palacios, N. et al. Rfam 15: RNA families database in 2025. Nucleic Acids Res. 53, D258–D267 (2025).
Chen, A. G. Functional Investigation of Ribozymes and Ribozyme Candidates in Viruses, Bacteria and Eukaryotes. PhD dissertation, Yale University (2014).
Cousin, F. J. et al. A long and abundant non-coding RNA in Lactobacillus salivarius. Microb. Genomics 3, e000126 (2017).
Article Google Scholar
Jones, C. P. & Ferré-D’Amaré, A. R. RNA quaternary structure and global symmetry. Trends Biochem. Sci. 40, 211–220 (2015).
Article PubMed PubMed Central CAS Google Scholar
Lyon, S. E., Wencker, F. D. R., Fernando, C. M., Harris, K. A. & Breaker, R. R. Disruption of the bacterial OLE RNP complex impairs growth on alternative carbon sources. PNAS Nexus 3, gae075 (2024).
Article Google Scholar
Breaker, R. R., Harris, K. A., Lyon, S. E., Wencker, F. D. R. & Fernando, C. M. Evidence that OLE RNA is a component of a major stress-responsive ribonucleoprotein particle in extremophilic bacteria. Mol. Microbiol. 120, 324–340 (2023).
Article PubMed CAS Google Scholar
Wallace, J. G., Zhou, Z. & Breaker, R. R. OLE RNA protects extremophilic bacteria from alcohol toxicity. Nucleic Acids Res. 40, 6898–6907 (2012).
Article PubMed PubMed Central CAS Google Scholar
Lyon, S. E., Harris, K. A., Odzer, N. B., Wilkins, S. G. & Breaker, R. R. Ornate, large, extremophilic (OLE) RNA forms a kink turn necessary for OapC protein recognition and RNA function. J. Biol. Chem. 298, 102674 (2022).
Article PubMed PubMed Central CAS Google Scholar
Widner, D. L., Harris, K. A., Corey, L. & Breaker, R. R. Bacillus halodurans OapB forms a high-affinity complex with the P13 region of the noncoding RNA OLE. J. Biol. Chem. 295, 9326–9334 (2020).
Article PubMed PubMed Central CAS Google Scholar
Yang, Y., Harris, K. A., Widner, D. L. & Breaker, R. R. Structure of a bacterial OapB protein with its OLE RNA target gives insights into the architecture of the OLE ribonucleoprotein complex. Proc. Natl Acad. Sci. USA 118, e2020393118 (2021).
Article PubMed PubMed Central CAS Google Scholar
Fernando, C. M. & Breaker, R. R. Bioinformatic prediction of proteins relevant to functions of the bacterial OLE ribonucleoprotein complex. mSphere 9, e0015924 (2024).
Article PubMed Google Scholar
Harris, K. A., Zhou, Z., Peters, M. L., Wilkins, S. G. & Breaker, R. R. A second RNA-binding protein is essential for ethanol tolerance provided by the bacterial OLE ribonucleoprotein complex. Proc. Natl Acad. Sci. USA 115, E6319–E6328 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Block, K. F., Puerta-Fernandez, E., Wallace, J. G. & Breaker, R. R. Association of OLE RNA with bacterial membranes via an RNA-protein interaction. Mol. Microbiol. 79, 21–34 (2011).
Article PubMed CAS Google Scholar
Nölling, J. et al. Genome sequence and comparative analysis of the solvent-producing bacterium Clostridium acetobutylicum. J. Bacteriol. 183, 4823–4838 (2001).
Article PubMed PubMed Central Google Scholar
Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).
Article PubMed PubMed Central CAS Google Scholar
Frank, J. et al. A model of the translational apparatus based on a three-dimensional reconstruction of the Escherichia coli ribosome. Biochem. Cell Biol. 73, 757–765 (1995).
Article PubMed CAS Google Scholar
Huang, L. & Lilley, D. M. J. A quasi-cyclic RNA nano-scale molecular object constructed using kink turns. Nanoscale 8, 15189–15195 (2016).
Article PubMed PubMed Central CAS Google Scholar
Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz, T. A. The kink-turn: a new RNA secondary structure motif. EMBO J. 20, 4214–4221 (2001).
Article PubMed PubMed Central CAS Google Scholar
Hess, M. et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–467 (2011).
Article ADS PubMed CAS Google Scholar
Morgado, S., Antunes, D., Caffarena, E. & Vicente, A. C. The rare lncRNA GOLLD is widespread and structurally conserved among Mycobacterium tRNA arrays. RNA Biol. 17, 1001–1008 (2020).
Article PubMed PubMed Central CAS Google Scholar
Yooseph, S. et al. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 5, e16 (2007).
Article PubMed PubMed Central Google Scholar
Hao, C. et al. Construction of RNA nanocages by re-engineering the packaging RNA of Phi29 bacteriophage. Nat. Commun. 5, 3890 (2014).
Article ADS PubMed CAS Google Scholar
Liu, D., Thélot, F. A., Piccirilli, J. A., Liao, M. & Yin, P. Sub-3-Å cryo-EM structure of RNA enabled by engineered homomeric self-assembly. Nat. Methods 19, 576–585 (2022).
Article PubMed CAS Google Scholar
Jaeger, L. & Chworos, A. The architectonics of programmable RNA and DNA nanostructures. Curr. Opin. Struct. Biol. 16, 531–543 (2006).
Article PubMed CAS Google Scholar
Dubois, N., Marquet, R., Paillart, J.-C. & Bernacchi, S. Retroviral RNA dimerization: From structure to functions. Front. Microbiol. 9, 527 (2018).
Article PubMed PubMed Central Google Scholar
Ding, F. et al. Structure and assembly of the essential RNA ring component of a viral DNA packaging motor. Proc. Natl Acad. Sci. USA 108, 7357–7362 (2011).
Article ADS PubMed PubMed Central CAS Google Scholar
Simpson, A. A. et al. Structure of the bacteriophage phi29 DNA packaging motor. Nature 408, 745–750 (2000).
Article ADS PubMed PubMed Central CAS Google Scholar
Xu, J., Wang, D., Gui, M. & Xiang, Y. Structural assembly of the tailed bacteriophage ϕ29. Nat. Commun. 10, 2366 (2019).
Article ADS PubMed PubMed Central Google Scholar
Soares, L. W., King, C. G., Fernando, C. M., Roth, A. & Breaker, R. R. Genetic disruption of the bacterial raiA motif noncoding RNA causes defects in sporulation and aggregation. Proc. Natl Acad. Sci. USA 121, e2318008121 (2024).
Article PubMed PubMed Central CAS Google Scholar
Badepally, N. G., de Moura, T. R., Purta, E., Baulin, E. F. & Bujnicki, J. M. Cryo-EM structure of raiA ncRNA from Clostridium reveals a new RNA 3D fold. J. Mol. Biol. 436, 168833 (2024).
Article PubMed CAS Google Scholar
Haack, D. B. et al. Scaffold-enabled high-resolution cryo-EM structure determination of RNA. Nat. Commun. 16, 880 (2025).
Article PubMed PubMed Central CAS Google Scholar
Hirano, S. et al. Structure of the OMEGA nickase IsrB in complex with ωRNA and target DNA. Nature 610, 575–581 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Wang, L. et al. Cryo-EM reveals mechanisms of natural RNA multivalency. Science 388, 545–550 (2025).
Article PubMed CAS Google Scholar
Zheng, L. et al. Cryo-EM structures of human SID-1 transmembrane family proteins and implications for their low-pH-dependent RNA transport activity. Cell Res. 34, 80–83 (2024).
Article PubMed CAS Google Scholar
Hirano, Y. et al. Cryo-EM analysis reveals human SID-1 transmembrane family member 1 dynamics underlying lipid hydrolytic activity. Commun. Biol. 7, 664 (2024).
Article PubMed PubMed Central CAS Google Scholar
Qian, D. et al. Structural insight into the human SID1 transmembrane family member 2 reveals its lipid hydrolytic activity. Nat. Commun. 14, 3568 (2023).
Article ADS PubMed PubMed Central CAS Google Scholar
Greening, C. & Lithgow, T. Formation and function of bacterial organelles. Nat. Rev. Microbiol. 18, 677–689 (2020).
Article PubMed CAS Google Scholar
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Article ADS PubMed PubMed Central CAS Google Scholar
Li, Y., Struwe, W. B. & Kukura, P. Single molecule mass photometry of nucleic acids. Nucleic Acids Res. 48, e97 (2020).
Article PubMed PubMed Central CAS Google Scholar
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
Article PubMed CAS Google Scholar
Punjani, A., Zhang, H. & Fleet, D. J. Non-uniform refinement: adaptive regularization improves single-particle cryo-EM reconstruction. Nat. Methods 17, 1214–1221 (2020).
Article PubMed CAS Google Scholar
Rubinstein, J. L. & Brubaker, M. A. Alignment of cryo-EM movies of individual particles by optimization of image translations. J. Struct. Biol. 192, 188–195 (2015).
Article PubMed Google Scholar
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D 75, 861–877 (2019).
Article ADS CAS Google Scholar
Jamali, K. et al. Automated model building and protein identification in cryo-EM maps. Nature 628, 450–457 (2024).
Article ADS PubMed PubMed Central CAS Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of coot. Acta Crystallogr. D 66, 486–501 (2010).
Article ADS PubMed PubMed Central CAS Google Scholar
Croll, T. I. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. D 74, 519–530 (2018).
Article ADS CAS Google Scholar
Kappel, K. et al. De novo computational RNA modeling into cryo-EM maps of large ribonucleoprotein complexes. Nat. Methods 15, 947–954 (2018).
Article PubMed PubMed Central CAS Google Scholar
Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D 74, 531–544 (2018).
Article ADS CAS Google Scholar
Chou, F.-C., Sripakdeevong, P., Dibrov, S. M., Hermann, T. & Das, R. Correcting pervasive errors in RNA crystallography through enumerative structure prediction. Nat. Methods 10, 74–76 (2013).
Article PubMed CAS Google Scholar
Williams, C. J. et al. MolProbity: more and better reference data for improved all-atom structure validation. Protein Sci. 27, 293–315 (2018).
Article PubMed CAS Google Scholar
Afonine, P. V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr. D 74, 814–840 (2018).
Article ADS CAS Google Scholar
Richardson, J. S. et al. RNA backbone: consensus all-angle conformers and modular string nomenclature (an RNA Ontology Consortium contribution). RNA 14, 465–481 (2008).
Article PubMed PubMed Central CAS Google Scholar
Meng, E. C. et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 32, e4792 (2023).
Article PubMed PubMed Central CAS Google Scholar
Das, R. & Watkins, A. M. RiboDraw: semiautomated two-dimensional drawing of RNA tertiary structure diagrams. NAR Genomics Bioinform. 3, lqab091 (2021).
Article Google Scholar
Lu, X.-J., Bussemaker, H. J. & Olson, W. K. DSSR: an integrated software tool for dissecting the spatial structure of RNA. Nucleic Acids Res. 43, e142 (2015).
PubMed PubMed Central Google Scholar
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Article PubMed PubMed Central CAS Google Scholar
Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
Article PubMed PubMed Central Google Scholar
Ogurtsov, A. Y., Roytberg, M. A., Shabalina, S. A. & Kondrashov, A. S. OWEN: aligning long collinear regions of genomes. Bioinformatics 18, 1703–1704 (2002).
Article PubMed CAS Google Scholar
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
Article PubMed PubMed Central CAS Google Scholar
Spiridonov, A. N. & Shabalina, S. A. Deciphering structural selective constraints: a comparative evolutionary analysis of RNA hairpin structures. In Computational Methods in Systems Biology: 22nd International Conference Proceedings (eds Gori, R. et al.) 196–208 (Springer, 2024).
Bernhart, S. H., Hofacker, I. L., Will, S., Gruber, A. R. & Stadler, P. F. RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics 9, 474 (2008).
Article PubMed PubMed Central Google Scholar
Ogurtsov, A. Y., Shabalina, S. A., Kondrashov, A. S. & Roytberg, M. A. Analysis of internal loops within the RNA secondary structure in almost quadratic time. Bioinformatics 22, 1317–1324 (2006).
Article PubMed CAS Google Scholar
Kondrashov, A. S. & Shabalina, S. A. Classification of common conserved sequences in mammalian intergenic regions. Hum. Mol. Genet. 11, 669–674 (2002).
Article PubMed CAS Google Scholar
Rivas, E., Clements, J. & Eddy, S. R. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods 14, 45–48 (2017).
Article PubMed CAS Google Scholar
Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745 (2003).
Article PubMed CAS Google Scholar
Young, G. et al. Quantitative mass imaging of single biological macromolecules. Science 360, 423–427 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

The authors thank Stanford Research Computing Center, SLAC Shared Scientific Data Facility, the Stanford-SLAC Cryo-EM Center, the Stanford Chem-H Macromolecular Structure Knowledge Center and the Stanford Biochemistry administrative staff for resources and support that contributed to this research. This work was supported by Stanford Bio-X (Bowes Graduate Student Fellowship to R.C.K.), the National Institute for Health (R35 GM122579 to R.D. and Common Fund Transformative High-Resolution Cryo-Electron Microscopy program U24 GM129541 to W.C.), Howard Hughes Medical Institute (HHMI) (to R.D.), the National Science Foundation (Grant No. 2330652 to R.D. and W.C.), and the G. Harold & Leila Y. Mathers Foundation (MF-2303-04116 to A.G.). S.A.S. and E.V.K. report funding from the Intramural Research Program of the National Institutes of Health of the United States of America (National Library of Medicine). This article is subject to HHMI’s Open Access to Publications policy. HHMI lab heads have previously granted a nonexclusive CC BY 4.0 license to the public and a sublicensable license to HHMI in their research articles. Pursuant to those licenses, the author-accepted manuscript of this article can be made freely available under a CC BY 4.0 license immediately upon publication.

Author information

Authors and Affiliations

Biophysics Program, Stanford University, Stanford, CA, USA
Rachael C. Kretsch, Alex Gao, Wah Chiu & Rhiju Das
Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
Yuan Wu & Rhiju Das
Computational Biology Branch, Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Svetlana A. Shabalina & Eugene V. Koonin
Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
Hyunbin Lee, Alex Gao & Rhiju Das
Division of CryoEM and Bioimaging, SSRL–SLAC National Accelerator Laboratory, Menlo Park, CA, USA
Grace Nye & Wah Chiu
Department of Bioengineering and James Clark Center, Stanford University, Stanford, CA, USA
Wah Chiu
Department of Microbiology and Immunology, Stanford University, Stanford, CA, USA
Wah Chiu

Authors

Rachael C. Kretsch
View author publications
Search author on:PubMed Google Scholar
Yuan Wu
View author publications
Search author on:PubMed Google Scholar
Svetlana A. Shabalina
View author publications
Search author on:PubMed Google Scholar
Hyunbin Lee
View author publications
Search author on:PubMed Google Scholar
Grace Nye
View author publications
Search author on:PubMed Google Scholar
Eugene V. Koonin
View author publications
Search author on:PubMed Google Scholar
Alex Gao
View author publications
Search author on:PubMed Google Scholar
Wah Chiu
View author publications
Search author on:PubMed Google Scholar
Rhiju Das
View author publications
Search author on:PubMed Google Scholar

Contributions

R.C.K., W.C., and R.D. conceptualized and designed the study. R.C.K. and G.N. selected sequences for the study. Y.W. and R.C.K. performed in vitro RNA transcription and collected DLS and mass photometry data. R.C.K. froze cryo-EM grids, collected and processed cryo-EM data, and modelled the cryo-EM maps with advice from W.C. and R.D. G.N. screened cryo-EM grids. H.L. and A.G. generated sequence alignments and S.A.S., R.C.K., R.D. and E.V.K. analysed these sequences for covariation. R.C.K., S.A.S. and R.D. prepared the manuscript with input from all authors.

Corresponding authors

Correspondence to Wah Chiu or Rhiju Das.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Anna Pyle, Jane Richardson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Cryo-EM data processing workflow for OLE dimer.

(a-e) OLE resolves into a high resolution dimer, even in the absence of protein. (a) Data processing flowchart for the OLE dimer. (b) Fourier shell correlation (FSC) plot for final refinement of OLE dimer. (c) Plot of particle number against the reciprocal squared resolution for OLE dimer. The B-factor was calculated as twice the linearly fitted slope⁷⁴. (d) Local resolution of the OLE dimer on the cryo-EM map (top) and the molecular model (bottom). (e) Resolvability of the built model of the OLE dimer as measured by Q-score. The black line is the mean across all chains, with the maximum and minimum values depicted in light grey (N = 2 chains). The expected Q-score at this resolution⁷² is labeled with a blue dotted line.

Extended Data Fig. 2 Cryo-EM data processing workflow for ROOL nanocage complex.

(a) Data processing flowchart. (b) Fourier shell correlation (FSC) plots of the single subunit local refinement. (c) Plot of particle number against the reciprocal squared resolution for the single subunit local refinement. The B-factor was calculated as twice the linearly fitted slope⁷⁴. (d) Local resolution on the cryo-EM map (right) and the molecular model (left). (e) Resolvability of the built model as measured by Q-score. The black line is the mean across all chains, with the maximum and minimum values depicted in light grey (N = 8 chains). The expected Q-score at this resolution⁷² is labeled with a blue dotted line.

Extended Data Fig. 3 Cryo-EM data processing workflow for GOLLD nanocage complex.

(a) Data processing flowchart. (b-c) Fourier shell correlation (FSC) plots for the local refinement of the 5′ and 3′ domains respectively. (d-e) Plots of particle number against the reciprocal squared resolution for the local refinement of the 5′ and 3′ domains respectively. The B-factor was calculated as twice the linearly fitted slope⁷⁴. (f) Local resolution on the cryo-EM map (left) and the molecular model (right). (g) Resolvability of the built model as measured by Q-score. The black line is the mean across all chains, with the maximum and minimum values depicted in light grey (N = 14 chains). The expected Q-score at this resolution⁷² is labeled with a blue dotted line.

Extended Data Fig. 4 Tertiary structure of raiA RNA motif.

(a) Global view of tertiary structure of raiA motif and 2.9 Å cryo-EM map coloured by as labeled in the secondary structure, (b). (c-f) Select tertiary interactions. Description can be found in Supplemental Text 1. The sharpened cryo-EM map is displayed at the following contours (a): 8 σ, (c,e,f): 16 σ, (d): 20 σ.

Extended Data Fig. 5 Cryo-EM data processing workflow for raiA motif.

(a) Data processing flowchart. (b) Representative micrograph (10,825 micrographs total) and 2D class averages. (c) Fourier shell correlation (FSC) plot. (d) Plot of particle number against the reciprocal squared resolution. The B-factor was calculated as twice the linearly fitted slope⁷⁴. (e) Local resolution on the cryo-EM map (top) and the molecular model (bottom). (f) Resolvability of the built model as measured by Q-score. The expected Q-score of a RNA model at this resolution is labeled with a blue dotted line.

Extended Data Fig. 6 Cryo-EM data of HEARO RNA without protein shows disorder.

HEARO did not resolve into a high resolution structure, despite similar amount and quality of data as OLE-dimer. (a) The representative micrograph (8,294 micrographs total) of HEARO shows clear particles. (b) Select 2D class averages show that HEARO is forming RNA helices, but they have diverse orientations and are blurred, suggesting high flexibility. (c) 3D reconstructions of HEARO, overlaid with the known structure of this RNA in the OMEGA nickase complex bound to protein IsrB (PDB: 8DMB⁴¹), show RNA of a similar fold to the complexed RNA. Multiple conformations are reconstructed, but with poorly resolved features, suggesting that HEARO may not form an atomically ordered structure when not in complex with its partner proteins.

Extended Data Fig. 7 Evidence of multimer formation of GOLLD, ROOL, and OLE in biologically relevant concentrations.

(a) Agilent Bioanalyzer traces demonstrate the purity of the samples. The second peak for OLE is a common artifact of poor denaturation of sample in Bioanalyzer traces. The pure monomeric reading in mass photometry, (b), shows that this peak is likely not a covalently linked dimer. (b) Mass of GOLLD, ROOL, and OLE complexes as obtained from mass photometry at 50 nM, 50 nM, and 12.5 nM respectively. The data is a histogram of particle count density, normalized per sample, where dark is many counts, white is none. Total particle counts are shown above the graph. (c) Hydrodynamic radius of GOLLD and ROOL complexes as derived from dynamic light scattering at 110 nM and 140 nM respectively. The data are plotted as relative population density, normalized by density per sample, with dark representing highly populated radius values. The temperature of the sample was raised from 25 °C to 75 °C and dynamic light scattering traces were obtained every 10 °C, showing complex melting into monomers at 65 °C and aggregation at high temperatures. (d) Representative ratiometric image for all mass photometry data (1 frame from a 60 s collection at 331 Hz). (e) Mass photometry data of OLE in different buffer conditions demonstrates OLE can dimerize at low RNA concentration, low magnesium concentration, and in the absence of magnesium with sufficient monovalent cations. (f) The mass photometric data is summarized by counting the amount of hits in the monomer, dimer, and high stoichiometry peaks. The absolute ratio of monomer:dimer is accurate as assessed in (g-h). (g) Mass photometry traces of mixtures of ROOL and GOLLD, ratiometric image examples can be found in. (h) Summary of the mixture results, with the known complex ratio plotted against the ratio reported by mass photometry. There is agreement, but with slight bias towards higher counts for the smaller species, ROOL, opposite of the previously observed trend⁷⁵.

Extended Data Fig. 8 Comparative and covariation sequence analysis of homo-oligomer forming RNAs.

(a-c) Distributions of covariation scores in multiple sequence alignments of (a) OLE, (b), ROOL, and (c) GOLLD sequences with select stems labeled. Dot size is proportional to the covariation score. In blue the consensus base pairs are depicted; in green, the consensus base pairs that show significant covariation are shown; in orange, other pairs that have significant covariation were depicted, they are not part of the consensus secondary structure but are compatible with it; in black, other significant pairs are depicted. Positions are relative to the original input alignment (before any gapped column is removed). (d-h) Examples of multiple alignments and profiles of sequence identity of selected stable hairpins with highly conserved loops which are involved in the intermolecular interactions are shown. Nucleotides involved in intermolecular interactions are labeled as in main Figs. 1, 2 and 3 for the RNAs OLE (B1, B3), ROOL (B6), and GOLLD (B6, B8) respectively, and highlighted with an orange box. A coloring scheme for highlighting the mutational pattern with respect to the secondary structure (folding) was used and can be found next to (d). If one predicted base-pair is formed by several different combinations of nucleotides, consistent or compensatory mutations have taken place. This is indicated by different colors. Pale colors indicate that a base pair cannot be formed in some sequences of the alignment. The sequence variants for the examples were selected from the closest branches of the evolutionary trees built based on the multiple sequence alignments used for the covariation analysis.

Extended Data Table 1 Cryo-EM experiments on four large non-coding RNAs

Full size table

Supplementary information

Supplementary Information

This file contains Supplementary Text 1 and 2, Supplementary Tables 1, 2 and 4, and additional references.

Reporting Summary

Supplementary Table 3

Summary of the nucleotide–nucleotide covariations identified in the OLE, ROOL and GOLLD alignments.

Supplementary Data 1

This zipped folder contains Supplementary Files 1–3 in Stockholm format. Supplementary File 1 1: The multiple sequence alignment of OLE. Supplementary File 2: The multiple sequence alignment of ROOL. Supplementary File 3: The multiple sequence alignment of GOLLD

Supplementary Data 2

Source Data for Extended Data Fig. 7. The Bioanalyzer tables for OLE, ROOL, GOLLD and raiA contain the intensity for the ladder used and the sample. The band location, in nucleotide units was calculated by linearly fitting the inverse migration time of the 7 highest intensity peaks to the known length of the ladder components (25, 200, 500, 1,000, 2,000, 4,000 and 6,000 nt), as is standard. The DLS table show the average intensity across ten acquisitions and two replicates. Each column represents the DLS trace at a given temperature: data was collected at 25 °C and temperature was increased by 10 °C until 75 °C. The mass photometry table lists every event recorded for each sample. The size of the RNA was calibrated using Millenium RNA Ladder (Ambion AM7150).

Peer Review File

Supplementary Video 1

Structure of OLE dimer. The overall topology of the OLE dimer is displayed with the regions in Fig. 1 highlighted.

Supplementary Video 2

Structure of ROOL nanocage. The overall topology of the ROOL nanocage is displayed with the regions in Fig. 2 highlighted.

Supplementary Video 3

Structure of GOLLD nanocage. The overall topology of the GOLLD nanocage is displayed with the regions in Fig. 3 highlighted.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kretsch, R.C., Wu, Y., Shabalina, S.A. et al. Naturally ornate RNA-only complexes revealed by cryo-EM. Nature 643, 1135–1142 (2025). https://doi.org/10.1038/s41586-025-09073-0

Download citation

Received: 08 December 2024
Accepted: 25 April 2025
Published: 06 May 2025
Issue date: 24 July 2025
DOI: https://doi.org/10.1038/s41586-025-09073-0

This article is cited by

Cryo-EM structure of a natural RNA nanocage
- Xiaobin Ling
- Dmitrij Golovenko
- Wenwen Fang
Nature (2025)
Knowing when to fold ’em
- Michael Eisenstein
Nature Methods (2025)