Abstract
Diversity-generating retroelements (DGRs) create massive protein sequence variation (up to 1030)1 in ecologically diverse microorganisms. A recent survey identified around 31,000 DGRs from more than 1,500 bacterial and archaeal genera, constituting more than 90 environment types2. DGRs are especially enriched in the human gut microbiome2,3 and nano-sized microorganisms that seem to comprise most microbial life and maintain DGRs despite reduced genomes4,5. DGRs are also implicated in the emergence of multicellularity6,7. Variation occurs during reverse transcription of a protein-encoding RNA template coupled to misincorporation at adenosines. In the prototypical Bordetella bacteriophage DGR, the template must be surrounded by upstream and downstream RNA segments for complementary DNA synthesis to be carried out by a complex of the DGR reverse transcriptase bRT and associated protein Avd. The function of the surrounding RNA was unknown. Here we show through cryogenic electron microscopy that this RNA envelops bRT and lies over the barrel-shaped Avd, forming an intimate ribonucleoprotein. An abundance of essential interactions in the ribonucleoprotein precisely position an RNA homoduplex in the bRT active site for initiation of reverse transcription. Our results explain how the surrounding RNA primes complementary DNA synthesis, promotes processivity, terminates polymerization and strictly limits mutagenesis to specific proteins through mechanisms that are probably conserved in DGRs belonging to distant taxa.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The atomic coordinates and sharpened cryo-EM and half maps of bRT–Avd/RNAΔ98 in Active G:dCTP conformation (Extended Data Table 1; Complex 1) have been deposited in the PDB (https://www.rcsb.org/structure/) with code 8UB7 and in the EMDB (https://www.ebi.ac.uk/emdb/) with code EMD-42077, respectively; Active A:Empty conformation (Complex 2) with codes PDB 8UBB and EMD-42081, respectively; Active G:Empty conformation (Complex 3) with codes PDB 8UB9 and EMD-42079, respectively; Resting conformation (Complex 4) with codes PDB 8UBC and EMD-42082, respectively; Resting conformation (Complex 5) with codes PDB 8UBE and EMD-42084, respectively; Resting conformation (Complex 6) with codes PDB 8UBF and EMD-42085, respectively; Pre-Active 1 conformation (Complex 7) with codes PDB 8UBA and EMD-42080; Pre-Active 1 conformation (Complex 8) with codes PDB 8UB8 and EMD-42078; and Pre-Active 2 conformation (Complex 9) with codes PDB 8UBD and EMD-42083. Coordinates for Methylobacterium extorquens formaldehyde activating enzyme (PDB 1Y60), RNA-dependent RNA polymerase (PDB 3OL9), group II intron RT GsI-IIC (PDB 6AR1) and Avd (PDB 4DWL) are available in the PDB with the codes indicated.
References
Wu, L. et al. Diversity-generating retroelements: natural variation, classification and evolution inferred from a large-scale genomic survey. Nucleic Acids Res. 46, 11–24 (2018).
Roux, S. et al. Ecology and molecular targets of hypermutation in the global microbiome. Nat. Commun. 12, 3076 (2021).
Nayfach, S. et al. Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome. Nat. Microbiol. 6, 960–970 (2021).
Paul, B. G. et al. Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea. Nat. Microbiol. 2, 17045 (2017).
Brown, C. T. et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature 523, 208–211 (2015).
Dore, H. et al. Targeted hypermutation of putative antigen sensors in multicellular bacteria. Proc. Natl Acad. Sci. USA 121, e2316469121 (2024).
Kaur, G., Burroughs, A. M., Iyer, L. M. & Aravind, L. Highly regulated, diversifying NTP-dependent biological conflict systems with implications for the emergence of multicellularity. eLife 9, e52696 (2020).
Liu, M. et al. Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage. Science 295, 2091–2094 (2002).
McMahon, S. A. et al. The C-type lectin fold as an evolutionary solution for massive sequence variation. Nat. Struct. Mol. Biol. 12, 886–892 (2005).
Handa, S. et al. Template-assisted synthesis of adenine-mutagenized cDNA by a retroelement protein complex. Nucleic Acids Res. 46, 9711–9725 (2018).
Doulatov, S. et al. Tropism switching in Bordetella bacteriophage defines a family of diversity-generating retroelements. Nature 431, 476–481 (2004).
Paul, B. G. et al. Targeted diversity generation by intraterrestrial archaea and archaeal viruses. Nat. Commun. 6, 6585 (2015).
Alayyoubi, M. et al. Structure of the essential diversity-generating retroelement protein bAvd and its functionally important interaction with reverse transcriptase. Structure 21, 266–276 (2013).
Handa, S., Reyna, A., Wiryaman, T. & Ghosh, P. Determinants of adenine-mutagenesis in diversity-generating retroelements. Nucleic Acids Res. 49, 1033–1045 (2021).
Naorem, S. S. et al. DGR mutagenic transposition occurs via hypermutagenic reverse transcription primed by nicked template RNA. Proc. Natl Acad. Sci. USA 114, E10187–E10195 (2017).
Inouye, S., Hsu, M. Y., Xu, A. & Inouye, M. Highly specific recognition of primer RNA structures for 2′-OH priming reaction by bacterial reverse transcriptases. J. Biol. Chem. 274, 31236–31244 (1999).
Guo, H. et al. Diversity-generating retroelement homing regenerates target sequences for repeated rounds of codon rewriting and protein diversification. Mol. Cell 31, 813–823 (2008).
Pintilie, G. & Chiu, W. Validation, analysis and annotation of cryo-EM structures. Acta Crystallogr. D Struct. Biol. 77, 1142–1152 (2021).
Zhao, C. & Pyle, A. M. Crystal structures of a group II intron maturase reveal a missing link in spliceosome evolution. Nat. Struct. Mol. Biol. 23, 558–565 (2016).
Stamos, J. L., Lentzsch, A. M. & Lambowitz, A. M. Structure of a thermostable group II intron reverse transcriptase with template-primer and its functional and evolutionary implications. Mol. Cell 68, 926–939 e924 (2017).
Haack, D. B. et al. Cryo-EM structures of a group II intron reverse splicing into DNA. Cell 178, 612–623 e612 (2019).
Blocker, F. J. et al. Domain structure and three-dimensional model of a group II intron-encoded reverse transcriptase. RNA 11, 14–28 (2005).
Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz, T. A. RNA tertiary interactions in the large ribosomal subunit: the A-minor motif. Proc. Natl Acad. Sci. USA 98, 4899–4903 (2001).
Torabi, S. F. et al. RNA stabilization by a poly(A) tail 3′-end binding pocket and other modes of poly(A)-RNA interaction. Science 371, eabe6523 (2021).
Danaee, P. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res. 46, 5381–5394 (2018).
Dunkle, J. A. et al. Structures of the bacterial ribosome in classical and hybrid states of tRNA binding. Science 332, 981–984 (2011).
Chung, K. et al. Structures of a mobile intron retroelement poised to attack its structured DNA target. Science 378, 627–634 (2022).
Wilkinson, M. E., Frangieh, C. J., Macrae, R. K. & Zhang, F. Structure of the R2 non-LTR retrotransposon initiating target-primed reverse transcription. Science 380, 301–308 (2023).
Deng, P. et al. Structural RNA components supervise the sequential DNA cleavage in R2 retrotransposon. Cell 186, 2865–2879.e2820 (2023).
Thawani, A., Ariza, A. J. F., Nogales, E. & Collins, K. Template and target-site recognition by human LINE-1 in retrotransposition. Nature 626, 186–193 (2024).
Larsen, K. P. et al. Architecture of an HIV-1 reverse transcriptase initiation complex. Nature 557, 118–122 (2018).
Das, K., Martinez, S. E., DeStefano, J. J. & Arnold, E. Structure of HIV-1 RT/dsRNA initiation complex prior to nucleotide incorporation. Proc. Natl Acad. Sci. USA 116, 7308–7313 (2019).
Liu, B. et al. Structure of active human telomerase with telomere shelterin protein TPP1. Nature 604, 578–583 (2022).
Zhao, C., Liu, F. & Pyle, A. M. An ultraprocessive, accurate reverse transcriptase encoded by a metazoan group II intron. RNA 24, 183–195 (2018).
Hsiou, Y. et al. Structure of unliganded HIV-1 reverse transcriptase at 2.7 A resolution: implications of conformational changes for polymerization and inhibition mechanisms. Structure 4, 853–860 (1996).
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).
Reuter, J. S. & Mathews, D. H. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinf. 11, 129 (2010).
Magnus, M., Boniecki, M. J., Dawson, W. & Bujnicki, J. M. SimRNAweb: a web server for RNA 3D structure modeling with optional restraints. Nucleic Acids Res. 44, W315–W319 (2016).
Popenda, M. et al. Automated 3D structure composition for large RNAs. Nucleic Acids Res. 40, e112 (2012).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010).
Yamashita, K., Palmer, C. M., Burnley, T. & Murshudov, G. N. Cryo-EM single-particle structure refinement and map calculation using Servalcat. Acta Crystallogr. D Struct. Biol. 77, 1282–1291 (2021).
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861–877 (2019).
Zok, T. et al. RNApdbee 2.0: multifunctional tool for RNA structure annotation. Nucleic Acids Res. 46, W30–W35 (2018).
Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).
Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Meng, E. C. et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 32, e4792 (2023).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–U354 (2012).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Busan, S. & Weeks, K. M. Accurate detection of chemical modifications in RNA by mutational profiling (MaP) with ShapeMapper 2. RNA 24, 143–148 (2018).
Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).
Acknowledgements
We thank R. Baker and A. Leschziner for initial cryo-EM characterization of complexes, T. Baker for computational resources and J. Meyers for data collection at Pacific Northwest Center for Cryo-EM (PNCC). Use of PNCC was supported by NIH grant no. U24GM129547. This work was supported by the National Institutes of Health, grant nos. R01 GM132720 (P.G.), R01 AI163327 (G.G.) and R01 GM033050-35 (T.B.). B.G.P. was supported by the Gordon and Betty Moore Foundation, the G. Unger Vetlesen Foundation and the Owens Family Foundation.
Author information
Authors and Affiliations
Contributions
S.H. and P.G. conceptualized the project. S.H., T.B., J.C., B.G.P. and P.G. carried out the investigation. S.H., T.B., B.G.P. and P.G. carried out visualization. P.G. and G.G. acquired funding. P.G. administered and supervised the project. S.H., T.B., B.G.P. and P.G. wrote the original draft and S.H., T.B., G.G., B.G.P. and P.G. reviewed and edited the draft.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 bRT-(Avd-Fae):RNAΔ98.
a. Schematic of bRT, Avd, Avd-Fae and DGR RNA and RNAΔ98. b. Synthesis of cDNA by bRT-Avd or bRT-(Avd-Fae) from DGR RNA or RNAΔ98. Reactions were incubated with dNTPs, including [α-32P]dCTP, for 2 h at 37 °C. Products were treated with RNase and resolved by 8% denaturing polyacrylamide gel electrophoresis (PAGE). Lane M corresponds to radiolabeled, single-stranded DNA molecular mass markers, denoted by nucleotide (nt) length. The panels are from the same gel in which irrelevant data were deleted (marked by line). For gel source data, see Supplementary Fig. 1a. The gel is representative of two independent replicates. c. Frequency distribution of deoxynucleotides incorporated (thymidine, purple) or misincorporated (adenosine, orange; cytosine, red; guanosine, green) by bRT-Avd (T 88.9%, A 1.8%, C 8.7%, G 0.6%) or bRT-Avd-Fae (T 90.0%, A 0.9%, C 8.7%, G 0.4%) at TRG117A in RNAΔ98. TRG117A has a lower misincorporation frequency than other adenosines in TR14. d. Two views of local resolution map of bRT-Avd without RNAΔ98 shown (left, in same orientation as Fig. 2a). Resolution scale in Å below.
Extended Data Fig. 2 RNP Interactions and Active Conformations.
a bRT (red) superposed with the group II intron RT GsI-IIC (blue) in ‘right-hand’ view of polymerases. Subdomains of the RTs are indicated. GsI-IIC has a D domain that is not present in bRT or other DGR RTs. b. Interaction between bRT E40 and Avd R79 and R83. Avd R79A and R83A do not bind bRT and cause loss of function13. Coloring as in Fig. 2a. c. Interaction of bRT NTE with TRG117 and Avd. Interacting amino acids in bonds representation (carbon yellow for bRT, salmon for Avd, and cyan for TRG117; oxygen red; nitrogen blue; and phosphorus orange). d. Superposition of Active G:dCTP (blue, Complex 1 in Extended Data Table 1), Active A:Empty (red, Complex 2), and Active G:Empty RNPs (green, Complex 3). Root mean square deviation of 1.6 Å for 10,218 atoms between Active G:dCTP and Active A:Empty RNPs, and 0.7 Å for 10,265 atoms between Active G:dCTP and Active A:Empty RNPs. e. Model of bRT-Avd-Fae:RNAΔ98 complex. Fae (red, PDB 1Y60) was placed at the wide end of the Avd barrel for visual purposes. No interpretable density exists for Fae. Coloring as in Fig. 1d. Dotted line indicates potential location of intact TR.
Extended Data Fig. 3 DGR RNA Binding and cPRT Base Pairing.
a-d. Electrophoretic mobility shift assay (EMSA) of (a) DGR RNA, (b) DGR RNA Δavd368-379, (c) DGR RNA ΔSp96-140, and (d) non-DGR RNA with varying concentrations of bRT-Avd (as indicated at top of each panel). RNA was resolved by 4% native PAGE. Arrowheads indicate position of shifted bands. To the side of each gel is shown LC/MS-MS analysis of shifted EMSA bands. Lower quantities of RNA at higher bRT-Avd concentrations in the non-DGR RNA sample likely indicate non-specific aggregation of bRT-Avd with RNA. For gel source data, see Supplementary Fig. 1b–e. Representative gel from two independent replicates is shown. e. Schematic of (1, WT) 12 potential bps in the avd-Sp duplex; (2) substitutions in Sp that disrupt the 12 bps; (3) complementary substitutions in avd that restore the 12 bps; (4) substitutions in Sp that disrupt six Watson-Crick bps; and (5) complementary substitutions in avd that restore the six Watson-Crick bps. Solid bars are Watson-Crick base pairs, and open circles wobble base pairs. f. Top, cDNA synthesis by bRT-Avd with (1) wild-type or (2-5) mutant DGR RNA. The lane numbers correspond to the numbering in panel e. Products were treated with RNase and resolved by 8% denaturing polyacrylamide gel electrophoresis (PAGE). Lane M corresponds to radiolabeled, single-stranded DNA molecular mass markers, denoted by nucleotide (nt) length. Arrowheads indicate the positions of ~90- and ~120-nt cDNAs. A representative gel from three independent replicates is shown. Bottom, input RNA for cDNA synthesis resolved by 6% denaturing PAGE. Lane M corresponds to DNA molecular mass markers, denoted by nt length. DGR RNA runs at a position higher than the 300 nt marker, likely due to retention of secondary structure in the RNA. The numbers correspond to the numbering of sequences depicted in panel e. A representative gel from three independent replicates is shown. For gel source data, see Supplementary Fig. 1f.
Extended Data Fig. 4 Mutations in DGR RNA.
a. Top, cDNAs synthesized by bRT-Avd from wild-type or mutated DGR RNAs. Products were treated with RNase and resolved by 8% denaturing polyacrylamide gel electrophoresis (PAGE). Lane M corresponds to radiolabeled, single-stranded DNA molecular mass markers, denoted by nucleotide (nt) length. Arrowheads correspond to cDNAs. Bottom, input RNAs for cDNA synthesis resolved by 6% denaturing PAGE. Lane M corresponds to DNA molecular mass markers, denoted by nt length. A representative gel from three independent replicates shown. An irrelevant lane is blanked out from the gels. For gel source data, see Supplementary Fig. 1g. b. cPRT activity of mutant DGR RNASp(Δ71-78) relative to wild-type DGR RNA. Data are from experiments that were repeated three independent times, and means and standard deviations shown. c and d. cDNAs synthesized by bRT-Avd from wild-type or DGR RNAs mutated in the Thumb ring (c) or bRT-binding RNA (d). Products were treated with RNase and resolved by 8% denaturing polyacrylamide gel electrophoresis (PAGE). Lane M corresponds to radiolabeled, single-stranded DNA molecular mass markers, denoted by nt length. Arrowheads correspond to cDNAs. Bottom of each panel shows input RNAs for cDNA synthesis resolved by 6% denaturing PAGE. Lane M corresponds to DNA molecular mass markers, denoted by nt length. A representative gel from three independent replicates shown. For gel source data for panels c and d, see Supplementary Fig. 1h,i, respectively.
Extended Data Fig. 5 Flipped-out bases in Avd-binding loop.
a. Binding of flipped-out base SpU3 to a crevice between Avd1 and Avd2. b. Binding of flipped-out base SpU7 to a crevice between Avd3 and Avd4.
Extended Data Fig. 6 RNP Alternative Conformations.
a. bRT-Avd:RNAΔ98 in Resting conformation from Complex 4 (Extended Data Table 1). Coloring here and in other panels as in Fig. 1d. b. Resting conformation from Complex 5. c. Resting conformation from Complex 6. d. Pre-Active 1 conformation from Complex 7. e. Pre-Active 1 conformation from Complex 8. f. Pre-Active 2 conformation from Complex 9.
Extended Data Fig. 7 Pre-Active Conformation.
a. Schematic of continuous Sp strand in the cPRT Stop and TR:Sp duplex in the Pre-Active conformation (Extended Data Table 1, Complex 7), depicted as in Fig. 2c. b. Interactions of SpG57 and C58 with TRG117 and avdG380, respectively. bRT amino acids that contact SpG57 are shown. Coloring as in Fig. 2a for carbon; phosphorus orange, oxygen red, and nitrogen blue.
Extended Data Fig. 8 Conservation of RNA elements.
a-c. Sequence alignment consensus profiles for putative (left) cPRT Stop duplex and (right) Avd-binding stem loop for (a) Bacillota, (b) Pseudomonadota, and (c) Cyanobacteriota DGRs. The number of genomes in each alignment is indicated in parentheses. d. RNA structural features of the BΦ DGR predicted by MxFold252 as compared to data from the cryo-EM structure. The sequence upstream (i.e., avd) and downstream (i.e., Sp) of TR are in green and orange, respectively.
Supplementary information
Supplementary Information
Supplementary Figs. 1–14.
Supplementary Table 1
Conservation of RNA structural elements.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Handa, S., Biswas, T., Chakraborty, J. et al. RNA control of reverse transcription in a diversity-generating retroelement. Nature 638, 1122–1129 (2025). https://doi.org/10.1038/s41586-024-08405-w
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41586-024-08405-w
This article is cited by
-
Structural basis for retron co-option of anti-phage ATPase-nuclease
Nature Structural & Molecular Biology (2026)
-
How and when organisms edit their own genomes
Nature Genetics (2025)


