Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Metagenomic characterization of viruses and mobile genetic elements associated with the DPANN archaeal superphylum

Subjects

Abstract

The archaeal superphylum DPANN (an acronym formed from the initials of the first five phyla discovered: Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanohaloarchaeota and Nanoarchaeota) is a group of ultrasmall symbionts able to survive in extreme ecosystems. The diversity and dynamics between DPANN archaea and their virome remain largely unknown. Here we use a metagenomic clustered regularly interspaced short palindromic repeats (CRISPR) screening approach to identify 97 globally distributed, non-redundant viruses and unclassified mobile genetic elements predicted to infect hosts across 8 DPANN phyla, including 7 viral groups not previously characterized. Genomic analysis suggests a diversity of viral morphologies including head-tailed, tailless icosahedral and spindle-shaped viruses with the potential to establish lytic, chronic or lysogenic infections. We also find evidence of a virally encoded Cas12f1 protein (probably originating from uncultured DPANN archaea) and a mini-CRISPR array, which could play a role in modulating host metabolism. Many metagenomes have virus-to-host ratios >10, indicating that DPANN viruses play an important role in controlling host populations. Overall, our study illuminates the underexplored diversity, functional repertoires and host interactions of the DPANN virome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of CRISPR–Cas systems in DPANN archaea.
Fig. 2: Genetic arrangements of CRISPR–Cas systems in DPANN archaea.
Fig. 3: Expansive DPANN viral diversity and distribution.
Fig. 4: Genome-based taxonomic assignment of DPANN viruses.
Fig. 5: Complete genomic maps of the seven groups of novel DPANN archaeal viruses.
Fig. 6: An atypical V-F1 CRISPR–Cas system harboured by Houyivirus.

Similar content being viewed by others

Data availability

All the data analysed in this study are publicly available. All the DPANN archaeal genomes are available in the NCBI GenBank database (https://www.ncbi.nlm.nih.gov/genbank/), Ocean Microbiomics Database (https://microbiomics.io/ocean/) and Genomes from Earth’s Microbiomes catalogue (https://portal.nersc.gov/GEM/). The identified direct repeat sequences and 3,662 DPANN genomes of completeness ≥50% and contamination <10% are deposited in Zenodo (https://doi.org/10.5281/zenodo.10926453)114. The genome sequences of DPANN viruses and unclassified MGEs identified in this study are available in the IMG/VR database (https://img.jgi.doe.gov/vr), and our collected DPANN archaeal genomes (IMG/VR and NCBI accession numbers listed in Supplementary Table 4), or deposited in Zenodo (https://doi.org/10.5281/zenodo.11004436)115. All the metagenomic raw reads used in this study for assembling and abundance profiling are available in the NCBI SRA (https://www.ncbi.nlm.nih.gov/sra/), with the accession numbers listed in Supplementary Table 10. The relevant sample attributes (for example, locations and ecosystem types) are from the IMG/VR or NCBI BioSample (https://www.ncbi.nlm.nih.gov/biosample/). Databases (UniRef30, PfamA 35.0, NCBI CDD v3.19, PDB70_June_2023, PHROG v4, SCOPe70 v2.08 and UniProt-SwissProt-viral70_Nov_2021) used in this study are publicly available (https://wwwuser.gwdguser.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/). Source data are provided with this paper.

References

  1. Dombrowski, N., Lee, J.-H., Williams, T. A., Offre, P. & Spang, A. Genomic diversity, lifestyles and evolutionary origins of DPANN archaea. FEMS Microbiol. Lett. 366, fnz008 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. He, C. et al. Genome-resolved metagenomics reveals site-specific diversity of episymbiotic CPR bacteria and DPANN archaea in groundwater ecosystems. Nat. Microbiol. 6, 354–365 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Sakai, H. D. et al. Insight into the symbiotic lifestyle of DPANN archaea revealed by cultivation and genome analyses. Proc. Natl Acad. Sci. USA 119, e2115449119 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

    CAS  PubMed  Google Scholar 

  5. Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).

    CAS  PubMed  Google Scholar 

  6. Dombrowski, N. et al. Undinarchaeota illuminate DPANN phylogeny and the impact of gene transfer on archaeal evolution. Nat. Commun. 11, 3939 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Rinke, C. et al. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat. Microbiol. 6, 946–959 (2021).

    CAS  PubMed  Google Scholar 

  8. Wurch, L. et al. Genomics-informed isolation and characterization of a symbiotic Nanoarchaeota system from a terrestrial geothermal environment. Nat. Commun. 7, 12115 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Castelle, C. J. et al. Biosynthetic capacity, metabolic variety and unusual biology in the CPR and DPANN radiations. Nat. Rev. Microbiol. 16, 629–645 (2018).

    CAS  PubMed  Google Scholar 

  10. Baker, B. J. et al. Enigmatic, ultrasmall, uncultivated Archaea. Proc. Natl Acad. Sci. USA 107, 8806–8811 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Huber, H. et al. A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont. Nature 417, 63–67 (2002).

    CAS  PubMed  Google Scholar 

  12. Vigneron, A., Cruaud, P., Lovejoy, C. & Vincent, W. F. Genomic evidence of functional diversity in DPANN archaea, from oxic species to anoxic vampiristic consortia. ISME Commun. 2, 4 (2022).

    PubMed  PubMed Central  Google Scholar 

  13. Chevallereau, A., Pons, B. J., van Houte, S. & Westra, E. R. Interactions between bacterial and phage communities in natural environments. Nat. Rev. Microbiol. 20, 49–62 (2022).

    CAS  PubMed  Google Scholar 

  14. Zimmerman, A. E. et al. Metabolic and biogeochemical consequences of viral infection in aquatic ecosystems. Nat. Rev. Microbiol. 18, 21–34 (2020).

    CAS  PubMed  Google Scholar 

  15. Krupovic, M., Dolja, V. V. & Koonin, E. V. The LUCA and its complex virome. Nat. Rev. Microbiol. 18, 661–670 (2020).

    CAS  PubMed  Google Scholar 

  16. Prangishvili, D. et al. The enigmatic archaeal virosphere. Nat. Rev. Microbiol. 15, 724–739 (2017).

    CAS  PubMed  Google Scholar 

  17. Krupovic, M., Cvirkaite-Krupovic, V., Iranzo, J., Prangishvili, D. & Koonin, E. V. Viruses of archaea: structural, functional, environmental and evolutionary genomics. Virus Res. 244, 181–193 (2018).

    CAS  PubMed  Google Scholar 

  18. Rambo, I. M., Langwig, M. V., Leão, P., De Anda, V. & Baker, B. J. Genomes of six viruses that infect Asgard archaea from deep-sea sediments. Nat. Microbiol. 7, 953–961 (2022).

    CAS  PubMed  Google Scholar 

  19. Medvedeva, S. et al. Three families of Asgard archaeal viruses identified in metagenome-assembled genomes. Nat. Microbiol. 7, 962–973 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Laso-Pérez, R. et al. Evolutionary diversification of methanotrophic ANME-1 archaea and their expansive virome. Nat. Microbiol. 8, 231–245 (2023).

    PubMed  PubMed Central  Google Scholar 

  21. Medvedeva, S., Borrel, G., Krupovic, M. & Gribaldo, S. A compendium of viruses from methanogenic archaea reveals their diversity and adaptations to the gut environment. Nat. Microbiol. 8, 2170–2182 (2023).

    CAS  PubMed  Google Scholar 

  22. Burstein, D. et al. New CRISPR–Cas systems from uncultivated microbes. Nature 542, 237–241 (2017).

    CAS  PubMed  Google Scholar 

  23. Esser, S. P. et al. A predicted CRISPR-mediated symbiosis between uncultivated archaea. Nat. Microbiol. 8, 1619–1633 (2023).

    CAS  PubMed  Google Scholar 

  24. Rahlff, J. et al. Lytic archaeal viruses infect abundant primary producers in Earth’s crust. Nat. Commun. 12, 4642 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Li, Y.-X. et al. Deciphering symbiotic interactions of “Candidatus Aenigmarchaeota” with inferred horizontal gene transfers and co-occurrence networks. mSystems 6, e0060621 (2021).

    PubMed  Google Scholar 

  26. Comolli, L. R., Baker, B. J., Downing, K. H., Siegerist, C. E. & Banfield, J. F. Three-dimensional analysis of the structure and ecology of a novel, ultra-small archaeon. ISME J. 3, 159–167 (2009).

    CAS  PubMed  Google Scholar 

  27. Comolli, L. R. & Banfield, J. F. Inter-species interconnections in acid mine drainage microbial communities. Front. Microbiol. 5, 367 (2014).

    PubMed  PubMed Central  Google Scholar 

  28. Banas, I. et al. Spatio-functional organization in virocells of small uncultivated archaea from the deep biosphere. ISME J. 17, 1789–1792 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Samson, J. E., Magadán, A. H., Sabri, M. & Moineau, S. Revenge of the phages: defeating bacterial defences. Nat. Rev. Microbiol. 11, 675–687 (2013).

    CAS  PubMed  Google Scholar 

  30. Paoli, L. et al. Biosynthetic potential of the global ocean microbiome. Nature 607, 111–118 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021).

    CAS  PubMed  Google Scholar 

  32. Harrington, L. B. et al. Programmed DNA destruction by miniature CRISPR–Cas14 enzymes. Science 362, 839–842 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Makarova, K. S. et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 18, 67–83 (2020).

    CAS  PubMed  Google Scholar 

  34. Bernheim, A. & Sorek, R. The pan-immune system of bacteria: antiviral defence as a community resource. Nat. Rev. Microbiol. 18, 113–119 (2020).

    CAS  PubMed  Google Scholar 

  35. Jurėnas, D., Fraikin, N., Goormaghtigh, F. & Van Melderen, L. Biology and evolution of bacterial toxin–antitoxin systems. Nat. Rev. Microbiol. 20, 335–350 (2022).

    PubMed  Google Scholar 

  36. Chopin, M.-C., Chopin, A. & Bidnenko, E. Phage abortive infection in lactococci: variations on a theme. Curr. Opin. Microbiol. 8, 473–479 (2005).

    CAS  PubMed  Google Scholar 

  37. Munson-McGee, J. H., Rooney, C. & Young, M. J. An uncultivated virus infecting a nanoarchaeal parasite in the hot springs of Yellowstone National Park. J. Virol. 94, e01213–19 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Li, Z. et al. Deep sea sediments associated with cold seeps are a subsurface reservoir of viral diversity. ISME J. 15, 2366–2378 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Martínez-García, M., Santos, F., Moreno-Paz, M., Parro, V. & Antón, J. Unveiling viral–host interactions within the ‘microbial dark matter’. Nat. Commun. 5, 4542 (2014).

    PubMed  Google Scholar 

  40. Liu, J., Jaffe, A. L., Chen, L., Bor, B. & Banfield, J. F. Host translation machinery is not a barrier to phages that interact with both CPR and non-CPR bacteria. mBio 14, e0176623 (2023).

    PubMed  Google Scholar 

  41. Liu, Y. et al. Diversity, taxonomy, and evolution of archaeal viruses of the class Caudoviricetes. PLoS Biol. 19, e3001442 (2021).

    PubMed  PubMed Central  Google Scholar 

  42. Zhou, Y. et al. Diverse viruses of marine archaea discovered using metagenomics. Environ. Microbiol. 25, 367–382 (2023).

    CAS  PubMed  Google Scholar 

  43. Mavrich, T. N. & Hatfull, G. F. Bacteriophage evolution differs by host, lifestyle and genome. Nat. Microbiol. 2, 17112 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Liu, X. et al. Insights into the ecology, evolution, and metabolism of the widespread Woesearchaeotal lineages. Microbiome 6, 102 (2018).

    PubMed  PubMed Central  Google Scholar 

  45. La Cono, V. et al. Symbiosis between nanohaloarchaeon and haloarchaeon is based on utilization of different polysaccharides. Proc. Natl Acad. Sci. USA 117, 20223–20234 (2020).

    PubMed  PubMed Central  Google Scholar 

  46. Aulitto, M. et al. Genomics, transcriptomics, and proteomics of SSV1 and related fusellovirus: a minireview. Viruses 14, 2082 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Wang, F. et al. Spindle-shaped archaeal viruses evolved from rod-shaped ancestors to package a larger genome. Cell 185, 1297–1307.e11 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Tamarit, D. et al. A closed Candidatus Odinarchaeum chromosome exposes Asgard archaeal viruses. Nat. Microbiol. 7, 948–952 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Yutin, N., Bäckström, D., Ettema, T. J. G., Krupovic, M. & Koonin, E. V. Vast diversity of prokaryotic virus genomes encoding double jelly-roll major capsid proteins uncovered by genomic and metagenomic sequence analysis. Virol. J. 15, 67 (2018).

    PubMed  PubMed Central  Google Scholar 

  50. Roine, E. et al. New, closely related haloarchaeal viral elements with different nucleic acid types. J. Virol. 84, 3682–3689 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Kazlauskas, D., Krupovic, M. & Venclovas, Č. The logic of DNA replication in double-stranded DNA viruses: insights from global analysis of viral genomes. Nucleic Acids Res. 44, 4551–4564 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Kazlauskas, D., Krupovic, M., Guglielmini, J., Forterre, P. & Venclovas, Č. Diversity and evolution of B-family DNA polymerases. Nucleic Acids Res. 48, 10142–10156 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Barry, E. R. & Bell, S. D. DNA replication in the Archaea. Microbiol. Mol. Biol. Rev. 70, 876–887 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Madru, C. et al. Structural basis for the increased processivity of D-family DNA polymerases in complex with PCNA. Nat. Commun. 11, 1591 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Guilliam, T. A., Keen, B. A., Brissett, N. C. & Doherty, A. J. Primase–polymerases are a functionally diverse superfamily of replication and repair enzymes. Nucleic Acids Res. 43, 6651–6664 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Fillat, M. F. The FUR (ferric uptake regulator) superfamily: diversity and versatility of key transcriptional regulators. Arch. Biochem. Biophys. 546, 41–52 (2014).

    CAS  PubMed  Google Scholar 

  57. du Penhoat, C. H. et al. The NMR solution structure of the 30S ribosomal protein S27e encoded in gene RS27_ARCFU of Archaeoglobus fulgidis reveals a novel protein fold. Protein Sci. 13, 1407–1416 (2004).

    Google Scholar 

  58. Schmitt, E. et al. Recent advances in Archaeal translation initiation. Front. Microbiol. 11, 584152 (2020).

    PubMed  PubMed Central  Google Scholar 

  59. Al-Shayeb, B. et al. Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Jeudy, S. et al. The DNA methylation landscape of giant viruses. Nat. Commun. 11, 2657 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Murphy, J., Mahony, J., Ainsworth, S., Nauta, A. & van Sinderen, D. Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl. Environ. Microbiol. 79, 7547–7555 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Agarkova, I. V., Dunigan, D. D. & Van Etten, J. L. Virion-associated restriction endonucleases of chloroviruses. J. Virol. 80, 8114–8123 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Bellas, C. M., Schroeder, D. C., Edwards, A., Barker, G. & Anesio, A. M. Flexible genes establish widespread bacteriophage pan-genomes in cryoconite hole ecosystems. Nat. Commun. 11, 4403 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Goodrich-Blair, H. & Shub, D. A. Beyond homing: competition between intron endonucleases confers a selective advantage on flanking genetic markers. Cell 84, 211–221 (1996).

    CAS  PubMed  Google Scholar 

  65. Markine-Goriaynoff, N. et al. Glycosyltransferases encoded by viruses. J. Gen. Virol. 85, 2741–2754 (2004).

    CAS  PubMed  Google Scholar 

  66. Pausch, P. et al. CRISPR–CasΦ from huge phages is a hypercompact genome editor. Science 369, 333–337 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Medvedeva, S. et al. Virus-borne mini-CRISPR arrays are involved in interviral conflicts. Nat. Commun. 10, 5204 (2019).

    PubMed  PubMed Central  Google Scholar 

  68. Faure, G. et al. CRISPR–Cas in mobile genetic elements: counter-defence and beyond. Nat. Rev. Microbiol. 17, 513–525 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Al-Shayeb, B. et al. Diverse virus-encoded CRISPR–Cas systems include streamlined genome editors. Cell 185, 4574–4586.e16 (2022).

    CAS  PubMed  Google Scholar 

  70. Danovaro, R. et al. Virus-mediated archaeal hecatomb in the deep seafloor. Sci. Adv. 2, e1600492 (2016).

    PubMed  PubMed Central  Google Scholar 

  71. Roux, S., Hallam, S. J., Woyke, T. & Sullivan, M. B. Viral dark matter and virus–host interactions resolved from publicly available microbial genomes. eLife 4, e08490 (2015).

    PubMed  PubMed Central  Google Scholar 

  72. Kim, J.-G. et al. Spindle-shaped viruses infect marine ammonia-oxidizing thaumarchaea. Proc. Natl Acad. Sci. USA 116, 15645–15650 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Quemin, E. R. J. et al. Eukaryotic-like virus budding in Archaea. mBio 7, e01439-16 (2016).

    PubMed  PubMed Central  Google Scholar 

  74. Dellas, N., Snyder, J. C., Bolduc, B. & Young, M. J. Archaeal viruses: diversity, replication, and structure. Annu. Rev. Virol. 1, 399–426 (2014).

    PubMed  Google Scholar 

  75. Tesson, F. et al. Systematic and quantitative view of the antiviral arsenal of prokaryotes. Nat. Commun. 13, 2561 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Baker, B. J. et al. Diversity, ecology and evolution of Archaea. Nat. Microbiol. 5, 887–900 (2020).

    CAS  PubMed  Google Scholar 

  77. Paez-Espino, D. et al. Uncovering Earth’s virome. Nature 536, 425–430 (2016).

    CAS  PubMed  Google Scholar 

  78. Chklovski, A., Parks, D. H., Woodcroft, B. J. & Tyson, G. W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods 20, 1203–1212 (2023).

    CAS  PubMed  Google Scholar 

  79. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38, 5315–5316 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 50, D785–D794 (2022).

    CAS  PubMed  Google Scholar 

  81. Bland, C. et al. CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8, 209 (2007).

    PubMed  PubMed Central  Google Scholar 

  82. Couvin, D. et al. CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins. Nucleic Acids Res. 46, W246–W251 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Russel, J., Pinilla-Redondo, R., Mayo-Muñoz, D., Shah, S. A. & Sørensen, S. J. CRISPRCasTyper: automated identification, annotation, and classification of CRISPR–Cas loci. CRISPR J. 3, 462–469 (2020).

    CAS  PubMed  Google Scholar 

  84. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    PubMed  PubMed Central  Google Scholar 

  85. Abby, S. S., Néron, B., Ménager, H., Touchon, M. & Rocha, E. P. C. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR–Cas systems. PLoS ONE 9, e110726 (2014).

    PubMed  PubMed Central  Google Scholar 

  86. Camargo, A. P. et al. IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata. Nucleic Acids Res. 51, D733–D743 (2023).

    CAS  PubMed  Google Scholar 

  87. Guo, J. et al. VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome 9, 37 (2021).

    PubMed  PubMed Central  Google Scholar 

  88. Camargo, A. P. et al. Identification of mobile genetic elements with geNomad. Nat. Biotechnol. 42, 1303–1312 (2023).

    PubMed  PubMed Central  Google Scholar 

  89. Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    CAS  PubMed  Google Scholar 

  90. Steinegger, M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 20, 473 (2019).

    PubMed  PubMed Central  Google Scholar 

  91. Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).

    CAS  PubMed  Google Scholar 

  92. Marchler-Bauer, A. et al. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 43, D222–D226 (2015).

    CAS  PubMed  Google Scholar 

  93. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Terzian, P. et al. PHROG: families of prokaryotic virus proteins clustered using remote homology. NAR Genom. Bioinform. 3, lqab067 (2021).

    PubMed  PubMed Central  Google Scholar 

  95. Chandonia, J.-M. et al. SCOPe: improvements to the structural classification of proteins—extended database to facilitate variant interpretation and machine learning. Nucleic Acids Res. 50, D553–D559 (2022).

    CAS  PubMed  Google Scholar 

  96. The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).

  97. Bin Jang, H. et al. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat. Biotechnol. 37, 632–639 (2019).

    Google Scholar 

  98. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  99. Gilchrist, C. L. M. & Chooi, Y.-H. clinker & clustermap.js: automatic generation of gene cluster comparison figures. Bioinformatics 37, 2473–2475 (2021).

    CAS  PubMed  Google Scholar 

  100. Nishimura, Y. et al. ViPTree: the viral proteomic tree server. Bioinformatics 33, 2379–2380 (2017).

    CAS  PubMed  Google Scholar 

  101. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  102. Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).

    CAS  PubMed  Google Scholar 

  104. Frickey, T. & Lupas, A. CLANS: a Java application for visualizing protein families based on pairwise similarity. Bioinformatics 20, 3702–3704 (2004).

    CAS  PubMed  Google Scholar 

  105. Madeira, F. et al. Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res. 50, W276–W279 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  106. Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M. & Barton, G. J. Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  107. Dobson, L., Reményi, I. & Tusnády, G. E. CCTOP: a Consensus Constrained TOPology prediction web server. Nucleic Acids Res. 43, W408–W412 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  108. Camargo, A. P. et al. IMG/PR: a database of plasmids from genomes and metagenomes with rich annotations and metadata. Nucleic Acids Res. 52, D164–D173 (2023).

    PubMed Central  Google Scholar 

  109. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    CAS  PubMed  Google Scholar 

  110. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  111. Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software Version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  112. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    PubMed  PubMed Central  Google Scholar 

  113. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  114. Wu, Z., Liu, S. & Ni, J. DPANN archaeal genomes and direct repeat sequences. Zenodo https://doi.org/10.5281/zenodo.13731608 (2024).

  115. Wu, Z., Liu, S. & Ni, J. DPANN viruses and unclassified MGEs assembled from metagenomes. Zenodo https://doi.org/10.5281/zenodo.13731609 (2024).

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China under grant number 423B2703 (Z.W.), U2240205 (J.N.), 92047303 (J.N.) and 51721006 (J.N.).

Author information

Authors and Affiliations

Authors

Contributions

J.N. designed the research. Z.W. conducted the bioinformatic and statistical analyses. Z.W. wrote the paper with help from S.L. and J.N. All the authors read and approved the final paper.

Corresponding author

Correspondence to Jinren Ni.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Microbiology thanks Janina Rahlff, Christian Rinke and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Genomic comparison of seven groups of novel DPANN viruses.

Homologous genes are displayed using the same colors, and the percentage of sequence identity at the protein level is indicated by different shades of grey (see scale at the bottom). Complete viral genomes are highlighted in blue fonts for their IDs on the left. Annotated genes related to viral hallmarks are denoted by the arrows with red border.

Source data

Extended Data Fig. 2 Maximum-likelihood phylogenetic tree of MCPs encoded by archaeal head-tailed viruses.

Black dots on the branches denote bootstrap support values > 50%. AlphaFold2-predicted MCP structural models are presented on the right.

Source data

Extended Data Fig. 3 Multiple sequence alignment of the MCPs of Kirinvirus, Sulfolobus spindle-shaped virus 1, and His 1 virus.

Multiple sequence alignment was generated using Clustal Omega and visualized using Jalview.

Source data

Extended Data Fig. 4 Sequence similarity network of prokaryotic virus DJR MCPs.

Protein sequences were clustered based on pairwise sequence similarity using CLANS. Each node in the network represents a prokaryotic virus DJR MCP sequence, and edges connect similar sequences with a CLANS p-value ≤ 0.0001. Distinct groups of DJR MCPs are shown by different colors. The structural model of Ditingvirus DJR MCP was predicted by AlphaFold2, while others were derived from the PDB database. The virus name (Ditingvirus) or PDB accession numbers are provided in parentheses.

Source data

Extended Data Fig. 5 Maximum-likelihood phylogenetic tree of ribosomal proteins S27e.

Black dots on the branches denote bootstrap support values > 70%.

Source data

Extended Data Fig. 6 Protein alignment of HvCas12f1, Cas14a.1, Cas14a.2, and Cas14a.3.

The secondary structure of HvCas12f1 is indicated above the sequences. The key residues of RuvC domain are marked with triangles below the sequences. Multiple sequence alignment was generated using Clustal Omega and visualized using ESPript3.

Source data

Extended Data Fig. 7 Abundance pattern of DPANN viruses and their hosts.

a, Boxplots showing relative abundances of DPANN viruses and their hosts in metagenomic data. For each boxplot, central line and whiskers indicate the median and 1.5 times the interquartile range. The upper and lower sides of boxes represent the interquartile range between 25th and 75th percentile. The differences in relative abundances were determined using the paired two-sided Wilcoxon test (P = 5.244 × 10−8). b, Average virus-to-host ratios (VHRs) for DPANN viruses in multiple metagenomes. The VHR in each sample was calculated as the ratio of relative abundances of viral and host genomes. Detailed information refers to Supplementary Table 9.

Source data

Extended Data Fig. 8 A conceptual map for viral proliferation and host interaction mechanisms of novel DPANN viruses.

Seven DPANN viral groups are differentiated by distinct colored polygons or ellipses. Figure created with BioRender.com.

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–4.

Reporting Summary

Supplementary Tables

Supplementary Table 1: Number of DPANN archaeal genomes with completeness ≥50% and contamination <10%. Supplementary Table 2: Complete or near-complete CRISPR–Cas systems in DPANN archaea. Supplementary Table 3: Other defence systems found in DPANN archaea using DefenseFinder. Supplementary Table 4: List of viruses and unclassified MGEs associated with DPANN archaea identified in this study. Supplementary Table 5: vCONTACT2 network analysis result of DPANN MGEs and NCBI Refseq prokaryotic viruses. In vCONTACT2, P values are estimated using a one-sided Mann–Whitney U-test. Supplementary Table 6: Taxonomic proposal for DPANN archaeal viral species with complete genomes. Supplementary Table 7: Detailed functional annotation results for representative viruses and circular unclassified MGEs associated with DPANN archaea. Supplementary Table 8: Statistics of the overall functional annotation of DPANN viruses and unclassified MGEs. Supplementary Table 9: Detailed information of DPANN virus-to-host ratios. Supplementary Table 10: List of NCBI metagenomic data used in this study.

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Z., Liu, S. & Ni, J. Metagenomic characterization of viruses and mobile genetic elements associated with the DPANN archaeal superphylum. Nat Microbiol 9, 3362–3375 (2024). https://doi.org/10.1038/s41564-024-01839-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41564-024-01839-y

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing