Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Gut microbiome evolution from infancy to 8 years of age

Abstract

The human gut microbiome is most dynamic in early life. Although sweeping changes in taxonomic architecture are well described, it remains unknown how, and to what extent, individual strains colonize and persist and how selective pressures define their genomic architecture. In this study, we combined shotgun sequencing of 1,203 stool samples from 26 mothers and their twins (52 infants), sampled from childbirth to 8 years after birth, with culture-enhanced, deep short-read and long-read stool sequencing from a subset of 10 twins (20 infants) to define transmission, persistence and evolutionary trajectories of gut species from infancy to middle childhood. We constructed 3,995 strain-resolved metagenome-assembled genomes across 399 taxa, and we found that 27.4% persist within individuals. We identified 726 strains shared within families, with Bacteroidales, Oscillospiraceae and Lachnospiraceae, but not Bifidobacteriaceae, vertically transferred. Lastly, we identified weaning as a critical inflection point that accelerates bacterial mutation rates and separates functional profiles of genes accruing mutations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Microbiome dynamics from birth through middle childhood.
Fig. 2: Pre-weaning exposure to breastmilk impacts transition through microbiome enterotypes.
Fig. 3: 3,995 RGs are recovered from 10 twin pairs and their mothers.
Fig. 4: Strain persistence, diversity and co-occurrence from birth through middle childhood.
Fig. 5: Intra-family strain-sharing events vary by phylogeny.
Fig. 6: The weaning period is a mutation-generating hotspot that triggers a shift in mutated gene functions.

Similar content being viewed by others

Data availability

All genomic data generated in this work are publicly available at the NCBI under BioProject ID PRJNA1060349. All individual-specific and sample-specific metadata are available in Supplementary Tables 1 and 2, and all NCBI RefSeq and Type Strain representative genomes are identified in Supplementary Table 3. Generated raw data are available in Supplementary Tables 412.

Code availability

All tools and R packages used for this analysis are publicly available and fully described in the Methods section. Detailed code used for data visualization and analysis is available at https://github.com/sanjsawhney/twins_diet.

References

  1. Clemente, J. C., Ursell, L. K., Parfrey, L. W. & Knight, R. The impact of the gut microbiota on human health: an integrative view. Cell 148, 1258–1270 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Valdes, A. M., Walter, J., Segal, E. & Spector, T. D. Role of the gut microbiota in nutrition and health. BMJ 361, k2179 (2018).

    PubMed  PubMed Central  Google Scholar 

  3. Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Korpela, K. et al. Selective maternal seeding and environment shape the human gut microbiome. Genome Res. 28, 561–568 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Sawhney, S. Influence of Environmental Gradients on Genomic Variation in Pediatric Commensals and Pathogens (Proquest, 2023).

  6. Stewart, C. J. et al. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature 562, 583–588 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Sender, R., Fuchs, S. & Milo, R. Revised estimates for the number of human and bacteria cells in the body. PLoS Biol. 14, e1002533 (2016).

    PubMed  PubMed Central  Google Scholar 

  8. Yassour, M. et al. Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability. Sci. Transl. Med. 8, 343ra381 (2016).

    Google Scholar 

  9. Wampach, L. et al. Birth mode is associated with earliest strain-conferred gut microbiome functions and immunostimulatory potential. Nat. Commun. 9, 5091 (2018).

    PubMed  PubMed Central  Google Scholar 

  10. Baumann-Dudenhoeffer, A. M., D’Souza, A. W., Tarr, P. I., Warner, B. B. & Dantas, G. Infant diet and maternal gestational weight gain predict early metabolic maturation of gut microbiomes. Nat. Med. 24, 1822–1829 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Gasparrini, A. J. et al. Persistent metagenomic signatures of early-life hospitalization and antibiotic treatment in the infant gut microbiota and resistome. Nat. Microbiol. 4, 2285–2297 (2019).

    PubMed  PubMed Central  Google Scholar 

  12. Arrieta, M. C. et al. Early infancy microbial and metabolic alterations affect risk of childhood asthma. Sci. Transl. Med. 7, 307ra152 (2015).

    PubMed  Google Scholar 

  13. Cox, L. M. & Blaser, M. J. Antibiotics in early life and obesity. Nat. Rev. Endocrinol. 11, 182–190 (2015).

    PubMed  Google Scholar 

  14. Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15, 382–392 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Thanert, R., Sawhney, S. S., Schwartz, D. J. & Dantas, G. The resistance within: antibiotic disruption of the gut microbiome and resistome dynamics in infancy. Cell Host Microbe 30, 675–683 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in Westernized populations. Cell Host Microbe 26, 666–679(2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Vatanen, T. et al. Genomic variation and strain-specific functional adaptation in the human gut microbiome during early life. Nat. Microbiol. 4, 470–479 (2019).

    CAS  PubMed  Google Scholar 

  18. Roodgar, M. et al. Longitudinal linked-read sequencing reveals ecological and evolutionary responses of a human gut microbiome during antibiotic treatment. Genome Res. 31, 1433–1446 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Faith, J. J. et al. The long-term stability of the human gut microbiota. Science 341, 1237439 (2013).

    PubMed  PubMed Central  Google Scholar 

  20. Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Ernst, C. M. et al. Adaptive evolution of virulence and persistence in carbapenem-resistant Klebsiella pneumoniae. Nat. Med. 26, 705–711 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Lou, Y. C. et al. Infant gut strain persistence is associated with maternal origin, phylogeny, and traits including surface adhesion and iron acquisition. Cell Rep. Med. 2, 100393 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Asnicar, F. et al. Studying vertical microbiome transmission from mothers to infants by strain-level metagenomic profiling. mSystems 2, e00164–16 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Backhed, F. et al. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host Microbe 17, 690–703 (2015).

    PubMed  Google Scholar 

  25. Yan, Y., Nguyen, L. H., Franzosa, E. A. & Huttenhower, C. Strain-level epidemiology of microbial communities and the human microbiome. Genome Med. 12, 71 (2020).

    PubMed  PubMed Central  Google Scholar 

  26. Olm, M. R. et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat. Biotechnol. 39, 727–736 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Fodor, A. A. et al. The ‘most wanted’ taxa from the human microbiome for whole genome sequencing. PLoS ONE 7, e41294 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. & Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505–510 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Zeng, S. et al. A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome. Nat. Commun. 13, 5139 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Whelan, F. J. et al. Culture-enriched metagenomic sequencing enables in-depth profiling of the cystic fibrosis lung microbiota. Nat. Microbiol. 5, 379–390 (2020).

    CAS  PubMed  Google Scholar 

  33. Teh, J. J. et al. Novel strain-level resolution of Crohn’s disease mucosa-associated microbiota via an ex vivo combination of microbe culture and metagenomic sequencing. ISME J. 15, 3326–3338 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Sawhney, S. S. et al. Assessment of the urinary microbiota of MSM using urine culturomics reveals a diverse microbial environment. Clin. Chem. 68, 192–203 (2021).

    PubMed  PubMed Central  Google Scholar 

  35. Jin, H. et al. Hybrid, ultra-deep metagenomic sequencing enables genomic and functional characterization of low-abundance species in the human gut microbiome. Gut Microbes 14, 2021790 (2022).

    PubMed  PubMed Central  Google Scholar 

  36. Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).

    CAS  PubMed  Google Scholar 

  37. Morgan, M. DirichletMultinomial: Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data. R package version 1.42.0. https://mtmorgan.github.io/DirichletMultinomial/ (2023).

  38. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Antipov, D., Korobeynikov, A., McLean, J. S. & Pevzner, P. A. hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015 (2016).

    CAS  PubMed  Google Scholar 

  40. Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

    PubMed  PubMed Central  Google Scholar 

  43. Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, 6761–6771 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Schoch, C. L. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford) 2020, baaa062 (2020).

  49. Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).

    PubMed  PubMed Central  Google Scholar 

  50. Pritchard, L., Glover, R. H., Humphris, S., Elphinstone, J. G. & Toth, I. K. Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens. Anal. Methods 8, 12–24 (2016).

    Google Scholar 

  51. Ondov, B. D. et al. Mash Screen: high-throughput sequence containment estimation for genome discovery. Genome Biol. 20, 232 (2019).

    PubMed  PubMed Central  Google Scholar 

  52. Valles-Colomer, M. et al. The person-to-person transmission landscape of the gut and oral microbiomes. Nature 614, 125–135 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Meek, J. Y., Noble, L. & Section on, B. Policy statement: breastfeeding and the use of human milk. Pediatrics 150, e2022057988 (2022).

    PubMed  Google Scholar 

  55. Fitzgerald, C. B. et al. Comparative analysis of Faecalibacterium prausnitzii genomes shows a high level of genome plasticity and warrants separation into new species-level taxa. BMC Genomics 19, 931 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. De Filippis, F., Pasolli, E. & Ercolini, D. Newly explored faecalibacterium diversity is connected to age, lifestyle, geography, and disease. Curr. Biol. 30, 4932–4943 (2020).

    PubMed  Google Scholar 

  57. Wu, G. et al. Genomic microdiversity of bifidobacterium pseudocatenulatum underlying differential strain-level responses to dietary carbohydrate intervention. mBio 8, e02348–16 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Chung The, H. et al. Exploring the genomic diversity and antimicrobial susceptibility of Bifidobacterium pseudocatenulatum in a Vietnamese population. Microbiol. Spectr. 9, e0052621 (2021).

    PubMed  Google Scholar 

  59. Taft, D. H. et al. Bifidobacterium species colonization in infancy: a global cross-sectional comparison by population history of breastfeeding. Nutrients 14, 1423 (2022).

    PubMed  PubMed Central  Google Scholar 

  60. Huang, Y. et al. High-throughput microbial culturomics using automation and machine learning. Nat. Biotechnol. 41, 1424–1433 (2023).

  61. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Schmieder, R. & Edwards, R. Fast identification and removal of sequence contamination from genomic and metagenomic datasets. PLoS ONE 6, e17288 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).

  64. Roberts, D. W. labdsv: Ordination and Multivariate Analysis for Ecology. https://cran.r-project.org/web/packages/labdsv/labdsv.pdf (2019).

  65. Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).

    CAS  PubMed  Google Scholar 

  66. Mallick, H. et al. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput. Biol. 17, e1009442 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Kassambara, A. ggpubr: ‘ggplot2’ Based Publication Ready Plots. GitHub https://github.com/kassambara/ggpubr (2023).

  68. Campitelli, E. metR: Tools for Easier Analysis of Meteorological Fields. https://eliocamp.github.io/project/metr/ (2021).

  69. Wickham, H. Reshaping data with the reshape package. J. Stat. Softw. 21, 1–20 (2007).

    Google Scholar 

  70. Wickham, H., Vaughan, D. & Girlich, M. tidyr: Tidy Messy Data. https://tidyr.tidyverse.org (2024).

  71. Kindt, R. & Coe, R. Tree Diversity Analysis. A Manual and Software for Common Statistical Methods for Ecological and Biodiversity Studies (World Agroforestry Centre, 2005).

  72. Wirbel, J. et al. Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox. Genome Biol. 22, 93 (2021).

    PubMed  PubMed Central  Google Scholar 

  73. Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023–1024 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Oksanen, J. et al. vegan: Community Ecology Package. https://cran.r-project.org/package=vegan (2022).

  75. Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Shen, W., Le, S., Li, Y. & Hu, F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11, e0163962 (2016).

    PubMed  PubMed Central  Google Scholar 

  78. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. Zheng, J. et al. dbCAN3: automated carbohydrate-active enzyme and substrate annotation. Nucleic Acids Res. 51, W115–W121 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported, in part, by awards from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD; grant R01HD092414, principal investigators (PIs): P.I.T. and G.D.) of the National Institutes of Health (NIH); the National Institute of Allergy and Infectious Diseases (NIAID; grant R01AI155893, PI: G.D.) of the NIH; the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK; grant 5P30 DK052574, PI: P.I.T.) of the NIH to the Biobank Core of the Washington University Digestive Disease Research Core Center; and the Children’s Discovery Institute (grant MD-FR-2013-292, PI: B.B.W.). S.S. is supported by the NIH-funded Training Programs in Cellular & Molecular Biology (grant T32GM007067, PI: H. True-Krob; National Institute of General Medical Sciences). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. We thank past and present members of the Dantas laboratory, specifically D. J. Schwartz and A. D’Souza, for helpful scientific discussions and staff from the Edison Family Center for Genome Sciences & Systems Biology, including E. Martin, B. Koebbe, J. Hoisington-López, M. Crosby and B. Dee, for technical and administrative support in high-throughput sequencing and computing.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: B.B.W., P.I.T. and G.D.; sample curation: B.B.W. and P.I.T.; sample preparation: R.T., A.T., C.H.-M. and M.N.; methodology, investigation, data curation and analyses: S.S.S., R.T., A.T. and B.M.; figure preparation and writing of the manuscript: S.S.S.; reviewing and editing of the manuscript: all authors. S.S.S., R.T. and A.T. contributed equally.

Corresponding author

Correspondence to Gautam Dantas.

Ethics declarations

Competing interests

P.I.T. is a holder of equity in, a consultant to and a member of the Scientific Advisory Board of MediBeacon, Inc., which is developing a technology to non-invasively measure intestinal permeability in humans. P.I.T. is a co-inventor on patents assigned to MediBeacon (US patents 11,285,223 and 11,285,224, titled ‘Compositions and methods for assessing gut function’, and US patent application 2022-0326255, titled ‘Methods of monitoring mucosal healing’), which might earn royalties if the technology is commercialized. P.I.T. receives compensation for his roles as Chair, Scientific Advisory Board of the AGA Center for Microbiome Research and Education, and consultant to Temple University on waterborne enteric infections. He is a member of the Data Safety Monitoring Board of Inmunova, which is developing an immune biologic targeting Shiga toxin–producing E. coli infections, for which he receives no compensation, except for reimbursement of expenses. P.I.T. receives royalties from UpToDate from two sections on intestinal E. coli infections. G.D. is a consultant to and a member of the Scientific Advisory Board of Pluton Biosciences, which is developing methods for discovering environmental microbes for commercial applications. G.D. has consulted for SNIPR Technologies, Ltd. in the last 5 years but not presently. S.S. has consulted for Hypha Life Sciences and BioGenerator Ventures in the last 5 years but not presently. The authors declare no other competing interests.

Peer review

Peer review information

Nature Medicine thanks Joseph Neu, Danielle Lemay and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Saheli Sadanand and Alison Farrell, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Timeline of samples from infants and mothers in this study.

The 26 families are given identifiers between 06 and 48. Twins are distinguished by -1 and -2 following the family identifier, and mothers are distinguished with a C0 before the family identifier. Light blue fill identifies samples that underwent shallow shotgun sequencing. Black fill identifies samples that underwent ‘extended sequencing’, that is, deep shotgun sequencing, pooled long read sequencing, and culture-enhanced sequencing. Four infants 20-2, 29-1, 39-2, and 43-1 were extensively sampled for extended sequencing. Blue diamonds identify the weaning timepoint for all children.

Extended Data Fig. 2 Selected species prevalence over time.

Percent of infant cohort at each timepoint that carries a species within Clostridium, Eubacterium, Bacteroides, Streptococcus, Bifidobacterium, Ruminococcus, Veillonella, Roseburia, or Blautia. Measures of center are a smoothed conditional mean (local polynomial regression), with 95% CI shown in gray.

Extended Data Fig. 3 Maternal postpartum microbiome dynamics.

(A) Bray-Curtis dissimilarities between stool samples from different mother (“Between”) and stool samples from the same mother over time (“Within”) (n = 87 maternal stool metagenomes, P = 2.2e-16, two-tailed Wilcoxon test). Boxes represent IQR with median line, whiskers extend to 1.5xIQR. (B) Principal coordinate analysis of maternal samples plotted by Bray-Curtis dissimilarity. Samples are colored by months postpartum. Samples from the same mother are connected. (C) Bray-Curtis dissimilarity between mother-infant dyads over time. Distance of child metagenomes to either the average microbiome pro-file for the respective mother (solid orange), the most recent stool sample from the same mother captured before it (dashed orange), and unrelated mothers at the same timepoint (solid navy blue), plotted over time. Measures of center are a smoothed conditional mean (local polynomial regression), with 95% CI shown in gray.

Extended Data Fig. 4 Cohort distributions by antibiotic exposure and birth mode.

(A) Percent of pre-weaning months with at least one antibiotic exposure is plotted for each infant, binned by cohort. No significant differences between cohorts are observed (FDR-corrected Kruskal-Wallis test). Median lines are displayed. (B) Total infants delivered via C-section and vaginal birth are plotted and binned by cohort. No significant differences in birth mode ratio are observed (P = 0.71, Chi-square, d.f. = 2). (C) Gestational age at birth of infants, binned by cohort. No significant differences between cohorts are observed (FDR-corrected Kruskal-Wallis test).

Extended Data Fig. 5 Enterotype distribution and composition.

(A) Enterotype distribution across infant samples. Principal coordinate analysis plotting Bray-Curtis dissimilarity between all infant samples (n = 1,099 stool), with enterotype indicated by color. (B) Percent pre-weaning exposure to breastmilk of infant stool in adult enterotypes, segmented by microbiome development phase. Percents are defined as the number of pre-weaning months with any breastmilk feeding, divided by total pre-weaning months. Boxes represent IQR with median line, whiskers extend to 1.5xIQR, and * and *** correspond to q < 0.05 and q < 0.001, respectively (FDR-corrected one-way Mann-Whitney test).

Extended Data Fig. 6 Evaluating MAG assembly approaches.

Summary statistics for (A) total high- (HQ) and medium-quality (MQ) post-DAS Tool MAGs (Putative Genomes) per stool, (B) average contigs per MAG, and (C) average N50 per MAG. HQ is defined as completeness ≥ 90%, contamination ≤ 5%, strain heterogeneity ≤ 0.5%, and MQ is defined as completeness ≥ 50%, contamination ≤ 5%. Boxes represent IQR with median line, and whiskers extend to 1.5xIQR. For each panel, six approaches were considered against 10 Putative Genomes: (1) timepoint-filtered (TF) long-read meta-assemblies, (2) timepoint-specific (TS) short-read meta-assemblies, (3) metaSPAdes merged with filtered long-read meta-assemblies, (4) metaSPAdes–nanopore merged with filtered long-read meta-assemblies, (5) unfiltered long-read meta-assemblies merged with timepoint-specific metaSPAdes, and (6) OPERA-MS.

Extended Data Fig. 7 Putative and Reconstructed Genome capture by time.

(A) Putative Genome count per sample increases until approximately 3 YOL before stabilizing (n = 177 stools from 20 infants). (B) Timepoints 5 and 6, covering the 3 YOL and 7-8 YOL infant samples, harbor more Putative Genomes than the early-life timepoints, taken within the first 2 YOL (n = 16 infants, 6 stools per infant). (C) Putative Genome count per timepoint is stable longitudinally (n = 10 mothers 3-4 stools per mother). (D) Comparison of Putative Genome counts per individual before and after quality filtering (n = 10 mothers, 20 infants). (E) Number of quality-filtered Putative Genomes per dRep secondary cluster during MAG dereplication. (F) Comparison of MAG counts for each sample before and after dereplication. For (A) and (F), measures of center are a smoothed conditional mean (LOESS local polynomial regression), with 95% CI shown in gray. For (B) and (C), boxes represent IQR with median line, and whiskers extend to 1.5xIQR.

Extended Data Fig. 8 Generalized mutation rates of persisting RGs in mothers and infants.

Aggregate breadth- and genome size-adjusted popSNPs per persisting RG plotted by years since seeding with (A) linear regression or (B) local regression trendline. Shaded region represents 95% CI.

Extended Data Fig. 9 Mutation rates and mutated gene profiles binned by sub-cohort and period relative to weaning.

(A-C) Breadth-and genome-size adjusted popSNP counts by years since seeding (A) for all persisting RGs (n = 255 breastfed-origin persisting MAGs, 359 formula-origin persisting MAGs; P = 0.912, two-tailed Mann-Whitney test), (B) for persisting RGs seeded pre-weaning (n = 125 breastfed-origin persisting MAGs, 192 formula-origin persisting MAGs; P = 0.754, two-tailed Mann-Whitney test), or (C) for persisting RGs seeded post-weaning (n = 130 breastfed-origin persisting MAGs, 167 formula-origin persisting MAGs; P = 0.965, two-tailed Mann-Whitney test), each binned by pre-weaning diet. (D) Mutation rates per persisting RG, binned by seeding in respect to weaning. Non-mutating RGs from each cohort are excluded prior to statistical analysis (n = 225 pre-weaning persisting MAGs, 221 post-weaning persisting MAGs; P < 0.0001, two-tailed Mann-Whitney test). (E) For each RG that persists through weaning, mutation rate calculated from seeding to the last timepoint (TP) before weaning, and mutation rate calculated between the immediate timepoints flanking weaning. RGs that did not accrue any mutations pre-weaning are excluded prior to statistical analysis (n = 45 persisting MAGs, P = 0.0292, two-tailed Wilcoxon matched-pairs test). (F) Pairwise Bray-Curtis dissimilarity between pre-weaning and post-weaning distributions relative to that of each mother, binned by intra- vs. inter-family comparisons (n = 4 infants, 4 mothers; P = 0.63 same vs. different family, pre-weaning, P = 0.72 same vs. different family, post-weaning, two-tailed Kruskal-Wallis test). Boxes represent IQR with median line, and whiskers extend to 1.5xIQR.

Supplementary information

Supplementary Information

Supplementary File 1

Reporting Summary

Supplementary Tables 1–12

Supplementary Table 1 Patient metadata. Supplementary Table 2 Sample metadata. Supplementary Table 3 List of all NCBI RefSeq and Type Strain RGs, along with CheckM and Quast quality statistics. Supplementary Table 4 Assembly quality statistics for all PGs and RGs. Supplementary Table 5 Raw within-individual InStrain Compare output. Supplementary Table 6 Within-individual persistence data for non-transient RGs. Supplementary Table 7 Raw within-family InStrain Compare output. Supplementary Table 8 Within-family tracking data for shared RGs. Supplementary Table 9 SNP tracking for all within-individual RGs. Supplementary Table 10 SNP tracking for all within-individual RGs, parsed. Supplementary Table 11 Mutation rates per RG. Supplementary Table 12 RG ORFs accruing mutations during within-individual persistence.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sawhney, S.S., Thänert, R., Thänert, A. et al. Gut microbiome evolution from infancy to 8 years of age. Nat Med 31, 2004–2015 (2025). https://doi.org/10.1038/s41591-025-03610-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41591-025-03610-0

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing