Abstract
The human gut microbiome is most dynamic in early life. Although sweeping changes in taxonomic architecture are well described, it remains unknown how, and to what extent, individual strains colonize and persist and how selective pressures define their genomic architecture. In this study, we combined shotgun sequencing of 1,203 stool samples from 26 mothers and their twins (52 infants), sampled from childbirth to 8 years after birth, with culture-enhanced, deep short-read and long-read stool sequencing from a subset of 10 twins (20 infants) to define transmission, persistence and evolutionary trajectories of gut species from infancy to middle childhood. We constructed 3,995 strain-resolved metagenome-assembled genomes across 399 taxa, and we found that 27.4% persist within individuals. We identified 726 strains shared within families, with Bacteroidales, Oscillospiraceae and Lachnospiraceae, but not Bifidobacteriaceae, vertically transferred. Lastly, we identified weaning as a critical inflection point that accelerates bacterial mutation rates and separates functional profiles of genes accruing mutations.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
All genomic data generated in this work are publicly available at the NCBI under BioProject ID PRJNA1060349. All individual-specific and sample-specific metadata are available in Supplementary Tables 1 and 2, and all NCBI RefSeq and Type Strain representative genomes are identified in Supplementary Table 3. Generated raw data are available in Supplementary Tables 4–12.
Code availability
All tools and R packages used for this analysis are publicly available and fully described in the Methods section. Detailed code used for data visualization and analysis is available at https://github.com/sanjsawhney/twins_diet.
References
Clemente, J. C., Ursell, L. K., Parfrey, L. W. & Knight, R. The impact of the gut microbiota on human health: an integrative view. Cell 148, 1258–1270 (2012).
Valdes, A. M., Walter, J., Segal, E. & Spector, T. D. Role of the gut microbiota in nutrition and health. BMJ 361, k2179 (2018).
Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).
Korpela, K. et al. Selective maternal seeding and environment shape the human gut microbiome. Genome Res. 28, 561–568 (2018).
Sawhney, S. Influence of Environmental Gradients on Genomic Variation in Pediatric Commensals and Pathogens (Proquest, 2023).
Stewart, C. J. et al. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature 562, 583–588 (2018).
Sender, R., Fuchs, S. & Milo, R. Revised estimates for the number of human and bacteria cells in the body. PLoS Biol. 14, e1002533 (2016).
Yassour, M. et al. Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability. Sci. Transl. Med. 8, 343ra381 (2016).
Wampach, L. et al. Birth mode is associated with earliest strain-conferred gut microbiome functions and immunostimulatory potential. Nat. Commun. 9, 5091 (2018).
Baumann-Dudenhoeffer, A. M., D’Souza, A. W., Tarr, P. I., Warner, B. B. & Dantas, G. Infant diet and maternal gestational weight gain predict early metabolic maturation of gut microbiomes. Nat. Med. 24, 1822–1829 (2018).
Gasparrini, A. J. et al. Persistent metagenomic signatures of early-life hospitalization and antibiotic treatment in the infant gut microbiota and resistome. Nat. Microbiol. 4, 2285–2297 (2019).
Arrieta, M. C. et al. Early infancy microbial and metabolic alterations affect risk of childhood asthma. Sci. Transl. Med. 7, 307ra152 (2015).
Cox, L. M. & Blaser, M. J. Antibiotics in early life and obesity. Nat. Rev. Endocrinol. 11, 182–190 (2015).
Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15, 382–392 (2014).
Thanert, R., Sawhney, S. S., Schwartz, D. J. & Dantas, G. The resistance within: antibiotic disruption of the gut microbiome and resistome dynamics in infancy. Cell Host Microbe 30, 675–683 (2022).
Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in Westernized populations. Cell Host Microbe 26, 666–679(2019).
Vatanen, T. et al. Genomic variation and strain-specific functional adaptation in the human gut microbiome during early life. Nat. Microbiol. 4, 470–479 (2019).
Roodgar, M. et al. Longitudinal linked-read sequencing reveals ecological and evolutionary responses of a human gut microbiome during antibiotic treatment. Genome Res. 31, 1433–1446 (2021).
Faith, J. J. et al. The long-term stability of the human gut microbiota. Science 341, 1237439 (2013).
Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).
Ernst, C. M. et al. Adaptive evolution of virulence and persistence in carbapenem-resistant Klebsiella pneumoniae. Nat. Med. 26, 705–711 (2020).
Lou, Y. C. et al. Infant gut strain persistence is associated with maternal origin, phylogeny, and traits including surface adhesion and iron acquisition. Cell Rep. Med. 2, 100393 (2021).
Asnicar, F. et al. Studying vertical microbiome transmission from mothers to infants by strain-level metagenomic profiling. mSystems 2, e00164–16 (2017).
Backhed, F. et al. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host Microbe 17, 690–703 (2015).
Yan, Y., Nguyen, L. H., Franzosa, E. A. & Huttenhower, C. Strain-level epidemiology of microbial communities and the human microbiome. Genome Med. 12, 71 (2020).
Olm, M. R. et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat. Biotechnol. 39, 727–736 (2021).
Fodor, A. A. et al. The ‘most wanted’ taxa from the human microbiome for whole genome sequencing. PLoS ONE 7, e41294 (2012).
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019).
Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).
Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. & Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505–510 (2019).
Zeng, S. et al. A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome. Nat. Commun. 13, 5139 (2022).
Whelan, F. J. et al. Culture-enriched metagenomic sequencing enables in-depth profiling of the cystic fibrosis lung microbiota. Nat. Microbiol. 5, 379–390 (2020).
Teh, J. J. et al. Novel strain-level resolution of Crohn’s disease mucosa-associated microbiota via an ex vivo combination of microbe culture and metagenomic sequencing. ISME J. 15, 3326–3338 (2021).
Sawhney, S. S. et al. Assessment of the urinary microbiota of MSM using urine culturomics reveals a diverse microbial environment. Clin. Chem. 68, 192–203 (2021).
Jin, H. et al. Hybrid, ultra-deep metagenomic sequencing enables genomic and functional characterization of low-abundance species in the human gut microbiome. Gut Microbes 14, 2021790 (2022).
Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).
Morgan, M. DirichletMultinomial: Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data. R package version 1.42.0. https://mtmorgan.github.io/DirichletMultinomial/ (2023).
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
Antipov, D., Korobeynikov, A., McLean, J. S. & Pevzner, P. A. hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015 (2016).
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, 6761–6771 (2015).
Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017).
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Schoch, C. L. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford) 2020, baaa062 (2020).
Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).
Pritchard, L., Glover, R. H., Humphris, S., Elphinstone, J. G. & Toth, I. K. Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens. Anal. Methods 8, 12–24 (2016).
Ondov, B. D. et al. Mash Screen: high-throughput sequence containment estimation for genome discovery. Genome Biol. 20, 232 (2019).
Valles-Colomer, M. et al. The person-to-person transmission landscape of the gut and oral microbiomes. Nature 614, 125–135 (2023).
Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Meek, J. Y., Noble, L. & Section on, B. Policy statement: breastfeeding and the use of human milk. Pediatrics 150, e2022057988 (2022).
Fitzgerald, C. B. et al. Comparative analysis of Faecalibacterium prausnitzii genomes shows a high level of genome plasticity and warrants separation into new species-level taxa. BMC Genomics 19, 931 (2018).
De Filippis, F., Pasolli, E. & Ercolini, D. Newly explored faecalibacterium diversity is connected to age, lifestyle, geography, and disease. Curr. Biol. 30, 4932–4943 (2020).
Wu, G. et al. Genomic microdiversity of bifidobacterium pseudocatenulatum underlying differential strain-level responses to dietary carbohydrate intervention. mBio 8, e02348–16 (2017).
Chung The, H. et al. Exploring the genomic diversity and antimicrobial susceptibility of Bifidobacterium pseudocatenulatum in a Vietnamese population. Microbiol. Spectr. 9, e0052621 (2021).
Taft, D. H. et al. Bifidobacterium species colonization in infancy: a global cross-sectional comparison by population history of breastfeeding. Nutrients 14, 1423 (2022).
Huang, Y. et al. High-throughput microbial culturomics using automation and machine learning. Nat. Biotechnol. 41, 1424–1433 (2023).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Schmieder, R. & Edwards, R. Fast identification and removal of sequence contamination from genomic and metagenomic datasets. PLoS ONE 6, e17288 (2011).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Roberts, D. W. labdsv: Ordination and Multivariate Analysis for Ecology. https://cran.r-project.org/web/packages/labdsv/labdsv.pdf (2019).
Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).
Mallick, H. et al. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput. Biol. 17, e1009442 (2021).
Kassambara, A. ggpubr: ‘ggplot2’ Based Publication Ready Plots. GitHub https://github.com/kassambara/ggpubr (2023).
Campitelli, E. metR: Tools for Easier Analysis of Meteorological Fields. https://eliocamp.github.io/project/metr/ (2021).
Wickham, H. Reshaping data with the reshape package. J. Stat. Softw. 21, 1–20 (2007).
Wickham, H., Vaughan, D. & Girlich, M. tidyr: Tidy Messy Data. https://tidyr.tidyverse.org (2024).
Kindt, R. & Coe, R. Tree Diversity Analysis. A Manual and Software for Common Statistical Methods for Ecological and Biodiversity Studies (World Agroforestry Centre, 2005).
Wirbel, J. et al. Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox. Genome Biol. 22, 93 (2021).
Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023–1024 (2017).
Oksanen, J. et al. vegan: Community Ecology Package. https://cran.r-project.org/package=vegan (2022).
Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Shen, W., Le, S., Li, Y. & Hu, F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11, e0163962 (2016).
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Zheng, J. et al. dbCAN3: automated carbohydrate-active enzyme and substrate annotation. Nucleic Acids Res. 51, W115–W121 (2023).
Acknowledgements
This work was supported, in part, by awards from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD; grant R01HD092414, principal investigators (PIs): P.I.T. and G.D.) of the National Institutes of Health (NIH); the National Institute of Allergy and Infectious Diseases (NIAID; grant R01AI155893, PI: G.D.) of the NIH; the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK; grant 5P30 DK052574, PI: P.I.T.) of the NIH to the Biobank Core of the Washington University Digestive Disease Research Core Center; and the Children’s Discovery Institute (grant MD-FR-2013-292, PI: B.B.W.). S.S. is supported by the NIH-funded Training Programs in Cellular & Molecular Biology (grant T32GM007067, PI: H. True-Krob; National Institute of General Medical Sciences). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. We thank past and present members of the Dantas laboratory, specifically D. J. Schwartz and A. D’Souza, for helpful scientific discussions and staff from the Edison Family Center for Genome Sciences & Systems Biology, including E. Martin, B. Koebbe, J. Hoisington-López, M. Crosby and B. Dee, for technical and administrative support in high-throughput sequencing and computing.
Author information
Authors and Affiliations
Contributions
Conceptualization: B.B.W., P.I.T. and G.D.; sample curation: B.B.W. and P.I.T.; sample preparation: R.T., A.T., C.H.-M. and M.N.; methodology, investigation, data curation and analyses: S.S.S., R.T., A.T. and B.M.; figure preparation and writing of the manuscript: S.S.S.; reviewing and editing of the manuscript: all authors. S.S.S., R.T. and A.T. contributed equally.
Corresponding author
Ethics declarations
Competing interests
P.I.T. is a holder of equity in, a consultant to and a member of the Scientific Advisory Board of MediBeacon, Inc., which is developing a technology to non-invasively measure intestinal permeability in humans. P.I.T. is a co-inventor on patents assigned to MediBeacon (US patents 11,285,223 and 11,285,224, titled ‘Compositions and methods for assessing gut function’, and US patent application 2022-0326255, titled ‘Methods of monitoring mucosal healing’), which might earn royalties if the technology is commercialized. P.I.T. receives compensation for his roles as Chair, Scientific Advisory Board of the AGA Center for Microbiome Research and Education, and consultant to Temple University on waterborne enteric infections. He is a member of the Data Safety Monitoring Board of Inmunova, which is developing an immune biologic targeting Shiga toxin–producing E. coli infections, for which he receives no compensation, except for reimbursement of expenses. P.I.T. receives royalties from UpToDate from two sections on intestinal E. coli infections. G.D. is a consultant to and a member of the Scientific Advisory Board of Pluton Biosciences, which is developing methods for discovering environmental microbes for commercial applications. G.D. has consulted for SNIPR Technologies, Ltd. in the last 5 years but not presently. S.S. has consulted for Hypha Life Sciences and BioGenerator Ventures in the last 5 years but not presently. The authors declare no other competing interests.
Peer review
Peer review information
Nature Medicine thanks Joseph Neu, Danielle Lemay and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Saheli Sadanand and Alison Farrell, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Timeline of samples from infants and mothers in this study.
The 26 families are given identifiers between 06 and 48. Twins are distinguished by -1 and -2 following the family identifier, and mothers are distinguished with a C0 before the family identifier. Light blue fill identifies samples that underwent shallow shotgun sequencing. Black fill identifies samples that underwent ‘extended sequencing’, that is, deep shotgun sequencing, pooled long read sequencing, and culture-enhanced sequencing. Four infants 20-2, 29-1, 39-2, and 43-1 were extensively sampled for extended sequencing. Blue diamonds identify the weaning timepoint for all children.
Extended Data Fig. 2 Selected species prevalence over time.
Percent of infant cohort at each timepoint that carries a species within Clostridium, Eubacterium, Bacteroides, Streptococcus, Bifidobacterium, Ruminococcus, Veillonella, Roseburia, or Blautia. Measures of center are a smoothed conditional mean (local polynomial regression), with 95% CI shown in gray.
Extended Data Fig. 3 Maternal postpartum microbiome dynamics.
(A) Bray-Curtis dissimilarities between stool samples from different mother (“Between”) and stool samples from the same mother over time (“Within”) (n = 87 maternal stool metagenomes, P = 2.2e-16, two-tailed Wilcoxon test). Boxes represent IQR with median line, whiskers extend to 1.5xIQR. (B) Principal coordinate analysis of maternal samples plotted by Bray-Curtis dissimilarity. Samples are colored by months postpartum. Samples from the same mother are connected. (C) Bray-Curtis dissimilarity between mother-infant dyads over time. Distance of child metagenomes to either the average microbiome pro-file for the respective mother (solid orange), the most recent stool sample from the same mother captured before it (dashed orange), and unrelated mothers at the same timepoint (solid navy blue), plotted over time. Measures of center are a smoothed conditional mean (local polynomial regression), with 95% CI shown in gray.
Extended Data Fig. 4 Cohort distributions by antibiotic exposure and birth mode.
(A) Percent of pre-weaning months with at least one antibiotic exposure is plotted for each infant, binned by cohort. No significant differences between cohorts are observed (FDR-corrected Kruskal-Wallis test). Median lines are displayed. (B) Total infants delivered via C-section and vaginal birth are plotted and binned by cohort. No significant differences in birth mode ratio are observed (P = 0.71, Chi-square, d.f. = 2). (C) Gestational age at birth of infants, binned by cohort. No significant differences between cohorts are observed (FDR-corrected Kruskal-Wallis test).
Extended Data Fig. 5 Enterotype distribution and composition.
(A) Enterotype distribution across infant samples. Principal coordinate analysis plotting Bray-Curtis dissimilarity between all infant samples (n = 1,099 stool), with enterotype indicated by color. (B) Percent pre-weaning exposure to breastmilk of infant stool in adult enterotypes, segmented by microbiome development phase. Percents are defined as the number of pre-weaning months with any breastmilk feeding, divided by total pre-weaning months. Boxes represent IQR with median line, whiskers extend to 1.5xIQR, and * and *** correspond to q < 0.05 and q < 0.001, respectively (FDR-corrected one-way Mann-Whitney test).
Extended Data Fig. 6 Evaluating MAG assembly approaches.
Summary statistics for (A) total high- (HQ) and medium-quality (MQ) post-DAS Tool MAGs (Putative Genomes) per stool, (B) average contigs per MAG, and (C) average N50 per MAG. HQ is defined as completeness ≥ 90%, contamination ≤ 5%, strain heterogeneity ≤ 0.5%, and MQ is defined as completeness ≥ 50%, contamination ≤ 5%. Boxes represent IQR with median line, and whiskers extend to 1.5xIQR. For each panel, six approaches were considered against 10 Putative Genomes: (1) timepoint-filtered (TF) long-read meta-assemblies, (2) timepoint-specific (TS) short-read meta-assemblies, (3) metaSPAdes merged with filtered long-read meta-assemblies, (4) metaSPAdes–nanopore merged with filtered long-read meta-assemblies, (5) unfiltered long-read meta-assemblies merged with timepoint-specific metaSPAdes, and (6) OPERA-MS.
Extended Data Fig. 7 Putative and Reconstructed Genome capture by time.
(A) Putative Genome count per sample increases until approximately 3 YOL before stabilizing (n = 177 stools from 20 infants). (B) Timepoints 5 and 6, covering the 3 YOL and 7-8 YOL infant samples, harbor more Putative Genomes than the early-life timepoints, taken within the first 2 YOL (n = 16 infants, 6 stools per infant). (C) Putative Genome count per timepoint is stable longitudinally (n = 10 mothers 3-4 stools per mother). (D) Comparison of Putative Genome counts per individual before and after quality filtering (n = 10 mothers, 20 infants). (E) Number of quality-filtered Putative Genomes per dRep secondary cluster during MAG dereplication. (F) Comparison of MAG counts for each sample before and after dereplication. For (A) and (F), measures of center are a smoothed conditional mean (LOESS local polynomial regression), with 95% CI shown in gray. For (B) and (C), boxes represent IQR with median line, and whiskers extend to 1.5xIQR.
Extended Data Fig. 8 Generalized mutation rates of persisting RGs in mothers and infants.
Aggregate breadth- and genome size-adjusted popSNPs per persisting RG plotted by years since seeding with (A) linear regression or (B) local regression trendline. Shaded region represents 95% CI.
Extended Data Fig. 9 Mutation rates and mutated gene profiles binned by sub-cohort and period relative to weaning.
(A-C) Breadth-and genome-size adjusted popSNP counts by years since seeding (A) for all persisting RGs (n = 255 breastfed-origin persisting MAGs, 359 formula-origin persisting MAGs; P = 0.912, two-tailed Mann-Whitney test), (B) for persisting RGs seeded pre-weaning (n = 125 breastfed-origin persisting MAGs, 192 formula-origin persisting MAGs; P = 0.754, two-tailed Mann-Whitney test), or (C) for persisting RGs seeded post-weaning (n = 130 breastfed-origin persisting MAGs, 167 formula-origin persisting MAGs; P = 0.965, two-tailed Mann-Whitney test), each binned by pre-weaning diet. (D) Mutation rates per persisting RG, binned by seeding in respect to weaning. Non-mutating RGs from each cohort are excluded prior to statistical analysis (n = 225 pre-weaning persisting MAGs, 221 post-weaning persisting MAGs; P < 0.0001, two-tailed Mann-Whitney test). (E) For each RG that persists through weaning, mutation rate calculated from seeding to the last timepoint (TP) before weaning, and mutation rate calculated between the immediate timepoints flanking weaning. RGs that did not accrue any mutations pre-weaning are excluded prior to statistical analysis (n = 45 persisting MAGs, P = 0.0292, two-tailed Wilcoxon matched-pairs test). (F) Pairwise Bray-Curtis dissimilarity between pre-weaning and post-weaning distributions relative to that of each mother, binned by intra- vs. inter-family comparisons (n = 4 infants, 4 mothers; P = 0.63 same vs. different family, pre-weaning, P = 0.72 same vs. different family, post-weaning, two-tailed Kruskal-Wallis test). Boxes represent IQR with median line, and whiskers extend to 1.5xIQR.
Supplementary information
Supplementary Information
Supplementary File 1
Supplementary Tables 1–12
Supplementary Table 1 Patient metadata. Supplementary Table 2 Sample metadata. Supplementary Table 3 List of all NCBI RefSeq and Type Strain RGs, along with CheckM and Quast quality statistics. Supplementary Table 4 Assembly quality statistics for all PGs and RGs. Supplementary Table 5 Raw within-individual InStrain Compare output. Supplementary Table 6 Within-individual persistence data for non-transient RGs. Supplementary Table 7 Raw within-family InStrain Compare output. Supplementary Table 8 Within-family tracking data for shared RGs. Supplementary Table 9 SNP tracking for all within-individual RGs. Supplementary Table 10 SNP tracking for all within-individual RGs, parsed. Supplementary Table 11 Mutation rates per RG. Supplementary Table 12 RG ORFs accruing mutations during within-individual persistence.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sawhney, S.S., Thänert, R., Thänert, A. et al. Gut microbiome evolution from infancy to 8 years of age. Nat Med 31, 2004–2015 (2025). https://doi.org/10.1038/s41591-025-03610-0
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41591-025-03610-0