Introduction

The establishment of the infant gut microbiota begins immediately after birth, closely linked with the development of the host’s immune system, metabolism, and intestinal homoeostasis. The process of colonisation occurs from multiple origins1 and varies according to the infant’s age2. Previous studies have reported that the infant gut microbiota development during the first year of life is critical for later risk of diseases and host health3. At birth, prenatal conditions, host genetics, and mode of delivery influence the development of the gut microbiome4,5. Later, antibiotic treatment, mode of feeding, the start of weaning, geographical location and other environmental exposures will be associated with the maturation of the gut microbiota6,7.

In parallel with the development of the early-life microbiota, the infant gut assembles its own resistome, which encompasses all the antibiotic resistance genes (ARGs) within a microbial community8. Thus, the development of the infant resistome is linked to the dynamics of the microbial composition, contributing to the long-term dynamics and stabilisation of the resistome into adulthood. In the early stages of life, antibiotic acquired resistance can result from the vertical transmission from the mother at birth or during lactation, and/or maternal, intra-partum, and infant antibiotic treatment9. Moreover, mobile genetic elements (MGEs), such as most plasmids, can transfer ARGs between members of the gut microbial community in a process known as horizontal gene transfer. The same factors that influence the gut microbiome composition can drive the spread and selection of resistance genes within the developing gut microbiota10. The types of ARGs that dominate in infancy typically confer resistance to common antibiotics such as beta-lactams, tetracyclines, macrolides, aminoglycosides, and quinolones11, as well as to drugs to which infants are not exposed12. Previously, studies have indicated that infants have a higher relative abundance of these genes compared to older children13,14,15.

As microbes evolve resistance mechanisms, previously effective treatments become obsolete, increasing clinical complications, mortality, and healthcare costs16. The problem is especially concerning during infancy, a time when early exposure to antibiotic resistance can have lasting health effects. With approximately 214,000 neonatal deaths annually linked to sepsis from ARG-carrying pathogens17, the importance of understanding how the resistome develops with age is clear. Yet, how various factors impact the resistome’s development during this crucial period of time is less explored.

Our research aims to delineate the assembly of the infant gut resistome during the first year of life and to identify the factors that shape its dynamics through the analysis of metagenomic data from infants and their mothers. By mapping out the resistome landscape within the infant gut and the dynamics within their microbial hosts, our work seeks to provide future strategies to ameliorate the spread of antimicrobial resistance in early life.

Results

Dynamics of the gut resistome during the first year of life

Overall, 265 samples were subjected to metagenomic sequencing and further resistome screening by using assembly-free and assembly-based approaches (Supplementary Data 1). The assembly-free approach allowed us to quantify (based on CPM) the occurrence and abundance of ARG in the different samples. ARGs were found in all samples collected. In total, 208 different ARGs were identified in infants and 134 in the mothers, and 122 ARGs (35.67% of the total ARGs) were shared among them. Infants shared more ARGs with their own mothers than with unrelated mothers at 1 month postpartum (Supplementary Fig. 1). Mother-infant dyads shared ARGs conferring resistance to lincosamides, macrolides, streptogramin B, tetracycline, aminoglycosides, beta-lactams and amphenicol (Fig. 1A), but unique ARGs were found in mothers and infants. The proportion of shared ARGs is lower than the proportion of unique ARG in mothers and infants (Fig. 1A). Tetracycline ARGs were the most abundant in infant faecal samples, with tet(S/M) being the most abundant and prevalent (93.4% of samples), followed by tetB, tetA, tetW and tetO (over 75% of samples). Macrolide ARGs were the second most abundant overall, with msrD and mefA present in more than 95% of the samples. ermB, categorised as a Macrolide-lincosamide-streptogramin B (MLSB) resistance gene, was prevalent in 80.6% of the infant samples, followed by ermX and ermT (over 60% of the infant samples). The aminoglycoside resistance gene aph(3’)-IIIa was found in 61.7% of the infant faecal samples, and blaTEM was identified as the most abundant beta-lactam resistance gene (55.1% of the samples) (Supplementary Data 2).

Fig. 1: Mother-infant’s gut resistome during the first year of life.
figure 1

A Abundance of assembled ARGs in mother-infant pairs at 1 month postpartum (N = 39 complete dyads), classified as shared, unique in mothers and unique in infants, and grouped by antibiotic class. B Antibiotic resistance load in the gut of infants during the first year of life and their mother’s, expressed as total CPM of ARGs per infant in each timepoint. ARG load was the summed abundance of all ARGs in a given sample. Significance values for ARG load were calculated using the pairwise rank-sum Wilcoxon test (Mann-Whitney U), two-sided. C Alpha diversity of the mother-infant gut resistomes based on a two-sided Kruskal-Wallis test on the Shannon index. D Composition of the infant resistome through time, coloured by the infant’s age. Two-sided PERMANOVA on Jaccard distances was applied to assess differences between timepoints (p < 0.05 for all comparisons). E Composition of the resistome at the antibiotic class level of mother-infant pairs. ARGs have been grouped according to the antibiotic(s) they confer resistance. All samples represent independent biological replicates, derived from individual participants. Sample sizes per group are as follows: day 7 infants (N = 58), 1-month-old infants (N = 58), 6-month-old infants (N = 56), 12-month-old infants (N = 55), and mothers (N = 39). Box plots show the median (centre line), the first and third quartiles (bounds of the box). Whiskers are chosen to show the 1.5 of the IQR. Asterisks indicate statistical significance: *p  <  0.05; **p  <  0.01; ***p  <  0.001; ****p  <  0.0001. FPA: Folate Pathway Antagonist. Source data are provided as a Source Data file.

The total load of ARGs in the infant gut decreased significantly during the first year of life. The amount of ARG in 7-days-old infants was significantly higher than at later time points and also compared to the mothers’ load (p < 0.01). However, infants at 6 and 12 months of age appeared to have lower ARG load than mothers (p < 0.025) (Fig. 1B). The resistome diversity increases over time and tends to resemble their mothers (p < 0.01) (Fig. 1C). The infant resistome composition also changed significantly over time (for all timepoints p < 0.05) (Fig. 1D) towards resembling the mothers’ resistome (Fig. 1E).

Antibiotic resistances differing significantly with child age were calculated using the DESeq2 approach (Supplementary Fig. 2). Briefly, ARGs conferring resistance to aminoglycosides started to increase from 1 month of age, while beta-lactam, macrolide and MLSB ARGs decreased through time, Tetracycline resistance genes increased during the first month of life, but then diminished towards 12 months of life.

We subsequently utilised an RF model to identify time-specific indicators in the resistome (Supplementary Fig. 3). This decision was based on the GLM analysis, which revealed a negative relationship between time and ARG load (p < 0.001), indicating that as the infant’s age increased, the ARG load decreased. The developing gut resistome was linearly related to infant age. RF, with a ten-fold cross-validation analysis (\({R}^{2}\)  = 65.24%), established that ten ARGs and five variables (type of diet, exclusive breastfeeding, type of breastfeeding, duration of exclusive breastfeeding and duration of lactation in days) are the most predictive of age. A tetracycline resistance gene, tet(40) (present in 34.5% of the samples) was ranked as the most important variable within de ARGs differentiating infant age, followed by other genes of the same antibiotic type tet(32), tet(W), tet(O), tet(A) and tet(S), as well as the blaZ, ant(6), lsa(A), and erm(B) genes, which confer resistance to beta-lactamases, aminoglycosides, lincosamides and macrolides, respectively.

Influence of the microbiome on the resistome acquisition

The investigation of the microbial composition by using assembly-free approaches showed that the microbiome diversity of the infants increased significantly over time, even though it remained lower than the mothers’ (p < 0.05) (Fig. 2A), which was in concordance with the results observed for the resistome. The microbial composition in infants of 7 days of age differed significantly to older ages (p < 0.05) (Fig. 2B and Supplementary Fig. 4).

Fig. 2: Infant’s gut microbiome during the first year of life and bacterial hosts of resistances.
figure 2

A Alpha diversity of the mother-infant gut resistomes based on the Kruskal-Wallis test on the Shannon index. Box plots show the median (centre line), the first and third quartiles (bounds of the box). Whiskers are chosen to show the 1.5 of the IQR. All samples represent independent biological replicates, derived from individual participants. Sample sizes per group are as follows: day 7 infants (N = 58), 1-month-old infants (N = 58), 6-month-old infants (N = 56), 12-month-old infants (N = 55), and mothers (N = 39). Asterisks indicate statistical significance: *p  <  0.05; **p  <  0.01; ***p  <  0.001; ****p  <  0.0001. B Microbial composition at the species level of infants gut over the first year of life. Samples are coloured according to infants’ age. Two-sided PERMANOVA on Bray-Curtis distances was applied to assess differences between timepoints. p = 0.05 for 7 days compared to older timepoints. C Microbial hosts of resistance during the first year of life. ARG-carrying contigs were detected within the bins. D Antibiotic resistance and their microbial hosts during the first year of life. E Relative abundance (%) of genus carrying ARGs and number of copies of these genes associated with a plasmid. FPA: Folate Pathway Antagonist; MLSB: Macrolide-Lincosamide-Streptogramin B. Source data are provided as a Source Data file.

Moreover, an assembly-based approach was also conducted in order to infer the microbial taxa associated with the ARGs identified. Overall, Bacillota was the main phyla carrying ARG, and Staphylococcus, Streptococcus, Enterococcus, and Klebsiella were the main ARG-carrying genera, particularly at 7 days of life. Staphylococcus aureus, S. epidermidis, and K. pneumoniae abundances decreased during the first year of life, while Streptococcus salivarius, S. parasangunis, S. lactarius, E. faecium and E. faecalis maintained constant over time (Fig. 2C). Escherichia coli, Clostridium perfringens and Bacteroides fragilis were some of the species with the greater impact over the infants’ resistome and appeared to remain stable during the first year of life. The species of Bifidobacterium that carry resistances increased over time, being B. longum and B. breve the most representative (Fig. 2C).

Tetracyclines and macrolides ARGs were harboured by many different genera, but Streptococcus were the main host of ARGs conferring resistance to those antibiotics (accounting for 28% and 42% of all tetracycline and macrolide ARGs, respectively). Similarly, aminoglycoside ARGs were mainly carried by Staphylococcus (23.39% of the total number of aminoglycoside ARGs), Escherichia (35.5%) and Enterococcus spp. (19.02%) (Fig. 2D and Supplementary Data 3). Many of these genes were also associated with plasmids, as aph(3”)-Ib and blaTEM were mainly associated with plasmids of Escherichia spp, and blaZ was mainly associated with plasmids in Staphylococcus spp. (Fig. 2E). Compared to the other genera, Klebsiella is the one responsible for carrying resistances against folate pathway antagonist (FPA, 66.66%), amphenicol (75.32%) and quinolone (95.08) antibiotics, and copies of fosA and oxqA genes in Klebsiella were identified in plasmids (Fig. 2E). Moreover, fosfomycin resistance was only associated with Klebsiella and Staphylococcus, and these two genera were also the main carriers of beta-lactam resistances (24.42% and 35.48%, respectively) (Fig. 2D and Supplementary Data 3). Macrolide ARGs were mainly carried by Streptococcus, being mefA the most representative non-mobile gene, but also by Staphylococcus, Enterococcus and some Bifidobacterium (Fig. 2D, E). In addition, Bifidobacterium, Enterococcus, and Bacteroides species were the main responsible for carrying tetracycline, macrolide, streptogramin B and lincosamide related resistances (Fig. 2D and Supplementary Data 3).

High Bifidobacterium abundance is associated with a lower resistome load and diversity during the first year of life

The microbial composition of the infants’ gut throughout time was clustered into two distinct groups according to hierarchical clusterization based on k-means (Supplementary Fig. 5), and further confirmed with PAM clustering (Supplementary Fig. 6). The strength of association between the PAM and the k-means clustering was 0.77 (Cramer’s V correlation). The overall microbiome composition was statistically different between the two groups during the first year of life (p < 0.001) (Supplementary Fig. 7a). Strikingly, Bifidobacterium abundance appeared to differ greatly between both groups in all the investigated timepoints, an observation that was further confirmed statistically (p < 0.0001) (Fig. 3A). Moreover, the microbial diversity was statistically higher in the group of infants with a lower abundance of Bifidobacterium (p < 0.05 in all timepoints) (Fig. 3A). Hence, we named the groups according to the relative abundance of Bifidobacterium as “High” (> 70%) or “Low” (< 30%), and further investigated the role of species belonging to this genus in shaping the infants’ gut resistome. During the first year of life, 42.25% of infants remained in the same Bifidobacterium abundance-based group. Stability within the groups was 65.96% from 7 days to 1 month, 67.39% from 1 to 6 months, and 58.33% from 6 months to 1 year (Fig. 3B).

Fig. 3: Bifidobacterium abundance drives microbial composition.
figure 3

A Genus relative abundance over time, depending on the Bifidobacterium abundance-based clusters. Two distinct clusters (optimal number of clusters k  =  2) were identified via k-means clustering. Mann-Whitney U test revealed that statistically significant differences in relative abundance were found in the Bifidobacterium genus, in all timepoints, so clusters were named as High- and Low-Bifidobacterium. Upper boxplots represent alpha diversity of the mother-infant gut resistomes based on a two-sided Kruskal-Wallis test on the Shannon index. Box plots show the median (centre line), the first and third quartiles (bounds of the box). Whiskers are chosen to show the 1.5 of the IQR. Sample sizes per group and timepoint were as follows: day 7: High-Bifidobacterium N  =  23, Low-Bifidobacterium N  =  35; 1 month: High N  =  42, Low N  =  16; 6 months: High N  =  31, Low N  =  25; 12 months: High N  =  19, Low N  =  36. Asterisks denote statistical significance: *p  <  0.05; **p  <  0.01; ***p  <  0.001; ****p  <  0.0001. B Stability of the hierarchical clustering based on Bifidobacterium spp. abundance over the first year of life. C Relative abundance of Bifidobacterium species in each cluster during the first year of life. Only species with statistically different abundances at any time point are shown (Mann-Whitney U test, *p  <  0.05). Source data are provided as a Source Data file.

Among the species of Bifidobacterium, B. longum and B. bifidum were statistically more abundant in High-Bifidobacterium infants during the entire first year of life (p < 0.05) (Fig. 3C). Similarly, B. breve was also more abundant in High-Bifidobacterium infants at 7 days, 1 month, and 1 year of age (p < 0.01). Indeed, the presence of B. breve in Low-Bifidobacterium infants seems to first appear at 6 months of age. On the other hand, Low-Bifidobacterium infant’s seem to be highly colonised by other genera, including E. coli during the first year of life (p < 0.05 in all timepoints), Clostridium perfringens and Klebsiella pneumoniae during the first 6 months of life, S. aureus during the first month of life and including E. faecalis at 7 days of age (Supplementary Fig. 8).

Based on the premise that vaginal deliveries and exclusive breastfeeding have been associated with Bifidobacterium spp. dominance in the infant gut18, we assessed the potential influence of these two factors to drive the clusterisation of the samples in the High- or Low-Bifidobacterium groups. Our analysis confirmed that the groups of samples based on Bifidobacterium abundance were still significantly different and independently to the mode of delivery and lactation (p < 0.05 in all timepoints) (Supplementary Fig. 7b and c).

Considering the effect on the microbiome, we assessed if Bifidobacterium abundance could have an association with the resistome of the infants. Low-Bifidobacterium infants have a significantly higher ARG load than High-Bifidobacterium ones (for all timepoints p < 0.001) (Fig. 4A), and the composition of the resistome is also significantly different (for all timepoints p = 0.001). Overall, most of the classes of antibiotic resistance are more abundant in Low-Bifidobacterium infants than High-Bifidobacterium infants during the first month of life (Fig. 4B). Moreover, the mobilome was also influenced by the Bifidobacterium abundance in the infants’ gut. Low-Bifidobacterium infants had more copies of ARGs associated to plasmids than High-Bifidobacterium infants, including genes conferring resistance to tetracycline, beta-lactam, amphenicol, and MLSB resistances (Fig. 4C).

Fig. 4: Bifidobacterium abundance drives resistance load and composition.
figure 4

A Antibiotic resistance load in the gut of infants during the first year of life, depending on their Bifidobacterium abundance (cluster High or Low), expressed as total CPM of ARGs per infant in each timepoint. ARG load was the summed abundance of all ARGs in a given sample. Significance values for ARG load were calculated using the pairwise rank-sum Wilcoxon test (Mann-Whitney U), two-sided. Box plots show the median (centre line), the first and third quartiles (bounds of the box). Whiskers are chosen to show the 1.5 of the IQR. All samples represent independent biological replicates, derived from individual participants. Sample sizes per group and timepoint were as follows: day 7: High-Bifidobacterium N  =  23, Low-Bifidobacterium N  =  35; 1 month: High N  =  42, Low N  =  16; 6 months: High N  =  31, Low N  =  25; 12 months: High N  =  19, Low N  =  36. Asterisks denote statistical significance: *p  <  0.05; **p  <  0.01; ***p  <  0.001; ****p  <  0.0001. B Antibiotic resistance profiles in each group. ARGs have been grouped according to the resistance they confer. C Number of copies of ARGs associated with a plasmid in High- and Low-Bifidobacterium grouped infants. FPA: Folate Pathway Antagonist; MLSB: Macrolide-Lincosamide-Streptogramin B. Source data are provided as a Source Data file.

Another RF model was performed to identify those Bifidobacterium cluster indicators in the resistome (Supplementary Fig. 9). The GLM analysis showed a significant negative association between high Bifidobacterium cluster infants and ARG load, compared to the low cluster group (p < 0.001). RF classified infants into high and low Bifidobacterium clusters with an accuracy of 82.35%, while the pROC curve showed an AUC value of 0.90 using only resistance genes as predictive variables. The macrolides resistance gene erm(B) and the tetracycline resistance gene tet(S), which were also present in the infant-age model, were ranked as the most important variables. Other genes present a high importance in the model: tet(32), erm(X), tet(B), OqxB or mef(A).

Association of the resistome with perinatal factors

Our next goal was to assess the ways in which the distribution of ARGs in the infant gut was shaped by various environmental exposures. At 7 days of age, mode of delivery and Bifidobacteirum abundance-based clustering (referred as “kmeans_bifis”) are key factors driving the antibiotic resistance composition, although their association is not significant. At 1 month of age, exclusive breastfeeding significantly influenced infants’ resistome (R2 = 0.119, p = 0.001), and the type of diet (exclusive breastfeeding, mixed or formula feeding) appeared to also have a slight influence (R2 = 0.143, p = 0.08). “kmeans_bifis” is the following factor with high influence. At 6 months of age, the consumption of antibiotics has a higher impact on infant’s resistome, but is not statistically significant, and exclusive breastfeeding loses its importance as most of the infant’s start weaning. Finally, the presence of domestic pets in the same household of 1-year-old infants seemed to have a great impact on their resistome as well (p = 0.05) (Fig. 5A).

Fig. 5: The influence of environmental factors on shaping infant’s resistome during first year of life.
figure 5

A Environmental factors driving infant’s resistome, ranked by effect size (‘envfit’ R2, false discovery rate (FDR)-corrected two-sided test, p < 0.05). B Antibiotic resistance load in the gut of infants of 7 days of age, depending on their mode of delivery (N = 40 vaginal, N = 26 C-section), expressed as total CPM of ARGs per infant in each timepoint. C Principal Coordinate Analysis (PCOA) of the composition of the resistome depending on the mode of delivery (vaginal or C-section) during the first 7 days and 1 month of life, based on Jaccard distances. Two-sided PERMANOVA on Jaccard distances was applied to assess differences between groups. D Antibiotic resistance load in the gut of infants of 1 month of age, depending on their diet (N = 40 breastfeeding, N = 4 formula feeding, N = 14 mixed), at each timepoint during the first year of life, expressed as total CPM of ARGs per infant in each timepoint. E Antibiotic resistance load in the gut of infants during the first month of life, depending on exclusive breastfeeding (N = 40 yes and N = 18 no), at each timepoint during the first year of life, expressed as total CPM of ARGs per infant in each timepoint. F Tendency of antibiotic resistance during the first year of life according to breastfeeding during the first month of life. The trajectory of the total load of ARGs was studied considering if the infant received exclusive breastfeeding during the first month of life. G ARGs differentially abundant due to breastfeeding in 6-month-old infants compared to non-breastfed infants, calculated using DESeq2 with the p-values attained by the Wald test corrected for multiple testing using the Benjamini-Hochberg method. H Antibiotic resistance load in the gut of infants during the first year of life, depending on the mode of delivery and if they were exclusively breastfed, at each timepoint during the first year of life, expressed as total CPM of ARGs per infant in each timepoint. I Multivariable association between factors and resistome composition using MaAsLin. Antibiotic resistance load was the summed abundance of all ARGs in a given sample. Box plots show the median (centre line), the first and third quartiles (bounds of the box). Whiskers are chosen to show the 1.5 of the IQR. Significance values for ARG load were calculated using the pairwise rank-sum Wilcoxon test (Mann-Whitney U), two-sided. Asterisks denote statistical significance: *p  <  0.05; **p  <  0.01; ***p  <  0.001; ****p  <  0.0001. kmeans_bifis: Bifidobacterium abundance-based clustering; AB_BB: infant’s antibiotic consumption; BF: breastfeeding; nBF_C: not exclusively breastfed infants born by C-section; nBF_V: not exclusively breastfed infants born vaginally; BF_C: exclusively breastfed infants born by C-section; BF_V: exclusively breastfed infants born vaginally. Source data are provided as a Source Data file.

Further analysis of the effect of these factors on the resistome diversity, composition and abundance were done. As reported above, Bifidobacterium abundance influences resistome load, diversity and composition during the first year of life. Regarding mode of delivery, C-section born infants had higher ARG load at 7-days of age than those born vaginally (Fig. 5B). The mode of delivery showed a direct effect on the resistome composition of 7-days and 1-month-old infants (p < 0.05) (Fig. 5C).

Our data suggest that any contact with human milk during the first 6 months of life reduces the ARG load on the infant’s resistome (Supplementary Fig. 10). At 7 days of age, breastfed infants have less abundance of ARG than formula-fed infants. At 1 month of age, some infants received a combination of both formula and breastfeeding (mixed), and we observed that in those cases, their ARG load was higher compared to those infants that kept exclusively breastfeeding(p < 0.01). Formula-fed infants at 1 month of age showed higher ARG abundances than those receiving breastfeeding, although the differences were not significant. At 6 months of age, most of the infant’s stopped receiving breastfeeding and started weaning. In those cases, their ARG load is higher than those infants that are still breastfed (p < 0.05). At 12 months of age, none of the infants were receiving exclusive breastfeeding (Fig. 5D). In addition, we have observed that exclusive breastfeeding directly influences the resistome during the first month of life, as infant’s that are not exclusively fed with human milk, and receive either only formula feeding or combination of formula and breastfeeding, have a higher ARG load than those exclusively breastfed (p = 0.003) (Fig. 5E). The ARG composition is also different at 1 month of age (p = 0.012). Exclusive breastfeeding during the first month of life accelerates the decrease of antibiotic resistance in the infant gut during that time and ensures a lower abundance of ARGs at 6 months (Fig. 5F). Early termination of breastfeeding (before 6 months according to WHO recommendations19) was associated with an enrichment of several ARGs in infants (p <  0.05) (Fig. 5G) The enriched genes included genes conferring resistance to beta-lactams, amphenicol quinolones, FPA, macrolides, and tetracyclines, among others.

The mode of delivery clearly influenced the ARG load of 7-days old infants, as C-section-born infants have more resistance than vaginal-born infants (p < 0.01). However, exclusive breastfeeding until 1 month of age reduces the ARG load in infants born by C-section (Fig. 5H). From vaginal delivered infant’s, those exclusively breastfeeding showed less ARG than those not exclusively breastfed (p < 0.05). Hence, we observe that the effects that a C-section procedure has on ARG load during the first month of life can be erased if the infant is exclusively breastfed.

C-section not exclusively breastfeed infants correlate positively with genes conferring resistance to lincosamide and streptogramin A, glycopeptides and amphenicol + quinolone + FPA (Fig. 5I). C-section procedures correlate positively with fosfomycin, lincosamide, quinolones and aminoglycoside resistances, while vaginal deliveries are positively correlated with tetracycline, macrolide and lincosamide resistances. The lack of exclusive breastfeeding is associated with many resistance too, such as macrolides, lincosamides and streptogramins. Fosfomycin, pseudomonic acid and steroid antibacterial resistances are negatively correlated with infant age, while lincosamides, tetracyclines, and quinolones increase over time. However, genes conferring resistance to macrolides, lincosamides and beta-lactams are more associated with 7 days infants. Finally, a low abundance of Bifidobacterium is associated with lincosamides, tetracyclines, amphenicols, beta-lactam and quinolone resistance (Fig. 5I).

In addition, an RF with a cross-validation analysis (accuracy = 66.18%, AUC = 0.79), established that seven ARGs are the most predictive for the type of lactation (Supplementary Fig. 11). bla(Z) beta-lactam resistance gene was ranked as the most important variable within the ARGs differentiating between maternal or artificial lactation, followed by other genes mph(C) and msr(A), tet(O), tet(32) and tet(W).

Discussion

Our study highlights the crucial role of Bifidobacterium and breastfeeding in shaping the infant gut resistome. We demonstrate that a high abundance of Bifidobacterium is associated with a more beneficial microbial composition and a lower ARG load throughout the first year of life, and this high abundance is independent of mode of delivery or feeding, highlighting an alternative way for a lower resistome burden. While the impact of C-section delivery on the resistome is evident at 7 days of life, exclusive breastfeeding becomes a key determinant by one month of age, significantly reducing ARG burden. Moreover, early breastfeeding cessation correlates with a higher ARG load, underscoring its long-term influence on microbial resilience. Notably, this is the first study to identify exclusive breastfeeding as a strategy to counteract the resistome-altering effects of C-section birth, reinforcing its protective role. Our findings position breastfeeding as a critical, modifiable factor in fostering a healthier microbiome and mitigating antibiotic resistance in early life.

In this study we utilised over 265 faecal metagenomes from 66 mother-infant pairs to explore the characteristics and dynamics of the intestinal resistome during the first year of life, and how the microbiome and external factors influence the infant resistome and its dynamics. We identified ARG in all the samples investigated, even though the type of resistance, load and associated taxa varied greatly during the different timepoints.

The developing gut resistome was linearly associated with infant age, suggesting a dynamic trajectory of resistance over time as described in other studies, such as that of Xu, X., et al.20. Over time, an infant’s resistome tends to become more similar to that of its mother. While we have observed that mother-infant dyads at one month postpartum share common ARGs and exhibit similar ARG loads, but their overall resistome composition remains distinct. Therefore, the extent of maternal gut influence on shaping the infant’s resistome is still unclear, and further evidence is needed to better understand this relationship. Tetracycline was the most abundant resistance class across all ages in the children’s samples, although infants are no longer exposed to this antibiotic21. The second most abundant resistance was against macrolides, which include genes encoding multiresistance (such as MLSB), as previously reported12,20,22,23. The use of macrolides, which still remains an important antibiotic in paediatric care24, has been associated with an increase in resistance to this class of antibiotic, with long-lasting effects on the composition of the infant gut microbiota25,26.

The resistome development during the first year of life occurs in parallel with the gut microbiota development and tends to resemble the mother’s -adult’s- resistome. ARGs in neonates have been most frequently associated with Bacteroides, Clostridioides, Escherichia, and Staphylococcus27,28,29,30, being enterobacteria particularly abundant in early life and important ARG carriers12,31. On top of those, we have also found Enterococcus, Streptococcus, Enterococcus and Klebsiella to be the main genera responsible for carrying a higher variety of antibiotic resistance in our cohort.

In terms of the potential mobilisation of the resistome, we analysed the presence of plasmids carrying ARGs in our samples. We observed a high presence of ARGs conferring resistance to beta-lactams and aminoglycosides in plasmids, and highly associated with Escherichia. Although it has been recently published that Bacteroides are the main hosts for the infant gut plasmidome32, the overrepresentation of plasmidic genomic sequences from enterobacteria, such as Escherichia and Klebsiella, might hinder the identification of the exact bacterial host in metagenomic studies. Similarly, most plasmidic contigs were assigned de novo without the occurrence of a plasmidic replicon that might share light of the family and characteristic of such plasmids, including their mobilisation ability. Despite these limitations, we aimed to infer the ratio of ARGs that occur in potential mobilisable elements and their potential host, due to the key role they pose in AMR spread and dissemination.

Our primary finding was the association of Bifidobacterium with the modulation of the early-life microbiota and the resistome abundance and profiles. High Bifidobacterium levels have been previously associated with low ARG abundance33. In a previous study, we investigated the resistome of the same cohort by using qPCR34. Here, we confirm and expand our previous results using an in-depth metagenomic sequencing approach on a larger dataset: a high relative abundance of Bifidobacterium on the infant’s gut drives a specific microbial composition and reduced ARG load during the first year of life.

The study conducted by Taft et al.33 correlates with our findings, as they also found that High-Bifidobacterium infants had lower levels of Enterobacteriaceae than the Low-Bifidobacterium infants, suggesting that the abundance of Bifidobacterium may suppress AMR-carrying taxa. The inhibitory effect of Bifidobacterium on the growth of other commensals and pathogens has been described before35. We have observed that a low abundance of Bifidobacterium is associated with a higher load of ARGs conferring resistance to lincosamides, tetracyclines, amphenicols, beta-lactams and quinolones. However, it is important to highlight that some species of Bifidobacterium carry resistance (mainly against tetracyclines)36 and that can increase over time. Indeed, the RF model indicated that seven ARGs, associated with macrolide resistance (erm(B), erm(X), mef(A)), tetracycline resistance (tet(S), tet(32), tet(B)) and beta-lactam resistance (blaTEM), were the most important variables in predicting the Bifidobacteirum cluster, some of which also had an important influence in the infant-age prediction model.

Shao et al.37 previously reported that one-month-old infants with high Bifidobacterium bifidum colonisation carried fewer ARGs than those colonised by Enterococcus faecalis, which exhibited elevated extended-spectrum beta-lactamase (ESBL) levels. Likewise, Li et al.12 identified Escherichia coli as a key determinant of the microbiome in one-year-old infants, with Proteobacteria contributing the most diverse and abundant ARGs. Aligning with these findings, our study found that infants with low Bifidobacterium levels were predominantly colonised by E. coli and E.faecium, both carrying a higher proportion of ARGs throughout the first year. In addition, our longitudinal analysis, spanning from seven days to one year of age, provided further insights into these microbial patterns.

Historically, the infant microbiome is dominated by Bifidobacterium and associated with a variety of beneficial health effects38,39,40. While we have observed the effect of Bifidobacterium abundance on resistome load and composition in all timepoints during the first year of life, the Bangladeshi infants from Taft et al.33 lose the association between early-life Bifidobacterium levels and antimicrobial resistance at 2 years of age. This suggests the important role of Bifidobacterium on the gut microbiota development only during the first year of life, as gut microbiota will tend to evolve towards an adult-like microbiota, where Bifidobacterium is less abundant. The dominance of the infant gut by Bifidobacterium is linked to the consumption of human milk oligosaccharides (HMO), the third most abundant component in human milk41. The essential role of breastfeeding in shaping a healthy infant gut microbiome by supporting the growth of beneficial microbes has been studied before18. Interestingly, our study found that the clustering based on Bifidobacterium abundance is not influenced by the mode of delivery or breastfeeding practices, with infants from diverse birth and feeding modes converging in the high-Bifidobacterium group. This underscores the unique role of Bifidobacterium in shaping the infant microbiome and resistome, which we have treated as an independent variable in our analysis. However, the interplay of host genetics and other factors that may drive the early-life abundance of Bifidobacterium remains to be explored. While breastfeeding did not show a strong association with Bifidobacterium abundance in our cohort overall, previous studies have demonstrated that, when Bifidobacterium is present, breastfeeding typically supports its growth41,42,43. This highlights the complex relationship between feeding practices and microbiota composition, which may vary depending on other individual or environmental factors.

In this vein, we also provide insights on the benefits of breastfeeding on resistome acquisition and maintenance. Exclusive breastfeeding during the first month of life accelerates the decrease of antibiotic resistance in the infant gut during that time, and ensures a lower abundance of ARGs at 6 months. We also provide more evidence that early termination of breastfeeding, recommended for at least 6 months by the WHO19, is associated with an enrichment of ARGs. Based on our study and previous work44, breastfeeding for at least 6 months may decrease the prevalence of Gammaproteobacteria while simultaneously increasing the presence of bifidobacteria, contributing to lowering the load of antibiotic resistance genes in the infant gut. Previous findings suggest that HMO-supported bifidobacteria growth could mediate these associations33,34,45.

Nevertheless, the benefits of breastfeeding on shaping an infant’s resistome are not observed until the first month of age. Before that time, the mode of delivery exerts the greatest influence in modulating the infant resistome. The WHO suggests that women who undergo a C-section should receive a single dose of penicillin or a first-generation cephalosporin as a preventive measure, instead of other types of antibiotics46. Consequently, all infants are currently exposed to beta-lactam antepartum via the umbilical cord, as these antibiotics rapidly cross the placenta. In concordance with previous studies22,31,47,48, C-section born infants have a higher abundance of ARGs than vaginal-born infants, and the composition of their resistome is different. This difference is likely driven primarily by the altered colonisation patterns typical of C-section births, including reduced transmission of maternal microbes and increased acquisition of hospital-associated taxa such as Staphylococcus, Enterococcus, and Klebsiella, which are commonly associated with higher ARG carriage48,49. Therefore, the enrichment of ARGs in C-section infants may result from a combination of altered microbial succession, environmental exposure, and, potentially, transient antibiotic pressure. Notably, these differences diminish by one month of age, when breastfeeding becomes a dominant factor influencing resistome composition.

Guo et al.50 reported that breastfeeding restores the gut microbiota of C-section infants. In their study based on 16S rRNA gene amplicon sequencing, they observed that breastfeeding did not compensate for the lack of Bifidobacterium spp. in children born by C-section: children born by C-section and receiving breastfeeding had less bifidobacteria than formula-fed vaginally-born children. A recent study conducted by Sinha et al. reflected that feeding mode is by far the most defining factor for microbial composition and functionality in early life in C-section born infants51. In our study, we extend these results to the infant’s gut resistome, observing that exclusive breastfeeding till 1 month of age reduces the ARG load of infants born by C-section. The positive contribution of human milk to the gut microbiota development52,53 may explain why breastfeeding could restore the delayed microbiome development in C-section, extending to a reduction in antibiotic resistance.

Jokela et al.48 described how the resistome in early life is primarily shaped by the natural development of the gut microbiota, with Bifidobacterium abundance correlating negatively, and Bacteroides positively, with ARG load. In our study, we confirm the association between Bifidobacterium abundance with a decreased ARG load, but also focus specifically on feeding practices and their influence on ARG acquisition. The inclusion of earlier sampling timepoints—at 7 days and 1 month—allowed to uncover that exclusive breastfeeding during the first month of life is associated with a rapid decrease in ARG abundance and leads to a significantly lower resistome burden by six months. Pärnänen et al.54, in a cross-sectional study of 1-month-old preterm infants, identified higher ARG loads in formula-fed compared to breastfed infants. Our longitudinal study in full-term infants aligns and builds upon these observations, demonstrating how exclusive breastfeeding during the first month not only lowers ARG burden but also mitigates the impact of caesarean-associated antibiotic exposure.

At 6 months of age, as the infants are more exposed to external microbes, the influence of environmental factors over their microbiome increases. This period typically coincides with the onset of weaning, during which most infants either discontinue breastfeeding or combine it with the introduction of solid foods and other liquids. At this stage, the direct effects of breastfeeding may become less pronounced due to the influence of multiple external factors. Nonetheless, the significance of the so-called “window of opportunity” for shaping the infant resistome cannot be overstated. Any contact with breastfeeding during this period offers substantial benefits to the infant microbiome and resistome, with both immediate and long-term effects. Moreover, lack of breastfeeding feeding has been observed to be the most influential variable in plasmid assemblage in early life, which has a related impact to the resistome dynamics32.

The establishment of a well-balanced microbiome during early life is critical for ensuring the controlled development of the resistome. Our evidence suggests that breastfeeding may help to modulate potential microbial dysbiosis associated with C-section births, a procedure that is sometimes unavoidable. These findings underscore the value of breastfeeding as a scalable and cost-effective intervention to address antimicrobial resistance, complementing its well-established health benefits55. In high-income nations, where fewer than half of infants are breastfed at six months of age55, promoting breastfeeding practices has the potential to decrease early-life antibiotic usage and enhance the gut’s ability to resist colonisation by AMR pathogens, which are becoming increasingly common in community environments. Establishing evidence that breastfeeding or components of human milk offer protection against colonisation or infection with AMR bacteria would provide strong support for intensifying efforts to advocate for breastfeeding through policy initiatives.

Methods

Study design and volunteers

A total of 66 of healthy, full-term mother-infant pairs from the MAMI cohort56 were selected for inclusion based on sample availability and completeness of clinical and anthropometric records. Maternal-infant data, including gestational age, neonate gender, birth weight, place and mode of delivery, type of lactation, and antibiotic exposure of the infant, were collected. Infant length and weight were registered at birth, 1, 6, and 12 months, and the data were used to calculate BMI Z-scores. These anthropometric measures that are standardised by age and sex were electronically computed using WHO Anthro software (https://www.who.int/childgrowth/software/en/).

Among the longitudinal sampling performed in the MAMI cohort, infant faecal samples from 7 days, 1, 6 and 12 months were included in this analysis (N = 265). One faecal sample was collected 1 month postpartum from each mother. All participants received oral and written information about the study, and written consent was obtained. Ethical approval for the study was obtained from the Ethics Committee for Clinical Research of the healthcare centres involved in the recruitment, including the Hospital La Fe, Hospital Clinico Universitario, Parc de Salut MAR, and CSIC (Consejo Superior de Investigaciones Científicas) [ClinicalTrial.gov NCT03552939]. The clinical characteristics of the study participants can be found in Supplementary Table 1.

Biosamples, DNA extraction and sequencing

A total of 265 mother-infant faecal samples were collected at home by their parents who were previously trained by clinical personnel in the health care centres where they were enroled. Briefly, faecal samples were deposited in provided sterile containers and immediately kept at − 20 °C, before the final storage at − 80 °C until further analysis. DNA was isolated from the faecal samples (50–100 mg) using the Master-Pure DNA extraction kit (Epicentre, Madison, WI, USA) following the recommended protocol with some additional steps as described in Selma-Royo et al.57. Briefly, a pre-treatment with lysozyme and mutanolysin was included as well as a cell disruption step with 3 µm diameter glass beads in a FastPrep 24-5 G Homogeniser (MP Biomedicals). After extraction, the DNA Purification Kit (Macherey-Nagel, Duren, Germany) was used for DNA purification.

Shotgun sequencing libraries were constructed using the NexteraXT DNA Library Preparation Kit (Illumina) and sequenced on the Illumina HiSeq2500 platform with a target depth of 5 Gb/sample(Supplementary Data 1).

Bioinformatic analysis of the metagenomic sequencing data and resistome and mobilome investigation

The quality of the raw sequencing data was firstly evaluated by using FastQC v0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and MultiQC v1.1158, and further trimmed by using FastP v0.23.259, excluding reads with a length shorter than 75 bp and a mean Phred score quality below 25. Residual adaptors and barcodes were also excluded by using FastP. Residual PhiX and human reads were discarded by using Bowtie2 v2.4.260. The efficiency of the quality control (QC) over the dataset was again evaluated with FastQC and MultiQC, and the reads overcoming the process were used for downstream analysis by following two main approaches: assembly-free and assembly-based.

In the assembly-free approach, the QC reads were subjected to MetaPhlAn 461 in order to perform a microbial profiling of the different samples by using the CHOCOPhlAn_202403 database. The resistome profiling per sample was assessed by aligning the QC reads against the ResFinder database62 (downloaded on 18th June 2024) as follows: (i) the ARGs contained in the ResFinder database were clustered at 90% identity and coverage by using CD-HIT63 in order to prevent spurious alignments of the short sequencing reads to conserved regions from multiple closely-related ARGs; (ii) alignment of the QC reads against the clustered ResFinder database by using Bowtie2 v2.4.2 and enabling the “--end-to-end” and “--very-sensitive” flags; (iii) alignments were polished with featureCounts64 considering only as positive hits those reads that aligned just once against the clustered ResFinder database (iv) the obtained counts matrix was processed to calculate the counts per million reads (CPMs) adding a “bacterial marker” modification according to the formula:

$${CPM}=\frac{({ARG}\times {Ratio\; of\; bacterial\; assigned\; reads})}{{Total\; number\; of\; reads}}\times {10}^{6}$$
(1)

where “CPM” is the total counts per million reads value for each ARG and “Ratio of bacterial assigned reads” represent the ratio of reads per sample to be considered to belong to the kingdom “Bacteria” according to MetaPhlAn 4, in order to normalise the dataset to consider only those reads derived from bacteria.

In the assembly-based approach, the QCed metagenomes were assembled by using SPAdes v3.15.565 by enabling the --meta option,96 threads and 1024 GB RAM per assembly using the k-mer lengths 21, 33, 55, 77, 99 and 121. The resulting FASTA files containing the metagenomic contigs were imported into TORMES v1.3.166 with66 me modifications to automate the downstream bioinformatic analysis as follows: first, contigs shorter than 500 bp were excluded, and quality control of the assemblies was performed by using QUAST v5.0.267 and aligning the QC reads of each sample against its own assembly by using Bowtie2, in order to assess the proportion of reads that were finally included in the assembly. The taxonomy of each contig was assessed by using kraken2 v2.0.868 the --use-names option, and the PlusPFP database (downloaded on 12th January 2024). Taxonomic nomenclature was refined by using Taxonkit v0.16.069 (taxdump NCBI files downloaded on 6th June 2024). Bacterial and archaeal coding sequences (CDS) were predicted and amino acid sequences were translated using Prodigal v2.6.370 and by enabling the --only_gene_prediction flag in TORMES. Contigs with less than 3 CDS were excluded for downstream analysis. Resistome profiling was performed by aligning the assemblies against the ResFinder database (downloaded on 18th June 2024) by using BLASTN v2.14.1 + 71. Alignments below 80% sequence similarity and coverage were excluded. Plasmid occurrence in the metagenome was evaluated by using geNomad v1.8.172 and by aligning the assemblies against the PlasmidFinder database73 (downloaded on 27th March 2024) by using BLASTN and excluding alignments below 95% sequence similarity and coverage ARGs occurring in contigs that were classified as “plasmid” with geNomad and/or that harboured a plasmidic replicon according to PlasmidFinder were considered to be plasmidic.

Statistical analyses and data visualisation

All statistical analyses were performed under the R environment v. 3.6.0, and visualisation was performed by using ggplot274. Normality of the data was evaluated with Shapiro – Wilk tests. Categorical variables are expressed as positive cases-prevalence and (percentage, %). Normally distributed variables are presented as mean ± standard deviation (SD), while non-normal distributed variables are presented as median and interquartile range [IQR]. Pearson’s-Chi-square test was used for categorical variables, and the Mann–Whitney U test or Fisher’s exact test were used for continuous variables, as appropriate, for calculating statistical significance. All p-values calculated in the manuscript were adjusted using the BH method.

We used the vegan package75 diversity analyses (α- and β-diversity). The Shannon diversity of ARGs or bacterial species was used as a measure of within-individual diversity (α-diversity). The β-diversity was conducted using the function ‘vegdist’ of the vegan package with Principal Coordinate Analysis (PCoA), and Bray–Curtis dissimilarity index was used for microbiota and Jaccard for resistome data. Permutational multivariate analysis of variance (PERMANOVA) was performed using the function ‘adonis’ of the vegan package with 9999 permutations to determine the statistical significance of composition differences. Differentially abundant features between time points were identified using the DESeq276 with the p-values attained by the Wald test were corrected for multiple testing using the Benjamini-Hochberg method.

Differences in the abundance of a given ARG or taxa were compared using the Wilcoxon test (for two groups) or the Kruskal-Wallis test (for three or more groups). Pairwise comparisons were carried out by pairwise Wilcoxon tests. Differences in the relative abundance of taxa between the two clusters were evaluated using the Mann-Whitney U test (Wilcoxon rank-sum test for 2 levels). In all of these tests, the p-value was adjusted for multiple comparisons by the FDR/Bonferroni correction.

The main separation of the microbiota in groups or clusters at each time point was performed with an unsupervised non-linear algorithm of k-means partitioning, and confirmed with the k-medoids clustering method, Partitioning Around Medoids (PAM). Both methods require specifying the number of clusters k to be generated, and we used the elbow and average silhouette method from factoextra R package77 to estimate the optimal k value. Next, clustering of the microbial data was performed with the stats R core package.

To further explore patterns in community composition and identify potential associations with environmental factors, we employed an unconstrained ordination approach. Unconstrained ordination is particularly valuable in this context as it allows for the identification of intrinsic patterns in ARG distributions without being restricted to pre-specified explanatory variables. Subsequently, we used the envfit function from the vegan R package to fit environmental variables onto the unconstrained ordination space.

MaAsLin78 was used to build general linear models to efficiently determine the multivariate association between relevant factors (high R2 in envfit analyses) and resistome while accounting for potentially confounding covariates. In this case, the analysis with MaAsLin2 was a longitudinal analysis, so the first added variable was time (7 days, 1, 6 and 12 months, considering 7 days as reference). The following variables were added depending on the strength of each variable seen in the other analyses and reported before, so we added the Bifidobacterium abundance based k-means clustering (High vs Low), the mode of delivery (vaginal vs C-section), exclusive breastfeeding (yes vs. no), antibiotic consumption (yes vs. no), and the presence of domestic pets in the household (yes vs. no). No more variable groups or other variables were added. The command runs a multivariable regression model to test for the association between ARGs abundance (or groups of resistance) versus the above-mentioned variables as fixed effects. Finally, the input data were transformed to log format.

To assess the statistical impact of clinical variables on infant resistome, a generalised linear model (GLM) was estimated using the stats R core package. The best model was selected using the Akaike Information Criterion (AIC), following a forward selection approach and choosing the one with the lowest AIC value. In addition, collinearity among explanatory variables was reviewed using the Variance Inflation Factor (VIF) with the car R package79. The influence of fixed effects was calculated via Nagelkerke’s \({R}^{2}\). The final model was as follows, and the response \({YResistance}\) represented the observed value of CPMs of total ARGs:

$${YResistance}=\beta 1\left({Infant\; age}\right)+\beta 2({Bifidobacterium\; level})$$

For the prediction of the variables infant age, lactation type and Bifidobacterium cluster from the abundance in CPMs and the different variables present, the Random Forest (RF) machine learning algorithm was applied using randomforest R80 package. We split the infant-based data in a 70:30 ratio (training set:test set). The number of predictors to be used in the RF model was determined by 10-fold cross-validation over 100 iterations using caret R package81. The number of trees, variables for each split (mtry) and the total number of variables (node size) to grow the forest were set at 500, 18 and 18, respectively for infant age, 500, 8 and 8 for bifidobacterium cluster and 500, 7 and 7 for lactation type. In addition, to evaluate models in detail according to their performance by comparing model predictions with the actual values of the test set caret R package was used to obtain the confusion matrices, and to see in more detail the relationship between the sensitivity and specificity of the RF models obtained, a pROC curve study was performed on the categorical variables of Bifidobacterium cluster (high, low) and lactation type (maternal, artificial) using pROC R package82.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.