Abstract
Breastfeeding is inversely associated with cardiometabolic disease incidence in prospective studies; however, the metabolic pathways underlying these associations remain largely unknown. Here, we derive a plasma metabolomic score of lifetime total duration of breastfeeding using elastic net regularized regression in Nurses’ Health Studies (n = 4349) and replicate in the Women’s Health Initiative (n = 2088). Data include 181 untargeted plasma metabolites profiled by liquid chromatography mass spectrometry using blood samples collected in mid-life, and self-reported lifetime total duration of breastfeeding. We then examine the associations between the metabolite-based breastfeeding score and risk of T2D and CVD using multivariable Cox regression models and replicated in two external cohorts. The metabolite-based breastfeeding score comprised of 5 metabolites (i.e., C54:2 triglyceride, C56:2 triglyceride, C56:3 triglyceride, cotinine, indole-3-propionate), which show a modest but statistically significant correlation with lifetime total duration of breastfeeding. The metabolite-based breastfeeding score significantly inversely associate with T2D incidence (HR = 0.76, 95%CI = 0.71-0.82) and with CVD incidence (HR = 0.88, 95%CI = 0.84-0.93) independent of T2D and CVD risk factors. We identify plasma metabolite profiles in mid-life associated with breastfeeding duration, which is also linked to CVD and T2D risk.
Similar content being viewed by others
Introduction
Several lines of evidence suggest that breastfeeding is prospectively associated with lower risk of cardiometabolic disease both in women and their offspring1,2,3,4. However, the mechanisms underlying the associations remain to be elucidated. Prior studies suggest that breastfeeding leads to reduced weight gain and lower midlife adiposity5,6,7,8. Furthermore, two studies in women with a history of gestational diabetes mellitus (GDM) found that blood phosphatidylcholines (PC) and branched-chain amino acids (BCAA) were inversely associated with breastfeeding duration9,10. These metabolites have been linked to the pathophysiology of T2D in previous studies (PC to lower risk and BCAA to higher risk). Another recent study in 350 women with GDM with 8 years of follow-up reported that higher intensity of breastfeeding was associated with lower glycerolipid (TAGs/DAGs) and higher phosphor- and sphingolipid levels 6-9 weeks postpartum, but these changes were transient and only robust among women with GDM who did not subsequently develop T2D11. However, inference from previous studies is limited by small sample size, short follow-up periods from the last breastfeeding episode, and has focused on populations with GDM.
Metabolites are the downstream products of cellular activities regulated by the genome and modified by environmental factors, representing the confluence of genetic regulation, local environment, and enzymatic activity12 and have proven valuable in identifying perturbed metabolic pathways in disease pathophysiology13,14,15,16. While some metabolites are more transient, we and others have reported the long-term within-person stability of blood metabolites17,18,19,20, supporting the use of metabolomic biomarkers to elucidate disease pathophysiology and biological links between exposure-risk associations. With this, studies have investigated prospective associations between plasma metabolomic profiles in blood samples collected years prior to disease onset with risk of developing chronic diseases21,22. Additionally, recent studies have investigated blood metabolomic profiles related to health behaviors, such as healthy lifestyle and dietary patterns, which have been further linked with risks of CVD and T2D23,24,25. Although no studies to date have systematically examine the associations between breastfeeding-related metabolites and future risk of T2D or CVD.
We hypothesized that associations of breastfeeding with cardiometabolic risk are mediated through changes in systemic metabolic pathways and homeostasis, specifically higher levels of lysophosphatidylcholines and lower levels of branched-chain amino acids, which have been associated with breastfeeding among women with GDM and involved in T2D development9. To test this hypothesis, we investigated associations of total duration of breastfeeding across all pregnancies with plasma metabolites using high-throughput metabolomics in two well-characterized cohorts of middle-aged to older women (Nurses’ Health Studies, NHS & NHSII), and derived a metabolite-based breastfeeding score and replicated this score in an independent cohort of postmenopausal women (Women’s Health Initiative, WHI) (Fig. 1). Subsequently, we investigated the prospective associations of metabolite-based breastfeeding score with the risk of T2D and CVD (as a composite endpoint of coronary heart disease, coronary revascularization, and stroke). In addition, we externally replicated the associations of this metabolite-based breastfeeding score with cardiometabolic disease in the PREvención con DIeta MEDiterránea (PREDIMED) trial and the WHI cohort.
We investigated associations of total duration of breastfeeding across all pregnancies with plasma metabolites using high-throughput metabolomics in two well-characterized cohorts of middle-aged to older women (Nurses’ Health Studies, NHS & NHSII), and derived a metabolite-based breastfeeding score and replicated this score in an independent cohort of postmenopausal women (Women’s Health Initiative, WHI). We further investigated the prospective associations of metabolite-based breastfeeding score with the risk of type II diabetes and cardiovascular disease. Results were externally replicated in the PREvención con DIeta MEDiterránea (PREDIMED) trial and the WHI cohort.
Results
Baseline characteristics for 159,684 NHS and NHSII participants included in the analysis of the association between self-reported breastfeeding and T2D and CVD are shown in the Supplementary Table S2, and the corresponding association analyses are presented in Supplementary Tables S3-4. Average age at baseline was 52.2 years in NHS and 35.8 years in NHSII. We documented 16,601 incident T2D cases over a period of 4,139,843 person-years and 15,488 incident CVD cases over a period of 4,197,137 person-years of follow-up in the NHS and NHSII. The baseline participant characteristics of the four cohorts included in the analyses for the associations between metabolite-based breastfeeding score and incident T2D and CVD are presented in Table 1. Average age at blood draw was 57.1 years in NHS, 44.8 years in NHSII, 67.0 years in WHI, and 67.8 years in PREDIMED. Participants in the NHS/NHSII were younger than participants in the other cohorts with 18% in the NHS and 78% in NHSII being premenopausal, while both WHI and PREDIMED consisted only of postmenopausal women. The distribution of women on postmenopausal hormones was similar in NHS and WHI, but only 2% of postmenopausal women in PREDIMED were on hormones at the time of blood collection. We documented 797 T2D and 613 CVD events in the NHS over a follow-up period of 20 and 24 years, respectively, 129 T2D and 81 CVD events in the NHSII over a period of 22 and 20 years, respectively, 1043 CVD events in the WHI and 138 T2D and 91 CVD events in the PREDIMED over a period of 3.8 and 4.7 years, respectively.
Self-reported breastfeeding and risk of T2D, CVD
Longer lifetime total duration of self-reported breastfeeding was inversely associated with T2D (12+ vs 0 months, pooled adjusted [aHR] 0.85 (95%CI 0.82-0.89), p-trend <0.0001) and CVD (12+ vs 0 months, pooled aHR 0.93 (95%CI 0.88-0.97), p-trend =0.009) risks in the fully adjusted Cox models. Additional adjustment for AHEI or socioeconomic status (SES) did not attenuate the estimates (Supplementary Tables S3-4).
Metabolite-based breastfeeding score
Characteristics of the NHS and NHSII (n = 4349) and WHI (n = 2088) participants included in the analyses of metabolomic score derivation and replication are presented in Supplementary Table S5. Average age at blood collection was 56.6 years in NHS, 44.4 years in NHSII, and 67.0 years in WHI with average total lifetime duration of breastfeeding being 13.6 months in NHS/NHSII and 1.9 months in WHI among those who ever breastfed. Out of 181 metabolites, 5 metabolites were selected for the metabolite-based breastfeeding score by elastic net regression (with mean squared error = 1.39): a group of 3 highly correlated triacylglycerols (TAGs C54:2, C56:2, C56:3), cotinine, and indole-3-propionate. The triacylglycerols and cotinine were inversely associated with longer self-reported duration of breastfeeding, while indole-3-propionate was positively associated (Fig. 2, and panels D–E), and the metabolites correlating with longer duration of breastfeeding were associated with lower risks of T2D and CVD (Fig. 2E). In addition, when we examined the internal correlations between these 5 metabolites, indole-3-propinate was inversely correlated with the other 4 metabolites in the NHS/NHSII, WHI and PREDIMED (Fig. 2, and panels A–C). In a sensitivity analysis, we recalculated the metabolite-based breastfeeding score coefficients among control participants and observed similar beta coefficients for the selected metabolites.
A Women’s Health Initiative (B) and PREDIMED (C). Weights from elastic net regression for the metabolomic score (D and associations of metabolomic score components with history of breastfeeding, incidence of CVD and T2D (E) in the Nurses’ Health Studies. Selection of metabolomic score was based on 181 metabolites measured on C8-positive, HILIC-positive and HILIC-negative platforms among 4349 parous participants from NHS and NHSII cohorts using elastic net regression. Breastfeeding (BF) beta coefficients were estimated using linear regression from mutually adjusted model (all 5 metabolites entered in the same model) and represent the associations with longer self-reported breastfeeding; Cox proportional hazards regression models were used to estimate the beta coefficients of the associations between 5 selected metabolites with incident CVD and T2D. The models were adjusted for the following covariables at blood draw: age, fasting status, cohort (NHS, NHSII), endpoint and case-control status from the pooled 13 nested case-control studies. All statistical tests were two-sided, and did not adjust for multiple comparisons for this analysis. *p < 0.05, **p < 0.01, ***p < 0.001 after FDR-correction. Abbreviations: CVD, cardiovascular disease; NHS, Nurses’ Health Study; PREDIMED, Prevención con Dieta Mediterránea; T2D, type 2 diabetes. Source data are provided as Source Data file.
The metabolite-based breastfeeding score showed a modest but statistically significant and externally validated correlation with self-reported breastfeeding duration (r = 0.10, p < 0.0001 in the training set, r = 0.12, p < 0.0001 in the testing set, and r = 0.04, p = 0.046 in the external validation sets). When stratified by menopausal status and postmenopausal hormone use at blood draw, the score correlated with self-reported breastfeeding duration only among premenopausal women and postmenopausal women not on hormones (premenopausal women: r = 0.11, p < 0.0001, postmenopausal women not on hormones: r = 0.06, p = 0.05, as opposed to postmenopausal women on hormones: r = 0.01, p = 0.76). The same trend was observed in the WHI (postmenopausal women not on hormones: r = 0.05, p = 0.04, postmenopausal women on hormones: r = 0.03, p = 0.39). In a sensitivity analysis, we recalculated the metabolite-based breastfeeding score coefficients among control participants. The metabolite-based breastfeeding score developed including both cases and controls showed a correlation coefficient of 0.86 with the metabolite-based breastfeeding score developed including controls only in NHS/NHSII and 0.92 in WHI (Supplementary Fig. 1).
Metabolite-based breastfeeding score and T2D and CVD risk
In multivariable analyses adjusting for age, fasting status, race, age at 1st birth, pre-pregnancy BMI, family history of CVD or T2D, smoking status, alcohol intake, physical activity, parity, menopausal status and postmenopausal hormone and aspirin use at blood draw, the metabolite-based breastfeeding score was significantly inversely associated with incident T2D risk in all individual cohorts: in NHS (adjusted HR [aHR]=0.83[95%CI 0.76-0.90]), NHSII (aHR=0.55[0.46-0.65]) and PREDIMED (aHR=0.77[95%CI 0.61-0.99]; Fig. 3). When we meta-analyzed the estimates from the individual cohorts, 1 SD increase in the metabolite-based breastfeeding score was associated with 24% lower T2D risk (aHR=0.76[95%CI 0.71-0.82]). Significant heterogeneity across the 3 cohorts was observed with Cochran’s Q = 16.8, p = 0.0002, tau2 = 0.04. Additional adjustment for AHEI (aHR=0.77[95%CI 0.71-0.83]) or BMI at blood collection (aHR=0.84[95%CI 0.78-0.91]) did not change the associations with incident T2D risk in NHS/NHSII. Since cotinine is a metabolite of nicotine and therefore positively associated with cigarette smoking26, we conducted a sensitivity analysis examining the association between metabolite-based breastfeeding score excluding cotinine and T2D risk in NHS/NHSII and observed a significant inverse association with aHR=0.78 [95%CI 0.72, 0.84].
A Associations with T2D were assessed in 3 cohorts (n total = 4720): Nurses’ Health Study (n = 2404), Nurses’ Health Study II (n = 1772), and PREDIMED (n = 544) using Cox proportional hazards models. B Associations with CVD were assessed in 4 cohorts (n total = 6792): Nurses’ Health Study (n = 2404), Nurses’ Health Study II (n = 1772), Women’s Health Initiative (n = 2088) and PREDIMED (n = 528), with T2D in 3 cohorts (n total = 4720): Nurses’ Health Study (n = 2404), Nurses’ Health Study II (n = 1772), and PREDIMED (n = 544) using Cox proportional hazards models. The forest plots display the hazard ratios (dot) and 95% confidence intervals (line). 181 HILIC-positive, C8-positive and HILIC-negative metabolites were used in the elastic net regression to select 5 metabolites for the metabolomic score of breastfeeding in the derivation cohort (NHS + NHSII). Model 1 adjusted for age, fasting status and stratified by endpoint, and case-control status. Model 2 additionally adjusted for race, age at 1st birth, pre-pregnancy BMI, family history of CVD/T2D, smoking status, alcohol intake, physical activity, parity, menopausal status and postmenopausal hormone and aspirin use at blood draw. All statistical tests were two-sided. Heterogeneity statistic for (A) T2D meta-analysis: model 1: Cochran’s Q = 25.8, p < 0.0001, tau2 = 0.05; model 2: Cochran’s Q = 16.8, p = 0.0002, tau2 = 0.04; for (B) CVD meta-analysis: model 1: Cochran’s Q = 5.19, p = 0.16, tau2 = 0002; model 2: Cochran’s Q = 5.21, p = 0.16, tau2 = 0.003. Abbreviations: BMI, body mass index; CVD, cardiovascular disease; NHS, Nurses’ Health Study; PREDIMED, Prevención con Dieta Mediterránea; T2D, type 2 diabetes; WHI, Women’s Health Initiative. Source data are provided as Source Data file.
With respect to incident CVD risk, metabolite-based breastfeeding score was suggestive of a slight inverse association in NHS (aHR=0.96[95%CI 0.87-1.05]), NHSII (aHR=0.95[95%CI 0.72-1.25]), WHI (aHR=0.84[95%CI 0.78-0.90]), and PREDIMED (aHR=0.86[95%CI 0.67-1.10]). The meta-analysis of the four cohorts resulted in a significant inverse association, with 1 SD increase in the metabolite-based breastfeeding associated with a 22% lower CVD risk (aHR=0.88[95%CI 0.84-0.93]). No significant heterogeneity was observed across the 4 cohorts used for CVD meta-analysis (Cochran’s Q = 5.21, p = 0.16, tau2 = 0.003; p-het=0.10 and 0.18 for model 1 and 2 respectively). Additional adjustment for AHEI score (aHR=0.97[95%CI 0.89-1.06]) or BMI at blood collection (aHR=0.99[95%CI0.90-1.08]) in the NHS/NHSII attenuated the association. Excluding cotinine from the metabolite-based breastfeeding score also attenuated the association (aHR=0.99[95%CI 0.91-1.08]).
Since women with a history of adverse pregnancy outcomes, such as GDM, preeclampsia, preterm birth, and pregnancy-induced hypertension, are at increased risk of T2D and CVD, we investigated the associations between the metabolite-based breastfeeding score and risk of T2D and CVD among this high-risk population in NHS/NHSII. Of the 2681 women who had a history of adverse pregnancy outcomes at time of blood draw, the metabolite-based breastfeeding score was significantly inversely associated with T2D risk (aHR=0.53[95%CI 0.44-0.64]) but not associated with CVD risk (aHR=0.94[95%CI 0.70-1.25]).
Discussion
Leveraging data from three large cohorts of > 6000 parous women, we identified and validated a metabolomic signature in plasma samples collected in mid-life or later life that was associated with longer lifetime duration of breastfeeding. Importantly, using the prospective data from the four large cohorts, we unraveled inverse associations of this metabolomic signature with future risks of T2D and CVD even after the adjustment for known cardiometabolic risk factors. The metabolomic signature highlights the potential metabolic pathways through which breastfeeding is associated with T2D and CVD in US and Spanish populations of women.
While there are multiple metabolomics studies published on T2D and CVD27,28,29, there is a paucity of metabolomics studies in breastfeeding. One study reported that breastfeeding for 3+ months among women with GDM was associated with changes in the metabolic profile that have been linked to the early pathogenesis of T2D, such as higher total lysophosphatidylcholine/total phosphatidylcholine ratio, lower leucine and total branched-chain amino acid concentrations9. Zhang et al. observed a shift away from the glycerolipid towards phosphor- and sphingolipid metabolism pathways among women with a history of GDM and greater exclusivity of breastfeeding as a potential mechanism underlying the metabolic benefits of breastfeeding in mothers11. In the same vein, we report inverse associations of self-reported breastfeeding with TAG components of the metabolomic score in our study. We also observed greater magnitude of risk reduction between the metabolite-based breastfeeding score and T2D risk among women with a history of adverse pregnancy outcomes, suggesting breastfeeding may have a differential effect, possibly a more beneficial effect among these high-risk population compared to women with average risk of T2D.
According to a recent systematic review leveraging evidence of metabolomics involvement in T2D, higher levels of indolepropionate, a tryptophan metabolite, were associated with lower T2D risk, after pooling hazard ratios from 8 prospective cohorts (pooled RR1-SD 0.82 95%CI 0.74-0.92)27. A conceivable explanation for the purported protective effect of indolepropionate against the onset of T2D may lie in two main factors. Firstly, its ability to influence the secretion of incretin hormones from enteroendocrine L cells, notably glucagon-like peptide (GLP)−1, which are known to be pivotal in the development of T2D30,31. Secondly, indole-3-propionic acid demonstrates significant antioxidative stress capabilities, indicating a potential role in safeguarding β-cells against damage caused by metabolic and oxidative stress, and potentially mitigating amyloid accumulation31. Recent prospective and cross-sectional studies have also reported inverse associations between indole-3-propionate and CVD32,33. One possible explanation could be that indole-3-propionate modulates the pregnane X receptor (PXR), a xenobiotic-activated nuclear receptor present in vascular endothelium, which asserts an anti-inflammatory effect and can induce vasodilation34.
Positive associations have been observed between the triglycerides included in our metabolomics score (TG 54:2 and 56:3) and incident T2D risk (pooled RR1-SD 1.42 [0.74-0.92] and 1.22 [1.08-1.39], respectively]27. Emerging evidence supports the link between triglycerides with low double-bond content and low carbon number, i.e., saturated and monounsaturated acyl chains, and increased risk of T2D35, possibly due to differential response of TAGs with low vs high double bond content to insulin activity and sensitivity, both acutely and over time. In the Framingham Health Study, TAGs of lower double bond content decreased in response to insulin action and were elevated in the setting of insulin resistance35. Similarly, in the context of CVD, TAGs with low double-bond content and shorter chain length, including TAG 54:2, were most consistently associated with higher CVD risk36,37, suggesting that the significance of specific TAG species in the context of cardiometabolic diseases might have been underestimated in previous research due to an undue emphasis on total triglyceride levels. Cotinine, a byproduct of nicotine metabolism, was selected into the metabolomic signature by elastic net regression likely due to the strong negative relationship between breastfeeding and smoking habits. However, removing cotinine from the metabolite-based breastfeeding score did not significantly change in the results for T2D risk but attenuated the association for CVD risk.
Cotinine, the primary metabolite of nicotine, is widely recognized as a biomarker for smoking38. However, a previous study showed that never smokers can also be exposed to cotinine through dietary sources such as potatoes, tomatoes, and eggplant39, suggesting that cotinine is influenced not only by smoking but also by diet. In our study, the association between the metabolite-based breastfeeding score and T2D and CVD risks remained significant even after adjusting for smoking status, further supporting the role of cotinine beyond smoking exposure. Additionally, a sensitivity analysis excluding cotinine from the breastfeeding score yielded results consistent with our main analysis, reinforcing the robustness of our main findings.
The plasma metabolome serves as a reflection of the overall metabolic balance influenced by various factors such as diet40, genetic variabilities41, the microbiome42, and health status43. It is possible that there may be misclassification in the self-reported breastfeeding. However, such misclassification is likely to be non-differential and while the metabolome can be changed by breastfeeding, it is expected to be independent of reporting errors. Furthermore, we observed similar positive correlations in the validation dataset. As we observed similar effect estimates additionally adjusting for BMI at blood collection, the observed associations between the metabolite-based breastfeeding score and risk of T2D and CVD seem to operate beyond adiposity at time of metabolomic profiling. It is possible that hormonal changes due to breastfeeding may have in part led to consequent changes in the metabolic pathways resulting in reduced risk of T2D and CVD. Breastfeeding leads to elevation in prolactin levels44, which has been associated with decreased risk of T2D45. Circulating prolactin levels have wide effects on glucose metabolism, inversely correlated with triglycerides and positively correlated with HDL-cholesterols46. Low prolactin levels have been associated with higher insulin resistance and beta-cell dysfunction46,47,48. Further mechanistic research is warranted to elucidate the underlying biological mechanisms linking breastfeeding and risks of T2D and CVD.
Our study has several strengths. First, leveraging multiple US and one Spanish cohorts with comprehensive covariate data and long-term follow-up enhanced the geographic diversity of the sample increasing the generalizability and validity of the results. Second, CVD endpoints were adjudicated by dedicated committees strengthening the validity of this critical outcome. Third, methods utilized to perform the metabolomic analysis have shown to be reproducible17 and the metabolomics data were generated at the same laboratory for all cohorts. We were also able to replicate our metabolite-based breastfeeding score and its subsequent associations with T2D and CVD risk in external replication cohorts. Another strength of our analysis was the consideration of the metabolite-based breastfeeding score and its correlation with self-reported breastfeeding stratified by menopausal status at blood collection. However, several limitations merit consideration. Firstly, this analysis was limited as we focused solely on the 181 named metabolites with HMDB IDs from 3 untargeted metabolomics platforms (HILIC-positive, HILIC-negative, and C8-positive platforms) which were measured in > 4000 parous women in NHS/NHSII to maximize sample size. It is possible that other metabolites may mediate the observed associations between lifetime total breastfeeding duration and risks of T2D and CVD, which, however, does not dismiss validity of our findings. Future structural annotation of currently unidentified peaks may help identify new biomarkers associated with breastfeeding. Secondly, the elastic net regression assumed linear relationships between breastfeeding and metabolites, overlooking potential nonlinear relationships or interactions (product-terms) between metabolites. Despite the robust performance of the metabolomic signature, incorporating additional metabolites and advanced machine learning techniques considering nonlinear relationships or interactions could enhance the approach. We examined plasma metabolic profile at a single time point and acknowledge that given the dynamic nature of the human metabolites, repeated metabolomic assessments using blood samples collected at multiple timepoints may better inform the changes in metabolite profiles related to breastfeeding duration. However, we previously reported that the majority of the measured metabolites exhibit reasonable within-person stability over short and long periods of time17,18. Specifically, the 1-2 years and 10 years ICCs for the metabolite selected into the score had ICC > 0.4, similar plasma cholesterol (10-years ICC = 0.39), a well-established CVD risk marker. Additionally, we acknowledge that evaluating plasma metabolites may not represent the metabolite profiles in the local tissue but rather system wide profiles which are more relevant to T2D and CVD development. Although we observed statistically significant correlations between the metabolite-based breastfeeding score and self-reported breastfeeding duration, the correlations were weak, and the significance are likely driven by the large sample size. However, we were able to replicate the significant positive correlation in an independent dataset (WHI) supporting the validity of the developed metabolite-based breastfeeding score. The model satisfied monotonicity and showed no evidence of gross misspecification, supporting the use of modeling breastfeeding duration as continuous. However, we selected metabolites that are predictive of breastfeeding duration modeled as a continuous variable, and therefore it is possible that metabolomic biomarkers that are non-linearly associated with breastfeeding duration are not selected. Due to the observational design, causality could not be firmly established. While the strength of this study is utilizing data from multiple cohorts with long-term follow-up, the metabolite-based breastfeeding score was developed using blood samples collected more than two decades ago. Given the potential change in lifestyle patterns over the past decades which may influence blood metabolite levels, the metabolite-based breastfeeding score presented in our study may not be generalizable in more recent cohorts. To examine health outcomes after long-term follow-up, there is always the dilemma that exposures were assessed decades prior to the outcome of interest. However, the underlying biological mechanisms and link should not change over time. Thus, it remains critical to investigate long-term impacts of exposures and our study provides unique data presenting the potential biological link between breastfeeding and future T2D and CVD development. We acknowledge that we were not able to investigate breastfeeding intensity as we did not have full detailed information on exclusive breastfeeding for all the participants. Average total duration of breastfeeding was fairly short in our study population, and therefore we were also only able to look at 12 months or more as the highest exposure category which may limit the generalizability of our findings and might underestimate the true effect of breastfeeding on cardiometabolic conditions. Majority of the study participants were white participants, limiting the generalizability of the study. Although cross-population reproducibility of the signature was assessed, further validation in racially and ethnically more diverse populations and exploration of associations with other chronic diseases are warranted. We acknowledge that the datasets used for the current analyses differ in study population including geographic location, study design, and detailed definition of the outcome. However, the strength of our study is that we observed similar direction of associations in these different datasets which supports the reproducibility and validity of our findings. We acknowledge that the CVD outcome definitions differed across studies. Since we only had blood metabolomics data at one timepoint, we were not able to account for potential time-varying effects of blood metabolite levels which could differ by those who are at risk vs. those who are not at risk of T2D or CVD. As our data includes participants 51 years old on average at blood collection with blood samples collected > 10 years after one’s breastfeeding exposure, this allowed us to investigate the potential systemic long-term impact of breastfeeding and its subsequent associations with T2D/CVD risk. We were not able to examine the breastfeeding-associated plasma metabolite signature in a younger cohort with blood samples collected more proximal to their last breastfeeding exposure, as this was beyond the scope of this study. Studies in younger cohorts are warranted to understand the short-term systemic impact of breastfeeding. We acknowledge the potential limitation of imputing missing values below the limit of detection with the half the minimal value for that metabolite. Lastly, individuals who birth a surviving infant for whom to provide breastmilk, who is healthy enough to breastfeed, and who has social support to breastfeed may be a fundamentally different individual than those who do not and/or cannot breastfeed. We were unable to control for these important factors given the available data but acknowledge this limitation.
In summary, based on consistent findings across multiple independent cohorts our study demonstrates that longer lifetime total breastfeeding duration is associated with a metabolite-based breastfeeding score of which consists of plasma metabolites measured during mid-life including C54:2 triglyceride, C56:2 triglyceride, C56:3 triglyceride, cotinine, indole-3-propionate. The metabolite-based breastfeeding score was associated with lower risk of T2D and CVD in women beyond adiposity at time of metabolomic profiling. Further investigation of the underlying biological pathways of the constituent metabolites will deepen our understanding of the biological mechanisms linking breastfeeding to cardiometabolic health.
Methods
Study populations
Primary analyses were performed in the prospective Nurses’ Health Studies (NHS/NHSII). The NHS was initiated in 1976 enrolling 121,700 female nurses aged 30-55 years49. The NHSII was established in 1989 enrolling 116,429 female US nurses aged 25–42 years50. In both cohorts, mailed questionnaires were administered biennially to assess reproductive and lifestyle factors and health status, with follow-up rates > 90%. Blood samples were collected from 32,826 NHS participants between 1989–1990 and 29,611 NHSII participants between 1996–1999 using standard protocols51,52. Within the NHS/NHSII, 13 nested case-control studies were previously conducted for blood metabolomic profiling (Supplementary Table S1)25,53. After excluding participants who are nulliparous, missing metabolomics data and breastfeeding status, a total of 4349 parous participants remained for the derivation of metabolite-based breastfeeding score. For the prospective T2D and CVD risk analyses, the baseline was set as the respective blood draw date for each participant. For these prospective analyses, participants with prevalent T2D, CVD, or cancer at baseline were additionally excluded, leaving a total of 2404 participants in the NHS and 1772 in the NHSII. We also conducted a subgroup analysis restricting to those with history of adverse pregnancy event (i.e., gestational hypertension, preeclampsia, GDM; n = 2681) and examined the association between metabolite-based breastfeeding score and T2D and CVD risk. The study protocol was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required. The present study used anonymized data that were originally collected and therefore considered non-human research by IRBs.
External replication was performed for T2D risk in a nested case-cohort study within the Prevención con Dieta Mediterránea (PREDIMED)54 study, and for CVD risk in a nested case-control study of coronary heart disease within the Women’s Health Initiative (WHI)53 and in a nested case-cohort study within the PREDIMED study.
The WHI study enrolled 161,808 U.S. postmenopausal women aged 50 to 79 years from 1993 to 1998 in an observational study (WHI-OS) or one or more of three randomized controlled trials55,56. Participants completed baseline socio-demographic, diet, lifestyle, and medical history questionnaires. Plasma samples were collected at enrollment using EDTA tubes and processed immediately and stored in −70 °C freezers53. The current study utilized data from an ancillary 1:1 matched coronary heart disease case-control study nested within the WHI which included 2306 participants with blood samples 53. Women with a history of CVD and cancer at baseline, without metabolomic profiling, who were never pregnant or with missing breastfeeding status were excluded. A total of 2088 participants were included in the WHI replication analyses. The protocol was approved by the Fred Hutchinson Cancer Research Center Institutional Review Board, Seattle, WA. Written informed consent was obtained from all participants.
PREDIMED, a multicenter randomized controlled trial among individuals at high cardiovascular risk was carried out in Spain from 2003 to 2010 and examined the effects of the traditional Mediterranean diet on the primary prevention of CVD, with T2D as a secondary outcome. The primary outcomes have been published elsewhere54,57. Fasting plasma EDTA samples were collected at baseline and processed at each recruiting center no later than 2 h after collection and stored in −80 °C freezers58. The current study included 528 women for the prospective CVD analyses and 544 women for T2D analyses from two nested case–cohort (CVD and T2D outcomes) studies with metabolomics profiling. The IRB of Hospital Clinic (Barcelona, Spain) approved the study protocol and written informed consent was obtained from all participants. All participants provided written informed consent. The flow chart of included studies (primary and validation) and the analysis approach are presented in Fig. 1.
Plasma metabolite profiling
Plasma metabolomic profiling for NHS, NHSII, WHI, and PREDIMED was performed in the same laboratory at the Broad Institute of the MIT and Harvard (Cambridge, MA, USA) using a liquid chromatography-mass spectrometry (LC-MS) platform as described elsewhere17,25,53. We excluded metabolites that were unstable due to delayed processing17. Metabolites with missing values below the limit of detection were imputed with the half the minimal value for that metabolite, in each case-control study separately. We used the inverse-normal transformation within a case-control study to correct for batch effects and to scale metabolite to the same range. The final number of named metabolites available for all 4349 participants measured on the HILIC-positive, HILIC-negative and C8-positive platforms in NHS/NHSII was 181.
Assessment of lifetime total breastfeeding duration
In the NHS, breastfeeding history was assessed once, in 1986, when most of the women had completed their reproductive lifespan and were asked to report the lifetime total duration of breastfeeding for all pregnancies as a categorical variable: “cannot remember (considered as missing breastfeeding information)”, “did not breastfeed”, “<1”, “1–3”, “4–6”, “7–11”, “12–17”, “18–23”, “24–35”, “36–47”, and “≥48 months”. Participants in the NHSII reported their breastfeeding duration in 3 follow-up questionnaires. In 1993, 1997 and 2003, women reported their breastfeeding duration as the same categorical variables as in the NHS. The NHSII questionnaire in 1993 asked about lifetime breastfeeding history. The 1997 questionnaire asked detailed information about their breastfeeding history for each birth59. In 2003, for women who reported pregnancies subsequent to 1997 were asked to provide breastfeeding duration in a supplementary questionnaire. We used the breastfeeding history data from the questionnaire cycle that was most proximal prior to the blood collection. To allow harmonization with the NHS data, in NHSII we calculated the cumulative breastfeeding duration by summing the breastfeeding duration after each birth that the participants reported any breastfeeding prior to blood collection. We used the following categories of cumulative lifetime breastfeeding duration: 0 months, 1–6 months, 7–11 months, and 12+ months. Previous studies have demonstrated that both self-reported breastfeeding initiation and duration are highly reliable60,61. In the WHI, women who reported at enrollment having at least one live birth and who were not missing information on ever breastfeeding were included. Women were asked “Thinking about all the children you breastfed, how many months total did you breastfeed?” Responses were recorded as a categorical variable indicating a cumulative lifetime duration of breastfeeding: 0, 1–6 months, 7–12 months, and 13+ months. In the PREDIMED, no data on lifetime total breastfeeding duration were available.
Ascertainment of type 2 diabetes (T2D)
In the NHS, a supplementary questionnaire was mailed to women who reported physician-diagnosed diabetes on a baseline or any biennial questionnaire62. A validation study demonstrated a high level of confirmation (98%) of self-reported T2D63. In accordance with the National Diabetes Data Group64, diagnosed cases were required to meet one of the following criteria: a) an elevated glucose concentration (i.e., fasting plasma glucose ≥7.8 mmol/l, random plasma glucose ≥11.1 mmol/l or plasma glucose ≥11.1 mmol/l at 2 or more hours after an oral glucose load) and at least one symptom related to diabetes (i.e., excessive thirst, polyuria, weight loss or hunger); b) in the absence of symptoms, at least two elevated glucose concentrations on different occasions; and c) treatment with insulin or oral or other hypoglycemic medication. For cases of T2D identified after 1998, the revised American Diabetes Association criteria were applied using the fasting glucose cutoff of 7.0 mmol/L65.
In the PREDIMED trial, The adjudication of new diagnoses of T2D during follow-up was conducted by the Clinical End point Committee (blinded to the intervention group). The American Diabetes Association criteria, namely, two confirmations of fasting plasma glucose ≥7.0 mmol/L or 2 h plasma glucose ≥11.1 mmol/L after a 75 g oral glucose load, were used to adjudicate cases.
Ascertainment of cardiovascular disease (CVD)
In the NHS and NHSII, CVD was defined as a combined endpoint of non-fatal or fatal myocardial infarction (MI), stroke, coronary artery graft bypass surgery (CABG) or percutaneous coronary intervention (PCI). When a participant (or family members of deceased participants) reported an incident event, permission was obtained to examine their medical records by physicians who were blinded to the participant risk factor status. For each endpoint, the month and year of diagnosis were recorded as the diagnosis date. Non-fatal events were confirmed through review of medical records. MI was confirmed according to the WHO criteria66 on the basis of symptoms and diagnostic electrocardiogram changes or elevated cardiac enzymes. Strokes were confirmed according to the National Survey of Stroke criteria67 as a neurological deficit with sudden or rapid onset that persisted for > 24 h or until death. Deaths were identified by reports of families, the U.S. postal system, or using death certificates obtained from state vital statistics departments and the National Death Index and confirmed through review of medical records or autopsy reports. Follow-up for deaths was > 98% complete68.
In WHI, coronary heart disease (CHD) was defined as incident coronary heart disease defined as MI or death attributable to coronary heart disease53,69. CHD outcomes were adjudicated based on physicians reviewing the elements of medical history, electorcardiogram reading, and the results of cardiac enzyme/ troponin determinations. The controls were frequency matched on 5-year age, race/ethnicity, hysterectomy status, and 2-year enrollment window. Women included in the WHI dataset were drawn from a prior nested case-control study of plasma metabolomicsa and incident CHD, and all were free of CVD at study baseline53.
In PREDIMED, CVD was defined as a composite of MI, stroke, or cardiovascular death. Every year, four information sources were utilized by study physicians who were blinded to the intervention status to identify incident CVD cases: follow-up contacts with participants, family doctor contacts, an annual examination of medical records, and consultation of the National Death Index. A central Event Ascertainment Committee that was blinded received anonymized data and made the adjudication of the events.
Assessment of covariates
Information on potential risk factors, including medical, demographic, and reproductive histories, lifestyle practices, and body weight was collected and updated through NHS and NHSII biennial questionnaires. The exposure assessed closest prior to the blood draw was used in analyses. Parity was defined as the number of pregnancies lasting > 6 months and updated through follow-up questionnaires. As a surrogate pre-pregnancy body mass index (BMI) measure, BMI at age 18 years was calculated as self-reported weight (kg) of these specific time periods divided by the square of height (m2). Data on pre-pregnancy BMI was not available in the WHI and PREDIMED. In PREDIMED, family history of T2D and further details on parity and age at 1st birth were not available. Physical activity in metabolic equivalent (MET) hours were calculated based on participants reported average weekly time spent over the past year engaging in activities70. Alternate Healthy Eating Index (AHEI) included 11 components and was calculated based on Food Frequency Questionnaire with details demonstrated elsewhere71. Cigarette smoking status was self-reported as current, past, and never. Alcohol intake was self-reported as total grams of alcohol intake per day. Self-reported menopausal status was collected at blood collection: premenopausal, postmenopausal without hormone therapy, postmenopausal with hormone therapy use, and unknown.
Statistical analysis
This study aimed to comprehensively examine the relationship between plasma metabolites and breastfeeding duration, as well as its long-term impact on CVD and T2D risk. First, we developed a metabolite-based breastfeeding score in NHS/NHSII and replicated this result in WHI. We then assessed the association between the breastfeeding score and CVD/T2D risk the NHS/NHSII and replicated these results in two independent cohorts, WHI and PREDIMED.
First, in NHS/NHSII, we examined the associations between self-reported breastfeeding and risk of T2D59 and CVD72 adjusting for a priori confounders, which have been adjusted for in the previously published publication on breastfeeding and T2D or CVD4,59,72,73, including all women with available self-reported breastfeeding status (n = 159,684), irrespective of whether they had plasma metabolomic data. Cox regression models were used to calculate the hazard ratios (HRs) and 95% confidence intervals (CIs). Model 1 adjusted for age. Model 2 additionally adjusted for race (white participants vs people of color), family history of T2D or CVD (yes/no), pre-pregnancy BMI ( ≤ 25, 26-30, 31 + ), parity (1, 2, 3 + ), age at 1st birth ( < 25, 25-29, 30-34, 35 + ), and time-varying smoking status (never, past, current), physical activity, alcohol intake, menopausal status, and post-menopausal hormone use. Model 3 additionally adjusted for AHEI dietary score. Model 4 was model 2 additionally adjusted for SES score. The SES score included 9 variables: median household income, median home value, percent with a college degree, percent white people, percent Black people, percent of foreign-born residents, percent of families receiving interest or dividends, percent of occupied housing units, and percent unemployed74.
To derive a metabolite-based breastfeeding score, individual metabolite values were transformed to probit scores using the inverse normal transformation. Then, NHS/NHSII dataset which includes both incident cases and controls from 13 nested case-control studies within the NHS/NHSII (Supplementary Table S1), was split into training and testing (for internal validation) sets in a 70%:30% fashion. The elastic net regression was used to select the breastfeeding-specific metabolites in the NHS/NHSII training set within a 10-fold cross-validation framework and was then applied to the NHS/NHSII testing set to calculate the metabolite-based breastfeeding score, where lifetime cumulative breastfeeding duration (0 months, 1–6 months, 7–11 months, 12+ months) was treated as an a continuous linear outcome. A 10-fold cross-validation (CV) approach was performed to select the optimal λ (lambda) tuning parameter, to minimizing overfitting and determine the optimal beta coefficients in the linear regression75,76. The metabolite-based breastfeeding score was calculated as the weighted sum of the selected metabolites with weights equal to the elastic net regression coefficients. To identify a disease-agnostic breastfeeding score and to avoid overadjustment, we did not adjust for T2D and CVD risk factors in the score development. However, these were accounted for when examining the associations between metabolite-based breastfeeding score with risk of CVD and T2D. As a sensitivity analysis, we performed LASSO and ridge regressions to select breastfeeding-specific metabolites following the same analytical framework as for elastic net regression. Pairwise correlation across metabolites was performed using Pearson correlation within each cohort. We examined the correlations between the individual metabolites selected in the metabolite-based breastfeeding score using elastic net in each of the three cohorts to evaluate the consistency of correlation patterns across the three cohorts. We also examined the correlations between the metabolite-based breastfeeding score and the self-reported lifetime total duration of breastfeeding using the Pearson correlation coefficient in the training, testing, and external validation (WHI) sets.
We then examined the associations between the metabolite-based breastfeeding score and incident T2D and CVD by multivariable Cox regression in NHS/NHSII. The score was standardized to a z-score to interpret the associations per 1-SD unit increment. The Cox regressions were stratified by case-control status in the original sub-study and adjusted for age and fasting status (model 1). Model 2 was additionally adjusted for age at first childbirth, pre-pregnancy BMI (continuous), race (white participants or people of color), smoking status (current, past, or never), /physical activity (continuous), alcohol intake (continuous), family history of disease (CVD for CVD endpoint and T2D for T2D endpoint), parity, menopausal status, and postmenopausal hormone therapy and aspirin use at baseline. Additionally, we performed sensitivity analyses adjusting the main model for the alternative healthy eating index (AHEI) score (continuous) and BMI at blood collection. As metabolites have been shown to be affected by postmenopausal hormone use77, we 1) derived the metabolomic score among postmenopausal women only to examine the robustness of the metabolite selection, and 2) stratified the correlation analyses between metabolomic score and self-reported duration of breastfeeding by menopausal status (premenopausal women, postmenopausal women on hormones, postmenopausal women not on hormones). The person-time for each participant was calculated from the blood collection date (baseline) until the date of CVD/T2D or end of follow-up (CVD: June 2022 in the NHS and June 2019 in the NHSII; T2D: June 2018 in the NHS and June 2021 in the NHSII), whichever came first. We conducted external replication of the association between the metabolite-based breastfeeding score calculated using the weights from the NHS/NHSII and T2D risk in PREDIMED and CVD risk in WHI and PREDIMED. Subsequently, we conducted random-effect meta-analysis of the risk estimates across 4 studies (NHS, NHSII, WHI, PREDIMED). Between-study heterogeneity was explored by τ2 and I2 statistics. Analyses were performed using R version 4.2.0, SAS 9.4 for UNIX (SAS Institute Inc), and Stata v16.0. All statistical tests were two-sided. To account for multiple comparisons, we used the Benjamini-Hochberg procedure and controlled the false discovery rate (FDR) < 0.05.
Further information on the research design is available in the GATHER checklist linked with this manuscript.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The data cannot be shared publicly because of participant confidentiality and privacy concerns. Data are available upon written request. According to standard controlled access procedures, applications to use NHS/NHSII resources will be reviewed by our External Collaborations Committee to verify that the proposed use maintains the protection of the privacy of participants and the confidentiality of the data. Investigators wishing to use NHS/NHSII data are asked to submit a brief description of the proposed project (contact email: nhsaccess@channing.harvard.edu). Investigators can expect initial responses within 4 weeks of request submission. Further information is available at (https://www.nurseshealthstudy.org/researchers) (contact email: nhsaccess@channing.harvard.edu). Source data supporting all our findings (Figs. 2, 3) are provided with this publication as a Source Data file. Source data are provided with this paper.
References
Parikh, N. I. et al. Reproductive risk factors and coronary heart disease in the Women’s Health Initiative observational study. Circulation 133, 2149–2158 (2016).
Tschiderer, L. et al. Breastfeeding is associated with a reduced maternal cardiovascular risk: systematic review and meta-analysis involving data from 8 studies and 1,192,700 parous women. J. Am. Heart Assoc. 11, e022746 (2022).
Pinho-Gomes, A. C., Morelli, G., Jones, A. & Woodward, M. Association of lactation with maternal risk of type 2 diabetes: A systematic review and meta-analysis of observational studies. Diab. Obes. Metab. 23, 1902–1916 (2021).
Birukov, A. et al. Lifetime duration of breastfeeding and cardiovascular risk in women with type 2 diabetes or a history of gestational diabetes: findings from two large prospective cohorts. Diabetes Care https://doi.org/10.2337/dc23-1494. (2024).
Kirkegaard, H. et al. Maternal weight change from prepregnancy to 7 years postpartum-the influence of behavioral factors. Obes. (Silver Spring) 23, 870–878 (2015).
Baker, J. L. et al. Breastfeeding reduces postpartum weight retention. Am. J. Clin. Nutr. 88, 1543–1551 (2008).
Dewey, K. G., Heinig, M. J. & Nommsen, L. A. Maternal weight-loss patterns during prolonged lactation. Am. J. Clin. Nutr. 58, 162–166 (1993).
Bobrow, K. L., Quigley, M. A., Green, J., Reeves, G. K. & Beral, V. Persistent effects of women’s parity and breastfeeding patterns on their body mass index: results from the Million Women Study. Int J. Obes. (Lond.) 37, 712–717 (2013).
Much, D. et al. Lactation is associated with altered metabolomic signatures in women with gestational diabetes. Diabetologia 59, 2193–2202 (2016).
Harville, E. W. et al. Branched-chain amino acids, history of gestational diabetes, and breastfeeding: The Bogalusa Heart Study. Nutr. Metab. Cardiovasc Dis. 30, 2077–2084 (2020).
Zhang, Z. et al. Intensive lactation among women with recent gestational diabetes significantly alters the early postpartum circulating lipid profile: the SWIFT study. BMC Med 19, 241 (2021).
Gerszten, R. E. & Wang, T. J. The search for new cardiovascular biomarkers. Nature 451, 949–952 (2008).
Carneiro, G., Radcenco, A. L., Evaristo, J. & Monnerat, G. Novel strategies for clinical investigation and biomarker discovery. a guide to applied metabolomics. Horm. Mol. Biol. Clin. Investig. https://doi.org/10.1515/hmbci-2018-0045 (2019).
Johnson, C. H., Ivanisevic, J. & Siuzdak, G. Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 (2016).
Siddiqui, M. A., Pandey, S., Azim, A., Sinha, N. & Siddiqui, M. H. Metabolomics: An emerging potential approach to decipher critical illnesses. Biophysical Chem. 267, 106462 (2020).
Sinclair, K. & Dudley, E. Metabolomics and biomarker discovery. Adv. Exp. Med Biol. 1140, 613–633 (2019).
Townsend, M. K. et al. Reproducibility of metabolomic profiles among men and women in 2 large cohort studies. Clin. Chem. 59, 1657–1667 (2013).
Zeleznik, O. A. et al. Intrapersonal stability of plasma metabolomic profiles over 10 years among women. Metabolites https://doi.org/10.3390/metabo12050372.(2022).
Carayol, M. et al. Reliability of serum metabolites over a two-year period: a targeted metabolomic approach in fasting and non-fasting samples from EPIC. PLoS One 10, e0135437 (2015).
Sampson, J. N. et al. Metabolomics in epidemiology: sources of variability in metabolite measurements and implications. Cancer Epidemiol. Biomark. Prev. 22, 631–640 (2013).
Eliassen, A. H. et al. Plasma carotenoids and risk of breast cancer over 20 y of follow-up. Am. J. Clin. Nutr. 101, 1197–1205 (2015).
Zeleznik, O. A. et al. A prospective analysis of circulating plasma metabolites associated with ovarian cancer risk. Cancer Res 80, 1357–1367 (2020).
Lee, D. H. et al. The metabolic potential of inflammatory and insulinaemic dietary patterns and risk of type 2 diabetes. Diabetologia 67, 88–101 (2024).
Rios, S. et al. Plasma metabolite profiles associated with the World Cancer Research Fund/American Institute for Cancer Research lifestyle score and future risk of cardiovascular disease and type 2 diabetes. Cardiovasc Diabetol. 22, 252 (2023).
Wang, F. et al. Plasma metabolite profiles related to plant-based diets and the risk of type 2 diabetes. Diabetologia 65, 1119–1132 (2022).
Oviedo, A. et al. Evaluation of the Tobacco Heating System 2.2. Part 6: 90-day OECD 413 rat inhalation study with systems toxicology endpoints demonstrates reduced exposure effects of a mentholated version compared with mentholated and non-mentholated cigarette smoke. Regul. Toxicol. Pharm. 81, S93–s122 (2016).
Morze, J. et al. Metabolomics and type 2 diabetes risk: an updated systematic review and meta-analysis of prospective cohort studies. Diab. Care 45, 1013–1024 (2022).
Guasch-Ferre, M. et al. Metabolomics in prediabetes and diabetes: a systematic review and meta-analysis. Diab. Care 39, 833–846 (2016).
Ruiz-Canela, M. et al. Comprehensive metabolomic profiling and incident cardiovascular disease: a systematic review. J. Am. Heart Assoc. https://doi.org/10.1161/JAHA.117.005705.(2017).
Chimerel, C. et al. Bacterial metabolite indole modulates incretin secretion from intestinal enteroendocrine L cells. Cell Rep. 9, 1202–1208 (2014).
de Mello, V. D. et al. Indolepropionic acid and novel lipid metabolites are associated with a lower risk of type 2 diabetes in the Finnish Diabetes Prevention Study. Sci. Rep. 7, 46337 (2017).
Teunis, C. J. et al. Tryptophan metabolites and incident cardiovascular disease: The EPIC-Norfolk prospective population study. Atherosclerosis 387, 117344 (2023).
Luo, K. et al. Tryptophan metabolism, gut microbiota, and carotid artery plaque in women with and without HIV infection. AIDS (Lond., Engl.) 38, 223–233 (2024).
Pulakazhi Venu, V. K. et al. The pregnane X receptor and its microbiota-derived ligand indole 3-propionic acid regulate endothelium-dependent vasodilation. Am. J. Physiol. Endocrinol. Metab. 317, E350–E361 (2019).
Rhee, E. P. et al. Lipid profiling identifies a triacylglycerol signature of insulin resistance and improves diabetes prediction in humans. J. Clin. Invest 121, 1402–1411 (2011).
Stegemann, C. et al. Lipidomics profiling and risk of cardiovascular disease in the prospective population-based Bruneck study. Circulation 129, 1821–1831 (2014).
Toledo, E. et al. Plasma lipidomic profiles and cardiovascular events in a randomized intervention trial with the Mediterranean diet. Am. J. Clin. Nutr. 106, 973–983 (2017).
Gorber, C. onnor, Schofield-Hurwitz, S., Hardt, S., Levasseur, J. & Tremblay, G. M. The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob. Res 11, 12–24 (2009).
Domino, E. F., Hornbach, E. & Demana, T. The nicotine content of common vegetables. N. Engl. J. Med 329, 437 (1993).
Esko, T. et al. Metabolomic profiles as reliable biomarkers of dietary composition. Am. J. Clin. Nutr. 105, 547–554 (2017).
Illig, T. et al. A genome-wide perspective of genetic variation in human metabolism. Nat. Genet. 42, 137–141 (2010).
Shoaie, S. et al. Quantifying diet-induced metabolic changes of the human gut microbiome. Cell Metab. 22, 320–331 (2015).
Newgard, C. B. Metabolomics and metabolic diseases: where do we stand? Cell Metab. 25, 43–56 (2017).
Bunner, D. L., EF, V. A. & WP, V. A. Prolactin levels in nursing mothers. Am. J. Obstet. Gynecol. 131, 250–252 (1978).
Li, J. et al. Circulating prolactin concentrations and risk of type 2 diabetes in US women. Diabetologia 61, 2549–2560 (2018).
Yang, H. et al. Prolactin is associated with insulin resistance and beta-cell dysfunction in infertile women with polycystic ovary syndrome. Front Endocrinol. (Lausanne) 12, 571229 (2021).
Wang, T. et al. Circulating prolactin associates with diabetes and impaired glucose regulation: a population-based study. Diab. Care 36, 1974–1980 (2013).
Much, D., Beyerlein, A., Roßbauer, M., Hummel, S. & Ziegler, A. G. Beneficial effects of breastfeeding in women with gestational diabetes mellitus. Mol. Metab. 3, 284–292 (2014).
Liu, S. et al. A prospective study of whole-grain intake and risk of type 2 diabetes mellitus in US women. Am. J. Public Health 90, 1409–1415 (2000).
van Dam, R. M., Willett, W. C., Manson, J. E. & Hu, F. B. Coffee, caffeine, and risk of type 2 diabetes: a prospective cohort study in younger and middle-aged U.S. women. Diab. Care 29, 398–403 (2006).
Hankinson, S. E. et al. Plasma sex steroid hormone levels and risk of breast cancer in postmenopausal women. J. Natl. Cancer Inst. 90, 1292–1299 (1998).
Tworoger, S. S. et al. A 20-year prospective study of plasma prolactin as a risk marker of breast cancer development. Cancer Res 73, 4810–4819 (2013).
Paynter, N. P. et al. Metabolic predictors of incident coronary heart disease in women. Circulation 137, 841–853 (2018).
Estruch, R. et al. Primary prevention of cardiovascular disease with a mediterranean diet supplemented with extra-virgin olive oil or nuts. N. Engl. J. Med 378, e34 (2018).
Vaughan, L. et al. The rationale, design, and baseline characteristics of the Women’s Health Initiative Memory Study of Younger Women (WHIMS-Y). Brain Res 1514, 3–11 (2013).
The Women’s Health Initiative Study Group Design of the Women’s Health Initiative clinical trial and observational study. Control Clin. Trials 19, 61–109 (1998).
Salas-Salvadó, J. et al. Prevention of diabetes with Mediterranean diets: a subgroup analysis of a randomized trial. Ann. Intern Med 160, 1–10 (2014).
Wang, D. D. et al. Plasma ceramides, mediterranean diet, and incident cardiovascular disease in the PREDIMED Trial (Prevención con Dieta Mediterranea). Circulation 135, 2028–2040 (2017).
Stuebe, A. M., Rich-Edwards, J. W., Willett, W. C., Manson, J. E. & Michels, K. B. Duration of lactation and incidence of type 2 diabetes. Jama 294, 2601–2610 (2005).
Li, R., Scanlon, K. S. & Serdula, M. K. The validity and reliability of maternal recall of breastfeeding practice. Nutr. Rev. 63, 103–110 (2005).
Tomeo, C. A. et al. Reproducibility and validity of maternal recall of pregnancy-related events. Epidemiology 10, 774–777 (1999).
Hu, F. B. et al. Physical activity and television watching in relation to risk for type 2 diabetes mellitus in men. Arch. Intern Med 161, 1542–1548 (2001).
Manson, J. E. et al. Physical activity and incidence of non-insulin-dependent diabetes mellitus in women. Lancet 338, 774–778 (1991).
National Diabetes Data Group Classification and diagnosis of diabetes mellitus and other categories of glucose intolerance. Diabetes 28, 1039–1057 (1979).
American Diabetes Association Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diab. Care 20, 1183–1197 (1997).
Mendis, S. et al. World Health Organization definition of myocardial infarction: 2008-09 revision. Int J. Epidemiol. 40, 139–146 (2011).
Walker, A. E., Robins, M. & Weinfeld, F. D. The National Survey of Stroke. clinical findings. Stroke 12, I13–I44 (1981).
Rich-Edwards, J. W., Corsano, K. A. & Stampfer, M. J. Test of the national death index and equifax nationwide death search. Am. J. Epidemiol. 140, 1016–1019 (1994).
Curb, J. D. et al. Outcomes ascertainment and adjudication methods in the Women’s Health Initiative. Ann. Epidemiol. 13, S122–S128 (2003).
Zhang, C., Solomon, C. G., Manson, J. E. & Hu, F. B. A prospective study of pregravid physical activity and sedentary behaviors in relation to the risk for gestational diabetes mellitus. Arch. Intern Med 166, 543–548 (2006).
Chiuve, S. E. et al. Alternative dietary indices both strongly predict risk of chronic disease. J. Nutr. 142, 1009–1018 (2012).
Stuebe, A. M. et al. Duration of lactation and incidence of myocardial infarction in middle to late adulthood. Am. J. Obstet. Gynecol. 200, 138 e131–138 e138 (2009).
Ley, S. H. et al. Lactation Duration and long-term risk for incident type 2 diabetes in women with a history of gestational diabetes mellitus. Diab. Care 43, 793–798 (2020).
DeVille, N. V. et al. Neighborhood socioeconomic status and mortality in the nurses’ health study (NHS) and the nurses’ health study II (NHSII). Environ. Epidemiol. 7, e235 (2023).
Wang, F. et al. Plasma metabolomic profiles associated with mortality and longevity in a prospective analysis of 13,512 individuals. Nat. Commun. 14, 5744 (2023).
Simon, N., Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39, 1–13 (2011).
Balasubramanian, R. et al. Metabolomic effects of hormone therapy and associations with coronary heart disease among postmenopausal women. Circ. Genom. Precis Med 13, e002977 (2020).
Acknowledgements
We would like to acknowledge the Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, as home of the Nurses’ Health Studies. This work was supported by National Institutes of Health R03CA259659 (NS), UM1 CA186107, R01CA49449, R01HL034594, U01CA176726, R01CA67262. The WHI program is funded by the National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health, U.S. Department of Health and Human Services through contracts 75N92021D00001, 75N92021D00002, 75N92021D00003, 75N92021D00004, 75N92021D00005. AB was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft) individual fellowship (no. BI 2427/1-1). MG-F is supported by Novo Nordisk Fonden Research grant NNF18CC0034900. A-JT is supported by the Canadian Institutes of Health Research. RS is funded by the NHLBI K23 HL159331. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
Conceptualization, methodology: A.B., O.A.Z., N.S. Data analysis: A.B., N.L., C.R. Study design, supervision: C.Z., O.A.Z., N.S. Writing–original draft: A.B. Writing- review and editing: A.B., N.L., J.M., C.R., F.W., Z.S.A., A.J.T., M.G.F., S.H.L., J.E.M., R.G.S., B.H., A.H.S., R.B., M.A.M., K.M.R., F.B.H., C.Z., O.A.Z., N.S.
Corresponding author
Ethics declarations
Competing interests
The Authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Kathleen Rasmussen, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Birukov, A., Lin, N., Mongiovi, J. et al. Plasma metabolomic signature of breastfeeding and risk of cardiometabolic diseases. Nat Commun 16, 11124 (2025). https://doi.org/10.1038/s41467-025-65977-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-65977-5





