Main

Aging causes organ dysfunction, chronic disease and death. Although interventions such as caloric restriction, young plasma and rapamycin extend healthspan in model organisms1, potentially slowing aging, their human efficacy is unclear due to limited molecular understanding and assessment tools. Molecular biomarkers of human biological age linked to health and disease are needed.

Recent studies show that human organs age at different rates2,3,4,5,6, as seen in animals7,8,9, necessitating organ-specific biological age measures. Prior estimates of organ age used clinical metrics and blood biochemistry, magnetic resonance imaging (MRI), DNA methylation or plasma proteins2,3,4,5,6. However, their reproducibility across cohorts and over longitudinal visits, their sensitivity to organ-specific diseases and environmental factors and their associations with incident disease and mortality independent of each other and established aging biomarkers are unclear. Furthermore, it is unclear which organs are key to longevity in humans.

Plasma proteomics, enabled by clinical accessibility of blood and advancing technologies, is ideal for gaining molecular insights into organ health and estimating organ-specific biological age. Building on our prior SomaScan-based study (5,678 individuals, 5,000 proteins)4, where we introduced machine learning models to estimate organ age, in the present study we tested our approach on 44,498 individuals in the UK Biobank (UKB) (age 40–70 years), using an orthogonal proteomics platform (Olink, ~3,000 proteins, 1,823 protein name overlap with SomaScan). These data were previously generated by the UKB Pharma Proteomics Project (UKB-PPP) and have been characterized in detail10.

The larger UKB cohort and expanded longitudinal phenotyping enabled us to explore organ age associations with a wider range of diseases (for example, chronic kidney disease, chronic obstructive pulmonary disease (COPD), heart failure and dementias) and focus on future disease incidence, which is a better proxy for biological age than disease prevalence. We also assessed sensitivity to environmental factors, including lifestyle, socioeconomic factors and medication use. Lastly, we performed a nuanced examination of mortality risk, comparing organ age estimates with established aging biomarkers, including PhenoAge and estimated glomerular filtration rate, and we tested associations between extreme organ youth versus age acceleration on longevity versus early mortality.

Results

Plasma protein-derived organ age estimates in the UKB

To derive estimates of organ-specific biological age from the plasma proteome, we (1) identified plasma proteins likely derived from a specific organ (Supplementary Tables 1 and 2, Methods and Supplementary Fig. 1); (2) trained a machine learning model to predict chronological age based on the levels of identified organ-enriched proteins (Supplementary Tables 3 and 4 and Extended Data Fig. 1a); and (3) calculated the age gap based on each person’s predicted age (the relative predicted age compared to individuals of the same chronological age) (Methods and Fig. 1a). The age gap provides a measure of relative biological age compared to same-aged peers.

Fig. 1: Plasma protein-derived organ age estimates in the UKB.
Fig. 1: Plasma protein-derived organ age estimates in the UKB.
Full size image

a, Study design to estimate organ-specific biological age from plasma proteomics data in the UKB. A protein was called ‘organ-enriched’ if the gene encoding the protein was expressed at least four-fold higher in one organ compared to any other organ in the GTEx organ bulk RNA sequencing atlas. Organ-enriched protein sets were used to train LASSO chronological age predictors. Samples from 11 of 22 centers (n = 23,140) were used for training, and the remaining samples (n = 21,358) were used for testing. An ‘organismal’ model, which was trained on the levels of non-organ-specific (organ-shared) proteins, and a ‘conventional’ model, which was trained on all proteins from the Olink assay, were also developed and assessed. Model age gaps were calculated and then z-score normalized per organ to allow for direct comparisons across organs. Age gaps were characterized and tested for associations with disease risk, modifiable lifestyle choices and mortality risk. b, Pairwise correlation of organ age gaps from all samples. Inset: the distribution of all pairwise correlations, with the dotted line representing the mean. c, A LASSO regression model was used to predict conventional age based on organ ages and organismal age. Organismal, brain and artery ages were sufficient to predict conventional age with r2 = 0.97. Relative weights are shown as a pie chart. d, Extreme agers were defined by a 1.5-s.d. increase or decrease in at least one age gap. The mean organ age gaps of extremely youthful brain agers and accelerated multi-organ agers are shown. Graphics in a and d created with BioRender.com.

We performed these three steps for each of 11 major organs, including adipose tissue, artery, brain, heart, immune tissue, intestine, kidney, liver, lung, muscle and pancreas. We focused our analyses on these organs due to the availability of relevant age-related phenotype data in the UKB. To compare organ age gaps to organ-agnostic measures of biological age, we also derived age gaps from an ‘organismal’ aging model trained on non-organ-specific proteins (80% of all proteins) and a ‘conventional’ aging model trained on all proteins on the Olink assay. We confirmed that the top proteins in the conventional aging model overlapped with a previous proteomic aging model developed on the UKB dataset11.

Data from 11 of 22 plasma collection centers were used for model training and the remaining 11 for testing (Fig. 1a; see Methods and Supplementary Figs. 2 and 3 for details on protein quality control, imputation and modeling). Model performance was highly stable across train and test centers (Extended Data Fig. 2a,b and Supplementary Table 4). Age gaps were z-scored per aging model to allow for direct comparison between organs in downstream analyses (Fig. 1a). We observed some sex differences between organ age gaps, with males having older kidneys, immune systems and intestines and females having older adipose tissue, arteries and hearts (Extended Data Fig. 2c,d).

After deriving organ age gaps, we first determined the uniqueness of each measure. If organs age at different rates, then the age gap of one organ should be independent from the age gap of another. Pairwise correlations between age gaps showed that organ age gaps were only weakly correlated (mean r = 0.21; Fig. 1b and Supplementary Table 5), confirming previous studies3,4,6. Organ age gaps differed from the conventional age gap, which strongly correlated with the organismal age gap (r = 0.87) due to majority protein overlap. Organismal, brain and artery ages explained 97% of conventional age variance, with organismal age contributing 74% (Fig. 1c).

We previously identified extreme organ agers who displayed especially fast aging in a single organ or in many organs4. We searched for extreme agers in the UKB by identifying individuals with age gaps above or below 1.5 s.d. from the population average in any organ (top and bottom ~6–7% percentiles). Here we found not only fast (≥1.5 z-age gap) organ agers but also slow (≤–1.5 z-age gap) organ agers (1–2% prevalence each, 33% total; Fig. 1d and Extended Data Fig. 2e). Differences between studies may stem from different sample sizes, proteomics platforms and thresholds. We also identified multi-organ agers who had two or more extreme organ age gaps (26% of samples; Extended Data Fig. 2e). Fourteen percent of samples were ambiguous, with both positive and negative extreme age gaps, and 27% of remaining samples were defined as normal agers (Extended Data Fig. 2e). Multi-organ agers were significantly older than normal and single-organ agers, suggesting that aged organs accumulate over time (Extended Data Fig. 2f).

To assess longitudinal stability of organ age gaps, we analyzed 1,176 individuals from the COVID-19 repeat imaging study with plasma proteomics (earlier version of Olink, 1,459 proteins) from 2–3 visits over 1–15 years (visits: baseline, Instance 2 and Instance 3) (Methods). Using organ aging models trained on the 1,459 proteins shared across Olink assay versions (correlated r ≥ 0.8 with 3,000-protein models), we found moderate to strong correlations (mean r = 0.6) between baseline and Instance 2 visit age gaps, indicating relative stability over approximately 9 years, with potential dynamic changes due to lifestyle, disease or technical variability (Extended Data Fig. 3a).

Examining longitudinal stability of extreme ager status, we found that baseline extreme agers were 3–22 times more likely to remain extreme agers in the same organ at Instance 2 (Extended Data Fig. 3b), but 68% lost this status. Analyzing age gap changes in 0.5-s.d. bins, 76% of baseline extreme agers maintained directional stability at Instance 2 and 72% at Instance 3, supporting relative stability (Extended Data Fig. 3c–e). Additional studies with larger cohorts, denser sampling and higher-coverage proteomics are needed to clarify technical versus biological stability.

We next sought to compare our UKB Olink organ aging models with our previously developed SomaScan organ aging models. We generated Olink plasma proteomics data from 1,636 samples pooled across the Stanford Alzheimer’s Disease Research Center (ADRC) and the Stanford Aging and Memory Study (SAMS)12,13. We previously generated SomaScan plasma proteomics data from 601 of these samples for our prior study4. This allowed us to directly compare organ age gaps derived across different proteomics platforms.

We tested UKB-trained organ aging models on Stanford Olink data. Of note, five organ aging model proteins—ANGPTL7, EBI3.IL27, GZMB, PGLYRP1 and ADIPOQ—had missing values in all Stanford samples (Extended Data Fig. 4a). EBI3.IL27 and ADIPOQ had relatively strong weights in the liver and adipose aging models, respectively (Extended Data Fig. 4a) and, therefore, reduced liver and adipose age prediction accuracy in the Stanford samples (Extended Data Fig. 4b). Overall, UKB-trained organ aging models showed similar age prediction accuracies across UKB-train, UKB-test and Stanford-test healthy control samples, demonstrating robustness of the models (Extended Data Fig. 4b).

We next compared organ age gaps between platforms. We observed moderate-strong correlations between the two platforms (r = 0.3–0.8; Extended Data Fig. 4c) with heart and kidney aging models displaying the most concordant age gaps across platforms (r = 0.81 and r = 0.82, respectively). The overall moderate correlations were expected given the differences in proteins measured per platform (1,823 proteins overlap, 1,093 Olink-specific, 3,156 SomaScan-specific) and the previously reported wide distribution of correlations for the overlapping proteins (by name) across the two platforms14 (Extended Data Fig. 4d). Even if the same protein by name is measured, the two platforms may quantify different forms of the same proteins (that is, isoform, posttranslational modification), which may not necessarily correlate with each other.

This suggested that each platform may provide complementary information in estimating organ age. Gene Ontology enrichment analyses showed that Olink brain aging proteins were enriched for perineuronal nets and glial differentiation, whereas SomaScan emphasized postsynaptic assembly; conversely, immune aging pathways were similar across platforms (Extended Data Fig. 4e). Both brain models were associated with prevalent Alzheimer’s disease (Extended Data Fig. 4f), capturing distinct aging aspects. These findings show that organ-specific biological age estimates, likely related to organ health, can be derived from plasma proteomic data in the UKB.

Organ age estimates predict future age-related disease

For an estimate of biological age to be informative, it must robustly associate with the physiological state of the organ or individual and, consequently, with age-related health and disease outcomes. Hence, we sought to determine whether organ age gaps could predict future diseases in their respective organs. We tested the associations between all 13 z-scored age gaps and 15 incident age-related diseases (2–17-year follow-up) using Cox proportional hazard regression, while adjusting for age and sex. Following Benjamini–Hochberg correction for multiple hypothesis testing, we identified 176 positive and four negative significant associations out of 195 tests (Extended Data Fig. 5a and Supplementary Table 6).

We discovered highly significant associations between heart aging and atrial fibrillation (s.d. increase in heart age gap, hazard ratio (HR) = 1.75, q < 1 × 10−250) and heart failure (s.d. increase in heart age gap, HR = 1.83, q = 8.35 × 10−231), pancreas aging and kidney aging with chronic kidney disease (s.d. increase in pancreas age gap, HR = 1.80, q = 3.36 × 10−247; s.d. increase in kidney age gap HR = 1.66, q = 2.85 × 10−228), brain aging with Alzheimer’s disease (s.d. increase in brain age gap, HR = 1.80, q = 1.21×10−67) and lung aging with COPD (s.d. increase in lung age gap, HR = 1.39, q = 6.82 × 10−49). Liver aging was associated with chronic liver disease (s.d. increase in liver age gap, HR = 1.20, q = 3.87 × 10−13), albeit the strength of the association was modest and similar compared to other organs. Notably, organ-specific age gaps consistently exhibited higher HRs than conventional age gaps across all diseases (Extended Data Fig. 5a).

The widespread significant associations between organ aging (176/195) and disease underscore the systemic nature of aging. To disentangle these systemic effects and identify organ age associations with disease independent of aging in other organs, we examined multivariate Cox models that included all organ age gaps (excluding ‘conventional’ due to collinearity with ‘organismal’, r = 0.87) as covariates. After accounting for the aging of other organs, we found that organ age gap associations with disease became much more organ specific (Fig. 2a and Supplementary Table 7). For example, incident heart failure was associated with all 13 baseline age gaps individually but only seven age gaps independently. Counterintuitively, we observed that youthful arteries were associated with increased risk for many diseases, including type 2 diabetes (s.d. increase in artery age gap, HR = 0.68, q = 2.11 × 10−64), COPD (s.d. increase in artery age gap, HR = 0.85, q = 8.85 × 10−8), chronic liver disease (s.d. increase in artery age gap, HR = 0.87, q = 7.58 × 10−5) and others, highlighting the complexity of aging’s link with disease. Notably, Alzheimer’s disease was highly specific to brain aging, both individually (s.d. increase in brain age gap, HR = 1.80, q = 1.21 × 10−67) and independently of other organs (s.d. increase in brain age gap, HR = 1.91, q = 2.37 × 10−67).

Fig. 2: Organ age estimates predict future age-related disease.
Fig. 2: Organ age estimates predict future age-related disease.
Full size image

a, Cox proportional hazard regression was used to test the association between age gaps and future disease risk, adjusted for age-at-blood-draw, sex and other age gaps. The heatmap is color coded by age gap log(HR). Heatmap columns are ordered by the Gini coefficient of age gaps per disease. b, Body plots showing logHR values for type 2 diabetes, atrial fibrillation and Alzheimer’s disease from the heatmap in a. c, Cox proportional hazard regression was used to test the association between extreme brain youth/aging and future Alzheimer’s disease risk, adjusted for age-at-blood-draw, sex and APOE4 and APOE2 genotype (n = 37,766). Points show age gap HRs, and error bars show 95% confidence intervals. d, Cumulative incidence curves with 95% confidence intervals showing onset of Alzheimer’s disease over time when comparing individuals with extremely aged brains, individuals with extremely youthful brains and normal brain agers. Age gap HRs and 95% confidence intervals from c are shown. e, Box plot visualization of brain age gap versus APOE genotype (n = 43,326). The box bounds are the Q1, median and Q3; the whiskers show Q1 − 1.5× the interquartile range (IQR) and Q3 + 1.5× the IQR. *P < 0.05, **P < 0.01, ***P < 0.001 from standard two-sided t-test. f, Plasma-based brain age gap at baseline versus MRI-based brain age gap at Instance 2 (several years after baseline visit). Correlation and P value are shown. g, Bar plot displaying the top 20 protein coefficients in the brain aging model. h, Pie chart displaying the proportion of brain aging proteins assigned to each brain cell type based on single-cell RNA sequencing. Cell type was assigned based on cell type with the maximum expression of a given gene. Oligo, oligodendrocytes; OPC, oligodendrocyte precursor cell. Body graphic in b created with BioRender.com.

We also investigated the associations between extreme organ ager status and disease risk, independent of age, sex and other extreme organs (Extended Data Fig. 5b and Supplementary Table 8). Multi-organ agers 5–7 and 8+ were not included due to limited sample size for most diseases. Individuals with 2–4 aged organs had significantly increased risk for every disease that we examined. Individuals with extremely aged brains, hearts, kidneys or lungs also showed broad increased risk of disease. Conversely, individuals with 2–4 extremely youthful organs were protected from many diseases, including chronic kidney disease, osteoarthritis and COPD. Among youthful organ profiles, the brain, immune system and intestine were nominally protective for at least four diseases (nominal P < 0.10). A youthful brain was especially protective of Alzheimer’s disease (extreme brain youth only, HR = 0.28, P = 0.076, q = 0.24) and other dementias, and a youthful intestine was especially protective of diabetes (extreme intestine youth only, HR = 0.30, q = 3.63 × 10−3; Extended Data Fig. 5b).

Given the brain’s strong and unique associations with Alzheimer’s disease, we further examined all individuals with extremely youthful or aged brains—regardless of single or multi-organ ager status—and their associations with Alzheimer’s disease risk, controlling for age, sex as well as APOE genotype, the strongest genetic risk factor of Alzheimer’s disease. Interestingly, we found that extreme brain aging conferred a 3.1-fold increase in Alzheimer’s disease risk (HR = 3.11, P = 1.41 × 10−28), and extreme brain youth conferred a 74% reduced risk of Alzheimer’s disease (HR = 0.26, P = 4.37 × 10−4), independent of age, sex, APOE4 and APOE2 (Fig. 2c,d). Consequently, individuals with aged brains exhibited a 13.5-fold increased risk of developing Alzheimer’s disease compared to those with youthful brains. Remarkably, when comparing the effects of brain age to APOE genotype, having an aged brain posed a risk similar to carrying one copy of APOE4, whereas having a youthful brain provided protection similar to carrying two copies of APOE2 (Fig. 2c). Brain age was very weakly correlated with APOE genotype (Fig. 2e). Over the 17-year follow-up period, 120 of 2,628 individuals (4.56%) with aged brains developed Alzheimer’s disease, whereas only seven of 1,998 individuals (0.35%) with youthful brains developed the disease.

The specificity of the association between brain aging and dementia led us to investigate whether organ age gaps were associated with brain volume based on MRI data from follow-up visits (Extended Data Fig. 5c). We found that the brain age gap at baseline visit was uniquely associated with increased volume of the ventricles and decreased volume of cortical regions at Instance 2 (~9 years later on average).

Brain MRI data were previously used to derive estimates of biological brain age15,16. To compare our plasma proteomic brain age estimate with an MRI-based brain age estimate, we trained an MRI brain aging model using 479 FreeSurfer brain volumetric phenotypes from 45,574 individuals from Instance 2. MRI brain age was correlated with chronological age (r = 0.81, mean absolute error (MAE) = 3.65; Extended Data Fig. 5d), similar to previous reports15,16. An s.d. increase in the MRI brain age gap conferred a 3.2-times increased risk of future Alzheimer’s disease (HR = 3.21, P = 2.55 × 10−36), confirming strong relevance to functional brain aging. Interestingly, however, MRI-based and plasma-based brain age gaps were only weakly correlated with each other (r = 0.18, P = 2.50 × 10−30; Fig. 2f), suggesting that each captures unique components of brain aging. Although this weak correlation is likely due, in part, to the approximately 9-year duration between MRI and plasma collection, a recent study showed that plasma brevican (BCAN) and glial fibrillary acidic protein (GFAP), but not other brain aging proteins identified in our study, were associated with MRI brain age17, suggesting that biological differences also contribute to the weak correlation. We speculate that MRI brain age captures global cell loss, whereas plasma brain age captures some of these volumetric changes as well as molecular alterations related to cell states and interactions.

We also assessed organ age gap associations with disease progression, by regressing age gaps against years since diagnosis, for individuals who were diagnosed with disease before blood draw. We found that many organ age gaps increased throughout chronic kidney disease progression (Extended Data Fig. 5e). Interestingly, the brain age gap was not associated with dementia progression (Extended Data Fig. 5f), suggesting that it captures age-related changes that may predispose individuals to dementia but does not reflect changes occurring after disease onset.

We next sought to gain further insights into organ aging by examining aging model proteins and their weights (Extended Data Fig. 1a). Aging models trained on the whole baseline sample versus only the randomly selected (86%) baseline samples resulted in essentially equivalent aging models and weights (Supplementary Fig. 4). The strongest weighted protein in the brain aging model was neurofilament light chain (NEFL; Fig. 2h), which increases with age and is a clinical biomarker of axon degeneration that is often measured in clinical trials for Alzheimer’s disease18,19 and was recently approved as a surrogate endpoint for a clinical trial to treat superoxide dismutase 1 amyotrophic lateral sclerosis (SOD1 ALS)20,21. Our data suggest that it, combined with other proteins, may also be a viable surrogate endpoint for brain aging and risk for dementia in normal people. Other highly weighted brain aging proteins include myelin oligodendrocyte protein (MOG), a component of the outer surface of myelin sheaths, and GFAP, a marker of reactive astrocytes, which both increased with age, as well as BCAN, a brain extracellular matrix component produced by oligodendrocyte precursor cells, and protein tyrosine phosphatase receptor type R (PTPRR), which both decreased with age (Fig. 2h). Plasma NEFL, GFAP and BCAN were previously highlighted as predictors of future dementia risk22. Using the permutation feature importance for biological aging (FIBA) algorithm4, we found that all of the top seven most highly weighted brain aging proteins contributed to the prediction of Alzheimer’s disease risk, showing the importance of leveraging the information from many brain-derived proteins to understand brain aging and disease risk (Extended Data Fig. 6a).

We then sought to determine which cell types these proteins were likely derived from by analyzing public human brain single-cell RNA sequencing data23 (Extended Data Fig. 6b). Interestingly, we found that approximately half of brain aging proteins were specific to the oligodendrocyte lineage, with the rest expressed mostly in neurons and then astrocytes (Fig. 2i), pointing to white matter as a key aging region, as suggested by human brain MRI and mouse brain RNA sequencing studies24,25.

Lung aging was explained primarily by lysosome-associated membrane glycoprotein 3 (LAMP3), a protein expressed specifically in type II alveolar stem cells; secretoglobin family 1A member 1 (SCGB1A1), also known as club cell secretory protein (CCSP), a marker of club cells; and C-C motif chemokine ligand 18 (CCL18), a cytokine expressed by alveolar macrophages, potentially reflective of stem cell dysfunction and inflammation in the lung with age. Heart aging was explained primarily by N-terminal pro-B-type natriuretic peptide (NT-proBNP), a vasodilating hormone that increases in response to heart damage, whereas kidney aging was explained by renin (REN), a protein involved in blood pressure regulation. Both NT-proBNP and REN were previously identified as key heart and kidney aging proteins, respectively, based on SomaScan plasma proteomics data4 and are well-established biomarkers of heart and kidney function. All aging model protein weights are provided in Supplementary Table 3, and the top 20 proteins for each model are shown in Extended Data Fig. 1a. Together, these data show that plasma protein-derived organ age estimates are linked with age-related organ diseases and can reveal insights into the aging biology of their respective organs.

Organ age estimates are sensitive to modifiable lifestyle choices

We next explored whether biological age estimates grounded in physiological states of organ function are sensitive to changes in lifestyle. We tested the associations between all 13 z-scored age gaps and 18 lifestyle factors (that is, diet, alcohol, smoking, exercise and insomnia) and socioeconomic factors (that is, education and Townsend Deprivation Index), adjusted for each other as well as age and sex, using linear regression (Supplementary Table 9). We found 69 positive and 57 negative significant associations (q < 0.05) after correcting for multiple hypothesis testing. In line with their known health associations, smoking, alcohol, processed meat intake, the Townsend Deprivation Index and insomnia were associated with age acceleration across several organs, whereas vigorous exercise, oily fish consumption, poultry consumption and higher education were associated with youthful organs (Fig. 3a).

Fig. 3: Organ age estimates are sensitive to modifiable lifestyle factors.
Fig. 3: Organ age estimates are sensitive to modifiable lifestyle factors.
Full size image

a, Linear regression was used to determine the association between age gaps and modifiable lifestyle factors while accounting for all other lifestyle factors as well as age and sex. The heatmap is color coded by signed log10(q). Only significant (q < 0.05) values are indicated by color coding. b, Linear regression was used to determine the association between age gaps and drugs/supplements intake while accounting for age and sex. Only drugs with significant (q < 0.05) youthful associations in at least two organs are shown. Heatmap is color coded by signed log10(q value). Only significant (q < 0.05) values are indicated by color coding. c, Multivariate linear regression was used to determine the association between age gaps versus early menopause and estrogen treatment independent of each other. Bar plot showing signed log10(P value) for menopause and estrogen covariates is shown. d, Box plot visualization of immune age gaps in individuals stratified by menopause status and estrogen treatment (n = 47). The box bounds are the Q1, median and Q3; the whiskers show Q1 − 1.5× the interquartile range (IQR) and Q3 + 1.5× the IQR. Q, quartile.

We also tested the associations between all 13 z-scored age gaps and consumption of 137 drugs/supplements (n cases ≥ 100; Supplementary Table 10). After multiple hypothesis test correction, we found six products—Premarin, ibuprofen, glucosamine, cod liver oil, multivitamins and vitamin C—that were significantly (q < 0.01) associated with youth in at least two organs (Fig. 3b). Ibuprofen, glucosamine, cod liver oil, multivitamins and vitamin C products were associated with youth primarily in the kidneys, brain and pancreas (Fig. 3b).

Premarin is a conjugated estrogen medication typically prescribed to women experiencing postmenopausal symptoms, and estrogen medication was recently linked to reduced mortality risk in the UKB26. Thus, we wondered whether estrogen medications may extend longevity by preventing menopause-induced accelerated aging of organs and disease risk27. Among 47 women with normal, early or premature menopause treated with estrogen, earlier menopause was associated with accelerated aging across most organs (Fig. 3c), whereas estrogen treatment correlated with youthful immune, liver and artery profiles (Fig. 3d). Although cross-sectional analyses should be interpreted with caution, these findings underscore the sensitivity of plasma protein-derived organ age estimates to environmental factors and their potential for evaluating the effects of interventions.

Brain and immune system youth predict longevity

We next sought to determine whether organ age estimates were associated with future mortality. We tested associations between organ age gaps and all-cause mortality risk, adjusting for age and sex, over a 2–17-year follow-up using Cox proportional hazard regression (Supplementary Table 11). All organs showed significant associations, with a 20–60% increased mortality risk per s.d. increase in age gaps (Fig. 4a), similar to findings from SomaScan-based organ aging models4. These associations were robust to adjustment with blood cystatin C, a marker of kidney filtration rate, and PhenoAge, an established blood biochemistry/cell-count-based biological age estimate (whose age gap has a mortality risk HR of 1.38 in the UKB), suggesting that organ age estimates provide independent information not captured by existing clinical biomarkers. Surprisingly, brain aging was most strongly predictive (s.d. increase in brain age gap, HR = 1.59, P = 2.16 × 10−293; Fig. 4a), suggesting that the brain may be a central regulator of lifespan in humans similar to findings in animal models (worms, flies and mice)28,29,30. Indeed, individuals with aged brains had increased risk for several diseases beyond dementia, including COPD and heart failure (Extended Data Fig. 5a,b), consistent with previous studies showing that the brain regulates systemic inflammation31,32,33,34,35.

Fig. 4: Accrual of aged organs progressively increases mortality risk, whereas brain and immune system youth is associated with longevity.
Fig. 4: Accrual of aged organs progressively increases mortality risk, whereas brain and immune system youth is associated with longevity.
Full size image

a, Bar plot showing results from Cox proportional hazard regression analyses, testing the associations between each age gap and future all-cause mortality risk, controlling for labeled covariates (n = 21,775). Bars show HRs, and error bars show 95% confidence intervals. PhenoAge age gap HR adjusted for age and sex (HR = 1.38) is shown as a dashed line for reference. b, Concordance indices from various LASSO-regularized Cox proportional hazard models were trained to predict mortality risk. Performance across train and test centers is shown. Covariates for each model, in addition to age-at-blood-draw and sex, are labeled on the x axis. c, Model coefficients are shown for the combined model (OrganAge + PhenoAge + CysC) from b. d, Forest plot showing results from Cox proportional hazard regression, testing the associations between extreme ager status (‘+’ refers to aged; ‘−’ refers to youthful) and future all-cause mortality risk, controlling for age-at-blood-draw and sex. Only significant (P < 0.05) associations are shown. Points show extreme ager HRs; error bars show 95% confidence intervals; and the number on the right shows the number of events out of the total sample size. e, Kaplan–Meier curves with 95% confidence intervals showing survival over a 17-year follow-up for normal agers, multi-organ agers (with 2–4, 5–7 or 8+ aged organs) and individuals with a youthful brain or immune system (brain– or immune–). f,g, Gene Ontology pathway enrichment analyses from the top 10 brain (f) and immune (g) aging proteins, as determined from the mortality risk FIBA score (Methods). CysC, cystatin C.

To test whether organ age estimates provided additional predictive power, beyond cystatin C and PhenoAge, we compared concordance indices of mortality risk Cox models that included cystatin C, PhenoAge, organ ages or combinations, with age and sex as covariates. Organ ages alone performed similarly to the combined model, outperforming cystatin C and PhenoAge, indicating that they capture additional predictive information (Fig. 4b). Brain age, conventional age, PhenoAge and sex were key predictors in the combined model (Fig. 4c). Application of FIBA to understand contributions of brain and conventional aging model proteins on mortality risk highlighted BCAN, NEFL and PTPRR from the brain as well as ectodysplasin A2 receptor (EDA2R, organismal protein), chemokine C-X-C motif ligand 17 (CXCL17, organismal protein) and elastin (ELN, artery protein) from the conventional aging model as important proteins (Extended Data Fig. 6c–f).

Although each organ age gap was associated with risk of death, we wondered whether the accrual of aged organs would be increasingly detrimental. Hence, we tested whether single-organ extreme agers and multi-organ agers (bins of 2–4, 5–7 and 8+ organs) had an increased risk of death compared to normal agers, while adjusting for age and sex (Supplementary Table 12). Interestingly, we found that, whereas having a single aged organ (brain, lung, intestine, heart, immune, kidney, liver or pancreas) conferred a 1.5–3-fold increased risk of death, having 2–4, 5–7 and 8+ extremely aged organs conferred a 2.3-fold, 4.5-fold and 8.3-fold (P = 2.05 × 10−70, P = 3.86 × 10−106 and P = 8.30 × 10−127) increased risk of death, respectively (Fig. 4d,e). Notably, over 60% of individuals with 8+ extremely aged organs at blood draw died within 15 years (Fig. 4d).

We then sought to determine whether youthful organ profiles were associated with longevity (Supplementary Table 12). We found that individuals with youthful-appearing arteries had increased mortality risk, and those with multi-organ youth had no difference in mortality risk compared to normal agers (Fig. 4d). Using FIBA, we found that artery protein thrombospondin 2 (THBS2), a protein that decreases with age but has a positive association with mortality risk, was responsible for the nonlinear association between the artery age gap and mortality risk (Extended Data Fig. 6e). Why individuals with broad multi-organ youth are not protected is unclear, although this may be due to limited sample size.

Notably, though, individuals with youthful brains (HR = 0.60, P = 7.49 × 10−3) and immune systems (HR = 0.58, P = 7.34 × 10−3) had significantly reduced mortality risk (Fig. 4d), similar to their unique protective associations with disease (Extended Data Fig. 5b). We, therefore, assessed individuals with both youthful brains and immune systems and found that this group was most strongly protected from mortality by effect size (HR = 0.44, P = 0.042; Fig. 4d,e), underscoring the benefits of having a resilient brain and immune system. Over 17 years, 792 of 10,000 (7.92%) normal agers died, whereas only six of 160 individuals (3.8%) with youthful brains and immune systems died.

To probe the underlying mechanisms implicated in brain-related and immune-system-related longevity, we performed Gene Ontology enrichment analyses of the top 10 brain and immune aging model proteins based on mortality risk FIBA scores (Fig. 4f,g and Extended Data Fig. 6h,i). Selection of brain-specific or immune-system-specific plasma proteins as background for the enrichment test did not result in significant associations, so we used all genes as background. Perineuronal net was the most enriched brain pathway (BCAN, PTPRZ1 and NCAN), and secretory granule was the most enriched immune system pathway. Intriguingly, neuroinflammation was another enriched immune system pathway, composed of metalloproteinase 9 (MMP9), a regulator of perineuronal net composition, as well as tumor necrosis factor receptor superfamily member 1B (TNFSRF1B) and integrin alpha M (ITGAM), also known as CD11B. Youthful brain agers had higher levels of BCAN and NCAN, and youthful immune agers had lower levels of MMP9 (Extended Data Fig. 6h,i), suggesting that global preservation of brain extracellular matrix partly due to reduced degradation by peripheral inflammatory factors and reduced chronic inflammation partly regulated by the brain could be crucial for promoting longevity. Together, these data reveal that plasma protein-derived organ age estimates improve upon existing aging biomarkers for mortality risk prediction, the accrual of aged organs progressively increases mortality risk, and a youthful brain and immune system are key to longevity.

Discussion

Our findings based on plasma proteomics data (~3,000 proteins) from nearly 45,000 individuals in the UKB establish plasma protein-derived organ age estimates as robust indicators of organ age, health and disease/mortality risk beyond gold standard clinical aging biomarkers and reveal key proteins implicated in the aging process. Furthermore, we show that organ age estimates are stable across train/test centers and longitudinal visits and are cross-sectionally associated with modifiable lifestyle choice, socioeconomic status and medications and, thus, lay the foundation for human experiments testing the effects of novel longevity interventions on the biological age of organs at the individual level.

We note key considerations and limitations to inform future work. Although our organ enrichment classification based on bulk RNA sequencing atlases yielded robust results, confirming the true protein sources remains challenging; high-resolution gene expression maps including information on alternative splicing and changes with age and disease could strengthen confidence. Like many studies, we relied on cross-sectional age gaps, which only approximate aging rate histories. Longitudinal proteomics data, integrated with deep environmental and genetic phenotyping, are essential to accurately measure aging rates, distinguish baseline differences and uncover their causes16,36,37,38. Such data could also clarify the sequence of organ aging at individual and population levels. Although we observed that multi-organ aging increases with age, suggesting cumulative organ decline, the specific order of organ aging remains undefined. Surprisingly, individuals with highly youthful organ profiles (except for brain and immune system) were not protected from mortality, despite assumptions of benefit. This aligns with studies showing U-shaped associations between clinical biomarkers (for example, body mass index and platelet count) and mortality, where extremely youthful phenotypes are associated with elevated risk39,40,41, warranting further exploration in aging biomarker research. Although organ age estimates independently predict mortality beyond clinical biochemistry biomarkers, their added clinical value over established disease-specific biomarkers requires disease-by-disease investigation42. Finally, as UKB participants are predominantly of European ancestry, organ age estimates may need recalibration for diverse genetic and environmental contexts, necessitating broader studies.

In the present study, we found evidence that the brain and immune system may be central regulators of aging and longevity in humans, as aged brains are most strongly predictive of earlier mortality, and youthful brains and immune systems are uniquely predictive of longevity. After all, the brain regulates numerous critical age-related functions throughout the body, including circadian rhythm, blood pressure, energy homeostasis and stress response, via the neuroendocrine and autonomic nervous systems, and chronic inflammation has been heavily implicated in aging1,43. Intriguingly, recent studies show bidirectional communication between the brain and immune system in aging and disease-relevant contexts, such as chronic stress, atherosclerosis and infection31,32,33,34,35,44. These observations suggest that accelerated aging or maintenance of youth in the brain and immune system likely has broad age-related effects across the body, although additional studies are needed to test this hypothesis more rigorously in humans.

Regarding the molecular alterations that occur with brain aging, we found many unexpected age-associated and disease-associated changes in oligodendrocyte lineage and extracellular matrix proteins, implying extensive changes beyond neuroinflammation and neurodegeneration. Indeed, myelin degeneration and defective remyelination with mouse aging causes cognitive deficits45 and aggravates Alzheimer’s disease pathology46, and microglia homeostatically modulate the perisynaptic matrix47,48. Moreover, the observation that white matter regions exhibit the most pronounced shifts in aging microglial transcriptomes25, coupled with the enrichment of genetic risk variants for neurodegenerative diseases in microglial and oligodendrocyte genes49, underscores a potential link between oligodendrocyte aging and age-related neuroinflammation and their relevance to neurodegeneration. Future studies exploring these multicellular and extracellular matrix interactions in the aging brain and their interactions with the periphery may reveal key insights into human health and longevity.

Methods

UKB

Participants and proteomics

The UKB is a population-based prospective cohort with omics and phenotypic data collected on approximately 500,000 participants, aged 40–69 years at recruitment, between 2006 and 2010. A subset of participants had subsequent visits after the baseline assessment: 20,337 participants had a first repeat assessment (Instance 1; 2012–2013); approximately 85,000 had a first MRI imaging visit (Instance 2; from 2014 to ongoing); and approximately 9,000 had a second MRI imaging visit (Instance 3; from 2019 to ongoing). Details on available phenotypes can be found at https://biobank.ndph.ox.ac.uk/showcase/, and all participants provided informed consent.

The UKB-PPP consortium generated Olink Explore 3,072 proteomics data from blood plasma samples collected from 54,219 UKB participants. These data consisted of 46,595 randomly selected samples from the baseline visit and 6,376 additional baseline visit samples selectively prioritized by the UK-PPP (samples presumably enriched for rare diseases and polymorphisms). Additional Olink proteomics data (using an earlier version of the platform, ~1,500 proteins) were generated from 1,268 participants who participated in the COVID-19 repeat imaging study. These data included samples from the baseline visit, Instance 2 (imaging visit 2014+) and Instance 3 (imaging visit 2019+). Additional details on the proteomics data are provided in Sun et al.10 and at https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/PPP_Phase_1_QC_dataset_companion_doc.pdf.

UKB data were analyzed under application number 45420.

Clinical phenotypes

We defined participants’ last known age as either age at death or the difference between the latest date available in International Classification of Diseases (ICD)-9, ICD-10, operating procedure, cancer registry or UKB assessment visit fields and birth date; this age corresponds to censoring in the following survival analyses. Additionally, we define the age at onset of several groups of diseases based on a combination of information in ICD-10, ICD-9, operating procedure and cancer registry. The following disease groups were defined, as in ref. 50: Type 2 diabetes (E11), Ischemic heart disease (I20 to I25), Cerebrovascular disease (I60 to I69), Chronic liver disease (K70 and K73 to K76), Chronic kidney disease (N18), All-cause dementia (A81, F00 to F03, F05, F10, G30, G31 and I67), Vascular dementia (F01 and I67), Alzheimer’s disease (F00 and G30), Parkinson’s disease and parkinsonism (G20 to G22), Rheumatoid arthritis (M05 and M06), Macular degeneration (H35), Osteoporosis (M80 and M81), Osteoarthritis (M15 to M19), Prevalent hypertension (I10 to I13 and I15), Colorectal cancer (C18 to C20), Lung cancer (C33 and C34), Esophageal cancer (C15), Liver cancer (C22), Pancreatic cancer (C25), Brain cancer (C71), Leukemia (C91 to C95), Non-Hodgkin lymphoma (C82 to C86), Breast cancer (C50), Ovarian cancer (C56 and C57) and Prostate cancer (C61). For heart-related diseases, the group definitions were based on ref. 51: Heart failure (ICD-9: 4254, 4280, 4281 and 4289; ICD-10: I110, I130, I132, I255, I420, I428, I429, I500, I501 and I509) and Atrial fibrillation or flutter (ICD-9: 4273; ICD-10: I480, I481, I482, I483, I484 and I489; OPCS-4: K571, K621, K622, K623, K624, X501 and X502).

The following lifestyle and socioeconomic factors were assessed: alcohol intake frequency, smoking status, number of days per week of moderate physical activity 10+ minutes, number of days per week of vigorous physical activity 10+ minutes, Townsend Deprivation Index, sleeplessness or insomnia, age at completed full-time education, oily fish intake, beef intake, cooked vegetable intake, tea intake, pork intake, processed meat intake, salad raw vegetable intake, bread intake, fresh fruit intake, non-oil fish intake and poultry intake. Medications reported in the verbal interview (https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100075) were also analyzed with a minimum of 100 participants per medication. Brain MRI-derived phenotypes extracted with FreeSurfer version 6 (https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=110 and https://biobank.ctsu.ox.ac.uk/crystal/refer.cgi?id=1977) were analyzed, and age at MRI as well as estimated total intracranial volume were regressed out.

Proteomics quality control and imputation

The post-UKB-PPP quality control baseline visit data consist of 2,923 protein measurements from 53,018 samples. We performed additional quality control steps (Supplementary Fig. 2a) in the following order: 8,182 samples with more than 1,000 protein missing values were removed; seven proteins with missing values in over 10% of samples were removed; and 338 samples with discordant reported sex and genetic sex were removed. This resulted in a post-quality control dataset consisting of 44,498 samples with 2,916 protein measurements.

We then performed missing value imputation of the proteomics data (Supplementary Fig. 2b) with the following steps. First, we split the data into train and test, with each split comprising 11 randomly selected centers (train centers: 11002, 11005, 11006, 11007, 11008, 11009, 11013, 11014, 11017, 11018 and 11023; test centers: 10003, 11001, 11003, 11004, 11010, 11011, 11012, 11016, 11020, 11021 and 11022). Protein values were z-score normalized based on the means and standard deviations of protein values in the train split. We trained a k-nearest neighbors imputer using scikit-learn’s KNNImputer function with the number of neighbors (k) set to the square root of the sample size of the train split (k = 152). We evaluated the imputer on a subset of 5,591 samples (2,967 train and 2,624 test) with zero original missing values. Specifically, we randomly inserted missing values into this ‘ground truth’ subsample at a rate equivalent to the missing value rate in the whole post-quality control dataset (3%). We then performed imputation on this subsample to calculate the error between imputed values and original ground truth values. We confirmed robust imputation with a total MAE of 0.57 (Supplementary Fig. 2c,d), which was consistent across both train and test. This is a relatively small error, considering that the data distribution is approximately 5 (z-score ±2.5 covers 99% of the distribution).

Stanford

Participants

Stanford ADRC

Samples were acquired through the National Institute on Aging–funded Stanford ADRC. The Stanford ADRC cohort is a longitudinal observational study of individuals with clinical dementia and age-matched and sex-matched individuals without dementia. All healthy control participants were deemed cognitively unimpaired during a clinical consensus conference that included board-certified neurologists and neuropsychologists. Cognitively impaired participants underwent Clinical Dementia Rating and standardized neurological and neuropsychological assessments to determine cognitive and diagnostic status, including procedures of the National Alzheimer’s Coordinating Center (https://naccdata.org/). Cognitive status was determined in a clinical consensus conference that included neurologists and neuropsychologists. All participants were free from acute infectious diseases and in good physical condition.

SAMS

SAMS is an ongoing longitudinal study of healthy aging. Blood and cerebrospinal fluid (CSF) collection and processing were done by the same team and using the same protocol as in the Stanford ADRC. Neurological and neuropsychological assessments were performed by the same team and using the same protocol as in the Stanford ADRC. All SAMS participants had Clinical Dementia Rating = 0 and a neuropsychological test score within the normal range, and all SAMS participants were deemed cognitively unimpaired during a clinical consensus conference that included neurologists and neuropsychologists.

Sample preparation and proteomics

Plasma and CSF collection, processing and storage for all Stanford cohorts were performed using a single standard operating procedure. All studies were approved by the institutional review board of Stanford University, and written informed consent or assent was obtained from all participants or their legally authorized representative.

Blood collection and processing were done according to a rigorous standardized protocol to minimize variation associated with blood draw and blood processing. In brief, approximately 10 cc of whole blood was collected in four vacutainer EDTA tubes (Becton Dickinson) and spun at 1,800g for 10 minutes to separate out plasma, leaving 1 cm of plasma above the buffy coat and taking care not to disturb the buffy coat to circumvent cell contamination. Plasma was aliquoted into polypropylene tubes and stored at −80 °C. Plasma processing times averaged approximately 1 hour from the time of the blood draw to the time of freezing and storage. All blood draws were done in the morning to minimize the impact of circadian rhythm on protein concentrations. CSF was collected via lumbar puncture using a 20–22-gauge spinal needle that was inserted in the L4–L5 or L5–S1 interspace. CSF samples were immediately centrifuged at 500g for 10 minutes, aliquoted in polypropylene tubes and stored at −80 °C.

Olink Explore 3,072 plasma proteomics data from 601 samples with matched SomaScan plasma proteomics data were analyzed in this study.

Computational Analyses

Identification of organ-enriched plasma proteins

We used the same methods that we developed in our previous study4 to identify putative plasma proteins. In brief, we identified organ-enriched genes: genes that were expressed at least four times higher in a single organ compared to any other organ based on human tissue bulk RNA sequencing data from the Gene Tissue Expression Atlas (Supplementary Fig. 1a–d). We refer to our previous study for details4. Our classification of organ-enriched genes is provided in Supplementary Table 1. We mapped these genes to proteins measured by Olink to identify organ-enriched plasma proteins in the UKB dataset (Supplementary Table 2). Non-organ-enriched plasma proteins were called ‘organismal’ proteins.

We compared our transcriptomics-based organ enrichment classification to proteomics-based organ enrichment from Jiang et al.52. Forty-eight of the 2,916 plasma proteins in the UKB dataset were not detected in Jiang et al. tissue proteomics data. Of the detected proteins, 80% of those we called ‘organ-enriched’ were also called organ-enriched or organ-specific in the same organ in Jiang et al. (Supplementary Fig. 1e), suggesting strong concordance between RNA and protein-based enrichment classification. Twelve percent of our organ-enriched plasma proteins were not enriched in Jiang et al., and 8% were enriched in a different organ (Supplementary Fig. 1e). Most of the discordant proteins (20%) were secreted proteins produced by the liver; we called these proteins ‘liver-enriched’ based on RNA, but, based on Jiang et al. tissue proteomics, they were called either artery-enriched or non-enriched (Supplementary Fig. 1f,g). This highlights a key distinction between organ enrichment classification based on tissue transcriptomics versus proteomics: transcriptomics captures the source of the protein, whereas proteomics captures its destination. Although understanding inter-organ communication by studying the source versus destination of proteins is an interesting area for future study, for our current study, our goal was to determine the putative organ source of plasma proteins to infer organ age.

Organ age estimation and age gap calculation

We used LASSO regression models to build chronological age predictors (also known as aging models) to estimate biological age. Aging models were trained and tested on the post-quality control imputed baseline visit proteomics data, using the same train–test split as the imputation. The LassoCV function from the scikit-learn53 Python package was used to identify the optimal lambda parameter value using five-fold cross-validation. The lambda value that achieved 95% of the performance of the highest-performing lambda value was applied to scikit-learn’s LASSO function to derive sparse aging models. Each organ aging model was trained using a distinct set of organ-enriched plasma proteins. An organismal aging model was trained using non-organ-enriched proteins. A conventional aging model was trained using all proteins. Age gaps were calculated as the residual of predicted age linearly regressed against actual age. Age gaps were z-scored per aging model to normalize for differences in age prediction accuracy. z-scored age gaps were used for all analyses. Extreme agers were defined as individuals with an age gap z-score greater than 1.5 or a z-score less than –1.5 in a given aging model. Conventional age gaps were not included in the extreme ager analyses due to their high similarity to organismal age gaps (Fig. 1b,c). All aging model weights are provided in Supplementary Table 3.

In addition to aging models trained on only proteins, we also assessed aging models trained on proteins and sex as well as aging models trained separately per sex. Pairwise correlations between age gaps produced by these different modeling frameworks showed very similar age gaps between the different modeling frameworks (Supplementary Fig. 3a). Age gap associations with mortality were highly similar across frameworks (Supplementary Fig. 3b). Extreme ager associations with mortality were similar, but sex-specific models showed no significant associations between extreme youth and longevity, although brain and immune youth trended toward significance (Supplementary Fig. 3c). For simplicity and robustness, we focused on aging models trained on only proteins for all primary analyses in this paper.

We also assessed aging models trained on only the randomly selected participants (86%) in the dataset. As mentioned previously, 14% of the samples were prioritized by the UKB-PPP to study specific diseases of interest. Aging models trained on the whole dataset and aging models trained on only the randomly selected subset produced nearly equivalent age gaps (r = 1.00; Supplementary Fig. 4a). Age gap associations with incident disease and mortality were also nearly equivalent when examining either all participants or only randomly selected participants (r = 0.99 and r = 0.99; Supplementary Fig. 4b,c). Given these consistencies, we analyze the entire dataset in this paper.

Longitudinal age gap analyses

Longitudinal age gap analyses (Extended Data Fig. 3) required use of plasma proteomics data collected across multiple visits (baseline, Instance 2 and Instance 3) from the same individual. However, proteomics data from post-baseline samples were obtained from an earlier version of the Olink assay with 1,459 proteins and were not compatible with models trained on the approximately 3,000-protein platform. Therefore, for longitudinal age gap analyses, we trained a distinct set of organ aging models using the subset of 1,459 proteins that were measured across all visits.

Four proteins with missing values in over 10% of samples were removed, leaving 1,459 proteins for model development. For model training, missing values for baseline samples were k-nearest neighbor imputed as described above. 1,500-protein-based aging models were trained on 44,406 baseline samples from individuals who did not have follow-up proteomics data. Samples from individuals with longitudinal proteomics data were not included in model training to prevent model training/evaluation contamination. 1,500-protein-based aging models were LASSO regression models trained to predict the predicted age from the 3,000-protein-based organ aging models. To maximize the number of testable samples with unimputed data, we aimed to maximize sparsity of the models. The lambda value that achieved 90% of the performance of the highest-performing lambda value was used, followed by recursive feature elimination using scikit-learn’s RFECV function. Liver and muscle 1,500-based aging models were removed due to low correlation (r < 0.8) with 3,000-based aging models.

Models were tested and evaluated on longitudinal data from 1,176 unique individuals who had non-missing values for all remaining aging model proteins (880 baseline, 843 Instance 2 and 786 Instance 3 samples). The mean number of years between Instance 2 and baseline was 9.1 years (s.d. = 1.8), and the mean number of years between Instance 3 and Instance 2 was 3.3 years (s.d. = 1.6).

MRI-based brain age estimation

We trained an MRI-based brain aging model using 479 FreeSurfer brain volumetric phenotypes from 45,574 individuals from Instance 2 (Extended Data Fig. 5d). This model was trained using the same pipeline as used for the organ aging models (LassoCV, five-fold cross-validation, lambda value with 95% performance). MRI brain age gaps from Instance 2 were correlated with plasma brain age gaps from baseline.

Statistical analyses

Cox proportional hazard regression (CoxPHFitter function from the lifelines54 Python package) was used to assess the associations between organ age gaps and future disease or mortality risk. Linear regression (OLS function from the statsmodels55 Python package) was used to assess the associations between organ age gaps and environmental factors recorded at the time of blood draw. All Cox and linear regression models included age and sex as additional covariates. Multiple hypothesis testing correction was applied, when appropriate, using the Benjamini–Hochberg method, and the significance threshold was a 5% false discovery rate. Corrected P values are referred to in this paper as q values. Gene Ontology pathway enrichment analyses were performed using gProfiler56.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.