Abstract
In this prospective cohort study of 1545 participants aged 80 years and older from the China Hainan Centenarian Cohort Study, we investigated the independent and joint associations of modifiable risk factors and genetic predisposition with life expectancy. A weighted modifiable risk factor score (MRFS) based on 11 factors and a polygenic risk score (PRS) for longevity were constructed. A favorable modifiable risk factor profile (low MRFS) was associated with a 40.7% lower death risk (HR 0.593, 95%CI 0.505–0.696) compared with high MRFS. Genetic predisposition to longer lifespan (high PRS) conferred a 13.0% lower risk (HR 0.870, 95%CI 0.768–0.986). Participants with both low MRFS and high PRS had the lowest mortality (HR 0.544, 95%CI 0.432–0.686), with a borderline significant multiplicative interaction (P = 0.040). Life expectancy gains from a low MRFS were more pronounced in those with high PRS (6.92 years at age 80) than low PRS (5.35 years). Among the oldest-old Han Chinese, favorable modifiable risk profiles and genetic predisposition independently and jointly contribute to substantially longer life expectancy. Importantly, an unfavorable modifiable profile may largely negate genetic longevity benefits, emphasizing the critical role of managing these factors even in advanced age and irrespective of genetic inheritance.

Similar content being viewed by others
Introduction
Global life expectancy has exhibited a sustained upward trajectory over centuries1, with China’s average lifespan reaching 79 years in 20242. Demographic data from the Seventh National Population Census indicate that China’s population aged 80 years and older currently stands at 35.8 million3, with projections suggesting this number will exceed 70 million by 20354. Concurrently, the centenarian population has demonstrated substantial growth. The pursuit of healthy aging and extended longevity has consequently emerged as a significant public health priority and subject of increasing societal concern.
Twin studies estimate the heritability of longevity to be ~25% in the general population5, increasing to about 40% among individuals surviving beyond 85 years6, suggesting a more prominent genetic role at advanced ages. Genome-wide association studies (GWAS) have identified several genetic loci associated with longevity across diverse populations7,8,9. However, individual variants typically confer very small effect sizes and explain only a limited fraction of heritability, while their replicability across different populations remains an ongoing challenge. Polygenic risk scores (PRS), which aggregate the effects of multiple genetic loci, offer a more robust tool to measure genetic predisposition for a longer lifespan10,11.
Beyond genetic determinants, modifiable risk factors are well-established drivers of premature death across the life course12,13,14,15. Emerging research also highlights the roles of epigenetic mechanisms and dynamic gene-environment interactions in shaping aging trajectories16,17. Nevertheless, the associations whether still persist in the oldest-old when genetic predisposition may be more prominent, and their joint effects on life expectancy remain substantially unexplored. The extent to which modifiable risk factors may attenuate genetic predisposition to longevity in advanced age represents a critical knowledge gap. Elucidating this interplay is crucial for developing targeted strategies to promote healthy aging in the very late stages of life18.
Therefore, utilizing the China Hainan Centenarian Cohort Study (CHCCS), which has the largest single-center cohort study of centenarians, this study aims to construct a PRS for capturing the genetic susceptibility to extreme longevity, assess the influence of a comprehensive modifiable risk factor score (MRFS), and investigate their interactive effects on human lifespan in this exceptional population.
Results
Study population and follow-up
During a median follow-up of 4.04 years (IQR 1.77–5.58 years), a total of 1020 deaths were identified among 1545 participants from CHCCS. The baseline characteristics of study participants by vital status are presented in Supplementary Table 4 and characteristics according to MRFS and PRS category are demonstrated in Table 1.
MRFS and mortality
The MRFS was normally distributed (Fig. 1). Participants with the favorable modifiable risk factor (low MRFS) were more likely to be young, male, married, and with a non-manual labor work (P < 0.05). An increased MRFS was significantly associated with a higher risk of death with a linear dose-response manner (P overall < 0.001; P nonlinear = 0.609; P for trend < 0.001). The HR of death for individuals in the low category was 0.595 (95% CI 0.507-0.698), compared with those in the high category. The associations did not change with further adjustment for PRS (Table 2). The same pattern of associations was observed in the sensitivity analysis using unweighted MRFS (Supplementary Table 5).
A Q-Q plot of modifiable risk factor score. B Distribution of polygenic risk score by longevity status. C Restricted cubic spline of modifiable risk factor score. D Restricted cubic spline of polygenic risk score.
PRS and mortality
The distribution of PRS grouped by longevity status was shown in Fig. 1. There was no correlation between MRFS and PRS (Spearman r = 0.063, P = 0.14) and the scatterplot was shown in Supplementary Fig. 9. Participants with a genetic risk to longer lifespan (high PRS) were more likely to be young and married (P < 0.05). The death risk decreased across the range of PRS (P overall < 0.001; P nonlinear = 0.134; P for trend < 0.001). Compared to individuals in the low category of PRS, those in the high category had a lower hazard ratio of death (HR 0.877, 95% CI 0.775–0.993). The associations remained significant after additional adjustment for MRFS (Table 2). The robust associations were observed in the analysis using different PRS constructed by various thresholds (Supplementary Table 6).
Joint effects of PRS and MRFS
In the analysis of joint categories for PRS and MRFS, the risk of death decreased with reduced MRFS and elevated PRS (Fig. 2). Specifically, mortality rate per 1000 person-years ranged from 288.72 (95%CI 253.00–327.66) to 111.92 (95%CI 993.43–132.70) for participants with the categories of combination. Especially, compared to individuals with genetic propensity for a shorter lifespan and an unfavorable modifiable risk factor, those with a genetic propensity for a longer lifespan (high PRS) and a favorable modifiable risk factor had 45.6% lower rates of death (HR 0.544, 95% CI 0.432–0.686). In contrast, individuals adopted an unfavorable modifiable risk factor even with a genetic propensity for a longer lifespan had no significantly lower risk of death (HR 1.015, 95% CI 0.835–1.235).
Mortality rates (per 1000 person-years) with 95% confidence intervals (CIs) were estimated using Poisson regression. Model was adjusted for age, sex, 3 principal components of ancestry, marriage, work, history of CVD, stroke, and cancer.
Stratified analysis by PRS
In the strata analysis by the PRS category, the cumulative survival probability during the follow-up was higher in the participants with a favorable modifiable risk factor compared with the participants with a favorable modifiable risk factor (P log rank < 0.001) (Supplementary Fig. 10). Cox regression confirmed that a low MRFS was associated with a lower risk of death across genetic groups (Fig. 2). Participants in the low MRFS group had significantly lower risk of death than those in the high MRFS group, with an adjusted HR of 0.649 (95% CI 0.517–0.815) in participants at a low PRS and 0.563 (0.447–0.708) in participants at a high PRS. Besides, a multiplicative interaction between the MRFS and the PRS was detected (P interaction = 0.040).
Sensitivity analyses
The observed associations remained robust and stable in a series of sensitivity analyses: (1) Cox regression model with lifespan as timescale (Supplementary Table 7); using the unweighted MRFS (Supplementary Table 8); (2) using the PRS constructed by P value of 10−4 (Supplementary Table 9); (3) excluding first-year death after the baseline survey (Supplementary Table 10); (4) stratified by age group (Supplementary Table 11), by sex (Supplementary Table 12), by marriage (Supplementary Table 13), work (Supplementary Table 14), and by the history of CVD, stroke or cancer (Supplementary Table 15). To compare with the former studies, we further explore the association between the risk of death and the components of MRF, which included Socioeconomical and psychosocial, lifestyle and metabolic risk factor. Consistent results were observed in the main, joint and strata analyses (Supplementary Tables 16–18).
Life expectancy at age 80
Life expectancy at 80 years was 18.31 (95% CI 14.48–25.26) years for participants with a genetic propensity for a longer lifespan and a favorable modifiable risk factor, and was 11.39 (95% CI 9.40–14.60) years for participants with a genetic propensity for a shorter lifespan and an unfavorable modifiable risk factor, with a mean difference of 6.92 years in lifespan (Fig. 3). A favorable modifiable risk factor has a strong effect on the remain life expectancy regardless of which PRS group an individual is at. Among the centenarians, those with a favorable modifiable risk factor would have 1.52 years or 1.96 years of life gained in the genetic propensity for a longer or shorter lifespan, respectively (Supplementary Fig. 11).
A Remaining expentancy at different ages. B Years of life gained at different ages. PRS polygenic risk score, MRF modifiable risk factor.
Discussion
To our knowledge, this represents the first investigation into the joint associations between modifiable risk factors and genetic predisposition with mortality risk in the oldest old. Our findings demonstrate that a favorable modifiable risk profile was associated with a 40.7% lower risk of death compared to an unfavorable profile, and this association persisted within each genetic risk category. Participants with both a high genetic predisposition for longevity and favorable modifiable risk factor profile exhibited the lowest mortality risk (HR 0.544 95% CI 0.432–0.686) compared to those with an unfavorable profile and low genetic predisposition. Notably, the survival advantage associated with a high genetic predisposition appeared less pronounced when combined with an unfavorable modifiable risk profile. A favorable modifiable risk profile was associated with a gain in remaining life expectancy of 5.35 years at age 80 in the low PRS group, and 6.92 years in the high PRS group. These results suggest that maintaining a favorable risk profile is associated with substantial benefits for remaining lifespan, irrespective of genetic susceptibility, even at very advanced ages.
Our findings deliver a powerful and optimistic message: it is never too late to intervene. By studying individuals who have already survived to 80+ years, our cohort represents a selected population resilient to earlier-life mortality risks19. This selection may lead to an underestimation of the true effect sizes of both modifiable risk factors and genetic predisposition, as their strongest impacts on mortality likely occurred at younger ages. Consequently, the associations we observed likely reflect the residual effects of these factors in late life. This makes the observed strong association of modifiable risk factors with survival even in this resilient group particularly noteworthy, suggesting their relevance extends into advanced age. The significant gains in life expectancy (up to nearly 7 years) associated with favorable modifiable risk factors even among individuals aged 80 years and older robustly challenge the notion of a finite “window” for effective intervention. This study provides some of the most compelling evidence to date that proactive management of modifiable risk factors remains critically effective well into advanced age, thereby extending the potential for public health action across the entire life course20. This insight is crucial for reframing global aging strategies. In alignment with the World Health Organization’s Decade of Healthy Aging21 and the United Nations’ Sustainable Development Goals22, which emphasize well-being at all ages, our research underscores that aging populations are not merely a demographic challenge to be managed, but a demographic opportunity to be realized. Policy and healthcare systems must evolve to integrate evidence-based, personalized preventive care for the oldest-old, moving beyond a sole focus on disease treatment to actively promoting health and functional capacity. By doing so, we can not only help individuals add more years to their lives but, more importantly, add more life to those years, contributing to more sustainable and inclusive societies.
Our findings align with and extend previous epidemiological evidence establishing the critical importance of modifiable risk factors in mortality. The Prospective Urban Rural Epidemiology (PURE) study attributed ~75% of mortality to modifiable risk factors, with population-attributable fractions ranging from 65% in high-income to 80% in low-income countries13. Similarly, research in Chinese populations has demonstrated that modifiable factors account for 56–64.8% of population-attributable mortality risk15,23. Furtherly, Wang et al. revealed that 12 modifiable risk factors, including lifestyle factors, socioeconomic and psychosocial and metabolic risk factors contributed to deaths in an age-specific differences17. These findings are further supported by UK Biobank data showing significant proportions of mortality associated with modifiable factors across age groups (PAF of 56.53% in middle-age; 49.78% in quinquagenarian; 42.45% in the elderly)16. Our results strengthen this evidence base by demonstrating robust associations between modifiable risk factors and mortality reduction (HR 0.593 95%CI 0.505–0.696) even in the oldest-old population. Collectively, these findings underscore the imperative for comprehensive modifiable risk factor management through multilevel interventions targeting individual education, community support, and policy initiatives to achieve substantial reductions in premature mortality.
The complementary role of genetic factors in longevity has gained increasing recognition, though previous studies have primarily examined lifestyle-genetic interactions using limited risk factor assessments24,25,26. A study recruited 9633 Chinese older adults to construct healthy lifestyle score from smoking, alcohol consumption, physical activity and healthy diet27. Bian et al. used 6 lifestyle factors to construct a healthy lifestyle score based on 35,742 adults from UK Biobank28. Zhang et al. constructed Life’s Essential 8 index (CVH, including 4 behavioral and 4 metabolic factors) based on UK Biobank29. However, the joint relationship of modifiable risk factors with genetic risk and lifespan has not been discussed. Our study advances this field by systematically incorporating socioeconomic, psychosocial, lifestyle, and metabolic factors into a comprehensive modifiable risk factors assessment. A common public perception is that genetic risk assessments are deterministic, fostering a sense of futility regarding one’s capacity to extend lifespan7,30. Nevertheless, the majority of deaths arises from the progressive accumulation of genetic and epigenetic alterations over the life course31,32, which may be influenced by modifiable risk factors. In the present study, we observed significant multiplicative interactions for MRFS and PRS in relation to the risk of death, highlighting the necessity of modifiable risk factors control, especially among those with genetic predisposition to a longer lifespan. These results challenge deterministic interpretations of genetic risk and support a model wherein modifiable factors substantially modulate genetic predisposition through potentially reversible mechanisms33.
Several biological mechanisms could be postulated. First, the SNPs included in our PRS implicate diverse longevity-related pathways. For example, variants near PPP1R12B (1q32.1) have been linked to parental lifespan and cardiovascular health, potentially influencing vascular smooth muscle contractility and blood pressure regulation34. Variants in ZNF521 (18q11.2) and ZNF385D (3p24.3) suggest roles in transcriptional regulation of developmental and cellular senescence pathways. Second, individuals with a high genetic predisposition for longevity might possess more robust cellular maintenance systems (e.g., enhanced DNA repair capacity, autophagy efficiency, or oxidative stress resistance) conferred by the cumulative protective effects of these variants35. An unfavorable modifiable risk profile could impose a chronic physiological burden (e.g., elevated inflammation or oxidative stress) that overwhelms these inherent protective systems, thereby attenuating their survival advantage36. Third, specific candidate pathways mediating this gene-environment interplay include inflammatory signaling (e.g., IL-6, TNF-α), metabolic regulation (e.g., insulin/IGF-1 signaling), and cardiovascular resilience, all of which are influenced by both the genetic variants comprising our PRS and lifestyle factors37. However, the specific molecular mechanisms remain speculative and warrant investigation through integrated multi-omics approaches in future studies. Further larger, independent cohorts integrated multi-omics approaches are required to confirm and elucidate the molecular mechanisms that mediate the joint effects of genetic predisposition and modifiable risk factors on mortality among oldest old.
From a public health perspective, the quantification of potential life expectancy or lifespan gained provides crucial evidence for intervention development. One study estimated that at age 40, adherence to 7–8 healthy lifestyle factors was associated with a life expectancy of 46.8 years (95% CI 45.6–48.1) for men and 51.3 years (95% CI 50.0–52.6) for women38. Similarly, the CLHLS study reported that at age 65, a healthy lifestyle was linked to a gain of 3.84 years (95% CI 3.05–4.64) among those with low genetic risk and 4.35 years (95% CI 3.70–5.06) among those with high genetic risk27. Our findings further indicate that improving modifiable risk factors from high to low levels may prolong life expectancy by ~5.95 years and 5.35 years at age 80 for high- and low-genetic-risk individuals, respectively. Even at age 100, the corresponding gains were 1.66 and 1.52 years. Importantly, this evidence strongly supports the value of modifying risk factors even at advanced ages, demonstrating that beneficial changes remain impactful throughout later life. Adopting a holistic strategy that addresses multiple modifiable risks simultaneously is likely to maximize the potential for achieving exceptional longevity39.
The principal strengths of our study include its prospective design, large centenarian cohort, and the comprehensive genotyping and phenotyping data for innovative examination of genetic-modifiable risk interactions. Our modifiable risk factor score incorporates socioeconomic, psychosocial, lifestyle, and metabolic factors, providing a more holistic risk assessment than previous instruments. The extended age range to the oldest old and mortality recorded for the majority of participants enable unique insights into longevity determinants in the most advanced age groups.
Several limitations warrant consideration. First, the PRS was derived from an internal genome-wide association study within a subset of our cohort, and the MRFS was weighted using coefficients estimated from the same population. This approach may introduce overfitting and potentially inflate effect estimates. Although internal cross-validation for the PRS and conducted sensitivity analyses were performed, the reported associations may still be subject to optimistic bias. Second, the reliance on baseline self-reported measures may introduce misclassification bias, potentially leading to either overestimation or underestimation of the true associations. Third, despite adjusted for major chronic diseases at baseline and excluded participants with history of disease (CVD, stroke, or cancer), we cannot completely rule out reverse causality; and although numerous potential confounders were adjusted, residual confounding by unmeasured factors such as healthcare access may persist. Fourth, the recruitment of participants from a single Chinese province may limit generalizability to other populations, necessitating validation in more diverse ethnic and geographical cohorts. Finally, the limited number of cause-specific mortality events prevented investigation of cause-specific mortality associations.
Our study reveals that both genetic and modifiable risk factors were independently associated with lifespan in the oldest-old. Notably, unfavorable modifiable risk factors significantly attenuate the benefit of genetic risk to a longer lifespan. Public health policies for controlling modifiable risk factors remain the paramount strategy for further extending life expectancy, regardless of an individual’s genetic endowment.
Methods
Study design, participants and assessment
CHCCS is a prospective, community-based cohort study designed to investigate determinants of healthy aging and longevity40. The present analysis utilizes data from the prospective follow-up of this cohort. Between June 2014 and December 2016, potential participants were identified across Hainan Province, China. The target population comprised two age groups: 1) all identified living centenarians (aged 100 years and above), and 2) a community-based sample of oldest-old adults (aged 80 to 99 years). The latter group was established using a stratified random sampling approach based on gender, age, and geographical distribution to approximate the provincial oldest-old population structure. Participants with incomplete identity documentation, unwillingness to comply with study protocols, or age verification discrepancies were excluded. The final cohort comprised 1800 oldest-old individuals, including 1002 centenarians and 798 oldest old. Written informed consent was provided by all participants prior to enrollment. The study protocol received approval from the Ethics Committee of the Hainan Hospital of the Chinese People’s Liberation Army General Hospital (301hn11-2016-01). All procedures conformed to institutional ethical guidelines and the principles of the Declaration of Helsinki.
Baseline assessments included a comprehensive, face-to-face interviewer-administered questionnaire, standardized clinical evaluations, and collection of biological specimens. Trained staff collected data on sociodemographic, lifestyle, medical history, and mental health. Clinical measures included blood pressure and anthropometry. Fasting venous blood samples were collected, transported under cold chain conditions, and processed on the same day for biochemical and genomic analyses.
Genotyping, quality control and imputation
Genotyping was performed on 1608 participants using the Infinium Asian Screening Array (ASA) BeadChip (Illumina, 2017), a specialized microarray designed for large-scale genetic and pharmacogenomic investigations in East and Southeast Asian populations. This array interrogates 659,184 markers simultaneously across 24 samples and allows for the customization of up to 50,000 bead types.
Sample-level quality control was conducted according to the following criteria: (1) exclusion of three individuals with a genotype call rate below 98%; (2) removal of two individuals due to sex discrepancies; (3) exclusion of 15 individuals exhibiting heterozygosity rates beyond ±3 SD; and (4) removal of two outliers identified via principal component analysis (PCA), defined as those deviating beyond ±6 SD from the central population cluster. After applying these filters, 1586 samples were retained for subsequent analysis.
Variant-level quality control was carried out using the following exclusion thresholds: (1) 12,441 single-nucleotide polymorphisms (SNPs) were removed due to a call rate below 98%; (2) 188,289 SNPs were excluded for exhibiting a minor allele frequency (MAF) below 1%; and (3) 1,064 SNPs were discarded for significant deviation from Hardy–Weinberg equilibrium (HWE; p-value < 1 × 10⁻⁴). Following this filtering, 469,137 out of 704,415 variants remained.
Genotype imputation was performed using the 1000 Genomes Project East Asian reference panel (Phase 3, October 2014 release; n = 2504). Pre-phasing was executed with SHAPEIT v2.5, and imputation was conducted using IMPUTE2 v2.3.1. Only SNPs with an imputation quality score (R²) > 0.9 were retained. Additional post-imputation filtering was applied to exclude SNPs with a missing rate > 5%, MAF < 0.01, or HWE p-value < 1 × 10⁻⁴, and to remove individuals exhibiting heterozygosity rates beyond ±3 SD or relatedness (PI_HAT > 0.1875) as determined by identity-by-descent (IBD) analysis.
After excluding individuals who had no genetic data, failed to pass genetic quality control, or had missing data for phenotype data, a final dataset comprising 2,831,063 SNPs from 1545 individuals was included in the main analysis. A detailed flowchart outlining the quality control procedures and participant selection is provided in Supplementary Fig. 1.
Modifiable risk factors score
A weighted MRFS was constructed based on 11 modifiable risk factors encompassing socioeconomic and psychosocial factors (education and depression), behavioral factors (smoking, alcohol consumption, physical activity, dietary patterns, and sleep duration), and metabolic factors (BMI, diabetes, hypertension, and dyslipidemia). The selection and definition of these factors were based on extensive epidemiological evidence supporting their potential causal influence on longevity and mortality13,16,17, as well as their modifiable nature and availability within the CHCCS dataset. Each factor was dichotomized and assigned a binary value (risk present vs. risk absent). The “risk” category was defined by conditions or behaviors considered amenable to modification toward a more favorable health profile. Operational definitions for each risk factor are detailed in Supplemental Table 1.
An unweighted MRFS was first constructed by summing the number of risk factors present, yielding a range from 0 to 11. Nonlinear association between GDS score, sleep hour, BMI, and mortality was observed (Supplementary Fig. 2). Subsequently, a weighted modifiable risk factor score was derived based on the β coefficients obtained from a multivariable Cox proportional hazards model, adjusted for age, sex, marital status, occupation, the first three genetic principal components, and a history of cardiovascular disease, stroke, or cancer. The full set of coefficients is provided in Supplemental Table 2. The weighted MRFS was computed by summing the product of each risk indicator and its respective β coefficient. Higher scores indicated a less favorable risk profile.
where \({{\rm{x}}}_{i}\) represents each modifiable risk factor (1 = unhealthy, 0 = healthy) and \({{\rm{\beta }}}_{i}\) represents the corresponding β coefficient. Participants were classified into low ( < tertile), medium (middle tertile), and high ( ≥ tertile) risk groups based on the resulting score distribution.
Polygenic risk score
Attempt to replicate published variants
We first attempted to construct PRS using external, published summary statistics. Two primary sources were used: 1) summary statistics from the largest published GWAS meta-analysis for longevity7, based on survival to the 90th and 99th percentiles; and 2) a set of 12 longevity-associated SNPs reported from the Chinese Longitudinal Healthy Longevity Survey8. The discriminatory ability of PRS derived from these external sources between centenarians and younger controls within our cohort was poor (AUC < 0.55, Supplementary Fig. 3).
Possible reasons for this lack of performance may be as follows:
(1) Population Specificity: The majority of previously reported longevity-associated variants were discovered in European ancestries. Genetic architectures and allele frequencies may differ in our Chinese Han cohort, leading to reduced predictive power. (2) Age-Effect Attenuation: The protective effect of certain genetic variants might be most influential in surviving to middle or early-old age. In an “oldest-old” cohort (all participants >80 years), the individuals with low genetic predisposition may have already been filtered out by mortality, reducing the discriminative power of the PRS among these “survivors.” (3) Cohort Specificity: Our cohort is from Hainan, a renowned longevity region with unique environmental and potentially genetic factors, which might not be fully captured by the existing PRS. (4) Limited Variant Set: The PRS was constructed from a limited set of previously reported GWAS-significant variants. A more comprehensive PRS incorporating a larger number of variants (even with sub-threshold significance) or using advanced methods might improve performance.
Internal genome-wide association study for longevity
Given the limited transferability of published scores, we performed an internal discovery GWAS to identify genetic variants associated with extreme longevity in our specific population. A nested case-control analysis was conducted using a sub-sample of 1099 individuals: 922 centenarians (cases, aged 100 + ) and 177 controls who had died before reaching age 100 (baseline age 80–99). Individuals who were alive but not yet 100 at the last follow-up (n = 446) were excluded from this discovery analysis to ensure a clear phenotypic contrast (Supplementary Fig. 1).
Association testing was performed using logistic regression under an additive genetic model in PLINK v1.9, adjusting for age, sex, and the first three genetic principal components. Resulting P-values and odds ratios (ORs) with 95% confidence intervals (CIs) were generated for each variant. Genomic control inflation factor (λgc) was calculated as 1.034, indicating minimal residual population stratification and confirming adequate control of confounding (0.95–1.05). Quantile–quantile and Manhattan plots were generated using the R package “CMplot”. The Manhattan plot revealed 85 loci associated with longevity at a suggestive genome-wide significance threshold (p < 1×10⁻⁵), indicating polygenic influences on a longer lifespan (Supplementary Fig. 4).
Functional annotation and enrichment analysis
Genome-wide significant loci (p < 1×10⁻⁵) identified in the GWAS were annotated using the SNP2GENE function in FUMA41. Linkage disequilibrium (LD) clumping (window size = 250 kb, r² < 0.1) revealed 13 independent genomic risk loci. These variants demonstrated significant enrichment in 3’ UTR regions, suggesting potential roles in gene expression regulation. Through positional mapping (10-kb window), we identified 18 candidate genes: PPP1R12B, KBTBD8, LPHN3, PKD2, SLC7A7, AL135998.1, MRPL52, SHISA9, ABR, *CTD-2535L24.2*, ZIM2, PEG3, HIRA, C22orf39, UFD1L, CDC45, and CLDN5. Notably, PEG3 and ZIM2 are located within the 19q13.32 chromosomal region containing the APOE gene cluster - a well-established longevity-associated locus (Supplementary Fig. 5).
Functional enrichment analysis was performed using the GENE2FUNC procedure against GTEx v8 gene expression sets across 54 tissues. While no results survived multiple testing correction, we observed a non-significant trend toward elevated expression of candidate genes in digestive and cerebellar tissues. Further examination of positional gene sets and transcription factor targets revealed significant enrichment (FDR-corrected p < 0.05) for genes associated with 22q11.2 deletion syndrome, suggesting potential links between longevity mechanisms and embryonic developmental processes. Additionally, several genes showed strong regulation by core cell cycle transcription factors, including E2F, indicating that cellular proliferation, DNA replication, and repair mechanisms may contribute substantially to lifespan determination (Supplementary Fig. 6).
Polygenic risk score
The summary statistics from the internal GWAS described above served as the base data for constructing a cohort-specific PRS. We used PRSice-2 with the clumping and thresholding (C + T) method42. Variants were filtered for imputation quality (INFO score > 0.9). Linkage disequilibrium (LD) clumping was applied (window size = 250 kb, r² < 0.1) to retain independent variants. A range of p-value thresholds (1 × 10⁻⁵ to 1) was evaluated. Analyses were restricted to autosomal SNPs and adjusted for sex and the first three principal components.
Eleven genetic variants meeting the suggestive significance threshold (p < 1×10⁻⁵) were selected for the final PRS (Supplementary Table 3).
The score was calculated for each individual using the formula:
where the effect size of SNP \(i\) is \({S}_{i}\); the number of effect alleles observed in sample \(j\) is \({G}_{{ij}}\). The primary PRS was calculated for all 1545 participants. Participants were stratified at the median PRS value into a higher genetic risk group ( ≤ median) and a lower genetic risk group ( > median) for longevity.
Internal validation
To mitigate concerns of overfitting and the “winner’s curse”, we performed 10-fold cross-validation within the discovery sample (n = 1099). The cohort was randomly split into 10 folds. In each iteration, GWAS was performed on 9 folds to derive SNP weights, which were then used to calculate the PRS in the held-out 10th fold. The process was repeated across all folds. The average AUC for the cross-validated PRS for predicting the centenarian vs. control status was 0.62 (95% CI: 0.58–0.66), indicating modest but stable discriminative ability (Supplementary Fig. 7). This result suggests that while the score is cohort-specific, the overfitting bias is likely limited.
Covariates
Covariates were selected based on established associations with longevity in the existing literature, including: age (modeled as a continuous variable in years), sex (male/female), marital status (married/others), occupational category (manual/non-manual labor), the first three genetic principal components (to account for population stratification), and a history of cardiovascular disease (CVD), stroke, or cancer. Missing data were handled using distinct strategies by variable type. For categorical variables, a separate “missing” indicator category was created. Missing values for continuous variables were imputed using sex-specific means.
Outcomes
Mortality status was determined through linkage with the official death registry maintained by the Public Security Department and corroborated by records from the Hainan Provincial Committee on Aging. Vital status was further verified via telephone follow-up through March 31, 2023. Person-years of follow-up were calculated for each participant from the date of study enrollment until the date of death, the date of the last successful follow-up contact, or the end of the study period, whichever occurred first. Survival time was defined as the duration from the baseline assessment until the occurrence of death or censoring.
Life expectancy and 95% CIs were estimated using flexible parametric survival models of the Royston-Parmar type on the log cumulative hazard scale, with 3 degrees of freedom used for the baseline restricted cubic spline function and follow-up time as the timescale. Years of life gained or lost were calculated as differences in the area under the survival curves.
The calculation of years of life gained (difference in average life expectancy) involved a three-step process:
First, the flexible parametric survival models with follow-up time as timescale were applied to estimate life expectancy and corresponding confidential interval43.
In the spline-based survival model of Royston and Parmar44, a transformation \(g(S\left(t,z\right)\) of the survival function is modelled as a natural cubic spline function of log time \(x=\log (t)\) plus linear effects of covariates \(z\).
The proportional hazards model defines the log cumulative hazard.
Second, remaining life expectancy was estimated as the area under the survival curve up to 120 years old, conditional on surviving at ages 80–100 years old (1-year intervals); survival curves were predicted for each individual and averaged over individuals.
Where \(e\left(a\right)\) denotes remaining life expectancy starting from age \(a\), \(T\) denotes lifespan random variable of an individual, \(a\) denotes starting age (current age), \(T > a\) denotes condition indicating the individual has survived to age \(a\), \(f(t)\) denotes mortality density function, representing the instantaneous probability density of death at time \(t\), \(S(t)\) denotes Survival function, representing the probability of surviving to time \(t\), where \(S\left(t\right)=P(T > t)\).
Third, years of life gained were calculated as the difference between the areas under two survival curves. All the analyses were conducted using R package “flexsurv”45.
Where \({S}_{i}\left(a\right)\) and \({S}_{{ref}}\left(a\right)\) represent survival probabilities at age \(a\) of the study participants with different categories and reference (high MRFS and low PRS). By applying the formula of integration by parts in yearly intervals, we estimated the years of life gained from age 80 up to age 100. The diagnostic plots for the flexible parametric survival models demonstrate adequate model fit (Supplementary Fig. 8)
Statistical analysis
Baseline characteristics were summarized using frequencies and proportions (%) for categorical variables and means ± SD for continuous variables. Mortality rates (per 1000 person-years) with 95% confidence intervals (CIs) were estimated using Poisson regression. Associations between MRFS and PRS were evaluated using Spearman’s rank correlation.
Associations of MRFS and PRS with all-cause mortality were assessed using Cox proportional hazards models, with follow-up time as the time scale and vital status (dead/alive) as the outcome. MRFS was analyzed both as a categorical variable (using the high-risk group as reference) and as a continuous variable (to test for linear trend). Similarly, PRS was modeled as a categorical variable (with the low genetic risk group as reference) and continuously. The primary model was adjusted for age, sex, marital status, occupation, history of cardiovascular disease, stroke, or cancer, and the first three genetic principal components. Fully adjusted models included mutual adjustment for MRFS and PRS.
Potential non-linear relationships were examined using restricted cubic splines with three knots. Combined effects of MRFS and PRS were evaluated by creating a composite variable, with the group having high MRFS and low PRS serving as the reference. Multiplicative interaction was tested using likelihood ratio tests comparing models with and without an interaction term. Additive interaction was assessed using the relative excess risk due to interaction (RERI), attributable proportion (AP), and synergy index (SI). To control for multiple comparisons, we applied a Bonferroni correction to primary pre-specified analyses.
Sensitivity analyses included: (1) using lifespan (chronological age at follow up) as the time scale in Cox models; (2) using an unweighted count of modifiable risk factors instead of the weighted MRFS; (3) constructing PRS using alternative p-value thresholds (10⁻⁴, 10⁻³, 10⁻², 0.05); (4) excluding deaths occurring within the first year after baseline; and (5) conducting stratified analyses by age (80-99 and 100- years), sex (female and male), marriage (married and other), work (manual and non-manual) and history of disease (yes and none).
The proportional hazards assumption was verified using Schoenfeld residuals (global P > 0.05). All tests were two-sided, with P < 0.05 considered statistically significant for uncorrected analyses and P < 0.0167 (0.5/3) for primary analyses after Bonferroni correction. Analyses were performed using R version 4.4.2.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to privacy or ethical restrictions, but are available from the corresponding author on reasonable request.
References
United Nations Department of Economic and Social Affairs, P. D. World Population Prospects 2024: Summary of Results Report No. UN DESA/POP/2024/TR/NO. 9, (2024).
China, N. H. C. o. t. P. s. R. o. China’s average life expectancy rises to 79, <https://en.nhc.gov.cn/2025-09/12/c_86518.htm> (2025).
China, N. B. o. S. o. Report of The Seventh National Population Census. (Beijing, 2021).
Foundation, C. D. R. China Development Report 2020: Trends and Policies of China’s Population Aging (Beijing, 2020).
van den Berg, N., Beekman, M., Smith, K. R., Janssens, A. & Slagboom, P. E. Historical demography and longevity genetics: back to the future. Ageing Res. Rev. 38, 28–39 (2017).
McDaid, A. F. et al. Bayesian association scan reveals loci associated with human lifespan and linked biomarkers. Nat. Commun. 8, 15842 (2017).
Deelen, J. et al. A meta-analysis of genome-wide association studies identifies multiple longevity genes. Nat. Commun. 10, 3669 (2019).
Liu, X. et al. Integrated genetic analyses revealed novel human longevity loci and reduced risks of multiple diseases in a cohort study of 15,651 Chinese individuals. Aging Cell 20, e13323 (2021).
Bae, H. et al. A genome-wide association study of 2304 extreme longevity cases identifies novel longevity variants. Int. J. Mol. Sci. 24. https://doi.org/10.3390/ijms24010116 (2022)
Tesi, N. et al. Polygenic risk score of longevity predicts longer survival across an age continuum. J. Gerontol. A Biol. Sci. Med. Sci. 76, 750-759. (2021)
Revelas, M. et al. High polygenic risk score for exceptional longevity is associated with a healthy metabolic profile. Geroscience 45, 399–413 (2023).
Magnussen, C. et al. Global effect of modifiable risk factors on cardiovascular disease and mortality. N. Engl. J. Med. 389, 1273–1285 (2023).
Yusuf, S. et al. Modifiable risk factors, cardiovascular disease, and mortality in 155 722 individuals from 21 high-income, middle-income, and low-income countries (PURE): a prospective cohort study. Lancet 395, 795–808 (2020).
Joseph, P. et al. Cardiovascular disease, mortality, and their associations with modifiable risk factors in a multi-national South Asia cohort: a PURE substudy. Eur. Heart J. 43, 2831–2840 (2022).
Li, S. et al. Modifiable risk factors associated with cardiovascular disease and mortality in China: a PURE substudy. Eur. Heart J. 43, 2852–2863 (2022).
Tian, F. et al. Ranking age-specific modifiable risk factors for cardiovascular disease and mortality: evidence from a population-based longitudinal study. EClinicalMedicine 64, 102230 (2023).
Wang, T. et al. Age-specific modifiable risk factor profiles for cardiovascular disease and all-cause mortality: a nationwide, population-based, prospective cohort study. Lancet Reg. Health West Pac. 17, 100277 (2021).
Chen, X. et al. The path to healthy ageing in China: a Peking University-Lancet Commission. Lancet 400, 1967–2006 (2022).
Burgess, S. & Zhou, A. Genetic predictors of traits in elderly subjects: risk of survival bias and reverse causation. Eur. Heart J. 45, 2155–2157 (2024).
Cocchi, C., Zazzara, M. B., Levati, E., Calvani, R. & Onder, G. How to promote healthy aging across the life cycle. Eur. J. Intern. Med. 135, 5–13 (2025).
(WHO), W. H. O. Decade of Healthy Ageing: Plan of Action. (2020).
Nations, U. United Nations Sustainable Development Goals (SDGs), <United Nations Sustainable Development Goals (SDGs)>
Lu, J. et al. Association of social determinants, lifestyle, and metabolic factors with mortality in Chinese adults: a nationwide 10-year prospective cohort study. Cell Rep. Med. 5, 101656 (2024).
Zhu, Y. et al. Ambient air pollution, lifestyle, and genetic predisposition on all-cause and cause-specific mortality: a prospective cohort study. Sci. Total Environ. 933, 173120 (2024).
Group, L. A. R Prospective association of a genetic risk score and lifestyle intervention with cardiovascular morbidity and mortality among individuals with type 2 diabetes: the Look AHEAD randomised controlled trial. Diabetologia 58, 1803–1813 (2015).
Walter, S. et al. Genetic, physiological, and lifestyle predictors of mortality in the general population. Am. J. Public Health 102, e3–e10 (2012).
Wang, J. et al. Healthy lifestyle in late-life, longevity genes, and life expectancy among older adults: a 20-year, population-based, prospective cohort study. Lancet Healthy Longev. 4, e535–e543 (2023).
Bian, Z. et al. Genetic predisposition, modifiable lifestyles, and their joint effects on human lifespan: evidence from multiple cohort studies. BMJ Evid. Based Med. 29, 255–263 (2024).
Zhang, J. et al. Relation of Life’s Essential 8 to the genetic predisposition for cardiovascular outcomes and all-cause mortality: results from a national prospective cohort. Eur. J. Prev. Cardiol. 30, 1676–1685 (2023).
Argentieri, M. A. et al. Integrating the environmental and genetic architectures of aging and mortality. Nat. Me.d 31, 1016–1025 (2025).
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194–1217 (2013).
Sens, D. et al. Genetics-driven risk predictions leveraging the Mendelian randomization framework. Genome Res. 34, 1276–1285 (2024).
Zhang, Y. et al. PPP1R12B inhibits cell proliferation by inducing G0/G1 phase arrest via PAK2/β-catenin axis in hepatocellular carcinoma. Front. Cell Dev. Biol. 13, 1621705 (2025).
Lin, J. R. et al. Rare genetic coding variants associated with human longevity and protection against age-related diseases. Nat. Aging 1, 783–794 (2021).
Xu, X., Pang, Y. & Fan, X. Mitochondria in oxidative stress, inflammation and aging: from mechanisms to therapeutic advances. Signal Transduct. Target Ther. 10, 190 (2025).
García, I. E. & Contreras, J. E. Decoding connexin hemichannels: structure, function, and regulatory mechanisms. Annu. Rev. Physiol. https://doi.org/10.1146/annurev-physiol-050724-010008 (2025).
Sakaniwa, R. et al. Impact of modifiable healthy lifestyle adoption on lifetime gain from middle to older age. Age Ageing 51. https://doi.org/10.1093/ageing/afac080 (2022).
López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. Hallmarks of aging: an expanding universe. Cell 186, 243–278 (2023).
He, Y. et al. Cohort profile: the China Hainan centenarian cohort study (CHCCS). Int. J. Epidemiol. 47, 694–695h (2018).
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8. https://doi.org/10.1093/gigascience/giz082 (2019)
Dehbi, H. M., Royston, P. & Hackshaw, A. Life expectancy difference and life expectancy ratio: two measures of treatment effects in randomised trials with non-proportional hazards. Bmj 357, j2250 (2017).
Royston, P. & Parmar, M. K. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat. Med. 21, 2175–2197 (2002).
Jackson, C. H. flexsurv: a platform for parametric survival modeling in R. J. Stat. Softw. 70. https://doi.org/10.18637/jss.v070.i08 (2016).
Acknowledgements
The most important acknowledgment is to the staff and participants from the China Hainan Centenarian Cohort Study (CHCCS). This work was supported by the National Nature Science Foundation of China (82574176, 82173589 & 82173590), National Key Research and Development Program of China (2022YFC2503605), National Science and Technology Major Project of China (2023ZD0500901), Beijing Natural Science Foundation (7252181), and Capital’s Funds for Health Improvement and Research (2024-2G-5033 & 2022-2G-5031). The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.
Author information
Authors and Affiliations
Contributions
Miao Liu and Yao He had full access to all data in this study and take responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: Shimin Chen, Ke Han. Acquisition, analysis, or interpretation of data: All authors. Drafting of the manuscript: Shimin Chen, Shengshu Wang. Critical revision of the manuscript for important intellectual content: Shanshan Yang, Xiaoxin Ye, Yanhao Wan, Jianhua Wang, Yao He, Miao Liu. Statistical analysis: Shimin Chen, Ke Han, Shengshu Wang, Yueting Shi, Wenchang Wang, Shengyan Du. Obtained funding: Yao He, Miao Liu. Administrative, technical, or material support: Ke Han, Shengshu Wang, Jianhua Wang. Supervision: Miao Liu, Yao He.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, S., Han, K., Wang, S. et al. Modifiable risk factors attenuated longevity genetic predisposition on life expectancy in the oldest old. npj Aging 12, 66 (2026). https://doi.org/10.1038/s41514-026-00393-7
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41514-026-00393-7





