Introduction

Cardiovascular disease (CVD) remains the leading cause of premature death globally, with coronary heart disease (CHD) accounting for the largest share of disability-adjusted life years among non-communicable diseases1,2. Although the associations between several well-established modifiable risk factors—such as elevated blood pressure, dyslipidemia, obesity, smoking, and diabetes—and CHD have been well documented, the decline in age-standardized CHD incidence has plateaued in recent years1,3,4,5. Alarmingly, the CHD burden is now increasing among younger adults (aged 20–54 years), highlighting the urgent need for more effective preventive strategies targeting modifiable risk factors in this demographic1,5,6,7,8.

As the range of exposures linked to cardiovascular outcomes continues to expand9, traditional hypothesis-driven studies, while informative, are limited by their constrained focus, susceptibility to inflated effect sizes and type I errors, and potential for selective reporting10,11. The exposure-wide association study (EWAS) provides a hypothesis-free framework to systematically evaluate numerous modifiable risk factors simultaneously12. By capturing the combined effects across multiple domains, EWAS can reduce bias, validate known associations, and identify novel risk factors with greater precision10,13,14,15,16,17,18. This approach has been successfully applied to studies of depression10, dementia13, diabetes14, and CVD15,16,17. Notably, the use of composite risk scores within the EWAS framework enables the quantification of cumulative exposure burden, allowing for a more accurate estimation of their contribution to CHD risk.

Observed associations between modifiable factors and CHD may arise from non-causal mechanisms, including residual confounding and reverse causation9. Integrating Mendelian randomization (MR) into the EWAS framework can strengthen causal inference and help prioritize factors for intervention19. Both genetic and environmental factors contribute to the risk of early-onset CHD20, and prior studies have demonstrated interactions between lifestyle, local environment, and genetic predisposition21,22. However, little is known about how modifiable factors from other domains interact with genetic risk. Assessing interactions on an additive scale may help identify individuals who would benefit most from targeted risk factors modification23.

For the first time, we conducted an EWAS of CHD using phenotypic and genomic data from nearly 400,000 participants in the UK Biobank, with a particular focus on early-onset cases. We assessed the joint effects of modifiable factors across eight domains by constructing composite weighted scores and examined their additive interactions with polygenic risk scores (PRS) for CHD events24. Finally, we estimated domain-specific and overall population attributable fractions (PAFs) to quantify the potential impact of preventive interventions, especially among younger populations where early risk modification may yield the most significant long-term benefits (Fig. 1).

Fig. 1: Overview of the analytic design.
Fig. 1: Overview of the analytic design.The alternative text for this image may have been generated using AI.
Full size image

Analytical procedure to identify modifiable risk factors associated with incident coronary heart disease in the UK Biobank. PRS polygenic risk scores, RERI relative excess risk due to interaction, AP attributable proportion due to interaction, SI synergy index.

Results

This study included 394,579 participants (56.1% females; Supplementary Data 4), with a mean (SD) age of 56.1 (8.1) years at recruitment. After 13.9 (2.3) years of follow-up, 27,592 participants were diagnosed with CHD, with a mean (SD) age of diagnosis at 68.4 (7.6) years. This study identified 155 out of 213 modifiable factors associated with CHD (Fig. 2 and Supplementary Data 5). These factors were categorized into eight domains: socioeconomic status (SES, 7 factors), lifestyles (32 factors), early life factors (2 factors), psychosocial factors (7 factors), local environment (8 factors), blood assays (37 factors), physical measures (10 factors), and health and medical history (52 factors). Among these, 33 factors were identified as protective factors, and 122 were identified as risk factors. According to the variable importance rankings from the random forest model, the top 10 predictors were: diabetes mellitus [hazard ratio (HR) = 2.04, 95% confidence interval (CI): 1.96, 2.13, p < 2.74 × 10−265], Cystatin C (HR = 1.18, 95% CI: 1.17, 1.19, p < 2.35 × 10−04), long-standing illness, disability or infirmity (HR = 1.73, 95% CI: 1.69, 1.78, p < 2.35 × 10−04), Overall health rating (HR = 1.42, 95% CI: 1.36, 1.47, p = 2.87 × 10−67 for “good”, HR = 2.37, 95% CI: 2.27, 2.48, p < 2.35 × 10−04 for “fair,” HR = 4.01, 95% CI: 3.79, 4.25, p < 2.35 × 10−04 for “poor”; all compared with excellent), systolic blood pressure (HR = 1.20, 95% CI: 1.18, 1.22, p = 8.90 × 10−118), testosterone (HR = 0.84, 95% CI: 0.81, 0.88, p = 8.43 × 10−16), glycated hemoglobin (HR = 1.12, 95% CI :1.12, 1.13, p < 2.35 × 10−04), leg fat percentage (HR = 2.33, 95% CI: 2.23, 2.44, p = 4.23 × 10−300), attendance/disability/mobility allowance (HR = 2.11, 95% CI: 2.03, 2.20, p = 1.15 × 10−275), and HDL cholesterol (HR = 0.72, 95% CI: 0.70, 0.73, p = 2.30 × 10−252).

Fig. 2: Association between 213 modifiable factors and incident coronary heart disease.
Fig. 2: Association between 213 modifiable factors and incident coronary heart disease.The alternative text for this image may have been generated using AI.
Full size image

A The x-axis represents the hazard ratio value, and the y-axis represents statistical significance (that is, the 30th root of the p value). The dashed line represents the threshold after multiple testing correction (Bonferroni correction, p value < 2.35 × 10−04). The point size represents minimal depth, which was calculated by simultaneously including 213 modifiable factors and covariates in a random forest model with Cox proportional hazards regression to rank the importance of the variables; smaller minimal depth indicates greater variable importance. A set of the highest-risk factors was annotated. B The points represent hazard ratios, and the horizontal lines represent the corresponding 95% confidence interval. The rank indicates the importance of the variable in order of minimal depth. Hazard ratios were calculated using Cox proportional hazards regression analysis, adjusted for baseline age, sex, ethnic background, family history of cardiovascular disease, and assessment center (N = 394,579). Two-sided Z-tests were used to assess statistical significance. ATHI average total household income before tax, HFD/SP hands-free device/speakerphone use with mobile phone, VIT B9 Vitamin B9, HbA1c hemoglobin A1c, HDL high-density lipoprotein, LDL low-density lipoprotein, IGF-1 Insulin-like growth factor 1, MET metabolic equivalent of task, SHBG sex hormone-binding globulin. Source data are provided as a Source data file (Supplementary Data 5).

Further analyses stratified by age, sex, genetic risk for CHD, and follow-up duration revealed that >100 factors remained stably associated with CHD in most subgroups, except for the group with >10 years of follow-up, although the specific associations varied across subgroups (Fig. 3, Supplementary Data 6 and Supplementary Figs. 19). The number of modifiable factors associated with CHD increased with age, yet the effects of most factors decreased. Moreover, older populations showed greater sensitivity to local environmental factors. Variable importance rankings from the random forest model showed broadly consistent results across subgroups, with factors in health and medical history, blood assays, and physical measures consistently ranking high in importance, while the importance of factors in other domains diminished with age.

Fig. 3: Summary heat map for significant factors in EWAS analysis across the subgroups.
Fig. 3: Summary heat map for significant factors in EWAS analysis across the subgroups.The alternative text for this image may have been generated using AI.
Full size image

Models were estimated using Cox proportional hazards regression, adjusted for baseline age, sex, ethnic background, family history of cardiovascular disease, and assessment center (N = 394,579). Two-sided Z-tests were used to assess statistical significance. The color of cells indicates the effect sizes between each risk factor and incident coronary heart disease. Asterisks in cells represent significant associations after correction for multiple testing (Bonferroni-corrected, p < 2.35 × 10−04). ATHI average total household income before tax, HFD/SP hands-free device/speakerphone use with mobile phone, VIT B9 Vitamin B9, HbA1c hemoglobin A1c, HDL high-density lipoprotein, LDL low-density lipoprotein, IGF-1 Insulin-like growth factor 1, MET metabolic equivalent of task, SHBG sex hormone-binding globulin. Source data are provided as a Source data file (Supplementary Data 6).

Inverse variance-weighted MR was employed as the primary analysis after removing outliers (Fig. 4, Supplementary Data 7, and Supplementary Figs. 1114), given the observed evidence of heterogeneity (p < 0.05 for the Q test) and horizontal pleiotropy (p < 0.05 for the Egger intercept). By the threshold of p < 0.05, 68 factors showed associations with CHD. At a more stringent threshold of p < 2.35 × 10−04, 31 factors demonstrated causal relationships with CHD, including 22 factors exhibiting positive associations and 9 with negative associations. These factors spanned six domains: blood assays (13 factors, e.g., triglycerides), early life factors (2 factors, e.g., comparative body size at age 10), health and medical history (5 factors, e.g., self-reported hypertension), lifestyle behaviors (3 factors, e.g., alcohol intake frequency), physical measures (6 factors, e.g., waist circumference), and SES (2 factors, e.g., educational attainment). Results from alternative methodologies are presented in Supplementary Data 7.

Fig. 4: Mendelian randomization estimates of factors in relation to coronary heart disease risk.
Fig. 4: Mendelian randomization estimates of factors in relation to coronary heart disease risk.The alternative text for this image may have been generated using AI.
Full size image

Causal estimates were generated using the inverse variance-weighted method after removing outliers. The dashed line represents the threshold after multiple testing correction (Bonferroni correction, p value < 2.35 × 10−04). Two-sided statistical tests were performed. Dots represent ln (odds ratios). A set of the highest-risk factors was annotated. Source data are provided as a Source data file (Supplementary Data 7).

Compared to favorable profiles (Table 1), both unfavorable early life and psychosocial factors, as well as intermediate or unfavorable profiles across SES, lifestyle, local environment, blood assays, physical measurements, and health and medical history, were significantly associated with higher CHD risk. A significant increasing trend in CHD risk was observed across all eight domains (p for trend <0.001). Stratified analysis by age (<50 years, 50–60 years, and ≥60 years) revealed that the adverse effects of intermediate and unfavorable profiles across SES, lifestyles, early life factors, psychosocial factors, blood assays, as well as health and medical history diminished with increasing age (Fig. 5 and Supplementary Data 8). Additionally, a multiplicative interaction between age and the weighted score for health and medical history demonstrated that the effect of having a moderate or unfavorable health and medical history score on CHD risk gradually diminished with increasing age (p for interaction <0.05, Supplementary Data 8).

Fig. 5: Risk of incident coronary heart disease according to categories of eight domains within each age group.
Fig. 5: Risk of incident coronary heart disease according to categories of eight domains within each age group.The alternative text for this image may have been generated using AI.
Full size image

The favorable profile was set as a reference in each domain. The associations were estimated using a Cox proportional hazards regression model that included all eight domains, mutually adjusted, and with adjustments for baseline age, sex, ethnic background, family history of cardiovascular disease, and assessment center. The points represent hazard ratios (HR), and the horizontal lines represent the corresponding 95% confidence interval (CI). Two-sided statistical tests were used. N number of individuals at risk, n number of CHD cases. Source data are provided as a Source data file (Supplementary Data 8).

Table 1 Risk of incident coronary heart disease according to the categories of the eight domains

Meanwhile, participants were divided into nine groups based on tertiles of PRS and weighted scores for each domain, respectively. As shown in Fig. 6, Supplementary Data 9 and Supplementary Figs. 1517, individuals with the highest genetic risk scores and unfavorable weighted scores had substantially increased CHD risks compared with those with the lowest genetic risk scores and favorable weighted scores: 134% higher for SES, 157% for lifestyles, 94% for early life factors, 97% for psychosocial factors, 96% for local environment, 192% for blood assays, 162% for physical measures, and 240% for health and medical history. Statistically significant interactions were observed between weighted scores for lifestyles, blood assays, and health and medical history with PRS, which diminished with age (Supplementary Data 10).

Fig. 6: Risk of incident coronary heart disease according to categories of eight domains within each genetic risk category.
Fig. 6: Risk of incident coronary heart disease according to categories of eight domains within each genetic risk category.The alternative text for this image may have been generated using AI.
Full size image

Participants were categorized into nine groups based on the tertiles of the PRS, combined with domain-weighted scores. Hazard ratios were generated using a Cox model that included all eight domains, mutually adjusted, and with adjustments for baseline age, sex, ethnic background, family history of cardiovascular disease, and assessment center. The points represent hazard ratios (HR), and the horizontal lines represent the corresponding 95% confidence interval (CI). Two-sided statistical tests were used. The additive interaction results of the PRS and the weighted scores for each domain are available in Supplementary Data 9 and 10. N number of individuals at risk, n number of CHD cases. Source data are provided as a Source data file.

When shifting all unfavorable profiles to intermediate and favorable ones (Model 1), PAF estimation suggested that 39.66% of CHD cases could be prevented (Table 2), increasing to 61.89% when shifting all factors to the favorable tertile. In the conservative assessment (Model 1), the estimated maximum preventive effect was attributed to health and medical history, resulting in an 11.37% reduction in CHD incidence. The contributions of remaining domains were described as follows: blood assays (9.54%), lifestyles (5.97%), physical measures (5.70%), SES (3.70%), psychosocial factors (2.06%), early life factors (0.99%), and local environment (0.33%). In the context of a feasible implementation of comprehensive elimination of these factors, health and medical history would still account for the highest proportion of prevented CHD cases (15.75%), followed by blood assays (13.87%) and lifestyles (10.01%). Notably, implementing preventive measures in younger populations (under the age of 50), including blood indicator monitoring, healthy lifestyles, and disease prevention, could yield substantial benefits, potentially preventing 49.34–70.81% of CHD cases (Supplementary Data 1113). Two sensitivity analyses were conducted, calculating PAFs by including only factors significantly associated with CHD (p < 0.05), and further adjusting for PRS when estimating PAFs. Both approaches yielded results consistent with the main findings (Supplementary Data 14 and 15). Additionally, considering the potential temporal ordering among different domains, PAFs were also calculated using models without mutual adjustment between domains, suggesting the possibility of potential prevention of an even larger proportion (59.77–77.65%) of CHD cases (Supplementary Data 16).

Table 2 Weighted and unweighted PAFs for eight domains

Discussion

Despite substantial progress in CHD prevention in recent decades, the overall decline in CHD burden has plateaued, accompanied by a concerning rise in incidence among younger populations1,25. Here, we systematically investigated 155 modifiable risk factors across eight domains. In addition to well-established factors such as obesity, systolic blood pressure, non-high-density lipoprotein cholesterol, smoking, and diabetes, several additional contributors, including socioeconomic indicators (e.g., income, disability benefits), infections, and sex hormones, were also identified3,4,15,17. Our findings demonstrate that addressing a broader range of risk factors beyond traditional biomedical targets is essential for reducing health disparities and preventing early-onset CHD effectively.

A wide range of modifiable factors was associated with CHD risk across multiple domains. Health and medical history contributed the most to CHD incidence (PAF: 11.37–15.75%), particularly endocrine and metabolic disorders such as diabetes, obesity, and thyroid disease26,27,28. Other disease categories, including genitourinary, neurological, and psychiatric conditions, were also strongly linked to CHD, potentially through mechanisms such as metabolic dysregulation, vascular dysfunction, and autonomic imbalance29,30,31. Lifestyle factors—such as smoking, physical inactivity, poor sleep, and unhealthy dietary patterns—were associated with a 5.97–10.01% reduction in CHD risk when improved32,33. Additionally, biomarkers (e.g., cystatin C, testosterone), physical function (e.g., lung function, grip strength), and environmental exposures (e.g., air pollution) were essential contributors, likely operating through inflammatory, oxidative stress, and cardiometabolic pathways34,35. SES, psychosocial factors, and early-life exposures together accounted for 6.75–11.05% of the preventable CHD burden. These upstream determinants influence health behaviors, healthcare access, and biological vulnerability throughout the life course36,37,38.

We also incorporated MR within the framework of EWAS. However, MR identified only 31 variables that were causally associated with CHD, in contrast to the larger number of associations observed in EWAS. Several factors may explain this discrepancy. First, some instrumental variables lacked sufficient strength, which may have limited causal inference and made it difficult to rule out reverse causation39. Second, MR estimates can indicate the lifelong impact of genetic variation on disease, while longitudinal studies cover shorter timeframes, potentially leading to underestimation of results11. Third, due to the lack of suitable genetic instruments, several exposures, such as natural environment percentage, water percentage, and magnesium concentration, could not be robustly assessed within the MR framework. Because suitable instrumental variables for certain exposures were unavailable in the UKB, the present study utilized exposure genome-wide association studies (GWAS) data from the FinnGen consortium as an alternative. Potential bias arising from sample overlap between the exposure GWAS and the outcome dataset should be considered when interpreting corresponding results40. Our further examination of the interaction between genetic and modifiable factors revealed an additive interaction between the PRS for CHD and the weighted scores of the eight domains, with lifestyle, blood assays, as well as health and medical history being the most predominant ones. This addresses a limitation of previous studies, which have primarily focused on the interaction between lifestyles41, local environment42, and genetic risk. Notably, there was a gradual reduction in the effect size of this interaction with age, which may be related to survival bias, as individuals with high genetic risk and unfavorable lifestyles may be more susceptible to both CVD and premature mortality43.

In this study, we found that the strength of association between multiple modifiable factors and CHD varied significantly with age, showing an overall decreasing trend as age increased. This observation is biologically plausible and aligns with findings from prior studies44. Our analysis further extended and quantified this phenomenon from multiple perspectives. First, using an EWAS framework, we systematically compared the exposure profiles and their explanatory power for CHD risk across different age groups. Although older individuals were exposed to more complex and diverse risk factors, the independent explanatory effects of each domain appeared to be attenuated. Second, age-specific comparisons of PAFs revealed distinct priorities for preventive interventions across the lifespan. In younger populations, lifestyle factors emerged as one of the main contributors to CHD risk, underscoring substantial opportunities for lifestyle-based prevention strategies. Conversely, among older adults, the contribution of lifestyle diminished, shifting the emphasis towards integrated management of chronic diseases, particularly within the context of multimorbidity, and the preservation of functional capacity. Furthermore, we observed that the additive interaction between genetic susceptibility and environmental exposures was more pronounced in younger individuals, suggesting that gene-environment synergy in early life may play a more substantial role in CHD development22,45. Taken together, our findings highlight the considerable heterogeneity in CHD risk profiles across different age groups, underscoring the importance of developing age-specific, stratified prevention strategies from a life course perspective. For individuals with high genetic risk, targeted interventions should focus on the accumulation of modifiable risk factors. Our study not only quantified the PAFs for each domain and overall modifiable burden, but also ranked individual factors by their importance. These results suggest that effective prevention should prioritize key factors within the most influential domains to achieve precision and efficiency in reducing the burden of CHD, especially in efforts to curb the rising trend of early-onset CHD among younger and middle-aged populations.

Using a data-driven EWAS framework and large-scale data from nearly 400,000 participants, this study systematically integrated environmental, behavioral, phenotypic, and genetic risk factors. It comprehensively evaluated the PAFs of modifiable risk factors, explored their additive interactions with PRS, and examined age-related trends in these interactions. These findings provide valuable insights into the multifactorial pathogenesis of CHD and offer practical guidance for developing age-specific precision prevention strategies. However, several limitations should be acknowledged. First, the generalizability of our findings is limited by volunteer bias and the age range of UK Biobank participants (40–69 years), highlighting the need to study younger populations. Second, although most modifiable factors were objectively measured or derived from validated tools, some domains (e.g., early life factors) were sparsely represented, which may underestimate their contribution. Third, the use of Bonferroni correction, while conservative, may have masked true associations. Residual confounding may also remain despite adjustments for collinearity and domain-specific factors. Fourth, this study did not systematically assess the impact of interactions between modifiable factors or gene-environment interactions on the estimated attributable risks. Multimorbidity was only partially reflected within the “health and medical history” domain, without detailed analysis of specific disease combinations or their synergistic effects. Fifth, the PAF may overestimate the reduction in risk when multiple overlapping factors are considered. Therefore, we applied conservative models and weighting procedures, as well as multiple sensitivity analyses, to reduce potential bias. However, the estimates should still be interpreted with caution. Finally, as PAFs are population-specific, our estimates may not generalize to other countries or ethnic groups, although many UKB-based risk estimates align with those from more representative cohorts46,47. Therefore, further validation in diverse populations is needed to ensure the effectiveness and appropriateness of prevention strategies.

In summary, our comprehensive analysis demonstrates that increased risk of CHD is associated with SES, lifestyles, early life factors, psychosocial factors, local environment, blood assays, physical measures, as well as health and medical history. Genetic risk amplifies the association between modifiable factors and CHD, particularly among younger populations. Importantly, approximately 40%–62% of CHD cases may be preventable through modification of identified risk factors. The proportion of CHD cases attributable to modifiable factors declines with age, from 70.81% in individuals under 50 years to 60.20% in those aged 60 years and older. These findings underscore the need to integrate a more comprehensive range of modifiable factors into public health strategies to reduce the incidence of CHD, particularly by mitigating the accumulation of modifiable factors during early life stages. A comprehensive approach may substantially reduce the overall CHD burden and help reverse the rising trend of early-onset CHD.

Method

Study design

We employed Cox proportional hazards regression models in an EWAS framework to examine the association between 213 modifiable factors and CHD. Two-sample MR analysis was then conducted for each of the 213 factors to identify causal relationships between modifiable factors and CHD. Subsequently, weighted standardized scores were constructed to assess the combined effect of risk factors within each domain, with an exploration of variations across different age groups (<50, 50–60, and ≥60 years). Additionally, we assessed additive interactions between weighted standardized scores and genetic risk on CHD. Finally, we estimated PAFs and weighted PAFs for each domain.

Study population and CHD diagnosis

We drew our study population from the UKB study, acquiring baseline data on 502,164 participants between 2006 and 2010 who were followed up until the earliest occurrence of a first diagnosis of CHD, death, loss to follow-up, or the last available information (up to 1 September 2023). CHD was identified using relevant International Classification of Diseases codes (I20, I21, I22, I23, I24, and I25) from the Health Outcomes dataset in UKB, comprising primary care cases, hospital records, death registrations, and algorithmically defined outcomes. We excluded participants with pre-existing CHD or stroke at baseline, those with >15% missing data regarding the included variables, and individuals with missing outcomes or covariates. Finally, 394,579 participants (aged 40–69 years at recruitment, including both men and women) who provided informed consent were included in the present study; participation in the UK Biobank is entirely voluntary and without financial compensation. The North West Multi-center Research Ethics Committee approved this study. The analysis was conducted under UKB application number 199688. We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline for cohort studies.

Modifiable factors

After excluding variables with >20% missing data and those classified as unmodifiable (https://ukbiobank.dnanexus.com/panx/projects), 213 modifiable factors, either measured or derived at baseline, were included for subsequent analysis. These factors spanned eight domains, involving SES (e.g., household income and accommodation type), lifestyle (e.g., physical activity), early life factors (e.g., breastfeeding), psychosocial factors (e.g., neuroticism score), local environment (e.g., distance to major roads), blood assays (e.g., platelet count), physical measurements (e.g., blood pressure), and medical history (e.g., overall health rating). Data cleaning and processing for these variables are detailed in Supplementary Methods and Supplementary Data 13. We employed multiple imputation using chained equations, generating five imputed datasets over ten iterations. For each variable, an imputation method was automatically assigned according to its data type, incorporating the 10 most relevant predictors, with age and sex included by default48.

PRS of CHD

A subset of 337,151 individuals from the UKB, known as the White British Unrelated (WBU) group, was created by Bycroft et al.49 to include individuals of British European ancestry while excluding closely related participants. This WBU subset, referred to as the “training subset,” was used to generate summary statistics from GWAS and, subsequently, meta-analyzed with external GWAS datasets to develop the Enhanced PRS set. Further details can be found in a previous publication50. Using this approach, the PRS for CHD was derived for over 480,000 individuals in the UKB (field ID: 26227).

Statistical analyses

We performed the EWAS using Cox proportional hazard regression models to test the association between each of the 213 baseline exposures and newly diagnosed CHD, using Bonferroni-corrected significance thresholds (0.05 divided by 213 tests, resulting in 2.35 × 10−4). We made further adjustments for age, sex, ethnic background, family history of CVD, and assessment center. An interaction term with the time of follow-up was added in case of violated proportional hazards assumption (p  <  0.0001 via a test using Schoenfeld’s residuals)51. We conducted stratified analyses based on the baseline age, sex, PRS, and time of follow-up, identifying significance by the Bonferroni correction method. We employed random forest models combined with Cox proportional hazards regression to rank the importance of 213 modifiable variables by calculating minimal depth—a measure of the depth at which a variable splits—to prioritize modifiable factors that may serve as effective targets for intervention52.

To explore the causal relationship between modifiable factors and CHD, we incorporated a two-sample MR into the EWAS analytical framework. We utilized GWAS data for CHD (phenocode: I9_IHD) from the FinnGen database, including 75,592 cases and 378,141 controls. We prioritized GWAS data from the FinnGen database, where medical history-related factors were available. For the remaining variables, data from the MRC IEU OpenGWAS database (https://gwas.mrcieu.ac.uk/) were sourced to access publicly available UKB-based summary statistics. Supplementary Data 13 summarizes the GWAS IDs for each variable in detail. We identified significant single-nucleotide polymorphisms (SNPs) for instrument variables (p < 5 × 10−08). For traits with ≤3 SNPs after outlier removal, we either relaxed the threshold (p < 5 × 10−06) or excluded the trait. Inverse variance weighting was chosen as the primary method, with outliers removed. Subsequently, potential heterogeneity and horizontal pleiotropy were assessed using Cochran’s Q test, MR Egger intercept, and MR-PRESSO global test. Moreover, other methods for MR, such as weighted median and MR Egger regression, were applied to ensure robust conclusions.

Following previous methodology11, weighted standardized scores for each domain were also calculated by reversing the coding for protective factors (HR < 1) identified in the EWAS, normalizing continuous variables, and dummy-coding categorical variables. The weighted standardized scores were determined based on adjustments for variables within the same domain using Cox proportional hazards regression models, accounting for age, sex, ethnic background, family history of CVD, and assessment center. These scores were obtained by multiplying the raw value of each variable by its corresponding β coefficient and dividing by the total sum of β coefficients. Higher weighted scores indicated greater exposure to risk factors. According to these scores, three tertiles were established, including unfavorable (higher risk), moderate, and favorable (lower risk). Then, the Cox proportional hazards regression model was also used to assess the association between domain-specific weighted scores and CHD risk, with covariate adjustment as described above. Subsequently, we conducted stratified analysis by age (<50, 50–60, and ≥60 years) to explore the age-related heterogeneity in the relationship between domain-specific scores and CHD. To construct the CHD genetic risk score, we classified participants into nine groups based on tertiles of the PRS combined with the domain-weighted scores. We also examined CHD risk across different combinations of genetic risk and modifiable factors using a Cox proportional hazards regression model, with the group having low genetic risk and a favorable weighted score as the reference. Additionally, we assessed the additive interaction effect between genetic risk and weighted scores across eight domains of CHD risk.

Finally, the PAF was calculated for each domain, representing the proportion of reduction in specific disease risk that would result from replacing a given risk factor with a more favorable profile. We employed two models: Model 1 eliminated the most adverse one-third of modifiable factors by integrating intermediate and unfavorable profiles across the eight domains to produce a more conservative outcome; Model 2 eliminated modifiable factors by 2/3 to estimate the proportion of coronary events that could be reduced, indicating a more complete elimination of unfavorable factors. The PAFs for each domain were calculated using the “graphPAF” package, employing a Cox proportional hazards regression model with adjustments for sex, age, assessment center, ethnic background, and family history of CVD53. To account for correlations among the eight domains, the common factor variance was calculated through principal component analysis to estimate the weights of each PAF. These weights were then used to calculate both the combination-weighted and individual-weighted PAFs, accounting for the coexistence of risk factors within individuals, which resulted in reduced overestimation of PAFs caused by factor interactions54. Additionally, we conducted three sensitivity analyses to assess the robustness of the PAF estimates as follows: (1) analysis restricted to factors associated significantly with CHD (p < 0.05); (2) additional adjustment for the PRS; and (3) omitting mutual adjustment between domains.

All P values were two-sided, and analyses were conducted using R v4.3.2 (R Core Team, 2023).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.