Abstract
Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide, and accurately predicting individual risk is critical for prevention. Here we aimed to integrate unmodifiable risk factors, such as age and genetics, with modifiable risk factors, such as clinical and biometric measurements, into a meta-prediction framework that produces actionable and personalized risk estimates. In the initial development of the model, ~2,000 predictive features were considered, including demographic data, lifestyle factors, physical measurements, laboratory tests, medication usage, diagnoses and genetics. To power our meta-prediction approach, we stratified the UK Biobank into two primary cohorts: first, a prevalent CAD cohort used to train predictive models for cross-sectional prediction at baseline and prospective estimation of contributing risk factor levels and diagnoses (baseline models) and, second, an incident CAD cohort using, in part, these baseline models as meta-features to train a final CAD incident risk prediction model. The resultant 10-year incident CAD risk model, composed of 15 derived meta-features with multiple embedded polygenic risk scores, achieves an area under the curve of 0.84. In an independent test cohort from the All of Us research program, this model achieved an area under the curve of 0.81 for predicting 10-year incident CAD risk, outperforming standard clinical scores and previously developed integrative models. Moreover, this framework enables the generation of individualized risk reduction profiles by quantifying the potential impact of standard clinical interventions. Notably, genetic risk influences the extent to which these interventions reduce overall CAD risk, allowing for tailored prevention strategies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
All data are made available from the UKBB55 (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) and All of Us research program74 (https://workbench.researchallofus.org/login) to researchers from universities and other institutions with genuine research inquiries following institutional review board and biobank approval. This research has been conducted using the UKBB resource under application number 41999 and the All of Us v7 Curated Data Repository (R2022Q4R9 and C2022Q4R9 versions).
Code availability
The machine learning code used to generate the meta-predictions is available via GitHub at http://github.com/TorkamaniLab/CAD_meta_prediction.
Change history
19 August 2025
A Correction to this paper has been published: https://doi.org/10.1038/s41591-025-03925-y
References
Damask, A. et al. Patients with high genome-wide polygenic risk scores for coronary artery disease may receive greater clinical benefit from alirocumab treatment in the ODYSSEY OUTCOMES trial. Circulation https://doi.org/10.1161/CIRCULATIONAHA.119.044434 (2020).
Marston, N. A. et al. Predicting benefit from evolocumab therapy in patients with atherosclerotic disease using a genetic risk score. Circulation https://doi.org/10.1161/CIRCULATIONAHA.119.043805 (2020).
Mega, J. L. et al. Genetic risk, coronary heart disease events, and the clinical benefit of statin therapy: an analysis of primary and secondary prevention trials. Lancet 385, 2264–2271 (2015).
Natarajan, P. et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation 135, 2091–2101 (2017).
Bolli, A., Di Domenico, P., Pastorino, R., Busby, G. B. & Bottà, G. Risk of coronary artery disease conferred by low-density lipoprotein cholesterol depends on polygenic background. Circulation 143, 1452–1454 (2021).
Ye, Y. et al. Interactions between enhanced polygenic risk scores and lifestyle for cardiovascular disease, diabetes, and lipid levels. Circ. Genom. Precis. Med. 14, E003128 (2021).
Muse, E. D. et al. Impact of polygenic risk communication: an observational mobile application-based coronary artery disease study. NPJ Digit. Med. 5, 30 (2022).
Hollands, G. J. et al. The impact of communicating genetic risks of disease on risk-reducing health behaviour: systematic review with meta-analysis. BMJ 352, i1102 (2016).
Widén, E. et al. How communicating polygenic and clinical risk for atherosclerotic cardiovascular disease impacts health behavior: an observational follow-up study. Circ. Genom. Precis. Med. 15, E003459 (2022).
Knowles, J. W. et al. Impact of a genetic risk score for coronary artery disease on reducing cardiovascular risk: a pilot randomized controlled study. Front. Cardiovasc. Med. 4, 53 (2017).
Maamari, D. J. et al. Clinical implementation of combined monogenic and polygenic risk disclosure for coronary artery disease. JACC Adv. 1, 100068 (2022).
Roberts, M. C., Khoury, M. J. & Mensah, G. A. Perspective: The clinical use of polygenic risk scores: race, ethnicity, and health disparities. Ethn. Dis. 29, 513–516 (2019).
Lewis, A. C. F. & Green, R. C. Polygenic risk scores in the clinic: new perspectives needed on familiar ethical issues. Genome Med. 13, 14 (2021).
Martens, F. K., Tonk, E. C. M. & Janssens, A. C. J. W. Evaluation of polygenic risk models using multiple performance measures: a critical assessment of discordant results. Genet. Med. 21, 391–397 (2019).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Khan, S. S. et al. Coronary artery calcium score and polygenic risk score for the prediction of coronary heart disease events. JAMA 329, 1768–1777 (2023).
Mosley, J. D. et al. Predictive accuracy of a polygenic risk score compared with a clinical risk score for incident coronary heart disease. JAMA 323, 627–635 (2020).
Wünnemann, F. et al. Validation of genome-wide polygenic risk scores for coronary artery disease in French Canadians. Circ. Genom. Precis. Med. 12, e002481 (2019).
Murthy, V. L. et al. Polygenic risk, fitness, and obesity in the Coronary Artery Risk Development in Young Adults (CARDIA) study. JAMA Cardiol. 5, 263–271 (2020).
Wells, Q. S. et al. Polygenic risk score to identify subclinical coronary heart disease risk in young adults. Circ. Genom. Precis. Med. 14, e003341 (2021).
Marston, N. A. et al. Predictive utility of a coronary artery disease polygenic risk score in primary prevention. JAMA Cardiol. 8, 130–137 (2023).
Isgut, M., Sun, J., Quyyumi, A. A. & Gibson, G. Highly elevated polygenic risk scores are better predictors of myocardial infarction risk early in life than later. Genome Med. 13, 13 (2021).
Mars, N. et al. Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat. Med. 26, 549–557 (2020).
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Elliott, J. et al. Predictive accuracy of a polygenic risk score-enhanced prediction model vs a clinical risk score for coronary artery disease. JAMA 323, 636–645 (2020).
Aragam, K. G. et al. Limitations of contemporary guidelines for managing patients at high genetic risk of coronary artery disease. J. Am. Coll. Cardiol. 75, 2769–2780 (2020).
Aragam, K. G. & Natarajan, P. Polygenic scores to assess atherosclerotic cardiovascular disease risk: clinical perspectives and basic implications. Circ. Res. 126, 1159–1177 (2020).
Sun, L. et al. Polygenic risk scores in cardiovascular risk prediction: a cohort study and modelling analyses. PLoS Med. 18, e1003498 (2021).
Riveros-Mckay, F. et al. Integrated polygenic tool substantially enhances coronary artery disease prediction. Circ. Genom. Precis. Med. 14, E003304 (2021).
Hindy, G. et al. Abstract 16565: Integration of a genome-wide polygenic score with ACC/AHA pooled cohorts equation in prediction of coronary artery disease events in >285,000 participants. Circulation 140, abstr. 16565 (2019).
Inouye, M. et al. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. J. Am. Coll. Cardiol. 72, 1883–1893 (2018).
Ntalla, I. et al. Genetic risk score for coronary disease identifies predispositions to cardiovascular and noncardiovascular diseases. J. Am. Coll. Cardiol. 73, 2932–2942 (2019).
Abraham, G. et al. Genomic risk score offers predictive performance comparable to clinical risk factors for ischaemic stroke. Nat. Commun. 10, 5819 (2019).
Lin, J. et al. Integration of biomarker polygenic risk score improves prediction of coronary heart disease. JACC Basic Transl Sci. 8, 1489–1499 (2023).
Vassy, J. L. et al. Cardiovascular disease risk assessment using traditional risk factors and polygenic risk scores in the Million Veteran Program. JAMA Cardiol. 8, 564–574 (2023).
Agrawal, S. et al. Selection of 51 predictors from 13,782 candidate multimodal features using machine learning improves coronary artery disease prediction. Patterns 2, 100364 (2021).
Patel, A. P. et al. A multi-ancestry polygenic risk score improves risk prediction for coronary artery disease. Nat. Med. https://doi.org/10.1038/s41591-023-02429-x (2023).
Torkamani, A., Andersen, K. G., Steinhubl, S. R. & Topol, E. J. High-definition medicine. Cell 170, 828–843 (2017).
Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Gola, D., Erdmann, J., Müller-Myhsok, B., Schunkert, H. & König, I. R. Polygenic risk scores outperform machine learning methods in predicting coronary artery disease status. Genet. Epidemiol. 44, 125–138 (2020).
Xu, Y. et al. A machine learning model for disease risk prediction by integrating genetic and non-genetic factors. In Proc. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (eds Adjeroh, D. et al.) 868–871 (IEEE, 2022).
Gibson, G. On the utilization of polygenic risk scores for therapeutic targeting. PLoS Genet. 15, e1008060 (2019).
Goddard, K. A. B., Lee, K., Buchanan, A. H., Powell, B. C. & Hunter, J. E. Establishing the medical actionability of genomic variants. Annu. Rev. Genomics Hum. Genet. 23, 173–192 (2022).
Wang, Y., Tsuo, K., Kanai, M., Neale, B. M. & Martin, A. R. Challenges and opportunities for developing more generalizable polygenic risk scores. Annu. Rev. Biomed. Data Sci. 5, 293–320 (2022).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Wray, N. R., Wijmenga, C., Sullivan, P. F., Yang, J. & Visscher, P. M. Common disease is more complex than implied by the core gene omnigenic model. Cell https://doi.org/10.1016/j.cell.2018.05.051 (2018).
Mathieson, I. The omnigenic model and polygenic prediction of complex traits. Am. J. Hum. Genet. 108, 1558–1563 (2021).
Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 1803–1815 (2022).
Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 1679–1692 (2022).
You, J. et al. Development of machine learning-based models to predict 10-year risk of cardiovascular disease: a prospective cohort study. Stroke Vasc. Neurol. 8, 475–485 (2023).
Zeitouni, M. et al. Performance of guideline recommendations for prevention of myocardial infarction in young adults. J. Am. Coll. Cardiol. 76, 653–664 (2020).
De Filippis, A. P. et al. Risk score overestimation: the impact of individual cardiovascular risk factors and preventive therapies on the performance of the American Heart Association–American College of Cardiology–Atherosclerotic Cardiovascular Disease risk score in a modern multi-ethnic cohort. Eur. Heart J. 38, 598–608 (2017).
Livingstone, S. et al. Effect of competing mortality risks on predictive performance of the QRISK3 cardiovascular risk prediction tool in older people and those with comorbidity: external validation population cohort study. Lancet Healthy Longev. 2, e352–e361 (2021).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Patel, A. P., Wang, M., Kartoun, U., Ng, K. & Khera, A. V. Quantifying and understanding the higher risk of atherosclerotic cardiovascular disease among South Asian individuals: results from the UK Biobank prospective cohort study. Circulation 144, 410–422 (2021).
Goff, D. C. et al. 2013 ACC/AHA guideline on the assessment of cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. Circulation 129, 49–73 (2014).
Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ 357, j2099 (2017).
CatBoost Encoder Category Encoders 2.6.3 documentation; https://contrib.scikit-learn.org/category_encoders/catboost.html
Wilson, S. miceRanger: multiple imputation by chained equations with random forests. R version 4.0.0 https://cran.r-project.org/package=miceRanger (2021).
Khan, S. S. et al. Novel prediction equations for absolute risk assessment of total cardiovascular disease incorporating cardiovascular–kidney–metabolic health: a scientific statement from the American Heart Association. Circulation https://doi.org/10.1161/CIR.0000000000001191 (2023).
Arnett, D. K. et al. 2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: executive summary: a report of the American College of Cardiology/American Heart Association Task Force on clinical practice guidelines. Circulation 140, e563–e595 (2019).
Chen, S. F. et al. Genotype imputation and variability in polygenic risk score estimation. Genome Med. 12, 100 (2020).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Nielsen, J. B. et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat. Genet. https://doi.org/10.1038/s41588-018-0171-3 (2018).
Evangelou, E. et al. Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat. Genet. 50, 1412–1425 (2018).
Lam, M. et al. Comparative genetic architectures of schizophrenia in East Asian and European populations. Nat. Genet. 51, 1670–1678 (2019).
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Conti, D. V. et al. Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat. Genet. 53, 65–75 (2021).
Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).
Lambert, S. A. et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 53, 420–425 (2021).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature 627, 340–346 (2024).
Khattab, A., Chen, S.-F., Wineinger, N. & Torkamani, A. AoUPRS: a cost-effective and versatile PRS calculator for the All of Us program. Preprint at bioRxiv https://doi.org/10.1101/2024.07.11.603165 (2024).
Norgeot, B. et al. Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist. Nat. Med. https://doi.org/10.1038/s41591-020-1041-y (2020).
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann. Intern. Med. 162, 55–63 (2015).
Acknowledgements
We thank J. C. Ducom, L. Dong and the Scripps High-Performance Computing service for their support. Thanks to E. Topol for his comments on this paper. This work is supported by R01HG010881 to A.T. as well as grant UM1TR004407. We recognize that some of the factors labeled unmodifiable in this paper may be modifiable in some circumstances.
Author information
Authors and Affiliations
Contributions
Concept and design: A.T. Acquisition, analysis or interpretation of data: A.T., S.-F.C., S.E.L., H.J.S., C.H., J.-F.C. and N.E.W. Drafting of the paper: A.T., S.-F.C., J.-B.P. and A.K. Critical revision of the paper for important intellectual content: A.T., J.-B.P. and E.D.M.
Corresponding author
Ethics declarations
Competing interests
A.T. declares that he is a cofounder and equity shareholder of GeneXwell Inc. A.T. is an advisor to InsideTracker. The other authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Michael Basson, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Feature importance and SHAP summary for 10-year prospective CAD risk prediction in the UK Biobank.
This composite figure provides two views of the importance of the final 50 features used in prospective 10-year CAD meta-prediction for the incident CAD cohort (n = 33,419). The left panel displays a bar plot ranking the features in order of importance as quantified by the mean absolute SHAP value. The length of each bar represents the magnitude of a feature’s importance. The right panel presents a SHAP summary plot with each point representing the feature’s SHAP value at an individual level. Color coding represents the feature value (red for high, blue for low). Positive values correspond to the prediction contribution to the positive class, and negative values correspond to the prediction contribution to the negative class. Abbreviations: Dx: Diagnosis; MUF: modifiable and unmodifiable factors; UF: unmodifiable factors.
Extended Data Fig. 2 Evaluating the calibration and predictive value of feature categories for the meta-prediction model in the UK Biobank.
This figure evaluates the performance of the meta-prediction model compared to existing clinical risk scores within the test set of the incident CAD cohort (n = 33,419), highlighting both the model’s calibration and the predictive contribution of distinct feature categories. a. Calibration plot showing the predicted vs. observed 10-year CAD risk by decile. Brier’s scores are provided as a summary calibration measure. Conventional risk scores were re-calibrated using the same approach within the UK Biobank training cohort. b. Predictive performance of individual feature categories and their combinations in the meta-prediction model. The left panel illustrates the Area Under the Receiver Operating Characteristic (AUROC) for various models. The right panel shows the Area Under the Precision-Recall Curve (AUPRC). The models compared include the full meta-prediction model, partial models restricted to a single feature class and feature class combination, including 15 meta-features; 22 polygenic risk scores (PRSs) and 13 modifiable risk factors (MFs); 13 MFs; sex and age with 12 PRSs; 12 PRSs; and sex and age alone. Abbreviation: MF: Modifiable factors.
Extended Data Fig. 3 Comparative performance of CAD risk prediction models in the UK Biobank.
This figure compares the predictive performance of our final perspective 10-year CAD meta-prediction model to established clinical risk scores, including PCE, QRISK3, PREVENT, and previous polygenic score benchmarks including GPSCAD, metaGRSCAD, Aragam2022, Tcheandjieu2022, and GPSMult, as well as ML models including ML4HEN-COX and UKBCRP using the incident CAD cohort (n = 33,419). For each model, the left panel depicts a scatter plot illustrating the incidence of CAD events across percentile bins of predicted risk, showcasing the predictive density of each model at 10-years of follow-up, while the right panel displays the 10-year cumulative risk trajectories for each risk prediction model, highlighting the ability of each model to stratify risk across time. Shaded regions in the right panel represent 95% confidence intervals (CI). In all cases, meta-prediction significantly outperforms prior approaches.
Extended Data Fig. 4 SHAP summary plots for meta-features in the final model in the UK Biobank.
This figure presents a comprehensive collection of SHAP summary plots for each of the 35 meta-features integrated into the meta-prediction for the incident CAD cohort (n = 33,419). Each subplot provides insight into the contribution of individual baseline features towards the prediction of each meta-feature. The color coding indicates each feature value (red = high, blue = low), with the SHAP value on the y-axis reflecting the impact on the model’s output (positive = contribution to the positive class, negative = contribution to the negative class). Abbreviations: Dx: Diagnosis; MUF: modifiable and unmodifiable factors; UF: unmodifiable factors.
Extended Data Fig. 5 External validation of the streamlined meta-prediction.
Comparative test accuracy for the streamlined meta-prediction model in UKBB (AUROC = 0.81) versus AoU external validation with bootstrapping 95% confidence interval (CI) as the shaded part. a. Tested on the larger AoU validation cohort (AUROC = 0.81), further stratified by self-reported European (EUR), Africna (AFR) and Hispanic (HIS) groups; AoU-EHR (AUROC = 0.81), AoU-AFR (AUROC = 0.79), and AoU-HIS (AUROC = 0.84) respectively. b. And tested in the AoU sub-cohort with complete phenotypes (AUROC = 0.78). Additional conditions tested in AoU include PCE (AUC = 0.72), QRISK3 (AUC = 0.73), and PREVENT (AUC = 0.73). Abbreviations: AoU: All of Us research program; AUC: area under curve; UKBB: UK Biobank.
Extended Data Fig. 6 SHAP explanation of streamlined meta-prediction in UK Biobank and All of Us research program.
Both panels display the total 50 features contributing to the streamlined meta-prediction. The vertical axis orders each feature by its overall importance to risk prediction. Each point represents a participant and is color-coded according to the feature’s direction of contribution to the individuals’ risk prediction (red increased risk, blue decreased risk). The value associated with each point on the x-axis represents the magnitude of its contribution to the individuals’ risk prediction. The left panel presents the results from the test set of UK Biobank (n = 33,419). The right panel presents the results from the external validation set of All of Us research program (n = 198,424). Abbreviations: Dx: Diagnosis; MUF: modifiable and unmodifiable factors; UF: unmodifiable factors.
Extended Data Fig. 7 Overview of generalizable genetic meta-prediction model in the UK Biobank.
Using the incident CAD cohort (n = 33,419), we demonstrated a. Calibration plot of generalizable genetic meta-prediction model for predicted vs observed risk by decile within the test set of the incident CAD cohort (Brier’s score of generalizable genetic meta-prediction: 0.073). b. Cumulative risk curve of CAD (%) development over the 10-year follow-up period stratified by percentile of predicted risk. c. Incidence rates of CAD observed across the test cohort, stratified by percentile of predicted risk. Data are presented as mean values ± SD. d. and e. Comparative test accuracy (n = 33,419) for the generalizable genetic meta-prediction model (AUROC = 0.80; AUPRC = 0.28) versus other standard clinical and research risk scores, including PCE (AUROC = 0.73; AUPRC = 0.20), QRISK3 (AUROC = 0.74; AUPRC = 0.22), PREVENT (AUROC = 0.72; AUPRC = 0.19) and GPSCAD (AUROC = 0.73; AUPRC = 0.20). Abbreviations: AUC: Area under curve; CAD: coronary artery disease; EHR: electric health records; PCE: pool cohort equations.
Extended Data Fig. 8 Feature importance and SHAP summary for 10-year prospective CAD risk prediction of generalizable genetic model in the UK Biobank.
This composite figure provides two views of the importance of the final 50 features used in prospective 10-year CAD meta-prediction of generalizable genetic model for the incident CAD cohort (n = 33,419). The left panel displays a bar plot ranking the features in order of importance as quantified by the mean absolute SHAP value. The length of each bar represents the magnitude of a feature’s importance. The right panel presents a SHAP summary plot with each point representing the feature’s SHAP value at an individual level. Color coding represents the feature value (red for high, blue for low). Positive values correspond to the prediction contribution to the positive class, and negative values correspond to the prediction contribution to the negative class. Abbreviations: Dx: Diagnosis; MUF: modifiable and unmodifiable factors; UF: unmodifiable factors.
Extended Data Fig. 9 Feature distribution pre- and post-imputation in the UK Biobank.
This figure demonstrates the preservation of variable distributions pre- and post- imputation, stratified by sex. For numeric features, density plots are presented, with bolded edges representing the distribution pre-imputation and light edges post-imputation. For categorical features, two sets of stacked histograms are presented side by side for comparison: on the left are the original distributions pre-imputation, and on the right are the distributions post-imputation of missing data.
Supplementary information
Supplementary Tables
Supplementary Tables 1–27.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, SF., Lee, S.E., Sadaei, H.J. et al. Meta-prediction of coronary artery disease risk. Nat Med 31, 2277–2288 (2025). https://doi.org/10.1038/s41591-025-03648-0
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41591-025-03648-0
This article is cited by
-
AI approaches for predicting progression to acute coronary syndrome among stable coronary heart disease patients
npj Cardiovascular Health (2025)


