Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Abstract

Cardiometabolic diseases are frequently polygenic in architecture, comprising a large number of risk alleles with small effects spread across the genome1,2,3. Polygenic scores (PGS) aggregate these into a metric representing an individual’s genetic predisposition to disease. PGS have shown promise for early risk prediction4,5,6,7 and there is an open question as to whether PGS can also be used to understand disease biology8. Here, we demonstrate that cardiometabolic disease PGS can be used to elucidate the proteins underlying disease pathogenesis. In 3,087 healthy individuals, we found that PGS for coronary artery disease, type 2 diabetes, chronic kidney disease and ischaemic stroke are associated with the levels of 49 plasma proteins. Associations were polygenic in architecture, largely independent of cis and trans protein quantitative trait loci and present for proteins without quantitative trait loci. Over a follow-up of 7.7 years, 28 of these proteins associated with future myocardial infarction or type 2 diabetes events, 16 of which were mediators between polygenic risk and incident disease. Twelve of these were druggable targets with therapeutic potential. Our results demonstrate the potential for PGS to uncover causal disease biology and targets with therapeutic potential, including those that may be missed by approaches utilizing information at a single locus.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Proteins associated with polygenic risk for cardiometabolic disease.
Fig. 2: PGS-associated proteins influence 7.7-year risk of CAD and T2D.

Similar content being viewed by others

Data availability

All data used in this study are publicly available or deposited in a public repository. The INTERVAL cohort data are available via the European Genome-phenome Archive with study accession no. EGAS00001002555. Dataset access is subject to approval by an independent data access committee since the data contain potentially identifying and sensitive patient information. Response times from the data access committee are typically within 1 week. All other data used in this study are publicly available without restriction. The PGS used in this study are available to download through the Polygenic Score Catalog (https://www.pgscatalog.org/) with accession nos PGS000727 (AF), PGS000018 (CAD), PGS000728 (CKD), PGS000039 (IS) and PGS000729 (T2D). The GWAS summary statistics used to generate new PGS for CKD, T2D and AF in this study are available to download through the GWAS Catalog (https://www.ebi.ac.uk/gwas/) with study accession nos GCST008065 (for the CKD GWAS published by Wuttke et al.14), GCST007517 (for the T2D GWAS published by Mahajan et al.15) and GCST006414 (for the AF GWAS published by Nielsen et al.13). The additional GWAS summary statistics used for Mendelian randomization analysis are also available through the GWAS Catalog with study accession nos GCST004787 (for the CAD GWAS published by Nelson et al.17), GCST006906 (for the IS GWAS published by Malik et al.16) and GCST007518 (for the T2D GWAS adjusted for BMI published by Mahajan et al.15). Full pQTL summary statistics published by Sun et al.26 for all SomaLogic SOMAscan aptamers are available to download from https://www.phpc.cam.ac.uk/ceu/proteins/. The DrugBank database is publicly available to download at https://www.drugbank.ca/releases/latest. Summary statistics for all statistical tests are available in Supplementary Data 3; the additional cis-pQTLs mapped in this study are provided in Supplementary Data 4.

Code availability

The code used to generate the results of this study, along with a detailed list of software and versions, is available on GitHub (https://github.com/sritchie73/cardiometabolic_prs_plasma_proteome/), which is permanently archived by Zenodo70 at https://doi.org/10.5281/zenodo.4762747.

References

  1. Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    Article  CAS  PubMed  Google Scholar 

  2. Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 28, R133–R142 (2019).

    Article  CAS  PubMed  Google Scholar 

  5. Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).

    Article  CAS  PubMed  Google Scholar 

  6. Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. McCarthy, M. I. & Mahajan, A. The value of genetic risk scores in precision medicine for diabetes. Expert Rev. Precis. Med. Drug Dev. 3, 279–281 (2018).

    Article  Google Scholar 

  8. International Common Disease Alliance Recommendations and White Paper v.1.0 (ICDA Organizing Committee and Working Groups, 2020); https://drive.google.com/file/d/16SVJ5lbneN9hB9E03PZMhpescAN527HO/view

  9. Erdmann, J., Kessler, T., Munoz Venegas, L. & Schunkert, H. A decade of genome-wide association studies for coronary artery disease: the challenges ahead. Cardiovasc. Res. 114, 1241–1257 (2018).

    CAS  PubMed  Google Scholar 

  10. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 52, 1122–1131 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Nielsen, J. B. et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat. Genet. 50, 1234–1239 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Wuttke, M. et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 51, 957–972 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Mahajan, A. et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes. Nat. Genet. 50, 559–571 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524–537 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Nelson, C. P. et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat. Genet. 49, 1385–1391 (2017).

    Article  CAS  PubMed  Google Scholar 

  18. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Sinnott-Armstrong, N., Naqvi, S., Rivas, M. & Pritchard, J. K. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. eLife 10, e58615 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034.e6 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Sun, L. et al. Polygenic risk scores in cardiovascular risk prediction: a cohort study and modelling analyses. PLoS Med. 18, e1003498 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Finan, C. et al. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 9, eaag1166 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).

    Article  PubMed  CAS  Google Scholar 

  24. Moore, C. et al. The INTERVAL trial to determine whether intervals between blood donations can be safely and acceptably decreased to optimise blood supply: study protocol for a randomised controlled trial. Trials 15, 363 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Di Angelantonio, E. et al. Efficiency and safety of varying the frequency of whole blood donation (INTERVAL): a randomised trial of 45 000 donors. Lancet 390, 2360–2371 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Sun, B. B. et al. Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Jovanović, D., Krstivojević, P., Obradović, I., Durdević, V. & Dukanović, L. Serum cystatin C and beta2-microglobulin as markers of glomerular filtration rate. Ren. Fail. 25, 123–133 (2003).

    Article  PubMed  CAS  Google Scholar 

  28. van Poelje, P. D., Dang, Q. & Erion, M. D. Fructose-1,6-bisphosphatase as a therapeutic target for type 2 diabetes. Drug Discov. Today Ther. Strateg. 4, 103–109 (2007).

    Article  Google Scholar 

  29. Holmes, M. V. & Davey Smith, G. Can Mendelian randomization shift into reverse gear? Clin. Chem. 65, 363–366 (2019).

    Article  CAS  PubMed  Google Scholar 

  30. Imai, K., Keele, L. & Tingley, D. A general approach to causal mediation analysis. Psychol. Methods 15, 309–334 (2010).

    Article  PubMed  Google Scholar 

  31. Imai, K., Keele, L., Tingley, D. & Yamamoto, T. Unpacking the black box of causality: learning about causal mechanisms from experimental and observational studies. Am. Polit. Sci. Rev. 105, 765–789 (2011).

    Article  Google Scholar 

  32. Hernán, M. A. A definition of causal effect for epidemiological research. J. Epidemiol. Community Health 58, 265–271 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Ding, E. L. et al. Sex hormone-binding globulin and risk of type 2 diabetes in women and men. N. Engl. J. Med. 361, 1152–1163 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Gudmundsdottir, V. et al. Circulating protein signatures and causal candidates for type 2 diabetes. Diabetes 69, 1843–1853 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).

    Article  CAS  PubMed  Google Scholar 

  37. VanderWeele, T. J. Mediation analysis: a practitioner’s guide. Annu. Rev. Public Health 37, 17–32 (2016).

    Article  PubMed  Google Scholar 

  38. Russo, V. C., Azar, W. J., Yau, S. W., Sabin, M. A. & Werther, G. A. IGFBP-2: the dark horse in metabolism and cancer. Cytokine Growth Factor Rev. 26, 329–346 (2015).

    Article  CAS  PubMed  Google Scholar 

  39. Assefa, B. et al. Insulin-like growth factor (IGF) binding protein-2, independently of IGF-1, induces GLUT-4 translocation and glucose uptake in 3T3-L1 adipocytes. Oxid. Med. Cell. Longev. 2017, 3035184 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  40. Wheatcroft, S. B. et al. IGF-binding protein-2 protects against the development of obesity and insulin resistance. Diabetes 56, 285–294 (2007).

    Article  CAS  PubMed  Google Scholar 

  41. Hedbacker, K. et al. Antidiabetic effects of IGFBP2, a leptin-regulated gene. Cell Metab. 11, 11–22 (2010).

    Article  CAS  PubMed  Google Scholar 

  42. Rajpathak, S. N. et al. Insulin-like growth factor axis and risk of type 2 diabetes in women. Diabetes 61, 2248–2254 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Wittenbecher, C. et al. Insulin-like growth factor binding protein 2 (IGFBP-2) and the risk of developing type 2 diabetes. Diabetes 68, 188–197 (2019).

    Article  CAS  PubMed  Google Scholar 

  44. Yin, S.-J., Chou, C.-F., Lai, C.-L., Lee, S.-L. & Han, C.-L. Human class IV alcohol dehydrogenase: kinetic mechanism, functional roles and medical relevance. Chem. Biol. Interact. 143–144, 219–227 (2003).

    Article  PubMed  CAS  Google Scholar 

  45. Kim, M. Y. et al. Identification of a zeta-crystallin (quinone reductase)-like 1 gene (CRYZL1) mapped to human chromosome 21q22.1. Genomics 57, 156–159 (1999).

    Article  CAS  PubMed  Google Scholar 

  46. International Statistical Classification of Diseases and Related Health Problems: Instruction Manual (World Health Organization, 2004).

  47. Kuan, V. et al. A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service. Lancet Digit. Health 1, e63–e77 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429.e19 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Inouye, M. et al. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. J. Am. Coll. Cardiol. 72, 1883–1893 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Abraham, G. et al. Genomic risk score offers predictive performance comparable to clinical risk factors for ischaemic stroke. Nat. Commun. 10, 5819 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Nikpay, M. et al. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Lambert, S. A. et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nature 53, 420–425 (2021).

    CAS  Google Scholar 

  53. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).

    Article  CAS  PubMed  Google Scholar 

  55. Peterson, C. B., Bogomolov, M., Benjamini, Y. & Sabatti, C. TreeQTL: hierarchical error control for eQTL findings. Bioinformatics 32, 2556–2558 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Huang, Q. Q., Ritchie, S. C., Brozynska, M. & Inouye, M. Power, false discovery rate and winner’s curse in eQTL studies. Nucleic Acids Res. 46, e133 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Atkinson, M. A., Eisenbarth, G. S. & Michels, A. W. Type 1 diabetes. Lancet 383, 69–82 (2014).

    Article  PubMed  Google Scholar 

  58. Vansteelandt, S., Bekaert, M. & Lange, T. Imputation strategies for the estimation of natural direct and indirect effects. Epidemiol. Methods 1, 130–158 (2012).

    Article  Google Scholar 

  59. Steen, J., Loeys, T., Moerkerke, B. & Vansteelandt, S. medflex: an R package for flexible mediation analysis using natural effect models. J. Stat. Softw. 76, 1–46 (2017).

    Article  Google Scholar 

  60. Liang, K.-Y. & Zeger, S. L. Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986).

    Article  Google Scholar 

  61. Yu, Q., Fan, Y. & Wu, X. General multiple mediation analysis with an application to explore racial disparities in breast cancer survival. J. Biom. Biostat. 5, 1–9 (2014).

    CAS  Google Scholar 

  62. Yu, Q. & Li, B. mma: an R package for mediation analysis with multiple mediators. J. Open Res. Softw. 5, 11 (2017).

    Article  Google Scholar 

  63. Burgess, S., Butterworth, A. & Thompson, S. G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37, 658–665 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Hartwig, F. P., Davey Smith, G. & Bowden, J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 46, 1985–1998 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  67. McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Yavorska, O. O. & Burgess, S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int. J. Epidemiol. 46, 1734–1739 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  70. Ritchie, S. sritchie73/cardiometabolic_prs_plasma_proteome: Nature revisions round 3. Zenodo https://doi.org/10.5281/zenodo.4762747 (2021).

  71. Rasmussen, K. L. Plasma levels of apolipoprotein E, APOE genotype and risk of dementia and ischemic heart disease: a review. Atherosclerosis 255, 145–155 (2016).

    Article  CAS  PubMed  Google Scholar 

  72. Sofat, R. et al. Circulating apolipoprotein E concentration and cardiovascular disease risk: meta-analysis of results from three studies. PLoS Med. 13, e1002146 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  73. Nikpay, M., Soubeyrand, S., Tahmasbi, R. & McPherson, R. Multiomics screening identifies molecular biomarkers causally associated with the risk of coronary artery disease. Circ. Genom. Precis. Med. 13, e002876 (2020).

    Article  CAS  PubMed  Google Scholar 

  74. Ruttmann, E. et al. γ-Glutamyltransferase as a risk factor for cardiovascular disease mortality: an epidemiological investigation in a cohort of 163 944 Austrian adults. Circulation 112, 2130–2137 (2005).

    Article  CAS  PubMed  Google Scholar 

  75. Lee, D. S. et al. Gamma glutamyl transferase and metabolic syndrome, cardiovascular disease, and mortality risk: the Framingham Heart Study. Arterioscler. Thromb. Vasc. Biol. 27, 127–133 (2007).

    Article  CAS  PubMed  Google Scholar 

  76. Kojima, Y. et al. Progranulin expression in advanced human atherosclerotic plaque. Atherosclerosis 206, 102–108 (2009).

    Article  CAS  PubMed  Google Scholar 

  77. Pugeat, M. et al. Interrelations between sex hormone-binding globulin (SHBG), plasma lipoproteins and cardiovascular risk. J. Steroid Biochem. Mol. Biol. 53, 567–572 (1995).

    Article  CAS  PubMed  Google Scholar 

  78. Sutton-Tyrrell, K. et al. Sex-hormone-binding globulin and the free androgen index are related to cardiovascular risk factors in multiethnic premenopausal and perimenopausal women enrolled in the Study of Women Across the Nation (SWAN). Circulation 111, 1242–1249 (2005).

    Article  CAS  PubMed  Google Scholar 

  79. Liu, P. Y., Death, A. K. & Handelsman, D. J. Androgens and cardiovascular disease. Endocr. Rev. 24, 313–340 (2003).

    Article  CAS  PubMed  Google Scholar 

  80. Li, G.-S. et al. Do the mutations of C1GALT1C1 gene play important roles in the genetic susceptibility to Chinese IgA nephropathy? BMC Med. Genet. 10, 101 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Yoshida, T. et al. Association of gene polymorphisms with chronic kidney disease in high- or low-risk subjects defined by conventional risk factors. Int. J. Mol. Med. 23, 785–792 (2009).

    CAS  PubMed  Google Scholar 

  82. Foster, M. C., Yang, Q., Hwang, S.-J., Hoffmann, U. & Fox, C. S. Heritability and genome-wide association analysis of renal sinus fat accumulation in the Framingham Heart Study. BMC Med. Genet. 12, 148 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Madsen, T. E. et al. Circulating SHBG (sex hormone-binding globulin) and risk of ischemic stroke: findings from the WHI. Stroke 51, 1257–1264 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Baumeier, C. et al. Caloric restriction and intermittent fasting alter hepatic lipid droplet proteome and diacylglycerol species and prevent diabetes in NZO mice. Biochim. Biophys. Acta 1851, 566–576 (2015).

    Article  CAS  PubMed  Google Scholar 

  85. Ngo, D. et al. Proteomic profiling reveals novel biomarkers and pathways in type 2 diabetes risk. JCI Insight 6, e144392 (2021).

    Article  PubMed Central  Google Scholar 

  86. Spranger, J. et al. Adiponectin and protection against type 2 diabetes mellitus. Lancet 361, 226–228 (2003).

    Article  CAS  PubMed  Google Scholar 

  87. Lau, W., Andrew, T. & Maniatis, N. High-resolution genetic maps identify multiple type 2 diabetes loci at regulatory hotspots in African Americans and Europeans. Am. J. Hum. Genet. 100, 803–816 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Suckale, J. & Solimena, M. The insulin secretory granule as a signaling hub. Trends Endocrinol. Metab. 21, 599–609 (2010).

    Article  CAS  PubMed  Google Scholar 

  89. Kim-Muller, J. Y. et al. Aldehyde dehydrogenase 1a3 defines a subset of failing pancreatic β cells in diabetic mice. Nat. Commun. 7, 12631 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Voight, B. F. et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Guevara-Aguirre, J. et al. Growth hormone receptor deficiency is associated with a major reduction in pro-aging signaling, cancer, and diabetes in humans. Sci. Transl. Med. 3, 70ra13 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  92. Rajwani, A. et al. Increasing circulating IGFBP1 levels improves insulin sensitivity, promotes nitric oxide production, lowers blood pressure, and protects against atherosclerosis. Diabetes 61, 915–924 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Xu, S., Lind, L., Zhao, L., Lindahl, B. & Venge, P. Plasma prolylcarboxypeptidase (angiotensinase C) is increased in obesity and diabetes mellitus and related to cardiovascular dysfunction. Clin. Chem. 58, 1110–1115 (2012).

    Article  CAS  PubMed  Google Scholar 

  94. Grarup, N., Sandholt, C. H., Hansen, T. & Pedersen, O. Genetic susceptibility to type 2 diabetes and obesity: from genome-wide association studies to rare variants and beyond. Diabetologia 57, 1528–1541 (2014).

    Article  CAS  PubMed  Google Scholar 

  95. Dwinovan, J., Colella, A. D., Chegeni, N., Chataway, T. K. & Sokoya, E. M. Proteomic analysis reveals downregulation of housekeeping proteins in the diabetic vascular proteome. Acta Diabetol. 54, 171–190 (2017).

    Article  CAS  PubMed  Google Scholar 

  96. Lopez, P. H. et al. Mice lacking sialyltransferase ST3Gal-II develop late-onset obesity and insulin resistance. Glycobiology 27, 129–139 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Kato, N. Insights into the genetic basis of type 2 diabetes. J. Diabetes Investig. 4, 233–244 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Levey, A. S. & Coresh, J. Chronic kidney disease. Lancet 379, 165–180 (2012).

    Article  PubMed  Google Scholar 

  99. Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Participants in the INTERVAL randomized controlled trial were recruited with the active collaboration of NHS Blood and Transplant (www.nhsbt.nhs.uk), which has supported fieldwork and other elements of the trial. DNA extraction and genotyping were co-funded by the National Institute for Health Research (NIHR), the NIHR BioResource (http://bioresource.nihr.ac.uk) and the NIHR Cambridge Biomedical Research Centre (BRC) (no. BRC-1215-20014). Olink Proteomics assays were funded by Biogen. SomaLogic assays were funded by Merck and the NIHR Cambridge BRC (no. BRC-1215-20014). The academic coordinating centre for INTERVAL was supported by core funding from the NIHR Blood and Transplant Research Unit in Donor Health and Genomics (no. NIHR BTRU-2014-10024), UK Medical Research Council (MRC) (no. MR/L003120/1), British Heart Foundation (nos SP/09/002, RG/13/13/30194 and RG/18/13/33946) and the NIHR Cambridge BRC (no. BRC-1215-20014). A complete list of the investigators and contributors to the INTERVAL trial is provided in ref. 25. The academic coordinating centre thanks blood donor centre staff and blood donors for participating in the INTERVAL trial. This work was supported by Health Data Research UK, which is funded by the UK MRC, Engineering and Physical Sciences Research Council (EPSRC), Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and Wellcome. This study was also supported by the Victorian Government’s Operational Infrastructure Support programme. This work was performed using resources provided by the Cambridge Service for Data Driven Discovery operated by the University of Cambridge Research Computing Service (https://www.hpc.cam.ac.uk/high-performance-computing), provided by Dell EMC and Intel using tier-2 funding from the EPSRC (capital grant no. EP/P020259/1), and DiRAC funding from the Science and Technology Facilities Council (www.dirac.ac.uk). This work uses data provided by patients and collected by the NHS and Public Health England as part of their care and support. Data on hospital episode statistics, mortality and cancer registration were obtained from NHS Digital (data sharing agreement reference no. DARS-NIC-156334-711SX). S.C.R. and J.M. were funded by the NIHR Cambridge BRC (no. BRC-1215-20014). S.A.L. is supported by a Canadian Institutes of Health Research postdoctoral fellowship (no. MFE-171279). G.A. was supported by a National Health and Medical Research Council of Australia Early Career Fellowship (no. 1090462). S.B. is supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (no. 204623/Z/16/Z). A.V.K. was supported by grants from the National Human Genome Research Institute (award nos 1K08HG010155 and 5UM1HG008895), an institutional grant from the Broad Institute of MIT and Harvard (variant2function) and a Hassenfeld Scholar Award from Massachusetts General Hospital. J.D. holds a British Heart Foundation Professorship and an NIHR Senior Investigator Award. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. The views expressed in this manuscript are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.

Author information

Authors and Affiliations

Authors

Contributions

S.C.R., S.Z., M. Chaffin, B.G.D., A.C.C., N.S., S.K., A.V.K., A.S.B. and M.I. conceptualized the study. S.C.R., M.A., Y.L. and A.S.B. curated the data. S.C.R. carried out the formal analysis. J.D. and M.I. acquired the funding. S.C.R., M.A., Y.L., B.G.D., A.C.C. and M. Chaffin carried out the investigation. S.C.R., S.A.L., S.M.T., S.L., P.S., J.M., G.A. and M.I. devised the methodology. A.V.K., S.K., A.S.B. and M.I. administered the project. W.H.O., D.J.R., N.A.W., B.G.D., A.C.C., E.D.A., M. Chapman, J.D., A.S.B. and M.I. curated the resources. S.C.R. managed the software. G.A., B.G.D., A.C.C., E.D.A., S.K., A.S.B. and M.I. supervised the study. S.C.R., S.A.L., S.M.T., S.L., P.S., J.M., G.A., S.B. and A.V.K. validated the data. S.C.R. visualized the data. S.C.R. and M.I. wrote the original manuscript draft. S.C.R., S.A.L., S.M.T., S.L., P.S., J.M., G.A., B.G.D., A.C.C., N.S., S.B., A.V.K., J.D., A.S.B. and M.I. reviewed and edited the manuscript draft.

Corresponding authors

Correspondence to Scott C. Ritchie or Michael Inouye.

Ethics declarations

Competing interests

Several authors are now employed by or run pharmaceutical companies. All significant contributions to this study were made before these roles and the named companies had no role in the study. M.A. is an employee of AstraZeneca. P.S. is an employee of Roche. J.M. is an employee of Genomics PLC. G.A. is an employee of CSL Limited. S.K. is the chief executive officer of Verve Therapeutics. The other authors declare no competing interests.

Additional information

Peer review information Nature Metabolism thanks Matthew Nelson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Isabella Samuelson.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Study schematic.

Overview of the study design.

Extended Data Fig. 2 Cohort characteristics.

IQR: interquartile range. Body mass index (BMI) was computed from self-reported height and weight.

Extended Data Fig. 3 Summary statistics for PGS to protein to disease associations.

Beta: standard deviation change in protein levels per standard deviation increase in PGS (from Fig. 1b) in linear regression adjusting for age, sex, 10 genotype PCs, sample measurement batch, and time between blood draw and sample processing. FDR: Benjamini-Hochberg false discovery rate corrected P-value. FDR correction was applied separately for each PGS to all 3,438 P-values from linear regression of each of the 3,438 measured proteins on the respective PGS. Polygenicity: proportion of the genome (%) required to explain the PGS to protein association (from Fig. 1c). HR: hazard ratio for 7.7 year risk of hospitalisation with the respective disease conferred per standard deviation increase in protein levels (from Fig. 2b) in cox proportional hazard models using follow-up as time scale and adjusting for age, sex, sample measurement batch, and time between blood draw and sample processing. Associations highlighted in red indicate significant associations after Bonferroni correction for the 42 tests (P < 0.0012). Associations dulled in grey indicate P > 0.05. % PGS Mediated: Percentage of total association between the respective PGS and 7.7 year risk of hospitalisation with the respective disease mediated by the respective protein (from Fig. 2d). Highlighted in red indicates mediation was significant after Bonferroni correction for the 42 tests (P < 0.0012). Entries dulled in grey indicate P > 0.05. Linear regression, polygenicity, cox proportional hazard models, and mediation analysis were all performed in the same n = 3,087 independent INTERVAL participants. In each instance, 95% CI corresponds to the 95% confidence interval of the respective point estimate. All P-values are two-sided. 95% confidence intervals and P-values could not be formulated for the polygenicity tests. For proteins measured by more than one SomaLogic aptamer (GPD1, IGFBP1, IGFBP2, SHBG, and WFIKKN2) effect sizes were averaged and two-sided P-values were obtained from averaged Z-scores, and aptamer-specific summary statistics are detailed in Supplementary Table 3.

Extended Data Fig. 4 Information about each PGS associated protein.

Aptamer: Sequence ID for the SomaLogic aptamer(s) targeting the protein. A * next to the protein name indicates the aptamer(s) binds to specific isoforms of the listed protein or binds to multiple proteins; see Aptamer target column. Extended details on aptamer sensitivity and specificity can be found in Supplementary Table 2.

Extended Data Fig. 5 Previous evidence for PGS-associated proteins in disease.

Citations provided where association with the respective disease has been previously observed71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97.

Extended Data Fig. 6 Robustness of PGS to protein associations.

a-c) Robustness and longitudinal stability of PGS to protein associations to proteomics technology. d-e) Robustness and longitudinal stability of protein levels to proteomics technology. f) Robustness of PGS-protein associations to environmental and physiological confounding. g) Mediation of PGS-protein associations through body mass index (BMI) for six proteins associated with T2D PGS. a) Compares PGS-protein associations from Fig. 1b in n = 3,087 INTERVAL participants in which protein levels were measured with the SomaLogic platform (x-axis) to PGS-protein associations tested in an independent set of n = 418 INTERVAL participants in which protein levels were measured with the Olink Explore platform (y-axis). In total 1,463 proteins were quantified by the Olink Explore platform, including 907 quantified by the SomaLogic platform, and among these 16 of the 49 PGS-associated proteins from Fig. 1b. b) Compares PGS-protein associations from Fig. 1b (x-axis) to PGS-protein associations tested in an independent set of n = 3,848 INTERVAL participants in which protein levels were measured with the Olink T96 platform (y-axis). In total 265 proteins were quantified by the Olink T96 platform, including 224 quantified by the SomaLogic platform, and among these 4 of the 49 PGS-associated proteins from Fig. 1b. c) Compares PGS-protein associations tested in n = 646 INTERVAL participants in which protein levels were measured with both the SomaLogic platform (x-axis) and, after two-years of follow-up, the Olink T96 platform (y-axis). a-c) Data shown correspond to the beta estimates from linear regression (points) and their 95% confidence interval (bars), indicating standard deviation change in protein levels per standard deviation increase in the respective PGS (denoted by colour). Solid points indicate two-sided P-value < 0.05 for the test on the y-axis. Linear regression on both axes were adjusted for age (at protein measurement), sex, 10 genotype PCs, and platform-specific technical covariates. Full summary statistics including exact P-values are detailed in Supplementary Data 3,b for linear regression tests on y-axes, and in Supplementary Data 3,a for linear regression tests on x-axes. d) Compares protein levels quantified by the SomaLogic platform (x-axes) to protein levels quantified by the Olink T96 platform (y-axes) after two years of follow-up in n = 646 INTERVAL participants. e) Compares protein levels quantified by the Olink T96 platform (x-axes) to protein levels quantified by the Olink Explore platform (y-axes) in n = 418 INTERVAL participants. f) Compares PGS-protein associations from Fig. 1b in n = 3,087 INTERVAL participants (x-axes) to PGS-protein associations (1) additionally adjusted for circadian effects (time of day of blood draw), (2) additionally adjusted for seasonal effects (date of blood draw), (3) when including 87 additional participants with prevalent cardiometabolic disease (n = 3,174 on y-axis), and (4) when adjusting for BMI (n = 3,072 participants with non-missing BMI on y-axis). All associations were testing using linear regression adjusting for age, sex, 10 genotype PCs, sample measurement batch, and time between blood draw and sample measurement in addition to the covariates noted above. Data shown correspond to the beta estimates from linear regression (points) and their 95% confidence interval (bars), indicating standard deviation change in protein levels per standard deviation increase in the respective PGS (denoted by colour). Full summary statistics including exact P-values in these sensitivity analyses are detailed in Supplementary Data 3,c. g) For the six proteins whose association with T2D PGS was attenuated by adjustment for BMI (P > 0.05; Extended Data Fig. 6f) gives, from mediation analysis, the estimated effect of T2D PGS on the protein levels through BMI (standard deviation change in protein levels through BMI per standard deviation increase in T2D PGS), percentage of T2D PGS to protein levels mediated by BMI, and the estimated effect of T2D PGS on protein levels independent of BMI in n = 3,072 INTERVAL participants. All P-values are two-sided.

Extended Data Fig. 7 Polygenicity of PGS to protein associations.

Linkage disequilibrium (LD) blocks contributing to each PGS to protein association in polygenicity tests. Briefly, each PGS was partitioned into 1,703 approximately independent LD blocks54 then tested for association with each protein in linear regression adjusting for age, sex, 10 genotype PCs, sample measurement batch, and time between blood draw and sample processing in 3,087 INTERVAL participants. Full summary statistics including exact two-sided P-values for these tests are detailed in Supplementary Data 3,e. Next, to obtain the set of LD blocks contributing to each PGS to protein association, LD blocks were sequentially removed from the PGS in ascending order by association P-value (two-sided) until the association between resulting PGS and protein levels were attenuated (two-sided P > 0.05). Full summary statistics including exact two-sided P-values for these tests are detailed in Supplementary Data 3,f. The polygenicity of PGS to protein association (% of genome) shown on the left (and in Fig. 1c) was subsequently computed based on the sum of lengths of all contributing LD blocks (in base pairs) as a proportion of the genome. Here, associations (−log10 two-sided P-values) between protein levels and LD blocks contributing to the PGS to protein association are shown. Regions in white contain LD blocks that did not contribute to the PGS to protein association. PGS to protein associations listed in red are those explained by pQTLs (cis and/or trans) rather than polygenic.

Extended Data Fig. 8 Incident disease and PGS validity.

a) Incident disease events over the 7.7 year of follow-up in the n = 3,087 INTERVAL participants. Endpoint: incident disease definition available in INTERVAL for the relevant PGS, as defined by CALIBER phenotyping algorithms. Age of onset: median age of first hospitalisation with the respective endpoint. Numbers in brackets gives the interquartile range. b) Hazard ratio (HR) (points) and 95% confidence interval (95% CI) (horizontal bar) for 7.7 year risk of hospitalisation with the respective endpoint per standard deviation increase in the respective PGS in cox proportional hazards models using follow-up as time scale and adjusting for age, sex, 10 genotype PCs, sample measurement batch, and time between blood draw and sample processing in n = 3,087 INTERVAL participants. P-values are two-sided. c) Association between CKD PGS with estimated glomerular filtration rate (eGFR), a marker of renal function used in chronic kidney disease diagnosis: decreased eGFR is indicative of reduced renal function98. EGFR was computed from serum creatinine in n = 3,307 participants using the CKD-EPI equation99. Association was fit with linear regression adjusting for age and sex, and 10 genotype PCs. The point corresponds to the change in eGFR per standard deviation increase in CKD PGS, and the horizontal bar corresponds to the 95% CI. P-values are two-sided.

Extended Data Fig. 9 Mendelian randomisation analysis.

a) Causal effects of protein levels on disease risk estimated through two-sample Mendelian randomisation analysis of pQTL summary statistics and disease GWAS summary statistics. OR: consensus estimate of the odds ratio conferred per standard deviation increase in protein levels across five Mendelian randomisation methods. * Estimated causal effect is directionally consistent with PGS-protein associations in Fig. 1b. 95% CI: 95% confidence interval. P-value: Two-sided P-value obtained by averaging Z-scores across five Mendelian randomisation methods. Entries are greyed out where P > 0.05, and red where P < 0.0038 (Bonferroni correction for 13 tests). Pleiotropy P-value: two-sided P-value for the intercept term in Egger regression, indicating where P < 0.05 confounding of the causal estimate by horizontal pleiotropy. Full summary statistics including exact P-values are detailed in Supplementary Table 6. b) Dose response curves showing the estimated causal effect of changes in protein levels on disease risk for each protein and disease. Points on each plot show the cis-pQTLs used as genetic instruments for each test. On the x-axes, points show the standard deviation change in protein levels per copy of the minor allele in the pQTL summary statistics, and horizontal bars show + /- the standard error. On the y-axes, points show odds ratio conferred per copy of the minor allele in the GWAS summary statistics, and vertical bars indicate show + /- the standard error. Effect sizes, standard errors, and exact two-sided P-values from pQTL and GWAS summary statistics are detailed in Supplementary Table 7. The slope of the orange dashed line corresponds to the estimated causal effect (consensus Odds Ratio from a). The yellow ribbon shows the 95% confidence interval for the estimated causal effect (slope), accounting also for the 95% confidence interval for the intercept term in Egger regression.

Extended Data Fig. 10 Overlap of results with proteome-wide T2D associations in AGES-Reykjavik.

a) Contingency table tabulating the overlap in results from our study detailed in Extended Data Fig. 3 (rows) with proteome-wide significant associations with incident and prevalent T2D in AGES-Reykjavik in Gudmundsdottir et al. 202035 (columns). One-sided P-values from Fisher’s exact tests are given in each cell testing whether the overlap is greater than expected by chance. Row totals and column totals indicate the number of proteins in each row and column group, and the total overlap in proteins present in both studies (3,250) is given in the bottom right. b) For the 16 of 31 proteins nominally associated with T2D PGS in INTERVAL (Fig. 2b) and proteome-wide significant for incident T2D in AGES-Reykjavik, compares hazard ratios (points; x-axis) for incident T2D in INTERVAL (N = 27 cases over 7.7 years of follow-up in 3,087 participants) to odds ratios (points; y-axis) for incident T2D in AGES-Reykjavik (N = 112 cases after 5 years of follow-up in 2,940 participants). Cox proportional hazards models in INTERVAL were fit with follow-up as time scale, adjusting for age, sex, 10 genotype PCs, sample measurement batch, and time between blood draw and sample processing. Logistic regression in AGES-Reykjavik were fit adjusting for age and sex35. Horizontal and vertical bars correspond to the 95% confidence intervals of the hazard ratios and odds ratios respectively. Two-sided P < 0.0012 indicates association with incident T2D in INTERVAL from Fig. 2b was significant after Bonferroni correction for the 42 tested protein to disease associations. Summary statistics including exact two-sided P-values from both analyses are given in Supplementary Table 8.

Supplementary information

Supplementary Information

Supplementary Notes and Methods, references and legends for Supplementary Tables and Data.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–9.

Supplementary Data

Supplementary Data 1–4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ritchie, S.C., Lambert, S.A., Arnold, M. et al. Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases. Nat Metab 3, 1476–1483 (2021). https://doi.org/10.1038/s42255-021-00478-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42255-021-00478-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research