Abstract
Discovery of cancer risk variants in the sequence of the germline genome can shed light on carcinogenesis. Here we describe gene burden association analyses, aggregating rare missense and loss of function variants, at 22 cancer sites, including 130,991 cancer cases and 733,486 controls from Iceland, Norway and the United Kingdom. We identified four genes associated with increased cancer risk; the pro-apoptotic BIK for prostate cancer, the autophagy involved ATG12 for colorectal cancer, TG for thyroid cancer and CMTR2 for both lung cancer and cutaneous melanoma. Further, we found genes with rare variants that associate with decreased risk of cancer; AURKB for any cancer, irrespective of site, and PPP1R15A for breast cancer, suggesting that inhibition of PPP1R15A may be a preventive strategy for breast cancer. Our findings pinpoint several new cancer risk genes and emphasize autophagy, apoptosis and cell stress response as a focus point for developing new therapeutics.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
Summary statistics for all burden associations for the 23 cancer phenotypes are available as Supplementary Data. The sequence variants from the Icelandic population whole-genome sequence data have been deposited at the European Variant Archive under accession PRJEB15197. The UK Biobank data were downloaded under application 56270. Data from the UK Biobank are available by application to all bona fide researchers in the public interest at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Other data supporting the findings of this study are available within the Article or its Supplementary Information.
Code availability
We used publicly available software together with software/methods developed at deCODE genetics as described in Methods. The publicly available software are: R, v.3.6.0 to analyze data and create plots; Graphtyper v.2, https://github.com/DecodeGenetics/graphtyper; Variant Effect Predictor (release 100), https://github.com/Ensembl/ensembl-vep; IMPUTE2 v.2.3.1, https://mathgen.stats.ox.ac.uk/impute/impute_v2.html; dbSNP v.140, http://www.ncbi.nlm.nih.gov/SNP/; CADD v.1.7, https://github.com/kircherlab/CADD-scripts; dbNSFP v.4.1c, https://sites.google.com/site/jpopgen/dbNSFP; BOLT-LMM v.2.1, http://www.hsph.harvard.edu/alkes-price/software/; STAR software package, v.2.7.10, https://github.com/alexdobin/STAR; Ensembl v.87, https://www.ensembl.org/index.html; LeafCutter v.1, https://github.com/davidaknowles/leafcutter and kallisto version 0.46, https://github.com/pachterlab/kallisto.
References
Ferlay, J. et al. Cancer statistics for the year 2020: an overview. Int. J. Cancer 149, 778–789 (2021).
Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Miki, Y. et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266, 66–71 (1994).
Wooster, R. et al. Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. Science 265, 2088–2090 (1994).
Peltomäki, P. et al. Genetic mapping of a locus predisposing to human colorectal cancer. Science 260, 810–812 (1993).
Papadopoulos, N. et al. Mutation of a mutL homolog in hereditary colon cancer. Science 263, 1625–1629 (1994).
Nelson, H. D., Pappas, M., Cantor, A., Haney, E. & Holmes, R. Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: updated evidence report and systematic review for the US preventive services task force. JAMA 322, 666–685 (2019).
Lord, C. J. & Ashworth, A. PARP inhibitors: synthetic lethality in the clinic. Science 355, 1152–1158 (2017).
André, T. et al. Pembrolizumab in microsatellite-instability-high advanced colorectal cancer. N. Engl. J. Med. 383, 2207–2218 (2020).
Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Helgason, H. et al. Loss-of-function variants in ATM confer risk of gastric cancer. Nat. Genet. 47, 906–910 (2015).
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
Meijers-Heijboer, H. et al. Low penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nat. Genet. 31, 55–59 (2002).
Rahman, N. et al. PALB2, which encodes a BRCA2-interacting protein, is a breast cancer susceptibility gene. Nat. Genet. 39, 165–167 (2007).
Renwick, A. et al. ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles. Nat. Genet. 38, 873–875 (2006).
Thai, T. Mutations in the BRCA1-associated RING domain (BARD1) gene in primary breast, ovarian and uterine cancers. Hum. Mol. Genet. 7, 195–202 (1998).
Hu, C. et al. A population-based study of genes previously implicated in breast cancer. N. Engl. J. Med. 384, 440–451 (2021).
Breast Cancer Association Consortiumet al. Breast cancer risk genes—association analysis in more than 113,000 women. N. Engl. J. Med. 384, 428–439 (2021).
Dong, X. et al. Mutations in CHEK2 associated with prostate cancer risk. Am. J. Hum. Genet. 72, 270–280 (2003).
Ewing, C. M. et al. Germline mutations in HOXB13 and prostate-cancer risk. N. Engl. J. Med. 366, 141–149 (2012).
Schaid, D. J. et al. Two-stage study of familial prostate cancer by whole-exome sequencing and custom capture identifies 10 novel genes associated with the risk of prostate cancer. Eur. Urol. 79, 353–361 (2021).
Sigurdsson, S. et al. BRCA2 mutation in Icelandic prostate cancer patients. J. Mol. Med. 75, 758–761 (1997).
Fishel, R. et al. The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 75, 1027–1038 (1993).
Sparks, A. B., Morin, P. J., Vogelstein, B. & Kinzler, K. W. Mutational analysis of the APC/β-catenin/Tcf pathway in colorectal cancer. Cancer Res. 58, 1130–1134 (1998).
Bronner, C. E. et al. Mutation in the DNA mismatch repair gene homologue hMLH1 is associated with hereditary non-polyposis colon cancer. Nature 368, 258–261 (1994).
Senter, L. et al. The clinical phenotype of lynch syndrome due to germ-line PMS2 mutations. Gastroenterology 135, 419–428.e1 (2008).
Papadopoulos, N. et al. Mutations of GTBP in genetically unstable cells. Science 268, 1915–1917 (1995).
Rafnar, T. et al. Mutations in BRIP1 confer high risk of ovarian cancer. Nat. Genet. 43, 1104–1107 (2011).
The Breast Cancer Linkage Consortium. Cancer risks in BRCA2 mutation carriers. J. Natl Cancer Inst. 91, 1310–1316 (1999).
Landi, M. T. et al. Genome-wide association meta-analyses combining multiple risk phenotypes provides insights into the genetic architecture of cutaneous melanoma susceptibility. Nat. Genet. 52, 494–504 (2020).
Hussussian, C. J. et al. Germline p16 mutations in familial melanoma. Nat. Genet. 8, 15–21 (1994).
Olafsdottir, T. et al. Loss-of-function variants in the tumor-suppressor gene PTPN14 confer increased cancer risk. Cancer Res. 81, 1954–1964 (2021).
Stankovic, S. et al. Genetic links between ovarian ageing, cancer risk and de novo mutation rates.Nature 633, 608–614 (2024).
Wilcox, N. et al. Exome sequencing identifies breast cancer susceptibility genes and defines the contribution of coding variants to breast cancer risk. Nat. Genet. 55, 1435–1439 (2023).
Brush, M. H., Weiser, D. C. & Shenolikar, S. Growth arrest and DNA damage-inducible protein GADD34 targets protein phosphatase 1 alpha to the endoplasmic reticulum and promotes dephosphorylation of the alpha subunit of eukaryotic translation initiation factor 2. Mol. Cell. Biol. 23, 1292–1303 (2003).
Chinnadurai, G., Vijayalingam, S. & Rashmi, R. BIK, the founding member of the BH3-only family proteins: mechanisms of cell death and role in cancer and pathogenic processes. Oncogene 27, S20–S29 (2008).
Boyd, J. M. et al. Bik, a novel death-inducing protein shares a distinct sequence motif with Bcl-2 family proteins and interacts with viral and cellular survival-promoting proteins. Oncogene 11, 1921–1928 (1995).
Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 50, 928–936 (2018).
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Tong, Y. et al. The pro-apoptotic protein, Bik, exhibits potent antitumor activity that is dependent on its BH3 domain1. Mol. Cancer Ther. 2, 95–102 (2001).
Kristmundsdóttir, S., Sigurpálsdóttir, B. D., Kehr, B. & Halldórsson, B. V. popSTR: population-scale detection of STR variants. Bioinformatics 33, 4041–4048 (2017).
Rubinstein, A. D., Eisenstein, M., Ber, Y., Bialik, S. & Kimchi, A. The autophagy protein Atg12 associates with antiapoptotic Bcl-2 family members to promote mitochondrial apoptosis. Mol. Cell 44, 698–709 (2011).
Smietanski, M. et al. Structural analysis of human 2′-O-ribose methyltransferases involved in mRNA cap structure formation. Nat. Commun. 5, 3004 (2014).
Haussmann, I. U. et al. CMTr cap-adjacent 2′-O-ribose mRNA methyltransferases are required for reward learning and mRNA localization to synapses. Nat. Commun. 13, 1209 (2022).
Campbell, J. D. et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat. Genet. 48, 607–616 (2016).
Zengin, T. & Önal-Süzek, T. Analysis of genomic and transcriptomic variations as prognostic signature for lung adenocarcinoma. BMC Bioinformatics 21, 368 (2020).
Rubio, I. G. S. & Medeiros-Neto, G. Mutations of the thyroglobulin gene and its relevance to thyroid disorders. Curr. Opin. Endocrinol. Diabetes Obes. 16, 373–378 (2009).
Targovnik, H. M., Esperante, S. A. & Rivolta, C. M. Genetics and phenomics of hypothyroidism and goiter due to thyroglobulin mutations. Mol. Cell. Endocrinol. 322, 44–55 (2010).
Hishinuma, A., Fukata, S., Kakudo, K., Murata, Y. & Ieiri, T. High incidence of thyroid cancer in long-standing goiters with thyroglobulin mutations. Thyroid. J. Am. Thyroid Assoc. 15, 1079–1084 (2005).
Yoon, J. H., Hong, A. R., Kim, H. K. & Kang, H.-C. Anaplastic thyroid cancer arising from dyshormonogenetic goiter: c.3070T>C and novel c.7070T>C mutation in the thyroglobulin gene. Thyroid 30, 1676–1680 (2020).
Alzahrani, A. S., Baitei, E. Y., Zou, M. & Shi, Y. Clinical case seminar: metastatic follicular thyroid carcinoma arising from congenital goiter as a result of a novel splice donor site mutation in the thyroglobulin gene. J. Clin. Endocrinol. Metab. 91, 740–746 (2006).
Honda, R., Körner, R. & Nigg, E. A. Exploring the functional interactions between Aurora B, INCENP, and survivin in mitosis. Mol. Biol. Cell 14, 3325–3341 (2003).
Marciniak, S. J., Chambers, J. E. & Ron, D. Pharmacological targeting of endoplasmic reticulum stress in disease. Nat. Rev. Drug Discov. 21, 115–140 (2022).
Harding, H. P. et al. An integrated stress response regulates amino acid metabolism and resistance to oxidative stress. Mol. Cell 11, 619–633 (2003).
Novoa, I., Zeng, H., Harding, H. P. & Ron, D. Feedback inhibition of the unfolded protein response by GADD34-mediated dephosphorylation of eIF2alpha. J. Cell Biol. 153, 1011–1022 (2001).
Licari, E., Sánchez-del-Campo, L. & Falletta, P. The two faces of the integrated stress response in cancer progression and therapeutic strategies. Int. J. Biochem. Cell Biol. 139, 106059 (2021).
Donzé, O., Jagus, R., Koromilas, A. E., Hershey, J. W. & Sonenberg, N. Abrogation of translation initiation factor eIF-2 phosphorylation causes malignant transformation of NIH 3T3 cells. EMBO J. 14, 3828–3834 (1995).
Tian, X. et al. Targeting the integrated stress response in cancer therapy. Front. Pharmacol. 12, 747837 (2021).
Sengupta, S., Sevigny, C. M., Bhattacharya, P., Jordan, V. C. & Clarke, R. Estrogen induced apoptosis in breast cancers is phenocopied by blocking dephosphorylation of eukaryotic initiation factor 2 alpha (eIF2α) protein. Mol. Cancer Res. 17, 918–928 (2019).
Hamamura, K. et al. Attenuation of malignant phenotypes of breast cancer cells through eIF2α-mediated downregulation of Rac1 signaling. Int. J. Oncol. 44, 1980–1988 (2014).
Singh, N., Romick-Rosendale, L., Watanabe-Chailland, M., Vinnedge, L. M. P. & Komurov, K. Drug resistance mechanisms create targetable proteostatic vulnerabilities in Her2+ breast cancers. PLoS ONE 17, e0256788 (2022).
Wang, Y. et al. The unfolded protein response induces the angiogenic switch in human tumor cells through the PERK/ATF4 pathway. Cancer Res. 72, 5396–5406 (2012).
García-Jiménez, C. & Goding, C. R. Starvation and pseudo-starvation as drivers of cancer metastasis through translation reprogramming. Cell Metab. 29, 254–267 (2019).
THE GTEX CONSORTIUM. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Cosson, P., Perrin, J. & Bonifacino, J. S. Anchors aweigh: protein localization and transport mediated by transmembrane domains. Trends Cell Biol. 23, 511–517 (2013).
Singh, S. & Mittal, A. Transmembrane domain lengths serve as signatures of organismal complexity and viral transport mechanisms. Sci. Rep. 6, 22352 (2016).
Sharpe, H. J., Stevens, T. J. & Munro, S. A comprehensive comparison of transmembrane domains reveals organelle-specific properties. Cell 142, 158–169 (2010).
Mizushima, N., Sugita, H., Yoshimori, T. & Ohsumi, Y. A new protein conjugation system in human. The counterpart of the yeast Apg12p conjugation system essential for autophagy. J. Biol. Chem. 273, 33889–33892 (1998).
Mizushima, N. & Komatsu, M. Autophagy: renovation of cells and tissues. Cell 147, 728–741 (2011).
Li, X., He, S. & Ma, B. Autophagy and autophagy-related proteins in cancer. Mol. Cancer 19, 12 (2020).
Yoo, B. H. et al. Oncogenic RAS-induced downregulation of ATG12 is required for survival of malignant intestinal epithelial cells. Autophagy 14, 134–151 (2018).
Cai, H. et al. A functional taxonomy of tumor suppression in oncogenic KRAS-driven lung cancer. Cancer Discov. 11, 1754–1773 (2021).
Rafnar, T. et al. Association of BRCA2 K3326* with small cell lung cancer and squamous cell cancer of the skin. J. Natl Cancer Inst. 110, 967–974 (2018).
Du, R., Huang, C., Liu, K., Li, X. & Dong, Z. Targeting AURKA in cancer: molecular mechanisms and opportunities for cancer therapy. Mol. Cancer 20, 15 (2021).
Borah, N. A. & Reddy, M. M. Aurora kinase B inhibition: a potential therapeutic strategy for cancer. Molecules 26, 1981 (2021).
Tulinius, H. et al. The effect of a single BRCA2 mutation on cancer in Iceland. J. Med. Genet. 39, 457–462 (2002).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Nielsen, F. C., van Overeem Hansen, T. & Sørensen, C. S. Hereditary breast and ovarian cancer: new genes in confined pathways. Nat. Rev. Cancer 16, 599–612 (2016).
Peltomäki, P. Lynch syndrome genes. Fam. Cancer 4, 227–232 (2005).
Rafnar, T. et al. Variants associating with uterine leiomyoma highlight genetic background shared by various cancers and hormone-related traits. Nat. Commun. 9, 3636 (2018).
Sigurdardottir, L. G. et al. Data quality at the Icelandic Cancer Registry: comparability, validity, timeliness and completeness. Acta Oncol. 51, 880–889 (2012).
Thorgeirsson, T. E. et al. A rare missense mutation in CHRNA4 associates with smoking behavior and its consequences. Mol. Psychiatry 21, 594–600 (2016).
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Møller, P. et al. Genetic epidemiology of BRCA1 mutations in Norway. Eur. J. Cancer 37, 2428–2434 (2001).
Mattingsdal, M. et al. The genetic structure of Norway. Eur. J. Hum. Genet. 29, 1710–1718 (2021).
Magnus, P. et al. Cohort profile: the Norwegian mother and child cohort study (MoBa). Int. J. Epidemiol. 35, 1146–1150 (2006).
Magnus, P. et al. Cohort profile update: the Norwegian mother and child cohort study (MoBa). Int. J. Epidemiol. 45, 382–388 (2016).
Corfield, E. C. et al. The Norwegian mother, father, and child cohort study (MoBa) genotyping data resource: MoBaPsychGen pipeline v.1. Preprint at bioRXiv https://doi.org/10.1101/2022.06.23.496289 (2022).
Miller, D. T. et al. ACMG SF v3.2 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 25, 100866 (2023).
Sørensen, E. et al. Data resource profile: the Copenhagen Hospital Biobank (CHB). Int. J. Epidemiol. 50, 719–720e (2021).
Erikstrup, C. et al. Cohort profile: the Danish blood donor study. Int. J. Epidemiol. 52, e162–e171 (2023).
Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).
Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci. Data 4, 170115 (2017).
Eggertsson, H. P. & Halldorsson, B. V. read_haps: using read haplotypes to detect same species contamination in DNA sequences. Bioinforma. Oxf. Engl. 37, 2215–2217 (2021).
Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).
Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Liu, X., Wu, C., Li, C. & Boerwinkle, E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum. Mutat. 37, 235–241 (2016).
Liu, X., Li, C., Mou, C., Dong, Y. & Tu, Y. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 12, 103 (2020).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Zhou, W. et al. SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests. Nat. Genet. 54, 1466–1469 (2022).
Mantel, N. & Haenszel, W. Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl Cancer Inst. 22, 719–748 (1959).
Higgins, J. P. T. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002).
Sveinbjornsson, G. et al. Multiomics study of nonalcoholic fatty liver disease. Nat. Genet. 54, 1652–1663 (2022).
Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49, D884–D891 (2021).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).
Acknowledgements
We would like to thank all study participants for their valuable contribution. We also thank all our colleagues who contributed to data and sample collecting and genotyping. We are grateful to all the participating families in Norway who take part in this ongoing cohort study. This work was partly performed on the TSD (Services for Sensitive Data) facilities, owned by the University of Oslo, operated and developed by the TSD service group at the University of Oslo, IT-Department (USIT). The Norwegian Mother, Father and Child Cohort Study is supported by the Norwegian Ministry of Health and Care Services and the Ministry of Education and Research. We thank the Norwegian Institute of Public Health for generating high-quality genomic data. This research is part of the HARVEST collaboration, supported by the Research Council of Norway (RCN 229624). We also thank the NORMENT Centre for providing genotype data, funded by the RCN (223273), South East Norway Health Authorities and Stiftelsen Kristian Gerhard Jebsen (SKGJ). We further thank the Center for Diabetes Research, the University of Bergen for providing genotype data funded by the ERC AdG project SELECTionPREDISPOSED, SKGJ, Trond Mohn Foundation, the RCN, the Novo Nordisk Foundation, the University of Bergen and the Western Norway Health Authorities. S.B. acknowledges the Novo Nordisk Foundation (grant nos. NNF17OC0027594 and NNF14CC0001).
Author information
Authors and Affiliations
Consortia
Contributions
E.V.I., J.G., P.S., T.R., D.F.G. and K.S. designed the study. E.V.I., J.G., V.T., G.S., S.K., G.H.H., M.I.M., A.O., G.B.W., A.S., D.B., G. Thorleifsson, B.H., P. Melsted and D.F.G. analyzed the data and interpreted the results. J.G., S.N.S., S.S., H.S., I.J., E.S., O.B.P., C.E., M.B., M.P., A.R., H.V.S., I.G., J. Hillingsø, S.E.B., U.L., E. Høgdall, H.U., S.B., S.R.O., I.E.S., O.F., S.D., A.H., P. Moller, M.D.-V., J. Haavik, O.A.A., E. Hovig, B.A.A., R.H., O.T.J., T.V., S.J., P.H.M., J.H.O., B.S., J.G.J., G. Tryggvason, H.H., T.R. and K.S. performed recruitment, phenotyping and reference data. E.V.I., J.G., T.R., D.F.G. and K.S. drafted the manuscript with input and comments from V.T., G.S., S.K., S.N.S., G.H., S.S., B.H. and P.S. All authors contributed to the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
E.V.I., J.G., V.T., G.S, S.K., S.N.S., G.H.H., M.I.M., A.O., G.B.W., A.S., S.S., D.B., G. Thorleifsson, B.H., P. Melsted., H.S., I.J., H.H., P.S. T.R., D.F.G. and K.S. are employees at deCODE genetics/Amgen, Inc. O.A.A. is a consultant to Cortechs.ai. C.E. has obtained unrestricted research grants from Abbott Diagnostics and Novo Nordisk A/S with no personal fees. S.B. has ownerships in Intomics A/S, Hoba Therapeutics Aps, Novo Nordisk A/S, Lundbeck A/S, ALK abello A/S, Eli Lilly and Co. and managing board memberships in Proscion A/S and Intomics A/S. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Douglas Easton and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Visual representation for the 34 genes identified to associate with cancer.
The genes in the blue circle were identified through burden of LOF variants, and the genes in the green circle were identified through burden of LOF+missense variants. Ten genes were found to associate with cancer through both burden of LOF and LOF+missense variants. Newly identified genes are colored in red.
Extended Data Fig. 2 Burden association results for cancer sites not shown in Fig. 2.
The effects (log(OR)) from both LOF and LOF+missense burden associations are shown with dots for genes, that associated significantly after correcting for multiple testing (P<1.3×10−6), using at least one of the variant selection methods for; basal cell carcinoma of the skin, cervical cancer, cutaneous melanoma, endometrial cancer, gastric cancer, head and neck cancer, kidney cancer, ovarian cancer, pancreatic cancer, squamous cell carcinoma of the skin. The color indicates the variant selection method; blue for LOF variants and green for LOF+missense variants. A logistic regression was used to test for the association and the two-sided P-values were obtained from a likelihood ratio test (Supplementary Table 2). The error bars represent the 95% confidence intervals for the estimated effects.
Extended Data Fig. 3 Manhattan plots.
Manhattan plots showing the association results from the gene-based burden test for; (a) breast cancer, (b) prostate cancer, (c) colorectal cancer, (d) lung cancer, (e) thyroid cancer and (f) cutaneous melanoma. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The point shape indicates the selected variants for the burden test; round dots for LOF variants and triangles for LOF and selected moderate impact variants. Genes that do not have rare coding variants previously reported for cancer are colored in red. The grey line represents the significance threshold correcting for multiple testing, 1.3×10−6.
Extended Data Fig. 4 The cumulative incidence of breast-, prostate- and colorectal cancer.
The cumulative incidence curves, 1-S(t), where S(t) is the survival function estimated with the Kaplan-Meier method, are shown for non-carriers and carriers for all genes that associate significantly in the burden association scan with a-b. breast cancer, c-d. prostate cancer and e-f. colorectal cancer. In a, c, e the cumulative incidence is shown among chip-typed individuals in Iceland (N=173,025) and in b, d, f the cumulative incidence is shown among white British/Irish individuals in the UK Biobank (N=431,079).
Extended Data Fig. 5 Associations between microsatellite alleles in BIK and prostate cancer.
Shown are the effect sizes (log(OR)) against allele lengths, for alleles that have at least 100 carriers (squares). Logistic regression and two-sided likelihood ratio tests were used to test for associations (Supplementary Table 8). The error bars represent the 95% confidence intervals. The grey line marks the 15-repeat reference allele length. Alleles were found with popSTR in the UK Biobank (blue) and Iceland (green). Additionally, two inframe deletions detected in FinnGen, rs965427251 and rs759887547, corresponding to allele length of 5 and 9 repeats, are included in the figure (red). Weighted linear regression was performed separately for alleles with ≥ 15 repeats, that is insertions, and alleles with ≤ 15 repeats, that is deletions, weighted by f*(1-f) where f is the allele frequency. The regression lines and P-values are shown in purple for deletions and in yellow for insertions.
Extended Data Fig. 6 Statistical power.
The statistical power (squares) to detect an association between a rare burden genotype and a disease, estimated with simulations for different carrier frequency and different odds ratios (OR), assuming a sample size of 200,000 individuals and 6% disease prevalence. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The power is defined as the rate of associations with P<0.05 divided by the total number of tests across 1000 simulations.
Extended Data Fig. 7 Manhattan plots for cancers combined.
Manhattan plots for burden association results with cancers combined, irrespective of site, for (a) all results and (b) results with P-value>1×10−10. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The grey line represents the significance threshold correcting for multiple testing, 1.3×10−6.
Extended Data Fig. 8 The cumulative incidence of all cancers combined.
The cumulative incidence curves are shown for all cancers combined, irrespective of site, among carriers of LOF+missense variants in AURKB in blue and non-carriers in red, for (a) chip-typed individuals in Iceland (N=173,025) and (b) white British/Irish individuals in the UK Biobank (N=431,079).
Extended Data Fig. 9 Survival probability.
Kaplan-Meier survival curves comparing survival probability for cancer cases in the UK Biobank for carriers of LOF+missense variants in AURKB in blue and non-carriers in red. The x-axis shows time from diagnoses to death in years.
Supplementary information
Supplementary Information
Supplementary Notes 1–5, Figs. 1–9 and References.
Supplementary Tables 1–17
The first tab includes a title and description for each Supplementary Table.
Supplementary Data
Burden association results for 22 cancer phenotypes.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ivarsdottir, E.V., Gudmundsson, J., Tragante, V. et al. Gene-based burden tests of rare germline variants identify six cancer susceptibility genes. Nat Genet 56, 2422–2433 (2024). https://doi.org/10.1038/s41588-024-01966-6
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41588-024-01966-6