Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Gene-based burden tests of rare germline variants identify six cancer susceptibility genes

Abstract

Discovery of cancer risk variants in the sequence of the germline genome can shed light on carcinogenesis. Here we describe gene burden association analyses, aggregating rare missense and loss of function variants, at 22 cancer sites, including 130,991 cancer cases and 733,486 controls from Iceland, Norway and the United Kingdom. We identified four genes associated with increased cancer risk; the pro-apoptotic BIK for prostate cancer, the autophagy involved ATG12 for colorectal cancer, TG for thyroid cancer and CMTR2 for both lung cancer and cutaneous melanoma. Further, we found genes with rare variants that associate with decreased risk of cancer; AURKB for any cancer, irrespective of site, and PPP1R15A for breast cancer, suggesting that inhibition of PPP1R15A may be a preventive strategy for breast cancer. Our findings pinpoint several new cancer risk genes and emphasize autophagy, apoptosis and cell stress response as a focus point for developing new therapeutics.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Study design and summary of results.
Fig. 2: Burden association results.
Fig. 3: Cumulative incidence of cancer among carriers and noncarriers.
Fig. 4: The p.S87G variant in BIK results in exon skipping.

Similar content being viewed by others

Data availability

Summary statistics for all burden associations for the 23 cancer phenotypes are available as Supplementary Data. The sequence variants from the Icelandic population whole-genome sequence data have been deposited at the European Variant Archive under accession PRJEB15197. The UK Biobank data were downloaded under application 56270. Data from the UK Biobank are available by application to all bona fide researchers in the public interest at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Other data supporting the findings of this study are available within the Article or its Supplementary Information.

Code availability

We used publicly available software together with software/methods developed at deCODE genetics as described in Methods. The publicly available software are: R, v.3.6.0 to analyze data and create plots; Graphtyper v.2, https://github.com/DecodeGenetics/graphtyper; Variant Effect Predictor (release 100), https://github.com/Ensembl/ensembl-vep; IMPUTE2 v.2.3.1, https://mathgen.stats.ox.ac.uk/impute/impute_v2.html; dbSNP v.140, http://www.ncbi.nlm.nih.gov/SNP/; CADD v.1.7, https://github.com/kircherlab/CADD-scripts; dbNSFP v.4.1c, https://sites.google.com/site/jpopgen/dbNSFP; BOLT-LMM v.2.1, http://www.hsph.harvard.edu/alkes-price/software/; STAR software package, v.2.7.10, https://github.com/alexdobin/STAR; Ensembl v.87, https://www.ensembl.org/index.html; LeafCutter v.1, https://github.com/davidaknowles/leafcutter and kallisto version 0.46, https://github.com/pachterlab/kallisto.

References

  1. Ferlay, J. et al. Cancer statistics for the year 2020: an overview. Int. J. Cancer 149, 778–789 (2021).

    Article  CAS  Google Scholar 

  2. Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).

    Article  PubMed  Google Scholar 

  3. Miki, Y. et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266, 66–71 (1994).

    Article  CAS  PubMed  Google Scholar 

  4. Wooster, R. et al. Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. Science 265, 2088–2090 (1994).

    Article  CAS  PubMed  Google Scholar 

  5. Peltomäki, P. et al. Genetic mapping of a locus predisposing to human colorectal cancer. Science 260, 810–812 (1993).

    Article  PubMed  Google Scholar 

  6. Papadopoulos, N. et al. Mutation of a mutL homolog in hereditary colon cancer. Science 263, 1625–1629 (1994).

    Article  CAS  PubMed  Google Scholar 

  7. Nelson, H. D., Pappas, M., Cantor, A., Haney, E. & Holmes, R. Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: updated evidence report and systematic review for the US preventive services task force. JAMA 322, 666–685 (2019).

    Article  PubMed  Google Scholar 

  8. Lord, C. J. & Ashworth, A. PARP inhibitors: synthetic lethality in the clinic. Science 355, 1152–1158 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. André, T. et al. Pembrolizumab in microsatellite-instability-high advanced colorectal cancer. N. Engl. J. Med. 383, 2207–2218 (2020).

    Article  PubMed  Google Scholar 

  10. Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Helgason, H. et al. Loss-of-function variants in ATM confer risk of gastric cancer. Nat. Genet. 47, 906–910 (2015).

    Article  CAS  PubMed  Google Scholar 

  13. Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).

    Article  CAS  PubMed  Google Scholar 

  14. Meijers-Heijboer, H. et al. Low penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nat. Genet. 31, 55–59 (2002).

    Article  CAS  PubMed  Google Scholar 

  15. Rahman, N. et al. PALB2, which encodes a BRCA2-interacting protein, is a breast cancer susceptibility gene. Nat. Genet. 39, 165–167 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Renwick, A. et al. ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles. Nat. Genet. 38, 873–875 (2006).

    Article  CAS  PubMed  Google Scholar 

  17. Thai, T. Mutations in the BRCA1-associated RING domain (BARD1) gene in primary breast, ovarian and uterine cancers. Hum. Mol. Genet. 7, 195–202 (1998).

    Article  CAS  PubMed  Google Scholar 

  18. Hu, C. et al. A population-based study of genes previously implicated in breast cancer. N. Engl. J. Med. 384, 440–451 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Breast Cancer Association Consortiumet al. Breast cancer risk genes—association analysis in more than 113,000 women. N. Engl. J. Med. 384, 428–439 (2021).

    Article  Google Scholar 

  20. Dong, X. et al. Mutations in CHEK2 associated with prostate cancer risk. Am. J. Hum. Genet. 72, 270–280 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Ewing, C. M. et al. Germline mutations in HOXB13 and prostate-cancer risk. N. Engl. J. Med. 366, 141–149 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Schaid, D. J. et al. Two-stage study of familial prostate cancer by whole-exome sequencing and custom capture identifies 10 novel genes associated with the risk of prostate cancer. Eur. Urol. 79, 353–361 (2021).

    Article  CAS  PubMed  Google Scholar 

  23. Sigurdsson, S. et al. BRCA2 mutation in Icelandic prostate cancer patients. J. Mol. Med. 75, 758–761 (1997).

    Article  CAS  PubMed  Google Scholar 

  24. Fishel, R. et al. The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 75, 1027–1038 (1993).

    Article  CAS  PubMed  Google Scholar 

  25. Sparks, A. B., Morin, P. J., Vogelstein, B. & Kinzler, K. W. Mutational analysis of the APC/β-catenin/Tcf pathway in colorectal cancer. Cancer Res. 58, 1130–1134 (1998).

    CAS  PubMed  Google Scholar 

  26. Bronner, C. E. et al. Mutation in the DNA mismatch repair gene homologue hMLH1 is associated with hereditary non-polyposis colon cancer. Nature 368, 258–261 (1994).

    Article  CAS  PubMed  Google Scholar 

  27. Senter, L. et al. The clinical phenotype of lynch syndrome due to germ-line PMS2 mutations. Gastroenterology 135, 419–428.e1 (2008).

    Article  CAS  PubMed  Google Scholar 

  28. Papadopoulos, N. et al. Mutations of GTBP in genetically unstable cells. Science 268, 1915–1917 (1995).

    Article  CAS  PubMed  Google Scholar 

  29. Rafnar, T. et al. Mutations in BRIP1 confer high risk of ovarian cancer. Nat. Genet. 43, 1104–1107 (2011).

    Article  CAS  PubMed  Google Scholar 

  30. The Breast Cancer Linkage Consortium. Cancer risks in BRCA2 mutation carriers. J. Natl Cancer Inst. 91, 1310–1316 (1999).

    Article  Google Scholar 

  31. Landi, M. T. et al. Genome-wide association meta-analyses combining multiple risk phenotypes provides insights into the genetic architecture of cutaneous melanoma susceptibility. Nat. Genet. 52, 494–504 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Hussussian, C. J. et al. Germline p16 mutations in familial melanoma. Nat. Genet. 8, 15–21 (1994).

    Article  CAS  PubMed  Google Scholar 

  33. Olafsdottir, T. et al. Loss-of-function variants in the tumor-suppressor gene PTPN14 confer increased cancer risk. Cancer Res. 81, 1954–1964 (2021).

    Article  CAS  PubMed  Google Scholar 

  34. Stankovic, S. et al. Genetic links between ovarian ageing, cancer risk and de novo mutation rates.Nature 633, 608–614 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Wilcox, N. et al. Exome sequencing identifies breast cancer susceptibility genes and defines the contribution of coding variants to breast cancer risk. Nat. Genet. 55, 1435–1439 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Brush, M. H., Weiser, D. C. & Shenolikar, S. Growth arrest and DNA damage-inducible protein GADD34 targets protein phosphatase 1 alpha to the endoplasmic reticulum and promotes dephosphorylation of the alpha subunit of eukaryotic translation initiation factor 2. Mol. Cell. Biol. 23, 1292–1303 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Chinnadurai, G., Vijayalingam, S. & Rashmi, R. BIK, the founding member of the BH3-only family proteins: mechanisms of cell death and role in cancer and pathogenic processes. Oncogene 27, S20–S29 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Boyd, J. M. et al. Bik, a novel death-inducing protein shares a distinct sequence motif with Bcl-2 family proteins and interacts with viral and cellular survival-promoting proteins. Oncogene 11, 1921–1928 (1995).

    CAS  PubMed  Google Scholar 

  39. Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 50, 928–936 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Tong, Y. et al. The pro-apoptotic protein, Bik, exhibits potent antitumor activity that is dependent on its BH3 domain1. Mol. Cancer Ther. 2, 95–102 (2001).

    Google Scholar 

  42. Kristmundsdóttir, S., Sigurpálsdóttir, B. D., Kehr, B. & Halldórsson, B. V. popSTR: population-scale detection of STR variants. Bioinformatics 33, 4041–4048 (2017).

    Article  PubMed  Google Scholar 

  43. Rubinstein, A. D., Eisenstein, M., Ber, Y., Bialik, S. & Kimchi, A. The autophagy protein Atg12 associates with antiapoptotic Bcl-2 family members to promote mitochondrial apoptosis. Mol. Cell 44, 698–709 (2011).

    Article  CAS  PubMed  Google Scholar 

  44. Smietanski, M. et al. Structural analysis of human 2′-O-ribose methyltransferases involved in mRNA cap structure formation. Nat. Commun. 5, 3004 (2014).

    Article  PubMed  Google Scholar 

  45. Haussmann, I. U. et al. CMTr cap-adjacent 2′-O-ribose mRNA methyltransferases are required for reward learning and mRNA localization to synapses. Nat. Commun. 13, 1209 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Campbell, J. D. et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat. Genet. 48, 607–616 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zengin, T. & Önal-Süzek, T. Analysis of genomic and transcriptomic variations as prognostic signature for lung adenocarcinoma. BMC Bioinformatics 21, 368 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Rubio, I. G. S. & Medeiros-Neto, G. Mutations of the thyroglobulin gene and its relevance to thyroid disorders. Curr. Opin. Endocrinol. Diabetes Obes. 16, 373–378 (2009).

    Article  CAS  PubMed  Google Scholar 

  49. Targovnik, H. M., Esperante, S. A. & Rivolta, C. M. Genetics and phenomics of hypothyroidism and goiter due to thyroglobulin mutations. Mol. Cell. Endocrinol. 322, 44–55 (2010).

    Article  CAS  PubMed  Google Scholar 

  50. Hishinuma, A., Fukata, S., Kakudo, K., Murata, Y. & Ieiri, T. High incidence of thyroid cancer in long-standing goiters with thyroglobulin mutations. Thyroid. J. Am. Thyroid Assoc. 15, 1079–1084 (2005).

    Article  CAS  Google Scholar 

  51. Yoon, J. H., Hong, A. R., Kim, H. K. & Kang, H.-C. Anaplastic thyroid cancer arising from dyshormonogenetic goiter: c.3070T>C and novel c.7070T>C mutation in the thyroglobulin gene. Thyroid 30, 1676–1680 (2020).

    Article  CAS  PubMed  Google Scholar 

  52. Alzahrani, A. S., Baitei, E. Y., Zou, M. & Shi, Y. Clinical case seminar: metastatic follicular thyroid carcinoma arising from congenital goiter as a result of a novel splice donor site mutation in the thyroglobulin gene. J. Clin. Endocrinol. Metab. 91, 740–746 (2006).

    Article  CAS  PubMed  Google Scholar 

  53. Honda, R., Körner, R. & Nigg, E. A. Exploring the functional interactions between Aurora B, INCENP, and survivin in mitosis. Mol. Biol. Cell 14, 3325–3341 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Marciniak, S. J., Chambers, J. E. & Ron, D. Pharmacological targeting of endoplasmic reticulum stress in disease. Nat. Rev. Drug Discov. 21, 115–140 (2022).

    Article  CAS  PubMed  Google Scholar 

  55. Harding, H. P. et al. An integrated stress response regulates amino acid metabolism and resistance to oxidative stress. Mol. Cell 11, 619–633 (2003).

    Article  CAS  PubMed  Google Scholar 

  56. Novoa, I., Zeng, H., Harding, H. P. & Ron, D. Feedback inhibition of the unfolded protein response by GADD34-mediated dephosphorylation of eIF2alpha. J. Cell Biol. 153, 1011–1022 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Licari, E., Sánchez-del-Campo, L. & Falletta, P. The two faces of the integrated stress response in cancer progression and therapeutic strategies. Int. J. Biochem. Cell Biol. 139, 106059 (2021).

    Article  CAS  PubMed  Google Scholar 

  58. Donzé, O., Jagus, R., Koromilas, A. E., Hershey, J. W. & Sonenberg, N. Abrogation of translation initiation factor eIF-2 phosphorylation causes malignant transformation of NIH 3T3 cells. EMBO J. 14, 3828–3834 (1995).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Tian, X. et al. Targeting the integrated stress response in cancer therapy. Front. Pharmacol. 12, 747837 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Sengupta, S., Sevigny, C. M., Bhattacharya, P., Jordan, V. C. & Clarke, R. Estrogen induced apoptosis in breast cancers is phenocopied by blocking dephosphorylation of eukaryotic initiation factor 2 alpha (eIF2α) protein. Mol. Cancer Res. 17, 918–928 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Hamamura, K. et al. Attenuation of malignant phenotypes of breast cancer cells through eIF2α-mediated downregulation of Rac1 signaling. Int. J. Oncol. 44, 1980–1988 (2014).

    Article  CAS  PubMed  Google Scholar 

  62. Singh, N., Romick-Rosendale, L., Watanabe-Chailland, M., Vinnedge, L. M. P. & Komurov, K. Drug resistance mechanisms create targetable proteostatic vulnerabilities in Her2+ breast cancers. PLoS ONE 17, e0256788 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Wang, Y. et al. The unfolded protein response induces the angiogenic switch in human tumor cells through the PERK/ATF4 pathway. Cancer Res. 72, 5396–5406 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. García-Jiménez, C. & Goding, C. R. Starvation and pseudo-starvation as drivers of cancer metastasis through translation reprogramming. Cell Metab. 29, 254–267 (2019).

    Article  PubMed  Google Scholar 

  65. THE GTEX CONSORTIUM. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  PubMed Central  Google Scholar 

  66. Cosson, P., Perrin, J. & Bonifacino, J. S. Anchors aweigh: protein localization and transport mediated by transmembrane domains. Trends Cell Biol. 23, 511–517 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Singh, S. & Mittal, A. Transmembrane domain lengths serve as signatures of organismal complexity and viral transport mechanisms. Sci. Rep. 6, 22352 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Sharpe, H. J., Stevens, T. J. & Munro, S. A comprehensive comparison of transmembrane domains reveals organelle-specific properties. Cell 142, 158–169 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Mizushima, N., Sugita, H., Yoshimori, T. & Ohsumi, Y. A new protein conjugation system in human. The counterpart of the yeast Apg12p conjugation system essential for autophagy. J. Biol. Chem. 273, 33889–33892 (1998).

    Article  CAS  PubMed  Google Scholar 

  70. Mizushima, N. & Komatsu, M. Autophagy: renovation of cells and tissues. Cell 147, 728–741 (2011).

    Article  CAS  PubMed  Google Scholar 

  71. Li, X., He, S. & Ma, B. Autophagy and autophagy-related proteins in cancer. Mol. Cancer 19, 12 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Yoo, B. H. et al. Oncogenic RAS-induced downregulation of ATG12 is required for survival of malignant intestinal epithelial cells. Autophagy 14, 134–151 (2018).

    Article  CAS  PubMed  Google Scholar 

  73. Cai, H. et al. A functional taxonomy of tumor suppression in oncogenic KRAS-driven lung cancer. Cancer Discov. 11, 1754–1773 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Rafnar, T. et al. Association of BRCA2 K3326* with small cell lung cancer and squamous cell cancer of the skin. J. Natl Cancer Inst. 110, 967–974 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Du, R., Huang, C., Liu, K., Li, X. & Dong, Z. Targeting AURKA in cancer: molecular mechanisms and opportunities for cancer therapy. Mol. Cancer 20, 15 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Borah, N. A. & Reddy, M. M. Aurora kinase B inhibition: a potential therapeutic strategy for cancer. Molecules 26, 1981 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Tulinius, H. et al. The effect of a single BRCA2 mutation on cancer in Iceland. J. Med. Genet. 39, 457–462 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Nielsen, F. C., van Overeem Hansen, T. & Sørensen, C. S. Hereditary breast and ovarian cancer: new genes in confined pathways. Nat. Rev. Cancer 16, 599–612 (2016).

    Article  CAS  PubMed  Google Scholar 

  80. Peltomäki, P. Lynch syndrome genes. Fam. Cancer 4, 227–232 (2005).

    Article  PubMed  Google Scholar 

  81. Rafnar, T. et al. Variants associating with uterine leiomyoma highlight genetic background shared by various cancers and hormone-related traits. Nat. Commun. 9, 3636 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Sigurdardottir, L. G. et al. Data quality at the Icelandic Cancer Registry: comparability, validity, timeliness and completeness. Acta Oncol. 51, 880–889 (2012).

    Article  PubMed  Google Scholar 

  83. Thorgeirsson, T. E. et al. A rare missense mutation in CHRNA4 associates with smoking behavior and its consequences. Mol. Psychiatry 21, 594–600 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Møller, P. et al. Genetic epidemiology of BRCA1 mutations in Norway. Eur. J. Cancer 37, 2428–2434 (2001).

    Article  PubMed  Google Scholar 

  86. Mattingsdal, M. et al. The genetic structure of Norway. Eur. J. Hum. Genet. 29, 1710–1718 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Magnus, P. et al. Cohort profile: the Norwegian mother and child cohort study (MoBa). Int. J. Epidemiol. 35, 1146–1150 (2006).

    Article  PubMed  Google Scholar 

  88. Magnus, P. et al. Cohort profile update: the Norwegian mother and child cohort study (MoBa). Int. J. Epidemiol. 45, 382–388 (2016).

    Article  PubMed  Google Scholar 

  89. Corfield, E. C. et al. The Norwegian mother, father, and child cohort study (MoBa) genotyping data resource: MoBaPsychGen pipeline v.1. Preprint at bioRXiv https://doi.org/10.1101/2022.06.23.496289 (2022).

  90. Miller, D. T. et al. ACMG SF v3.2 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 25, 100866 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Sørensen, E. et al. Data resource profile: the Copenhagen Hospital Biobank (CHB). Int. J. Epidemiol. 50, 719–720e (2021).

    Article  PubMed  Google Scholar 

  92. Erikstrup, C. et al. Cohort profile: the Danish blood donor study. Int. J. Epidemiol. 52, e162–e171 (2023).

    Article  PubMed  Google Scholar 

  93. Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci. Data 4, 170115 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  95. Eggertsson, H. P. & Halldorsson, B. V. read_haps: using read haplotypes to detect same species contamination in DNA sequences. Bioinforma. Oxf. Engl. 37, 2215–2217 (2021).

    Article  CAS  Google Scholar 

  96. Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).

    Article  CAS  PubMed  Google Scholar 

  97. Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Liu, X., Wu, C., Li, C. & Boerwinkle, E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum. Mutat. 37, 235–241 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Liu, X., Li, C., Mou, C., Dong, Y. & Tu, Y. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 12, 103 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).

    Article  CAS  PubMed  Google Scholar 

  102. Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Zhou, W. et al. SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests. Nat. Genet. 54, 1466–1469 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Mantel, N. & Haenszel, W. Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl Cancer Inst. 22, 719–748 (1959).

    CAS  PubMed  Google Scholar 

  105. Higgins, J. P. T. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002).

    Article  PubMed  Google Scholar 

  106. Sveinbjornsson, G. et al. Multiomics study of nonalcoholic fatty liver disease. Nat. Genet. 54, 1652–1663 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49, D884–D891 (2021).

    Article  CAS  PubMed  Google Scholar 

  108. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  109. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  110. Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to thank all study participants for their valuable contribution. We also thank all our colleagues who contributed to data and sample collecting and genotyping. We are grateful to all the participating families in Norway who take part in this ongoing cohort study. This work was partly performed on the TSD (Services for Sensitive Data) facilities, owned by the University of Oslo, operated and developed by the TSD service group at the University of Oslo, IT-Department (USIT). The Norwegian Mother, Father and Child Cohort Study is supported by the Norwegian Ministry of Health and Care Services and the Ministry of Education and Research. We thank the Norwegian Institute of Public Health for generating high-quality genomic data. This research is part of the HARVEST collaboration, supported by the Research Council of Norway (RCN 229624). We also thank the NORMENT Centre for providing genotype data, funded by the RCN (223273), South East Norway Health Authorities and Stiftelsen Kristian Gerhard Jebsen (SKGJ). We further thank the Center for Diabetes Research, the University of Bergen for providing genotype data funded by the ERC AdG project SELECTionPREDISPOSED, SKGJ, Trond Mohn Foundation, the RCN, the Novo Nordisk Foundation, the University of Bergen and the Western Norway Health Authorities. S.B. acknowledges the Novo Nordisk Foundation (grant nos. NNF17OC0027594 and NNF14CC0001).

Author information

Authors and Affiliations

Authors

Consortia

Contributions

E.V.I., J.G., P.S., T.R., D.F.G. and K.S. designed the study. E.V.I., J.G., V.T., G.S., S.K., G.H.H., M.I.M., A.O., G.B.W., A.S., D.B., G. Thorleifsson, B.H., P. Melsted and D.F.G. analyzed the data and interpreted the results. J.G., S.N.S., S.S., H.S., I.J., E.S., O.B.P., C.E., M.B., M.P., A.R., H.V.S., I.G., J. Hillingsø, S.E.B., U.L., E. Høgdall, H.U., S.B., S.R.O., I.E.S., O.F., S.D., A.H., P. Moller, M.D.-V., J. Haavik, O.A.A., E. Hovig, B.A.A., R.H., O.T.J., T.V., S.J., P.H.M., J.H.O., B.S., J.G.J., G. Tryggvason, H.H., T.R. and K.S. performed recruitment, phenotyping and reference data. E.V.I., J.G., T.R., D.F.G. and K.S. drafted the manuscript with input and comments from V.T., G.S., S.K., S.N.S., G.H., S.S., B.H. and P.S. All authors contributed to the final version of the manuscript.

Corresponding authors

Correspondence to Erna V. Ivarsdottir or Kari Stefansson.

Ethics declarations

Competing interests

E.V.I., J.G., V.T., G.S, S.K., S.N.S., G.H.H., M.I.M., A.O., G.B.W., A.S., S.S., D.B., G. Thorleifsson, B.H., P. Melsted., H.S., I.J., H.H., P.S. T.R., D.F.G. and K.S. are employees at deCODE genetics/Amgen, Inc. O.A.A. is a consultant to Cortechs.ai. C.E. has obtained unrestricted research grants from Abbott Diagnostics and Novo Nordisk A/S with no personal fees. S.B. has ownerships in Intomics A/S, Hoba Therapeutics Aps, Novo Nordisk A/S, Lundbeck A/S, ALK abello A/S, Eli Lilly and Co. and managing board memberships in Proscion A/S and Intomics A/S. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Douglas Easton and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Visual representation for the 34 genes identified to associate with cancer.

The genes in the blue circle were identified through burden of LOF variants, and the genes in the green circle were identified through burden of LOF+missense variants. Ten genes were found to associate with cancer through both burden of LOF and LOF+missense variants. Newly identified genes are colored in red.

Extended Data Fig. 2 Burden association results for cancer sites not shown in Fig. 2.

The effects (log(OR)) from both LOF and LOF+missense burden associations are shown with dots for genes, that associated significantly after correcting for multiple testing (P<1.3×10−6), using at least one of the variant selection methods for; basal cell carcinoma of the skin, cervical cancer, cutaneous melanoma, endometrial cancer, gastric cancer, head and neck cancer, kidney cancer, ovarian cancer, pancreatic cancer, squamous cell carcinoma of the skin. The color indicates the variant selection method; blue for LOF variants and green for LOF+missense variants. A logistic regression was used to test for the association and the two-sided P-values were obtained from a likelihood ratio test (Supplementary Table 2). The error bars represent the 95% confidence intervals for the estimated effects.

Extended Data Fig. 3 Manhattan plots.

Manhattan plots showing the association results from the gene-based burden test for; (a) breast cancer, (b) prostate cancer, (c) colorectal cancer, (d) lung cancer, (e) thyroid cancer and (f) cutaneous melanoma. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The point shape indicates the selected variants for the burden test; round dots for LOF variants and triangles for LOF and selected moderate impact variants. Genes that do not have rare coding variants previously reported for cancer are colored in red. The grey line represents the significance threshold correcting for multiple testing, 1.3×10−6.

Extended Data Fig. 4 The cumulative incidence of breast-, prostate- and colorectal cancer.

The cumulative incidence curves, 1-S(t), where S(t) is the survival function estimated with the Kaplan-Meier method, are shown for non-carriers and carriers for all genes that associate significantly in the burden association scan with a-b. breast cancer, c-d. prostate cancer and e-f. colorectal cancer. In a, c, e the cumulative incidence is shown among chip-typed individuals in Iceland (N=173,025) and in b, d, f the cumulative incidence is shown among white British/Irish individuals in the UK Biobank (N=431,079).

Extended Data Fig. 5 Associations between microsatellite alleles in BIK and prostate cancer.

Shown are the effect sizes (log(OR)) against allele lengths, for alleles that have at least 100 carriers (squares). Logistic regression and two-sided likelihood ratio tests were used to test for associations (Supplementary Table 8). The error bars represent the 95% confidence intervals. The grey line marks the 15-repeat reference allele length. Alleles were found with popSTR in the UK Biobank (blue) and Iceland (green). Additionally, two inframe deletions detected in FinnGen, rs965427251 and rs759887547, corresponding to allele length of 5 and 9 repeats, are included in the figure (red). Weighted linear regression was performed separately for alleles with ≥ 15 repeats, that is insertions, and alleles with ≤ 15 repeats, that is deletions, weighted by f*(1-f) where f is the allele frequency. The regression lines and P-values are shown in purple for deletions and in yellow for insertions.

Extended Data Fig. 6 Statistical power.

The statistical power (squares) to detect an association between a rare burden genotype and a disease, estimated with simulations for different carrier frequency and different odds ratios (OR), assuming a sample size of 200,000 individuals and 6% disease prevalence. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The power is defined as the rate of associations with P<0.05 divided by the total number of tests across 1000 simulations.

Extended Data Fig. 7 Manhattan plots for cancers combined.

Manhattan plots for burden association results with cancers combined, irrespective of site, for (a) all results and (b) results with P-value>1×10−10. Logistic regression and two-sided likelihood ratio tests were used to test for associations. The grey line represents the significance threshold correcting for multiple testing, 1.3×10−6.

Extended Data Fig. 8 The cumulative incidence of all cancers combined.

The cumulative incidence curves are shown for all cancers combined, irrespective of site, among carriers of LOF+missense variants in AURKB in blue and non-carriers in red, for (a) chip-typed individuals in Iceland (N=173,025) and (b) white British/Irish individuals in the UK Biobank (N=431,079).

Extended Data Fig. 9 Survival probability.

Kaplan-Meier survival curves comparing survival probability for cancer cases in the UK Biobank for carriers of LOF+missense variants in AURKB in blue and non-carriers in red. The x-axis shows time from diagnoses to death in years.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Figs. 1–9 and References.

Reporting Summary

Peer Review File

Supplementary Tables 1–17

The first tab includes a title and description for each Supplementary Table.

Supplementary Data

Burden association results for 22 cancer phenotypes.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ivarsdottir, E.V., Gudmundsson, J., Tragante, V. et al. Gene-based burden tests of rare germline variants identify six cancer susceptibility genes. Nat Genet 56, 2422–2433 (2024). https://doi.org/10.1038/s41588-024-01966-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41588-024-01966-6

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer