Abstract
Penetrance is the probability that an individual with a pathogenic genetic variant develops a specific disease. Knowing the penetrance of variants for monogenic disorders is important for counseling of individuals. Until recently, estimates of penetrance have largely relied on affected individuals and their at-risk family members being clinically referred for genetic testing, a ‘phenotype-first’ approach. This approach substantially overestimates the penetrance of variants because of ascertainment bias. The recent availability of whole-genome sequencing data in individuals from very-large-scale population-based cohorts now allows ‘genotype-first’ estimates of penetrance for many conditions. Although this type of population-based study can underestimate penetrance owing to recruitment biases, it provides more accurate estimates of penetrance for secondary or incidental findings. Here, we provide guidance for the conduct of penetrance studies to ensure that robust genotypes and phenotypes are used to accurately estimate penetrance of variants and groups of similarly annotated variants from population-based studies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout

Similar content being viewed by others
References
Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).
Kingdom, R. & Wright, C. F. Incomplete penetrance and variable expressivity: from clinical studies to population cohorts. Front. Genet. 13, 920390 (2022).
Roberts, A. M. et al. Towards robust clinical genome interpretation: developing a consistent terminology to characterize disease–gene relationships — allelic requirement, inheritance modes and disease mechanisms. Genet. Med. 26, 101029 (2024).
Otto, P. A. & Horimoto, A. R. V. R. Penetrance rate estimation in autosomal dominant conditions. Genet. Mol. Biol. 35, 583–588 (2012).
Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).
Wright, C. F. et al. Assessing the pathogenicity, penetrance, and expressivity of putative disease-causing variants in a population setting. Am. J. Hum. Genet. 104, 275–286 (2019).
Mirshahi, U. L. et al. Reduced penetrance of MODY-associated HNF1A/HNF4A variants but not GCK variants in clinically unselected cohorts. Am. J. Hum. Genet. 109, 2018–2028 (2022).
Pizzo, L. et al. Rare variants in the genetic background modulate cognitive and developmental phenotypes in individuals carrying disease-associated variants. Genet. Med. 21, 816–825 (2019).
Crawford, K. et al. Medical consequences of pathogenic CNVs in adults: analysis of the UK Biobank. J. Med. Genet. 56, 131–138 (2019).
McGurk, K. A. et al. The penetrance of rare variants in cardiomyopathy-associated genes: a cross-sectional approach to estimating penetrance for secondary findings. Am. J. Hum. Genet. 110, 1482–1495 (2023).
Ciesielski, T. H., Sirugo, G., Iyengar, S. K. & Williams, S. M. Characterizing the pathogenicity of genetic variants: the consequences of context. NPJ Genom. Med. 9, 3 (2024).
Kassabian, B. et al. Intrafamilial variability in SLC6A1-related neurodevelopmental disorders. Front. Neurosci. 17, 1219262 (2023).
Martins Custodio, H. et al. Widespread genomic influences on phenotype in Dravet syndrome, a ‘monogenic’ condition. Brain 146, 3885–3897 (2023).
Minikel, E. V. et al. Quantifying prion disease penetrance using large population control cohorts. Sci. Transl. Med. 8, 322ra9 (2016).
Fan, X. et al. Penetrance of breast cancer susceptibility genes from the eMERGE III Network. JNCI Cancer Spectr. 5, pkab044 (2021).
Shekari, S. et al. Penetrance of pathogenic genetic variants associated with premature ovarian insufficiency. Nat. Med. 29, 1692–1699 (2023).
de Marvao, A. et al. Phenotypic expression and outcomes in individuals with rare genetic variants of hypertrophic cardiomyopathy. J. Am. Coll. Cardiol. 78, 1097–1110 (2021).
Goodrich, J. K. et al. Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes. Nat. Commun. 12, 3505 (2021).
Forrest, I. S. et al. Population-based penetrance of deleterious clinical variants. JAMA 327, 350–359 (2022).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Ghosh, R. et al. Updated recommendation for the benign stand-alone ACMG/AMP criterion. Hum. Mutat. 39, 1525–1530 (2018).
Patel, K. A. et al. Heterozygous RFX6 protein truncating variants are associated with MODY with reduced penetrance. Nat. Commun. 8, 888 (2017).
Wiltshire, K. M., Hegele, R. A., Innes, A. M. & Brownell, A. K. W. Homozygous lamin A/C familial lipodystrophy R482Q mutation in autosomal recessive Emery Dreifuss muscular dystrophy. Neuromuscul. Disord. 23, 265–268 (2013).
Minikel, E. V. & MacArthur, D. G. Publicly available data provide evidence against NR1H3 R415Q causing multiple sclerosis. Neuron 92, 336–338 (2016).
Hanany, M. & Sharon, D. Allele frequency analysis of variants reported to cause autosomal dominant inherited retinal diseases question the involvement of 19% of genes and 10% of reported pathogenic variants. J. Med. Genet. 56, 536–542 (2019).
Gudmundsson, S. et al. Variant interpretation using population databases: lessons from gnomAD. Hum. Mutat. 43, 1012–1030 (2022).
Whiffin, N. et al. CardioClassifier: disease- and gene-specific computational decision support for clinical genome interpretation. Genet. Med. 20, 1246–1254 (2018).
Laver, T. W. et al. The common p.R114W HNF4A mutation causes a distinct clinical subtype of monogenic diabetes. Diabetes 65, 3212–3217 (2016).
Loveday, C. et al. p.Val804Met, the most frequent pathogenic mutation in RET, confers a very low lifetime risk of medullary thyroid cancer. J. Clin. Endocrinol. Metab. 103, 4275–4282 (2018).
MacArthur, D. G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012).
Weedon, M. N. et al. Use of SNP chips to detect rare pathogenic variants: retrospective, population based diagnostic evaluation. BMJ 372, n214 (2021).
Weedon, M. N., Wright, C. F., Patel, K. A. & Frayling, T. M. Unreliability of genotyping arrays for detecting very rare variants in human genetic studies: example from a recent study of MC4R. Cell 184, 1651 (2021).
Valluru, M. K. et al. A founder UMOD variant is a common cause of hereditary nephropathy in the British population. J. Med. Genet. 60, 397–405 (2023).
Wang, Q. et al. Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 597, 527–532 (2021).
Thorvaldsdóttir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
Carlston, C. M. et al. Pathogenic ASXL1 somatic variants in reference databases complicate germline variant interpretation for Bohring–Opitz syndrome. Hum. Mutat. 38, 517–523 (2017).
Steensma, D. P. Clinical implications of clonal hematopoiesis. Mayo Clin. Proc. 93, 1122–1130 (2018).
Ariste, O., de la Grange, P. & Veitia, R. A. Recurrent missense variants in clonal hematopoiesis-related genes present in the general population. Clin. Genet. 103, 247–251 (2022).
Fasham, J. et al. No association between SCN9A and monogenic human epilepsy disorders. PLoS Genet. 16, e1009161 (2020).
Laver, T. W. et al. Evaluation of evidence for pathogenicity demonstrates that BLK, KLF11, and PAX4 should not be included in diagnostic testing for MODY. Diabetes 71, 1128–1136 (2022).
Hosseini, S. M. et al. Reappraisal of reported genes for sudden arrhythmic death: evidence-based evaluation of gene validity for Brugada syndrome. Circulation 138, 1195–1205 (2018).
Strande, N. T. et al. Evaluating the clinical validity of gene–disease associations: an evidence-based framework developed by the Clinical Genome Resource. Am. J. Hum. Genet. 100, 895–906 (2017).
DiStefano, M. T. et al. The Gene Curation Coalition: a global effort to harmonize gene–disease evidence resources. Genet. Med. 24, 1732–1742 (2022).
Harrison, S. M. & Rehm, H. L. Is ‘likely pathogenic’ really 90% likely? Reclassification data in ClinVar. Genome Med. 11, 72 (2019).
Mighton, C. et al. Variant classification changes over time in BRCA1 and BRCA2. Genet. Med. 21, 2248–2254 (2019).
Shah, N. et al. Identification of misclassified ClinVar variants via disease population prevalence. Am. J. Hum. Genet. 102, 609–619 (2018).
Ellard, S. et al. ACGS Best Practice Guidelines for Variant Classification in Rare Disease 2020 www.acgs.uk.com/media/11631/uk-practice-guidelines-for-variant-classification-v4-01-2020.pdf (2020).
Biesecker, L. G. Opportunities and challenges for the integration of massively parallel genomic sequencing into clinical practice: lessons from the ClinSeq project. Genet. Med. 14, 393–398 (2012).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Wright, C. F., FitzPatrick, D. R., Ware, J. S., Rehm, H. L. & Firth, H. V. Importance of adopting standardized MANE transcripts in clinical reporting. Genet. Med. 25, 100331 (2023).
Morales, J. et al. A joint NCBI and EMBL-EBI transcript set for clinical genomics and research. Nature 604, 310–315 (2022).
Karlin, S., Chen, C., Gentles, A. J. & Cleary, M. Associations between human disease genes and overlapping gene groups and multiple amino acid runs. Proc. Natl Acad. Sci. USA 99, 17008–17013 (2002).
Barton, A. R., Hujoel, M. L. A., Mukamel, R. E., Sherman, M. A. & Loh, P.-R. A spectrum of recessiveness among Mendelian disease variants in UK Biobank. Am. J. Hum. Genet. 109, 1298–1307 (2022).
Lipov, A. et al. Exploring the complex spectrum of dominance and recessiveness in genetic cardiomyopathies. Nat. Cardiovasc. Res. 2, 1078–1094 (2023).
Heyne, H. O. et al. Mono- and biallelic variant effects on disease at biobank scale. Nature 613, 519–525 (2023).
Ellard, S., Colclough, K., Patel, K. A. & Hattersley, A. T. Prediction algorithms: pitfalls in interpreting genetic variants of autosomal dominant monogenic diabetes. J. Clin. Invest. 130, 14–16 (2020).
Cremers, F. P. M., Lee, W., Collin, R. W. J. & Allikmets, R. Clinical spectrum, genetic complexity and therapeutic approaches for retinal disease caused by ABCA4 mutations. Prog. Retin. Eye Res. 79, 100861 (2020).
Runhart, E. H. et al. The common ABCA4 variant p.Asn1868Ile shows nonpenetrance and variable expression of Stargardt disease when present in trans with severe variants. Invest. Ophthalmol. Vis. Sci. 59, 3220–3231 (2018).
Zschocke, J., Byers, P. H. & Wilkie, A. O. M. Mendelian inheritance revisited: dominance and recessiveness in medical genetics. Nat. Rev. Genet. 24, 442–463 (2023).
Cicerone, A. P. et al. A survey of multigenic protein-altering variant frequency in familial exudative vitreo-retinopathy (FEVR) patients by targeted sequencing of seven FEVR-linked genes. Genes 13, 495 (2022).
Backwell, L. & Marsh, J. A. Diverse molecular mechanisms underlying pathogenic protein mutations: beyond the loss-of-function paradigm. Annu. Rev. Genomics Hum. Genet. 23, 475–498 (2022).
Gerasimavicius, L., Livesey, B. J. & Marsh, J. A. Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure. Nat. Commun. 13, 3895 (2022).
Wakeling, M. N. et al. Non-coding variants disrupting a tissue-specific regulatory element in HK1 cause congenital hyperinsulinism. Nat. Genet. 54, 1615–1620 (2022).
Gandotra, S. et al. Perilipin deficiency and autosomal dominant partial lipodystrophy. N. Engl. J. Med. 364, 740–748 (2011).
Laver, T. W. et al. PLIN1 haploinsufficiency is not associated with lipodystrophy. J. Clin. Endocrinol. Metab. 103, 3225–3230 (2018).
Patel, K. A. et al. PLIN1 haploinsufficiency causes a favorable metabolic profile. J. Clin. Endocrinol. Metab. 107, e2318–e2323 (2022).
Magge, S. N. et al. Familial leucine-sensitive hypoglycemia of infancy due to a dominant mutation of the beta-cell sulfonylurea receptor. J. Clin. Endocrinol. Metab. 89, 4450–4456 (2004).
DeBoever, C. et al. Assessing digital phenotyping to enhance genetic studies of human diseases. Am. J. Hum. Genet. 106, 611–622 (2020).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Jacob, K. N. & Garg, A. Laminopathies: multisystem dystrophy syndromes. Mol. Genet. Metab. 87, 289–302 (2006).
Magrinelli, F., Balint, B. & Bhatia, K. P. Challenges in clinicogenetic correlations: one gene — many phenotypes. Mov. Disord. Clin. Pract. 8, 299–310 (2021).
Pilling, L. C. et al. Common conditions associated with hereditary haemochromatosis genetic variants: cohort study in UK Biobank. BMJ 364, k5222 (2019).
Murphy, N. A. et al. Age-related penetrance of the C9orf72 repeat expansion. Sci. Rep. 7, 2116 (2017).
Wade, K. H. et al. Loss-of-function mutations in the melanocortin 4 receptor in a UK birth cohort. Nat. Med. 27, 1088–1096 (2021).
Khoury, M. J. & Flanders, W. D. On the measurement of susceptibility to genetic factors. Genet. Epidemiol. 6, 699–711 (1989).
Bland, J. M. & Altman, D. G. Survival probabilities (the Kaplan–Meier method). BMJ 317, 1572 (1998).
Jonker, M. A., Rijken, J. A., Hes, F. J., Putter, H. & Hensen, E. F. Estimating the penetrance of pathogenic gene variants in families with missing pedigree information. Stat. Methods Med. Res. 28, 2924–2936 (2019).
Lebo, M. et al. O31: risk allele evidence curation, classification, and reporting: recommendations from the ClinGen Low Penetrance/Risk Allele Working Group. Genet. Med. 1, 100457 (2023).
De Franco, E. et al. Update of variants identified in the pancreatic β-cell KATP channel genes KCNJ11 and ABCC8 in individuals with congenital hyperinsulinism and diabetes. Hum. Mutat. 41, 884–905 (2020).
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Wang, Q. et al. Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes. Nat. Commun. 11, 2539 (2020).
Ioannidis, N. M. et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885 (2016).
Harrison, S. M. et al. Using ClinVar as a resource to support variant interpretation. Curr. Protoc. Hum. Genet. 89, 8.16.1–8.16.23 (2016).
Weedon, M. N. et al. No evidence of association of ENPP1 variants with type 2 diabetes or obesity in a study of 8,089 U.K. Caucasians. Diabetes 55, 3175–3179 (2006).
Hughes, A. E. et al. Identification of GCK-MODY in cases of neonatal hyperglycemia: a case series and review of clinical features. Pediatr. Diabetes 22, 876–881 (2021).
Raimondo, A. et al. Phenotypic severity of homozygous GCK mutations causing neonatal or childhood-onset diabetes is primarily mediated through effects on protein stability. Hum. Mol. Genet. 23, 6432–6440 (2014).
Bastarache, L. & Peterson, J. F. Penetrance of deleterious clinical variants. JAMA 327, 1926–1927 (2022).
Acknowledgements
We thank A. Hattersley and numerous other colleagues and reviewers for insightful conversations and guidance. This research has been conducted using the UK Biobank resource under application numbers 49847 and 9072. The current work was supported by Diabetes UK (19/0005994), the MRC (MR/T00200X/1) and Wellcome (226083/Z/22/Z). K.A.P. is supported by a Wellcome Clinical Fellowship (219606/Z/19/Z). J.S.W. is supported by the Medical Research Council (UK), the Sir Jules Thorn Charitable Trust (21JTA), the British Heart Foundation (RE/18/4/34215) and the NIHR Imperial College Biomedical Research Centre. We acknowledge the use of the University of Exeter High-Performance Computing facility in carrying out this work. This study was supported by the National Institute for Health and Care Research Exeter Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
Author information
Authors and Affiliations
Contributions
C.F.W. and M.N.W. conceived the study; L.N.S. and K.A.P. performed the diabetes analysis outlined in Fig. 1; C.F.W., M.N.W., L.J., A.M. and K.A.P. curated variants, genes and conditions to identify potential errors; C.F.W. wrote the first draft of the manuscript; J.S.W., D.G.M. and H.L.R. provided expert input into the manuscript; all authors contributed to revisions and the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
M.N.W. is a co-investigator on a Randox Laboratories R&D research grant and received translational industry academic funding from Randox Laboratories R&D relating to autoimmune GRS for prediction and classification of disease. M.N.W. and K.A.P. have received royalties from Randox as co-inventors of a type 1 diabetes genetic risk score product. D.G.M. is a paid advisor to GlaxoSmithKline, Insitro, Variant Bio and Overtone Therapeutics and has received research support from AbbVie, Astellas, Biogen, BioMarin, Eisai, Google, Merck, Microsoft, Pfizer and Sanofi–Genzyme. J.S.W. has received research support from Bristol Myers Squibb and has acted as a consultant for MyoKardia, Pfizer, Foresite Labs, HealthLumen and Tenaya Therapeutics. The other authors have no conflict of interest to declare.
Peer review
Peer review information
Nature Genetics thanks Brett Kroncke, Clare Turnbull, and Johannes Zschocke for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wright, C.F., Sharp, L.N., Jackson, L. et al. Guidance for estimating penetrance of monogenic disease-causing variants in population cohorts. Nat Genet 56, 1772–1779 (2024). https://doi.org/10.1038/s41588-024-01842-3
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-024-01842-3
This article is cited by
-
Modifier gene identification reveals therapeutic pathways for adult and childhood dementias
Discover Neuroscience (2026)
-
A new form of diabetes caused by INS mutations defined by zygosity, stem cell and population data
EMBO Molecular Medicine (2026)
-
Prevalence of loss-of-function, gain-of-function and dominant-negative mechanisms across genetic disease phenotypes
Nature Communications (2025)
-
Using large-scale population-based data to improve disease risk assessment of clinical variants
Nature Genetics (2025)
-
Genome sequencing for the diagnosis of intellectual disability as a paradigm for rare diseases in the French healthcare setting: the prospective DEFIDIAG study
Genome Medicine (2025)


