Extended Data Fig. 1: Schematic overview of mitochondrial genome constraint.
From: Quantifying constraint in the human mitochondrial genome

(a) We established a constraint model for the mtDNA to quantify the removal of deleterious variants from the population by negative selection. We assessed constraint by identifying genes and regions where the observed variation is less than expected, under neutrality. Observed is calculated using maximum heteroplasmy in gnomAD, and specifically by summing the maximum heteroplasmy value of every variant in a gene or region. Expected is calculated using a mutational model, and specifically by summing the mutational likelihoods of every variant in a gene or region and applying linear models fit on neutral variation (ascertained using Phylotree and PhyloP). The ratio of observed:expected variation and its 90% confidence interval is calculated, and the OEUF is used as a conservative measure of constraint. (b) A suite of constraint metrics are available via Supplementary Datasets, including constraint metrics for each gene and non-coding element, as well as regional missense constraint for each protein gene, regional constraint for each rRNA gene, position constraint for tRNA genes, and local constraint for every position in the mtDNA (MLC scores). (c) Constraint metrics can identify deleterious variants, and constrained sites are enriched in pathogenic variants from ClinVar and MITOMAP. Example applications include using regional constraint for variant classification and prioritization in individuals with rare disease and using the MLC score to assess associations between heteroplasmy burden and common phenotypes. (a) and (c) were created with BioRender.com.