Fig. 3: Patterns of selection of missense mutations in NOTCH1 EGF11–12 in normal human oesophagus.
From: Mutations observed in somatic evolution reveal underlying gene mechanisms

a Missense mutation frequency across the domains of NOTCH1. Domain definitions from UniProt56. Where the gap between domains is only a single residue, mutations from this residue are included in the subsequent domain. EGF repeats, blue; EGF11–12, dark blue; LNR, orange; transmembrane region, red; ankyrin repeats, green; other regions, grey. b ∆∆G of mutations in NOTCH1 EGF11–12. Single nucleotide missense mutations that occur in the normal oesophagus, red, with marker size proportional to the number of times that mutation occurs. Single nucleotide missense mutations that do not occur in the dataset shown in grey. c Distributions of ∆∆G values of missense mutations. Distribution expected under the neutral null hypothesis, light red, and the distribution observed, dark red. d, e Counts of NOTCH1 EGF11–12 mutations occurring on the ligand-binding interface under the neutral null hypothesis, light blue, and observed, dark blue. Null and observed counts including all missense mutations (d) or excluding destabilising mutations (∆∆G > 2 kcal/mol) from both the null model and observed data (e). f ∆∆G plotted against distance from the NOTCH1 EGF11–12 ligand-binding interface residues. Observed single nucleotide missense mutations shown in green (calcium binding), blue (ligand binding), red (∆∆G > 2 kcal/mol) or orange (other). Marker size is proportional to the number of times that mutation occurs. Single nucleotide missense mutations that do not occur in the data set shown in grey. Regions containing highly destabilising mutations (∆∆G > 2 kcal/mol) and mutations on the ligand-binding interface shown with dashed red and blue boxes, respectively. g, h Counts of NOTCH1 EGF11–12 mutations that are on calcium-binding residues under the neutral null hypothesis, light green, and observed, dark green. g Null and observed counts including all missense mutations (g) and excluding destabilising mutations (∆∆G > 2 kcal/mol) and ligand-binding interface mutations from both the null model and observed data (h). P values calculated using a two-tailed Monte Carlo test for c and a two-tailed binomial test for d, e, g, h (Supplementary Note 10). Error bars in d, e, g, h show 95% confidence intervals (Supplementary Note 10). ****P ≤ 0.0001, ns P > 0.05.