Fig. 1: Pervasive selection in whole blood exomes in UKBB. | Nature Genetics

Fig. 1: Pervasive selection in whole blood exomes in UKBB.

From: Analysis of somatic mutations in whole blood from 200,618 individuals identifies pervasive positive selection and novel drivers of clonal hematopoiesis

Fig. 1

a, Exome-wide somatic mutation frequency, VAF and mutation counts in individuals increase with age. The error bars represent 2× standard error of the mean. The smoothed line represents a second-degree polynomial fit of the actual data, and the shading represents the CI. N = 200,618. b, Left: dN/dS is the normalized ratio of nonsynonymous to synonymous mutations. dN represents the rate of nonsynonymous mutations per nonsynonymous site, and dS represents the rate of synonymous mutations per synonymous site. A dN/dS of ~1 is expected under neutrality. Genes with a dN/dS ratio >1, taking into account a trinucleotide-specific mutation rate, indicates the gene is under positive selection (‘fitness inferred’, FI). HSCs with a mutation under positive selection will clonally expand to result in CH. Right: global positive selection in blood detectable at missense, essential splice site and truncating mutations (comprising nonsense substitutions and frameshift insertions/deletions). Nonsynonymous mutations comprise missense and nonsense single base substitutions. The error bars represent the 95% CI of the dN/dS parameter for that mutation type. N = 52,701 mutations. c, Classical fitness-inferred (FI) CH genes (in blue), Classical non-fitness-inferred (non-FI) genes (in green), and new fitness-inferred (FI) CH genes (in orange) representing both novel genes and several recently reported. The graph shows the dN/dS ratio for nonsense and/or missense variants >1, q value <0.1, plotting the maximum dN/dS value. d. New fitness-inferred CH genes (in orange) alongside classical fitness-inferred CH genes (in blue), and the types of nonsynonymous mutation they are under positive selection for. e,f, The frequency of individuals (e) and mutation log(VAF) (f) for new and classical FI CH genes and classical non-FI CH genes versus age. The error bars represent the 2× standard error of the mean. The smoothed line represents a second-degree polynomial fit of the actual data, and the shading represents the CI. N = 200,618 individuals. g, The number of individuals in UKBB with detectable somatic mutations in driver genes associated with CH. h, The number of individuals carrying CH conferring variants per gene.

Source data

Back to article page