Extended Data Fig. 4: Restriction enzyme NanoSeq and targeted NanoSeq on buccal swabs.

a, Regression of SNV mutation burden with age for 12 samples selected from across the age range (black dots), three samples with high SigB signature contribution (grey), and one sample with a high mutation burden from a donor with a history of CHOP chemotherapy treatment (orange). The regression results listed within the plot were generated using only the 12 samples randomly selected for their age range. Error bars show Poisson 95% CIs for the estimated burdens (substitutions per cell). P-values calculated with t-test and 2 degrees of freedom. b, Transcription-coupled repair and damage in three donors with high contribution of SigB. Estimated substitution burdens plus their associated 95% CIs (error bars) across upstream, transcribed and downstream regions, showing T > C and A > G in the coding strand separately. c, Number of substitutions per Mbp per year in 12 age range donors for each of 10 major ENCODE chromatin states. Reference chromatin states were obtained from ENCODE E057 foreskin keratinocytes. Chromatin states BivFlnk, EnhBiv, TssBiv, TxFlnk, and ZNF/Rpts were removed given their smaller footprint and too large confidence intervals. Burdens were normalised to whole genome trinucleotide frequencies. Error bars show Poisson 95% CIs. d, Number of substitutions per Mbp per year in 3 donors with strong SigB exposure for each of 10 major ENCODE chromatin states. Only T > C rates are shown, calculated as the number of T > C substitutions observed and divided by the number of [TA] bps. Error bars show Poisson 95% CIs. e, Cosine similarities between the observed and reconstructed substitution profiles as a function of the number of mutations in each sample, highlighting the outlier donor with a history of CHOP chemotherapy treatment (brown). f, Transcriptional strand-wise trinucleotide SBS spectrum for the outlier CHOP donor. g, Numbers of total mutations, coding SNVs and coding indels identified in oral epithelium samples from 1,042 donors using targeted NanoSeq. h, Numbers of non-synonymous mutations identified by targeted NanoSeq per donor in oral epithelium for genes NOTCH1, TP53, FAT1 and NOTCH2. Mutation counts are ordered independently for each gene.