Fig. 3: Frequent CpG C > T transitions in retrotransposons in iPSCs.

a Distribution of CpG C > T mutations in the genome. Distribution of CpG sequence in the genome is shown in ‘control’55,56. b Mutations at CpG or at non-CpG on retrotransposon sequences. Percentages of mutation positions on LINE, SINE and LTR-retrotransposons are shown. For statistical analysis, the two-tailed Z-test was employed. See Supplementary Table 2 also. c CpG C > T mutations in each SINE subfamily. The SINE subfamily names are shown on the x-axis. AluY and other AluY subfamilies, Ya through Yk, are denoted in orange. The y-axis indicates the percentage of CpG C > T mutations detected in each SINE subfamily of the total number of CpG C > Ts detected within whole SINEs. One SINE subfamily, AluY, which showed a marked increase in the number of mutations in iPSCs, is shown in yellow. d Ratio of the CpG C > T frequency of each SINE subfamily between iPSCs and germline. The iPSC to germline CpG C > T ratio in each subfamily was calculated using the % data shown in Fig. 3c, and the values are shown relative to AluJ as 1. AluJ and AluS are the sum of the values of the subfamilies belonging to the AluJ and AluS families, respectively, and AluYa-k is the sum of the values of the subfamilies, AluYa to AluYk. Shown in orange are the evolutionarily young subfamilies AluY and AluYa-k. Source data are provided as a Source Data file.