Supplementary Figure 2: The XDG normalized sequence depth of branches by sequence region and haplogroup.

Results are shown for the 482 branches with an average XDG sequence depth >10×. For each branch, the average depth in each of 27 regions of the 4 sequence classes was normalized through division by its average XDG sequence depth. This was done because the branches have very different baseline expectations of sequence depth, which depend on the number of males who belong to them and the amount of sequencing undertaken for each. Normalization by average XDG sequence depth allows meaningful comparison of sequence depth among branches. Moreover, as the XDG region is mostly present in single copy, the normalized value provides some information about the relative magnitude of change in copy number. For each sequence region and haplogroup, we then calculated and plotted the average XDG normalized sequence depth with 95% confidence intervals. The greatest differences are seen for region PAL_IR2. Here we see that males belonging to haplogroup Q1a3a appear to have just a single copy of each orientation of IR2 (around 70 and 80 kb in length, respectively), as seen in the NCBI Build 36 reference sequence, whereas males from the other haplogroups have an excess of 17–30% of reads that map to these small sequence regions, indicating a greater copy number of at least part of the IR2 sequence. Other regions that exhibit differences between haplogroups are rAMP2, where the XDG normalized sequence depth is greater in E1b1 and Q1a3a than in the other haplogroups, and XTR2, where R1b1a and Q1a3a exhibit greater XDG normalized sequence depth than the rest. For most other sequence regions, there is little evidence for differences in copy number among haplogroups. On the basis of this evidence, we deem it unlikely that our mutation rate estimates are affected by copy number variation of MSY sequence.