Fig. 1: Copy number variation of TCAF SDs in a collection of diverse human and nonhuman samples.

Copy numbers were estimated using a read-depth-based genotyping method. a Copy number heatmaps, where each row represents the CN of a sample over the TCAF locus. The colored arrows (A, B, and C) represent the three major SD blocks in this region. The white area in the middle represents the gap present in the human reference genome (GRCh38). The two white boxes show a putative interlocus gene conversion event that correlates with latitudinal locations of populations (Supplementary Figs. 16–20). Dark green bars at the bottom indicate the unique diploid sequences used for downstream phylogenetic and population genetic analyses (chr7:143,501,000–143,521,000, chr7:143,729,525–143,741,525, chr7:143,875,000–143,895,000 [GRCh38]). b Distributions of the overall TCAF CN genotypes among samples from nonhuman great apes, archaic hominins, and modern humans using a representative region (chr7:143,615,002–143,624,482). c Geographic distribution of the overall TCAF CN genotypes in the 54 Human Genome Diversity Project (HGDP) populations and the nonhuman great ape samples. Geo-coordinates for the populations are listed in Supplementary Data 8. Pie charts show the CN frequency distribution for a given population. The map was created using the R packages rgdal (v1.5), scatterpie (v0.1.6), and Natural Earth. Note that the color scheme is slightly different from those in the CN heatmaps in Fig. 1a.