Fig. 1: CPC panel with diploid assemblies of 58 core samples.

a, Left: the geographical locations and ethnic, linguistic and genetic affiliations of the samples sequenced by CPC (see Supplementary Table 1 for details). The geographical distribution of HPRC samples is shown in the top left. Han-N, Han Chinese from North China; Han-S, Han Chinese from South China; Han-C, Han Chinese from Central China. Top right: the results of principal component (PC) analysis based on whole-genome data of the CPC samples (coloured dots) in the context of East Asian populations (grey dots). The four East Asian samples in HPRC are indicated using triangles in the principal component plot. b, NGx plot showing the assembly contiguity of the 116 CPC core assemblies. The contigs of T2T-CHM13 and GRCh38 (N-masked) are included for comparison. c, Assembly quality of the 116 CPC core assemblies. The x axis shows the quality value of each assembly, and the y axis shows the rate at which the circular consensus sequencing reads were remapped to the assembly contigs. d, Completeness of the 116 CPC core assemblies. The x axis represents the total length of the genome that could not be aligned to the GRCh38 and T2T-CHM13 references, and the y axis shows the proportion of each assembly aligned to GRCh38 and T2T-CHM13. e, Density plot showing the small-scale assembly error distribution of the 116 CPC core assemblies. f, Duplication ratio of the 116 CPC core assemblies. The dark and light colours of each bar represent the duplication rate related to T2T-CHM13 and GRCh38, respectively. The map of China in a was obtained from a standard map service (GS[2020]4618) approved by the Ministry of National Resources of the People’s Republic of China (https://m.mnr.gov.cn).