Fig. 1: The statistics of samples and variants in the WBBC. | Nature Communications

Fig. 1: The statistics of samples and variants in the WBBC.

From: Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project

Fig. 1: The statistics of samples and variants in the WBBC.

a Sample distribution and statistics by geography. The proportion of samples sequenced by whole-genome sequencing (WGS) and those genotyped by high-density Infinium Asian Screening Array (ASA) were marked in red and blue, respectively. b The number of SNV and INDEL variants identified in the WBBC cohort in five frequency bins: AC = 1, AC = 2, AC > 2 and AF < 0.005, 0.005 ≤ AF ≤ 0.05, and AF > 0.05. c The number of variants in 22 autosomes and X chromosome in the WBBC, 1000 Genome Project (1000G), gnomAD, and UK10K datasets. The horizontal bar plot shows the total number of variants in each of the four datasets. The individual dots and connected dots indicate each dataset and a combination of two or more datasets, respectively. Each vertical bar represents the number of variants in each dataset or overlapping variants in those datasets. d Functional annotations of all variants that were absent in dbSNP Build 151. The proportion of each category was filled with a different color. e The pie chart only displayed the variants in the coding and splicing regions (10 bp from exon-intron boundary). Source data are provided as a Source Data file.

Back to article page