Fig. 3: Comparison of iCF values for consensus GIAB samples stratified by allele frequency bin.

Ethnic comparisons include the Ashkenazi GIAB sample (NA24385) with gnomAD Non-Finnish Europeans and gnomAD Ashkenazis (a, b, respectively), the North-western European GIAB sample (NA12878) with gnomAD Non-Finnish Europeans (c), and the East Asian GIAB sample (NA24631) with gnomAD East Asians (d). iCF values were calculated as the exome-wide ratio of the sum of per-individual observed allele counts (OAC) to the sum of per-individual expected allele counts in gnomAD (EAC) according to Eq. (5) for all exonic alleles. Allele frequency bins were chosen to best stratify variants according to rare [0–0.01], low-frequency [0.01–0.05], common [0.5–0.25] and very common [0.25–0.5] bins. Allele frequencies were standardized to the minor allele. All shaded regions depict the 95% confidence interval of the iCF at each AF bin. Dashed lines indicate an iCF representing equal total allele counts between the GIAB sample and gnomAD (i.e., iCF = 1) across all protein-coding genes. iCF indicates individual correction factor, and GIAB indicates Genome In A Bottle, and gnomAD indicates genome aggregation database. Equations are defined in the “Methods” section. Source data are provided as a Source data file.