Table 1 Summary statistics of variant accumulation analyses

From: Trajectory of exonic variant discovery in a large clinical population: implications for variant curation

 

# Exomes

Asymptote (SD)

# Exomes to reach asymptote

% Asymptote at current databasea

Slope (SD)

r 2

All genes, all individuals

90,000

4,587,310 (126,700)

360,000

97

22.4 (2)

0.997

All genes, unrelated individuals

60,000

4,437,535 (158,500)

290,000

93

16.9 (2)

0.998

74 genes, all individuals

90,000

32,010 (946)

260,000

95

22.4 (2)

0.996

74 genes, unrelated individuals

60,000

31,137 (1143)

210,000

93

16.9 (2)

0.998

  1. Accrual of coding and splicing variants in 17,267 genes (all genes) or in 74 actionable genes in all individuals and in unrelated individuals were fitted to a nonlinear least square fit model with simulation to obtain the trajectory of variant growth to asymptote (See Fig. 1). The summary statistics of the curve fitting are detailed below. At the current DiscovEHR database size, 93–97% of projected variants have been observed (see Supplemental Tables S2S5).
  2. SD, standard deviation.
  3. a% Asymptote at current database: % of variants attained at the current # exomes in the DiscovEHR cohort from Supplemental Tables S2–S5.