Table 3 Ranking of variant-level filters for ClinVar-indexed biallelic sites, and genome-wide biallelic and triallelic sites.

From: Empirical design of a variant quality control pipeline for whole genome sequencing data using replicate discordance

Rank

Filter

Negative Predictive Value

Specificity

Discordances among Discarded Genotypes (%)

% of Discordant Genotypes Removed

ClinVar Biallelic

All Biallelic

All Triallelic

ClinVar Biallelic

All Biallelic

All Triallelic

1

Missingness

87.65

1.98

42.55

18.39

0.03

34.92

2

MQ

16.19

8.85

42.91

48.70

55.38

79.98

3

DP

13.97

20.72

45.97

27.46

19.21

53.34

4

VQSLOD*

12.16

6.77

41.15

55.96

68.65

99.03

5

InbreedingCoeff

2.25

2.31

29.62

4.40

3.65

37.76

  1. The filters are ranked in order from greatest to lowest preference for filtering out discordant genotypes. Negative predictive value refers to a filter’s ability to remove discordant genotypes (true negatives) and minimize the number of concordant genotypes removed (false negatives). Specificity refers to a filter’s ability to identify and remove discordant genotypes (true negatives) and minimize the number of discordant genotypes retained (false positives). Matching was performed using ClinVar version 2019-01-02.
  2. *Filter applied to biallelic and triallelic sites involving only SNVs.