Table 2 Non-reference concordance rates after running each variant-level filter in the QC pipeline in succession, for ClinVar-indexed biallelic sites only.

From: Empirical design of a variant quality control pipeline for whole genome sequencing data using replicate discordance

Variant Filter

Site Removal Criterion

Concordance Rate of Passing Sites (%)

Change in Rate (%)

Monomorphic

99.375

1

Missingness ≥ 5%

99.473

+0.098

2

Within blacklisted region or LCR

99.473

0

3

DP < 25,000

99.563

+0.090

4

MQ < 58.75 or MQ > 61.25

99.695

+0.132

5

VQSLOD < 7.81

99.725

+0.030

6

InbreedingCoeff < –0.8

99.729

+0.004

  1. These values were calculated following removal of non-‘PASS’ sites according to GATK HaplotypeCaller. A pair of genotypes is concordant when the genotypes of a duplicate pair are identical. The concordance change was always positive or zero. Prior to QC, 99.375% of the 9,946,118 replicate genotypes at ClinVar-indexed biallelic sites were concordant. Following QC, 99.729% of the 8,722,641 remaining genotypes were concordant. Matching was performed using ClinVar version 2019-01-02.