Table 4 Outcome from the hard filters utilized in the QC pipeline, at the variant, genotype, and sample levels, for genome-wide biallelic and triallelic sites.
Variant Level | Site Removal Criterion | Biallelic, Sequential Filtering | Triallelic, Sequential Filtering |
---|---|---|---|
# Pass (% Pass), Variants | |||
– | Monomorphic | 17,585,919 (100) | 1,536,657 (100) |
1 | Missingness ≥ 5% | 17,584,990 (99.99) | 1,536,085 (99.96) |
2 | Blacklisted region or LCR | 17,584,990 (100) | 1,536,085 (100) |
3 | DP < 25,000 | 17,346,931 (98.65) | 1,345,292 (87.58) |
4 | MQ < 58.75 or MQ > 61.25 | 15,971,098 (92.17) | 968,987 (72.03) |
5 | InbreedingCoeff < –0.8 | 15,661,311 (98.06) | 949,810 (98.02) |
6 | VQSLOD < 7.81 | 14,760,982 (94.25) | 888,194 (93.51) |
Genotype Level | Genotype Removal Criterion | # Pass (% Pass), Genotypes | |
7 | DP < 10 | 3,819,276,086 (99.96) | 202,424,447 (98.89) |
8 | GQ < 20 | 3,800,347,137 (99.50) | 187,956,031 (92.85) |
Sample Level | Sample Removal Criterion | # Pass (% Pass), Samples | |
9 | Missingness ≥ 10% | 259 (100) | 193 (74.52) |