Fig. 5: Inflation of test statistics shows type-I errors in association studies with imputed data across the four data integration protocols as compared to genotyped variants. | Communications Biology

Fig. 5: Inflation of test statistics shows type-I errors in association studies with imputed data across the four data integration protocols as compared to genotyped variants.

From: Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks

Fig. 5

a Shows the inflation in test statistics represented using lambda genomic control when performing an association test at each of the 10,000 SNPs common to both genotyping arrays masked prior to phasing. Controls of a homogeneous genetic origin were compared between the iPSYCH2012 and iPSYCH2015i cohorts with the genotyping array as the outcome at different thresholds of post-imputation quality control across the four different data integration protocols (Array genotypes: black bars, intersection protocol: red bars, separate protocol: blue bars, union protocol: yellow bars, two-stage protocol: green bars). The dotted horizontal black line indicates the baseline ƛgc when the association test was performed using true genotypes from both arrays. Haplotypes were phased using BEAGLE5 phase-states = 560, imputations were done using BEAGLE5.1 with the HRCv1.1 as the reference. b Shows the number of SNPs left after each threshold of post-imputation quality control across the four data integration protocols.

Back to article page