Fig. 2: Phasing and imputation accuracy vary across state-of-the-art tools and data integration protocols. | Communications Biology

Fig. 2: Phasing and imputation accuracy vary across state-of-the-art tools and data integration protocols.

From: Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks

Fig. 2

a Shows the accuracy in switch error rate percentage of phasing across the three tools at two parameter sets each and a consensus approach taking the majority haplotype at each locus from the three tools at two parameter sets each across all four data integration protocols (cohort 2012 within the separate protocol in red bars, cohort 2015i within the separate protocol in blue bars, intersection protocol in green bars, two-stage protocol in black bars and union protocol in gray bars). Default parameters are SHAPEIT4.1.2 pbwt-depth = 4, BEAGLE5 phase-states = 280, EAGLE2.4.1 Kpbwt = 10000. High-Resolution parameters are SHAPEIT4.1.2 pbwt-depth = 8, BEAGLE5 phase-states = 560, EAGLE2.4.1 Kpbwt = 20,000. The switch error rates were computed within 124 trio offspring by comparing the computationally assigned phase to the mendelian transmission from known parental genotypes at the heterozygous loci common to both genotyping arrays. b Shows the imputation accuracy (r2) within each data integration protocol (intersection protocol denoted by circles, separate protocol denoted by triangles, two-stage protocol denoted by squares, union protocol denoted by plus symbols), and choice of phasing tool (Beagle5 in red, Eagle2.4.1 in green, Shapeit4.1.2 in black, consensus across all three tools in blue) at different minor allele frequency bins across the 10,000 SNPs common to both genotyping arrays that were masked prior to phasing.

Back to article page