Fig. 1: Illustration of different standard and FGWAS estimators and theoretical gain in effective sample size for DGEs. | Nature Genetics

Fig. 1: Illustration of different standard and FGWAS estimators and theoretical gain in effective sample size for DGEs.

From: Family-based genome-wide association study designs for increased power and robustness

Fig. 1

a, We illustrate the different sample subsets used by different FGWAS and standard GWAS methods. We give the numbers for each subset for the UKB ‘White British’ sample for illustration. The sibling difference estimator uses samples with one or more siblings’ genotypes observed (35,259 individuals), whereas the Young et al. estimator uses all related samples, which also include individuals with both parents’ genotypes observed (894) and those with one parent’s genotype observed (5,316); in addition to the related samples, the standard GWAS and unified estimators also use singletons (368,629). b, Illustration of regressions performed by standard GWAS and the unified estimator. Through linear imputation of parental genotypes, the unified estimator incorporates singletons into the FGWAS regression, enabling use of the same sample as standard GWAS to estimate the parameter vector [δ, α]T. Although the design matrix for the singleton subset (in blue) in FGWAS is collinear, the design matrix for the related sample subset (in red) is not, so the stacked design matrix is not collinear. c,d, We show the effective sample size for the unified estimator applied to n0 = 20,000 sibling pairs and n1 singletons, relative to the effective sample size of using the sibling pairs alone with imputation. The parental genotypes in the sibling sample are imputed3 using phased data (c) and unphased data (d). The parental genotypes for the singletons are imputed linearly. The theoretical gain depends upon the correlation between the siblings’ residuals, which we show in c. When imputing using unphased data, the gain depends upon the minor allele frequency3, which we show in d for a fixed correlation between siblings’ residuals of 0.3. We confirmed the theoretical results using simulations (Supplementary Note 2.1).

Back to article page