Abstract
The power of testing for a population-wide association between a biallelic quantitative trait locus and a linked biallelic marker locus is predicted both empirically and deterministically for several tests. The tests were based on the analysis of variance (ANOVA) and on a number of transmission disequilibrium tests (TDT). Deterministic power predictions made use of family information, and were functions of population parameters including linkage disequilibrium, allele frequencies, and recombination rate. Deterministic power predictions were very close to the empirical power from simulations in all scenarios considered in this study. The different TDTs had very similar power, intermediate between one-way and nested ANOVAs. One-way ANOVA was the only test that was not robust against spurious disequilibrium. Our general framework for predicting power deterministically can be used to predict power in other association tests. Deterministic power calculations are a powerful tool for researchers to plan and evaluate experiments and obviate the need for elaborate simulation studies.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Kerem B, Rommens JM, Buchanan JA et al: Identification of the cystic fibrosis gene: genetic analysis. Science 1989; 245: 1073–1079.
Hastbäcka J, de la Chapelle A, Kaitila I, Sistonen P, Weaver A, Lander E : Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nat Genet 1992; 2: 204–211.
Hastbäcka J, de la Chapelle A, Mahtani MM et al: The diastrophic dysplasia gene encodes a novel sulfate transporter: positional cloning by fine-structure linkage disequilibrium mapping. Cell 1994; 78: 1073–1087.
Terwilliger JD, Weiss KM : Linkage disequilibrium mapping of complex disease: fantasy or reality? Curr Opin Biotech 1998; 9: 578–594.
Schork NJ, Cardon LR, Xu X : The future of genetic epidemiology. Trends Genet 1998; 14: 266–272.
Kruglyak L : Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 1999; 22: 139–144.
Ott J : Predicting the range of linkage disequilibrium. Proc Natl Acad Sci USA 2000; 97: 2–3.
Neale MC, Cherny SS, Sham PC et al: Distinguishing population stratification from genuine allelic effects with MX: association of ADH2 with alcohol consumption. Behav Genet 1999; 29: 233–243.
Risch N, Merikangas K : The future of genetic studies of complex human diseases. Science 1996; 273: 1516–1517.
Wright AF, Carothers AD, Pirastu M : Population choice in mapping genes for complex diseases. Nat Genet 1999; 23: 397–404.
Spielman RS, McGinnis RE, Ewens WJ : Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 1993; 52: 506–516.
Clayton D : Population association; in Balding DJ, Bishop M, Cannings C (eds) Handbook of statistical genetics. New York: John Wiley & Sons Ltd., 2001, pp 519–540.
Schork NJ, Fallin D, Thiel B et al: The future of genetic case–control studies; in Rao DC, Province MA (eds): Genetic dissection of complex traits (Advances in genetics, Vol 42). US: Academic Press, 2000, pp 191–212.
Allison DB : Transmission-disequilibrium tests for quantitative traits. Am J Hum Genet 1997; 60: 676–690.
Long AD, Langley CH : The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res 1999; 9: 720–731.
Xiong MM, Krushkal J, Boerwinkle E : TDT statistics for mapping quantitative trait loci. Ann Hum Genet 1998; 62: 431–452.
Haseman JK, Elston RC : The investigation of linkage between a quantitative trait and a marker locus. Behav Genet 1972; 2: 3–19.
Rabinowitz D : A transmission disequilibrium test for quantitative trait loci. Hum Hered 1997; 47: 342–350.
Sham PC, Cherny SS, Purcell S, Hewitt JK : Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am J Hum Genet 2000; 6: 1616–1630.
Sokal RR, Rohlf FJ : Biometry. New York US: WH Freeman and Company, 1995.
Lewontin RC : On measures of gametic disequilibrium. Genetics 1988; 120: 849–852.
Jayakar SD : On the detection and estimation of linkage between a locus influencing a quantitative character and a marker locus. Biometrics 1970; 26: 451–464.
Hill AP : Quantitative linkage: a statistical procedure for its detection and estimation. Ann Hum Genet 1975; 38: 439–449.
Weir BS : Genetic data analysis II. Sunderland, US: Sinauer Associates, Inc. 1996.
Searle SR : Linear models. New York: John Wiley & Sons, 1971.
Falconer DS, Mackay TFC : Introduction to quantitative genetics. England: Longman Group Ltd, 1996.
Zhao H : Family-based association studies. Stat Methods Med Res 2000; 9: 563–587.
The World Health Report. Part three: statistical annex. WHO, 1999, www.who.int/whr/1999/en/report.htm.
Underwood JCE : Genetic and environmental causes of disease; in Underwood JCE (ed): General and systematic pathology. London Churchill Livingstone, 1996, pp 31–60.
Weiss KM, Terwilliger JD : How many diseases does it take to map a gene with SNPs? Nat Genet 2000; 26: 151–157.
Miller RD, Kwok PY : The birth and death of human single-nucleotide polymorphisms: new experimental evidence and implications for human history and medicine. Hum Mol Genet 2001; 20: 2195–2198.
Lynch M, Walsh B : Genetics and analysis of quantitative traits. Sunderland, US: Sinauer Associates, Inc., 1998.
Acknowledgements
We are grateful to Ian White and Dr O Southwood for helpful comments on earlier versions of this manuscript. This work has been supported by Sygen International, and by the Biotechnology and Biological Sciences Research Council of UK.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
The NCP (λ) of two-way ANOVA can be expressed as25

Let σe2 be unity. Let B′ be the vector [μ, f1, f2, f3, g1, g2, g3] of parameters in the model, where μ is the sample mean, fi is the mean of the ith parental type, and gj the mean of the jth marker genotype across all parental types. Let K be a matrix of parameter contrasts reflecting the H0 being tested; for example, if H0: g1=g2 and g2=g3, then

The matrix X′X is

where nij is the number of records in the ith family and jth marker genotype class. X′X is a matrix of order 7 and rank 5; hence, there are seven unknowns and only five df. An appropriate generalisation of X′X is obtained deleting the first row and column, hence setting μ=0, and the last row and column, hence setting g3=0.25 Let G be the reduced X′X matrix. This G matrix can be partitioned as follows:

Then, if C=G−1, K*′ is the matrix K′ with the first and last columns deleted, and B* is the vector B with the first and last elements deleted, then

where C22 = K*CK* = (G22 - G21 G11- 1 G12)- 1 and (K*′ B*)′ = [g1 - g2, g2] When testing the QTL, Eq. (A2) gives

where the first part of (A3) corresponds to the sum of squares due to genotype, and the second part of (A3) corresponds to the sum of squares due to parental type.
However, it is a linked marker, rather than the QTL, what usually is being tested. Thus, Eq. (A3) needs to accommodate this fact. Using Tables 2 and 3 in the Materials and Methods section, the new λ can be written as

where bi is the expected marker genotype effect among progeny in the ith trio class, ni the number of trios in class i, Ii(j) an indicator variable equal to 1 if the trio is informative and 0 otherwise. Table A1 shows Fj, the mean value of the jth parental type, and fj, the number of these parental types.
It is also possible to use all trios, thus setting Ii(j)=1 for all i (j), without increasing the type-I error rate. By doing so, power increases slightly, through augmenting the residual df, and ascertainment of informative families becomes unnecessary.
This method of obtaining λ can be applied to derive the NCP for nested ANOVA; however, the algebra becomes more tedious. Finally, the NCP λO can be derived through Eq. (3), although a simpler method was described in the Materials and Methods section.
Appendix B
Let us consider two fixed effects, α and β, where α could represent the factor parental type, and β could represent the genotype of the progeny. Thus, the model can be written as yij=μ+αi+βi+eij, which corresponds to a two-way ANOVA model without interaction. We will now show that the original statistic F2,n−5 for 14 is equivalent to the F-ratio for testing the effects of β after having corrected for the effects due to μ and α, using the previous model.
For a constant k = 2/n - 5, we can see that

where SSμα and SSμαβ are the sum of squares explained by a model that fits μ and α, and by a model that fits μ, α, and β, respectively; SST is the total sum of squares; and Rβ∣μ,α2 and Re2 are the proportions of the total variance explained by β, after taking into account the effects of μ and α, and the proportion of unexplained variance, respectively. The null hypothesis of interest is whether factor β explains a significant amount of phenotypic variance over and above the amount explained by μ and α jointly. The F-ratio that appropriately reflects this null hypothesis is given in Eq. (B1).
Appendix C
Let assume T is a random variable following a t-distribution, and let σT be the standard deviation of T. A first-order Taylor's approximation for λ is λ = E(T/σT) ≈ E[T]/E[σT].32 In order to derive E[T] and E[σT], we used the probabilities of the 10 different types of trios and the expected effects of marker genotypes in the progeny contained in Tables 2 and 3. Hence, conditional on pM, pQ, c, and D′, E[T] = E[∑in (yi - ȳ)wi], and because all family trios are independent (ie unrelated) E[T] = NE[(y - ȳ)w], where y, the phenotype, and w, a weighting factor, are expectations for a single trio (Table 3). Thus, the expected value of the numerator of TDTR is approximately E[T] = NpM pm [pM2 (b2 - b3) + pM pm (b5 - b7) + pm2 (b8 - b9)]. When analysing the QTL, and assuming no dominance, the previous equation simplifies to E[T] = NpQ pqa.
The expected variance of T, E[σT2], is the same regardless of whether the locus being tested is the QTL or a marker. Equation (A1.23a) in Reference32 is , which reduces to
if the second term can be ignored. Hence, E[σT2] = E[1/4 ∑in (yi - ȳ)2 Hi] = 1/4E[(y - ȳ)2 H] and, as the expectation of a random variable X given another random variable Y is E[X] = E[E[X∣Y]], E[(y - ȳ)2 H] = ∑H = 02 HPHE(y - ȳ)2 = pM pm σe2 + σQTL2 Finally, dividing E[T] by
we obtain
Rights and permissions
About this article
Cite this article
Hernández-Sánchez, J., Haley, C. & Visscher, P. Power of QTL detection using association tests with family controls. Eur J Hum Genet 11, 819–827 (2003). https://doi.org/10.1038/sj.ejhg.5201042
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/sj.ejhg.5201042