Power of QTL detection using association tests with family controls

Hernández-Sánchez, Jules; Haley, Chris S; Visscher, Peter M

doi:10.1038/sj.ejhg.5201042

Article
Published: 22 October 2003

Power of QTL detection using association tests with family controls

Jules Hernández-Sánchez¹,
Chris S Haley¹ &
Peter M Visscher²

European Journal of Human Genetics volume 11, pages 819–827 (2003)Cite this article

885 Accesses
4 Citations
Metrics details

Abstract

The power of testing for a population-wide association between a biallelic quantitative trait locus and a linked biallelic marker locus is predicted both empirically and deterministically for several tests. The tests were based on the analysis of variance (ANOVA) and on a number of transmission disequilibrium tests (TDT). Deterministic power predictions made use of family information, and were functions of population parameters including linkage disequilibrium, allele frequencies, and recombination rate. Deterministic power predictions were very close to the empirical power from simulations in all scenarios considered in this study. The different TDTs had very similar power, intermediate between one-way and nested ANOVAs. One-way ANOVA was the only test that was not robust against spurious disequilibrium. Our general framework for predicting power deterministically can be used to predict power in other association tests. Deterministic power calculations are a powerful tool for researchers to plan and evaluate experiments and obviate the need for elaborate simulation studies.

Robust association tests for quantitative traits on the X chromosome

Article 10 September 2022

Plasma proteomic associations with genetics and health in the UK Biobank

Article Open access 04 October 2023

Whole genome sequencing in the Middle Eastern Qatari population identifies genetic associations with 45 clinically relevant traits

Article Open access 23 February 2021

References

Kerem B, Rommens JM, Buchanan JA et al: Identification of the cystic fibrosis gene: genetic analysis. Science 1989; 245: 1073–1079.
Article CAS PubMed Google Scholar
Hastbäcka J, de la Chapelle A, Kaitila I, Sistonen P, Weaver A, Lander E : Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nat Genet 1992; 2: 204–211.
Article PubMed Google Scholar
Hastbäcka J, de la Chapelle A, Mahtani MM et al: The diastrophic dysplasia gene encodes a novel sulfate transporter: positional cloning by fine-structure linkage disequilibrium mapping. Cell 1994; 78: 1073–1087.
Article PubMed Google Scholar
Terwilliger JD, Weiss KM : Linkage disequilibrium mapping of complex disease: fantasy or reality? Curr Opin Biotech 1998; 9: 578–594.
Article CAS PubMed Google Scholar
Schork NJ, Cardon LR, Xu X : The future of genetic epidemiology. Trends Genet 1998; 14: 266–272.
Article CAS PubMed Google Scholar
Kruglyak L : Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 1999; 22: 139–144.
Article CAS PubMed Google Scholar
Ott J : Predicting the range of linkage disequilibrium. Proc Natl Acad Sci USA 2000; 97: 2–3.
Article CAS PubMed PubMed Central Google Scholar
Neale MC, Cherny SS, Sham PC et al: Distinguishing population stratification from genuine allelic effects with MX: association of ADH2 with alcohol consumption. Behav Genet 1999; 29: 233–243.
Article Google Scholar
Risch N, Merikangas K : The future of genetic studies of complex human diseases. Science 1996; 273: 1516–1517.
Article CAS PubMed Google Scholar
Wright AF, Carothers AD, Pirastu M : Population choice in mapping genes for complex diseases. Nat Genet 1999; 23: 397–404.
Article CAS PubMed Google Scholar
Spielman RS, McGinnis RE, Ewens WJ : Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 1993; 52: 506–516.
CAS PubMed PubMed Central Google Scholar
Clayton D : Population association; in Balding DJ, Bishop M, Cannings C (eds) Handbook of statistical genetics. New York: John Wiley & Sons Ltd., 2001, pp 519–540.
Google Scholar
Schork NJ, Fallin D, Thiel B et al: The future of genetic case–control studies; in Rao DC, Province MA (eds): Genetic dissection of complex traits (Advances in genetics, Vol 42). US: Academic Press, 2000, pp 191–212.
Google Scholar
Allison DB : Transmission-disequilibrium tests for quantitative traits. Am J Hum Genet 1997; 60: 676–690.
CAS PubMed PubMed Central Google Scholar
Long AD, Langley CH : The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res 1999; 9: 720–731.
CAS PubMed PubMed Central Google Scholar
Xiong MM, Krushkal J, Boerwinkle E : TDT statistics for mapping quantitative trait loci. Ann Hum Genet 1998; 62: 431–452.
Article CAS PubMed Google Scholar
Haseman JK, Elston RC : The investigation of linkage between a quantitative trait and a marker locus. Behav Genet 1972; 2: 3–19.
Article CAS PubMed Google Scholar
Rabinowitz D : A transmission disequilibrium test for quantitative trait loci. Hum Hered 1997; 47: 342–350.
Article CAS PubMed Google Scholar
Sham PC, Cherny SS, Purcell S, Hewitt JK : Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am J Hum Genet 2000; 6: 1616–1630.
Article Google Scholar
Sokal RR, Rohlf FJ : Biometry. New York US: WH Freeman and Company, 1995.
Google Scholar
Lewontin RC : On measures of gametic disequilibrium. Genetics 1988; 120: 849–852.
CAS PubMed PubMed Central Google Scholar
Jayakar SD : On the detection and estimation of linkage between a locus influencing a quantitative character and a marker locus. Biometrics 1970; 26: 451–464.
Article CAS PubMed Google Scholar
Hill AP : Quantitative linkage: a statistical procedure for its detection and estimation. Ann Hum Genet 1975; 38: 439–449.
Article CAS PubMed Google Scholar
Weir BS : Genetic data analysis II. Sunderland, US: Sinauer Associates, Inc. 1996.
Google Scholar
Searle SR : Linear models. New York: John Wiley & Sons, 1971.
Google Scholar
Falconer DS, Mackay TFC : Introduction to quantitative genetics. England: Longman Group Ltd, 1996.
Google Scholar
Zhao H : Family-based association studies. Stat Methods Med Res 2000; 9: 563–587.
Article CAS PubMed Google Scholar
The World Health Report. Part three: statistical annex. WHO, 1999, www.who.int/whr/1999/en/report.htm.
Underwood JCE : Genetic and environmental causes of disease; in Underwood JCE (ed): General and systematic pathology. London Churchill Livingstone, 1996, pp 31–60.
Google Scholar
Weiss KM, Terwilliger JD : How many diseases does it take to map a gene with SNPs? Nat Genet 2000; 26: 151–157.
Article CAS PubMed Google Scholar
Miller RD, Kwok PY : The birth and death of human single-nucleotide polymorphisms: new experimental evidence and implications for human history and medicine. Hum Mol Genet 2001; 20: 2195–2198.
Article Google Scholar
Lynch M, Walsh B : Genetics and analysis of quantitative traits. Sunderland, US: Sinauer Associates, Inc., 1998.
Google Scholar

Download references

Acknowledgements

We are grateful to Ian White and Dr O Southwood for helpful comments on earlier versions of this manuscript. This work has been supported by Sygen International, and by the Biotechnology and Biological Sciences Research Council of UK.

Author information

Authors and Affiliations

Roslin Institute (Edinburgh), Roslin, Midlothian, EH25 9PS, Scotland, UK
Jules Hernández-Sánchez & Chris S Haley
Institute of Cell, Animal and Population Biology, University of Edinburgh, West Mains Road, Edinburgh, EH9 3JT, Scotland, UK
Peter M Visscher

Authors

Jules Hernández-Sánchez
View author publications
Search author on:PubMed Google Scholar
Chris S Haley
View author publications
Search author on:PubMed Google Scholar
Peter M Visscher
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jules Hernández-Sánchez.

Appendices

Appendix A

The NCP (λ) of two-way ANOVA can be expressed as²⁵

Let σ_e² be unity. Let B′ be the vector [μ, f₁, f₂, f₃, g₁, g₂, g₃] of parameters in the model, where μ is the sample mean, f_i is the mean of the ith parental type, and g_j the mean of the jth marker genotype across all parental types. Let K be a matrix of parameter contrasts reflecting the H₀ being tested; for example, if H₀: g₁=g₂ and g₂=g₃, then

The matrix X′X is

where n_ij is the number of records in the ith family and jth marker genotype class. X′X is a matrix of order 7 and rank 5; hence, there are seven unknowns and only five df. An appropriate generalisation of X′X is obtained deleting the first row and column, hence setting μ=0, and the last row and column, hence setting g₃=0.²⁵ Let G be the reduced X′X matrix. This G matrix can be partitioned as follows:

Then, if C=G⁻¹, K^*′ is the matrix K′ with the first and last columns deleted, and B^* is the vector B with the first and last elements deleted, then

where C₂₂ = K^*CK^* = (G₂₂ - G₂₁ G₁₁^{- 1} G₁₂)^{- 1} and (K^*′ B^*)′ = [g₁ - g₂, g₂] When testing the QTL, Eq. (A2) gives

where the first part of (A3) corresponds to the sum of squares due to genotype, and the second part of (A3) corresponds to the sum of squares due to parental type.

However, it is a linked marker, rather than the QTL, what usually is being tested. Thus, Eq. (A3) needs to accommodate this fact. Using Tables 2 and 3 in the Materials and Methods section, the new λ can be written as

where b_i is the expected marker genotype effect among progeny in the ith trio class, n_i the number of trios in class i, I_i(j) an indicator variable equal to 1 if the trio is informative and 0 otherwise. Table A1 shows F_j, the mean value of the jth parental type, and f_j, the number of these parental types.

Table 11 Family mean (F_i) and number (f_i)

Full size table

It is also possible to use all trios, thus setting I_i(j)=1 for all i (j), without increasing the type-I error rate. By doing so, power increases slightly, through augmenting the residual df, and ascertainment of informative families becomes unnecessary.

This method of obtaining λ can be applied to derive the NCP for nested ANOVA; however, the algebra becomes more tedious. Finally, the NCP λ_O can be derived through Eq. (3), although a simpler method was described in the Materials and Methods section.

Appendix B

Let us consider two fixed effects, α and β, where α could represent the factor parental type, and β could represent the genotype of the progeny. Thus, the model can be written as y_ij=μ+α_i+β_i+e_ij, which corresponds to a two-way ANOVA model without interaction. We will now show that the original statistic F_2,n−5 for ${TDT}_{Q_{5}}$ ¹⁴ is equivalent to the F-ratio for testing the effects of β after having corrected for the effects due to μ and α, using the previous model.

For a constant k = 2/n - 5, we can see that

where SS_μ_α and SS_μ_α_β are the sum of squares explained by a model that fits μ and α, and by a model that fits μ, α, and β, respectively; SS_T is the total sum of squares; and R_β∣μ,α² and R_e² are the proportions of the total variance explained by β, after taking into account the effects of μ and α, and the proportion of unexplained variance, respectively. The null hypothesis of interest is whether factor β explains a significant amount of phenotypic variance over and above the amount explained by μ and α jointly. The F-ratio that appropriately reflects this null hypothesis is given in Eq. (B1).

Appendix C

Let assume T is a random variable following a t-distribution, and let σ_T be the standard deviation of T. A first-order Taylor's approximation for λ is λ = E(T/σ_T) ≈ E[T]/E[σ_T].³² In order to derive E[T] and E[σ_T], we used the probabilities of the 10 different types of trios and the expected effects of marker genotypes in the progeny contained in Tables 2 and 3. Hence, conditional on p_M, p_Q, c, and D′, E[T] = E[∑_iⁿ (y_i - ȳ)w_i], and because all family trios are independent (ie unrelated) E[T] = NE[(y - ȳ)w], where y, the phenotype, and w, a weighting factor, are expectations for a single trio (Table 3). Thus, the expected value of the numerator of TDT_R is approximately E[T] = Np_M p_m [p_M² (b₂ - b₃) + p_M p_m (b₅ - b₇) + p_m² (b₈ - b₉)]. When analysing the QTL, and assuming no dominance, the previous equation simplifies to E[T] = Np_Q p_qa.

The expected variance of T, E[σ_T²], is the same regardless of whether the locus being tested is the QTL or a marker. Equation (A1.23a) in Reference³² is , which reduces to if the second term can be ignored. Hence, E[σ_T²] = E[1/4 ∑_iⁿ (y_i - ȳ)² H_i] = 1/4E[(y - ȳ)² H] and, as the expectation of a random variable X given another random variable Y is E[X] = E[E[X∣Y]], E[(y - ȳ)² H] = ∑_{H = 0}² HP_HE(y - ȳ)² = p_M p_m σ_e² + σ_QTL² Finally, dividing E[T] by we obtain

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernández-Sánchez, J., Haley, C. & Visscher, P. Power of QTL detection using association tests with family controls. Eur J Hum Genet 11, 819–827 (2003). https://doi.org/10.1038/sj.ejhg.5201042

Download citation

Received: 22 October 2002
Revised: 17 March 2003
Accepted: 16 April 2003
Published: 22 October 2003
Issue date: 01 November 2003
DOI: https://doi.org/10.1038/sj.ejhg.5201042

Keywords

This article is cited by

Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses
- M J Sillanpää
Heredity (2011)

Power of QTL detection using association tests with family controls

Abstract

Similar content being viewed by others

Robust association tests for quantitative traits on the X chromosome

Plasma proteomic associations with genetics and health in the UK Biobank

Whole genome sequencing in the Middle Eastern Qatari population identifies genetic associations with 45 clinically relevant traits

Log in or create a free account to read this content

References

Acknowledgements