Abstract
Genome-wide association studies of clinically defined cases against controls have transformed our understanding of the genetic causes of many diseases. However, there are limitations to the simple clinical definitions used in these studies, and GWAS analyses are beginning to explore more refined phenotypes in subgroups of the existing data sets. These analyses are often performed ad hoc without considering the power requirements to justify such analyses. Here we derive expressions for the relative power of such subgroup analyses and determine the genotypic relative risks (GRRs) required to achieve equivalent power to a full analysis for relevant scenarios. We show that only modest increases in GRRs may be required to offset the reduction in power from analysing fewer cases, implying that analyses of more genetically homogenous case subgroups may have the potential to identify further associations. We find that, for lower genotypic relative risks in the full sample, subgroup analyses of more homogeneous cases have relatively more power than for higher index genotypic relative risks and that this effect is stronger for rare as opposed to common variants. As GWA studies are likely to have now identified the majority of SNPs with stronger effects, these results strongly advocate a renewed effort to identify phenotypically homogeneous disease groups, in which power to detect genetic variants with small effects will be greater. These results suggest that analysis of case subsets could be a powerful strategy to uncover some of the hidden heritability for common complex disorders, particularly in identifying rarer variants of modest effect.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Franke A, McGovern DP, Barrett JC et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat Genet 2010; 42: 1118–1125.
Lambert JC, Ibrahim-Verbaas CA, Harold D et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat Genet 2013; 45: 1452–1458.
Deloukas P, Kanoni S, Willenborg C et al. Large-scale association analysis identifies new risk loci for coronary artery disease. Nat Genet 2013; 45: 25–33.
Visscher PM, Brown MA, McCarthy MI, Yang J : Five years of GWAS discovery. Am J Hum Genet 2012; 90: 7–24.
Morris AP, Voight BF, Teslovich TM et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 2012; 44: 981–990.
Ripke S, O'Dushlaine C, Chambert K et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat Genet 2013; 45: 1150–1159.
Ehret GB, Munroe PB, Rice KM et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 2011; 478: 103–109.
Plomin R, Haworth CM, Davis OS : Common disorders are quantitative traits. Nat Rev Genet 2009; 10: 872–878.
Mitchell KJ : What is complex about complex disorders? Genome Biol 2012; 13: 237.
Girirajan S, Eichler EE : Phenotypic variability and genetic susceptibility to genomic disorders. Hum Mol Genet 2010; 19: R176–R187.
Manchia M, Cullis J, Turecki G, Rouleau GA, Uher R, Alda M : The impact of phenotypic and genetic heterogeneity on results of genome wide association studies of complex diseases. PLoS One 2013; 8: e76295.
Beach TG, Monsell SE, Phillips LE, Kukull W : Accuracy of the clinical diagnosis of Alzheimer disease at National Institute on Aging Alzheimer Disease Centers, 2005-2010. J Neuropathol Exp Neurol 2012; 71: 266–273.
Traylor M, Farrall M, Holliday EG et al. Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE collaboration): a meta-analysis of genome-wide association studies. Lancet Neurol 2012; 11: 951–962.
Bellenguez C, Bevan S, Gschwendtner A et al. Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke. Nat Genet 2012; 44: 328–333.
Anttila V, Winsvold BS, Gormley P et al. Genome-wide meta-analysis identifies new susceptibility loci for migraine. Nat Genet 2013; 45: 912–917.
Padyukov L, Seielstad M, Ong RT et al. A genome-wide association study suggests contrasting associations in ACPA-positive versus ACPA-negative rheumatoid arthritis. Ann Rheum Dis 2011; 70: 259–265.
Ohmura K, Terao C, Maruya E et al. Anti-citrullinated peptide antibody-negative RA is a genetically distinct subset: a definitive study using only bone-erosive ACPA-negative rheumatoid arthritis. Rheumatology (Oxford) 2010; 49: 2298–2304.
Zaitlen N, Lindstrom S, Pasaniuc B et al. Informed conditioning on clinical covariates increases power in case-control association studies. PLoS Genet 2012; 8: e1003032.
Traylor M, Bevan S, Rothwell PM et al. Using phenotypic heterogeneity to increase the power of genome-wide association studies: application to age at onset of ischaemic stroke subphenotypes. Genet Epidemiol 2013; 37: 495–503.
Perry JR, Voight BF, Yengo L et al. Stratifying type 2 diabetes cases by BMI identifies genetic risk variants in LAMA1 and enrichment for risk variants in lean compared to obese cases. PLoS Genet 2012; 8: e1002741.
Li Y, Sheu CC, Ye Y et al. Genetic variants and risk of lung cancer in never smokers: a genome-wide association study. Lancet Oncol 2010; 11: 321–330.
Yang J, Wray NR, Visscher PM : Comparing apples and oranges: equating the power of case-control and quantitative trait association studies. Genet Epidemiol 2010; 34: 254–257.
Purcell S, Cherny SS, Sham PC : Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 2003; 19: 149–150.
Okada Y, Wu D, Trynka G et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 2014; 506: 376–381.
Beecham AH, Patsopoulos NA, Xifara DK et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat Genet 2013; 45: 1353–1360.
Michailidou K, Hall P, Gonzalez-Neira A et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 2013; 45: 353–361, 361e351-352.
Lee SH, Yang J, Goddard ME, Visscher PM, Wray NR : Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 2012; 28: 2540–2542.
Lee SH, Ripke S, Neale BM et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet 2013; 45: 984–994.
Davis LK, Yu D, Keenan CL et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet 2013; 9: e1003864.
Purcell SM, Wray NR, Stone JL et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009; 460: 748–752.
Goris A, van Setten J, Diekstra F et al. No evidence for shared genetic basis of common variants in multiple sclerosis and amyotrophic lateral sclerosis. Hum Mol Genet 2014; 23: 1916–1922.
Moskvina V, Harold D, Russo G et al. Analysis of genome-wide association studies of Alzheimer disease and of Parkinson disease to determine if these 2 diseases share a common genetic risk. JAMA Neurol 2013; 70: 1268–1276.
Dudbridge F : Power and predictive accuracy of polygenic risk scores. PLoS Genet 2013; 9: e1003348.
Chan Y, Lim ET, Sandholm N et al. An excess of risk-increasing low-frequency variants can be a signal of polygenic inheritance in complex diseases. Am J Hum Genet 2014; 94: 437–452.
Holliday EG, Maguire JM, Evans TJ et al. Common variants at 6p21.1 are associated with large artery atherosclerotic stroke. Nat Genet 2012; 44: 1147–1151.
Gudbjartsson DF, Holm H, Gretarsdottir S et al. A sequence variant in ZFHX3 on 16q22 associates with atrial fibrillation and ischemic stroke. Nat Genet 2009; 41: 876–878.
Gretarsdottir S, Thorleifsson G, Manolescu A et al. Risk variants for atrial fibrillation on chromosome 4q25 associate with ischemic stroke. Ann Neurol 2008; 64: 402–409.
Manolio TA, Collins FS, Cox NJ et al. Finding the missing heritability of complex diseases. Nature 2009; 461: 747–753.
McClellan J, King MC : Genetic heterogeneity in human disease. Cell 2010; 141: 210–217.
Acknowledgements
Matthew Traylor is funded by a Stroke Association Project Grant (TSA 2013/01). We acknowledge support from the National Institutes of Health Research Biomedical Research Centre at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies this paper on European Journal of Human Genetics website
Supplementary information
Rights and permissions
About this article
Cite this article
Traylor, M., Markus, H. & Lewis, C. Homogeneous case subgroups increase power in genetic association studies. Eur J Hum Genet 23, 863–869 (2015). https://doi.org/10.1038/ejhg.2014.194
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ejhg.2014.194
This article is cited by
-
Phenotypic and genetic markers of psychopathology in a population-based sample of older adults
Translational Psychiatry (2021)
-
Clustering by phenotype and genome-wide association study in autism
Translational Psychiatry (2020)