Abstract
Gene–environment interactions may provide a mechanism for targeting interventions to those individuals who would gain the most benefit from them. Searching for interactions agnostically on a genome-wide scale requires large sample sizes, often achieved through collaboration among multiple studies in a consortium. Family studies can contribute to consortia, but to do so they must account for correlation within families by using specialized analytic methods. In this paper, we investigate the performance of methods that account for within-family correlation, in the context of gene–environment interactions with binary exposures and quantitative outcomes. We simulate both cross-sectional and longitudinal measurements, and analyze the simulated data taking family structure into account, via generalized estimating equations (GEE) and linear mixed-effects models. With sufficient exposure prevalence and correct model specification, all methods perform well. However, when models are misspecified, mixed modeling approaches have seriously inflated type I error rates. GEE methods with robust variance estimates are less sensitive to model misspecification; however, when exposures are infrequent, GEE methods require modifications to preserve type I error rate. We illustrate the practical use of these methods by evaluating gene–drug interactions on fasting glucose levels in data from the Framingham Heart Study, a cohort that includes related individuals.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Thomas D : Methods for investigating gene–environment interactions in candidate pathway and genome-wide association studies. Annu Rev Public Health 2010; 31: 21–36.
Roses A : Pharmacogenetics and the practice of medicine. Nature 2000; 405: 857–865.
Meyer U : Pharmacogenetics and adverse drug reactions. Lancet 2000; 356: 1667–1671.
Khoury M, Wagener D : Epidemiological evaluation of the use of genetics to improve the predictive value of disease risk factors. Am J Hum Genet 1995; 56: 835–844.
Song M, Lee KM, Kang D : Breast cancer prevention based on gene–environment interaction. Mol Carcinogen 2011; 50: 280–290.
Voorman A, Lumley T, McKnight B, Rice K : Behavior of QQ-plots and genomic control in studies of gene–environment interaction. PLoS One 2011; 6: e19416.
Tchetgen ET, Kraft P : On the robustness of tests of genetic associations incorporating gene–environment interaction when the environmental exposure is misspecified. Epidemiology 2011; 22: 257–261.
Sitlani C, Rice K, Lumley T et al: Generalized estimating equations for genome-wide association studies using longitudinal phenotype data. Stat Med 2015; 34: 118–130.
Burton P, Clayton D, Cardon L et al: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007; 447: 661–678.
de Bakker P, Ferreira M, Jia X, Neale B, Raychaudhuri S, Voight B : Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 2008; 17: R122–R128.
Gauderman W, Macgregor S, Briollais L et al: Longitudinal data analysis in pedigree studies. Genet Epidemiol 2003; 25: S18–S28.
Eu-ahsunthornwattana J, Miller E, Fakiola M et al: Comparison of methods to account for relatedness in genome-wide association studies with family-based data. PLoS Genet 2014; 10: e1004445.
Suktitipat B, Mathias R, Vaidya D et al: The robustness of generalized estimating equations for association tests in extended family data. Hum Hered 2012; 74: 17–26.
Laird N, Ware J : Random-effect models for longitudinal data. Biometrics 1982; 38: 963–974.
Liang KY, Zeger S : Longitudinal data analysis using generalized linear models. Biometrika 1986; 73: 13–22.
Zeger S, Liang KY : Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986; 42: 121–130.
Hardin J, Hilbe J : Generalized Estimating Equations. 2nd edn. CRC Press: Boca Raton, FL USA, 2013.
Lipsitz S, Fitzmaurice G, Orav E, Laird N : Performance of generalized estimating equations in practical situations. Biometrics 1994; 50: 270–278.
Wang M, Long Q : Modified robust variance estimator for generalized estimating equations with improved small-sample performance. Stat Med 2011; 30: 1278–1291.
Mancl L, DeRouen T : A covariance estimator for GEE with improved small-sample properties. Biometrics 2001; 57: 126–134.
Pan W : On the robust variance estimator in generalised estimating equations. Biometrika 2001; 88: 901–906.
Satterthwaite F : An approximate distribution of estimates of variance components. Biometrics Bull 1946; 2: 110–114.
Pan W, Wall M : Small-sample adjustments in using the sandwich variance estimator in generalized estimating equations. Stat Med 2002; 21: 1429–1441.
Li B, Chen W, Zhan X et al: A likelihood-based framework for variant calling and de novo mutation detection in families. PLoS Genet 2012; 8: e1002944.
R Core Team R: A Language and Environment for Statistical Computing. R Core Team: Vienna, Austria, 2014.
Psaty B, O’Donnell C, Gudnason V et al: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: design of prospective meta-analyses of genome-wide association studies from five cohorts. Circ Cardiovasc Genet 2009; 2: 73–80.
Dawber T, Meadors G, Moore F Jr : Epidemiological approaches to heart disease: the Framingham Study. Am J Public Health Nations Health 1951; 41: 279–281.
Kannel W, Feinleib M, McNamara P, Garrison R, Castelli W : An investigation of coronary heart disease in families. The Framingham offspring study. Am J Epidemiol 1979; 110: 281–290.
Splansky G, Corey D, Yang Q et al: The Third Generation Cohort of the National Heart, Lung, and Blood Institute’s Framingham Heart Study: design, recruitment, and initial examination. Am J Epidemiol 2007; 165: 1328–1335.
Baigent C, Blackwell L, Emberson J et al: Efficacy and safety of more intensive lowering of LDL cholesterol: a meta-analysis of data from 170,000 participants in 26 randomised trials. Lancet 2010; 376: 1670–1681.
Preiss D, Seshasai S, Welsh P et al: Risk of incident diabetes with intensive-dose compared with moderate-dose statin therapy: a meta-analysis. JAMA 2011; 305: 2556–2564.
Mukherjee B, Ahn J, Gruber S, Chatterjee N : Testing gene–environment interaction in large-scale case–control association studies: possible choices and comparisons. Am J Epidemiol 2012; 175: 177–190.
Astle W, Balding D : Population structure and cryptic relatedness in genetic association studies. Stat Sci 2009; 24: 451–471.
Price A, Zaitlen N, Reich D, Patterson N : New approaches to population stratification in genome-wide association studies. Nat Rev Genet 2010; 11: 459–463.
Moreno-Macias H, Romieu I, London S, Laird N : Gene–environment interaction tests for family studies with quantitative phenotypes: a review and extension to longitudinal measures. Hum Genomics 2010; 4: 302–326.
Zhu X, Li S, Cooper R, Elston R : A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet 2008; 82: 352–365.
Acknowledgements
This work was supported by US NIH R01 HL103612, R01 HL105756, R01 DK078616, and U01 DK085526. From the FHS of the National Heart Lung and Blood Institute of the National Institutes of Health and Boston University School of Medicine: This work was supported by the National Heart, Lung, and Blood Institute’s FHS (Contract No. N01-HC-25195) and its contract with Affymetrix Inc. for genotyping services (Contract No. N02-HL-6-4278). Analyses reflect intellectual input and resource development from FHS investigators participating in the SNP Health Association Resource (SHARe) project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
Psaty serves on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson and on the DSMB of a clinical trial of a device funded by the manufacturer (Zoll LifeCor). The other authors have no conflict of interest.
Additional information
Supplementary Information accompanies this paper on European Journal of Human Genetics website
Supplementary information
Rights and permissions
About this article
Cite this article
Sitlani, C., Dupuis, J., Rice, K. et al. Genome-wide gene–environment interactions on quantitative traits using family data. Eur J Hum Genet 24, 1022–1028 (2016). https://doi.org/10.1038/ejhg.2015.253
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ejhg.2015.253