Abstract
Detecting gene–gene interactions or epistasis in studies of human complex diseases is a big challenge in the area of epidemiology. To address this problem, several methods have been developed, mainly in the context of data dimensionality reduction. One of these methods, Model-Based Multifactor Dimensionality Reduction, has so far mainly been applied to case–control studies. In this study, we evaluate the power of Model-Based Multifactor Dimensionality Reduction for quantitative traits to detect gene–gene interactions (epistasis) in the presence of error-free and noisy data. Considered sources of error are genotyping errors, missing genotypes, phenotypic mixtures and genetic heterogeneity. Our simulation study encompasses a variety of settings with varying minor allele frequencies and genetic variance for different epistasis models. On each simulated data, we have performed Model-Based Multifactor Dimensionality Reduction in two ways: with and without adjustment for main effects of (known) functional SNPs. In line with binary trait counterparts, our simulations show that the power is lowest in the presence of phenotypic mixtures or genetic heterogeneity compared to scenarios with missing genotypes or genotyping errors. In addition, empirical power estimates reduce even further with main effects corrections, but at the same time, false-positive percentages are reduced as well. In conclusion, phenotypic mixtures and genetic heterogeneity remain challenging for epistasis detection, and careful thought must be given to the way important lower-order effects are accounted for in the analysis.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Ritchie MD, Hahn LW, Roodi N et al: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 2001; 69: 138–147.
Calle ML, Urrea V, Vellalta G, Malats N, Van Steen K : Model-Based Multifactor Dimensionality Reduction for Detecting Interactions in High-Dimensional Genomic Data; Department of Systems Biology UoV, 2008, http://www.recercat.net/handle/2072/5001.
Calle ML, Urrea V, Vellalta G, Malats N, Van Steen K : Improving strategies for detecting genetic patterns of disease susceptibility in association studies. Stat Med 2008b; 27: 6532–6546.
Lou XY, Chen GB, Yan L ; et al: A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. Am J Hum Genet 2007; 80: 1125–1137.
Cattaert T, Urrea V, Naj AC et al: FAM-MDR: a flexible family-base multifactor dimensionality reduction technique to detect epistasis using related individuals. PLoS ONE 2010; 5: e10304.
Mahachie John JM, Baurecht H, Rodriguez E et al: Analysis of the high affinity IgE receptor genes reveals epistatic effects of FCER1A variants on eczema risk. Allergy 2010; 65: 875–882.
Westfall PH, Young SS : Resampling-Based Multiple Testing. New York: Wiley, 1993.
Evans DM, Marchini J, Morris AP, Cardon LR : Two-stage two-locus models in genome-wide association. PLoS Genet 2006; 2: e157.
Ritchie MD, Hahn LW, Moore JH : Power of multifactor dimensionality reduction for detecting gene–gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol 2003; 24: 150–157.
Akey JM, Zhang K, Xiong M, Doris P, Jin L : The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. Am J Hum Genet 2001; 68: 1447–1456.
Bradley JV : Robustness? Br J Math Stat Psychol 1978; 31: 144–152.
Sham P : Statistics in Human Genetics (Arnold Applications of Statistics Series). New York, Toronto: Johnson Wiley & Sons Inc., 1998.
Verhoeven KJF, Cassela G, McIntyre LM : Epistasis:obstacle or advantage for mapping complex traits? PLoS ONE 2010; 5: e12264.
Heidema AG, Boer J, Nagelkerke N, Mariman E, van der AD, Feskens E : The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet 2006; 7: 23.
Acknowledgements
JM Mahachie John is a doctoral student funded by the Belgian Network BioMAGNet (Bioinformatics and Modeling: from Genomes to Networks), within the Interuniversity Attraction Poles Program (Phase VI/4), initiated by the Belgian State, Science Policy Office. We acknowledge research opportunities offered by the Belgian Network BioMAGNet and partial support by the IST Program of the European Community, under the PASCAL2 Network of Excellence (Pattern Analysis, Statistical Modeling and Computational Learning), IST-2007-216886. We also acknowledge the valuable discussions with Tom Cattaert (a Postdoctoral Researcher of the Fonds de la Recherche Scientifique – FNRS) on data generation and variance decomposition sections. In addition, F Van Lishout acknowledges support by Alma in silico, funded by the European Commission and Walloon Region through the Interreg IV Program.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies the paper on European Journal of Human Genetics website
Rights and permissions
About this article
Cite this article
Mahachie John, J., Van Lishout, F. & Van Steen, K. Model-Based Multifactor Dimensionality Reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data. Eur J Hum Genet 19, 696–703 (2011). https://doi.org/10.1038/ejhg.2011.17
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ejhg.2011.17
Keywords
This article is cited by
-
Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
BioData Mining (2021)
-
Empowering individual trait prediction using interactions for precision medicine
BMC Bioinformatics (2021)
-
How to increase our belief in discovered statistical interactions via large-scale association studies?
Human Genetics (2019)
-
KNN-MDR: a learning approach for improving interactions mapping performances in genome wide association studies
BMC Bioinformatics (2017)
-
gammaMAXT: a fast multiple-testing correction algorithm
BioData Mining (2015)


