Abstract
Population stratification refers to differences in allele frequencies between cases and controls due to systematic differences in ancestry rather than association of genes with disease. It has been proposed that false positive associations due to stratification can be controlled by genotyping a few dozen unlinked genetic markers. To assess stratification empirically, we analyzed data from 11 case-control and case-cohort association studies. We did not detect statistically significant evidence for stratification but did observe that assessments based on a few dozen markers lack power to rule out moderate levels of stratification that could cause false positive associations in studies designed to detect modest genetic risk factors. After increasing the number of markers and samples in a case-cohort study (the design most immune to stratification), we found that stratification was in fact present. Our results suggest that modest amounts of stratification can exist even in well designed studies.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout


Similar content being viewed by others
References
Thomas, D.C. & Witte, J.S. Point: population stratification: a problem for case-control studies of candidate-gene associations? Cancer Epidemiol. Biomarkers Prev. 11, 505–512 (2002).
Wacholder, S., Rothman, N. & Caporaso, N. Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. Cancer Epidemiol. Biomarkers Prev. 11, 513–520 (2002).
Ziv, E. & Burchard, E.G. Human population structure and genetic association studies. Pharmacogenomics 4, 431–441 (2003).
Cardon, L.R. & Palmer, L.J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).
Ardlie, K.G., Lunetta, K.L. & Seielstad, M. Testing for population subdivision and association in four case-control studies. Am. J. Hum. Genet. 71, 304–311 (2002).
Schork, N.J. et al. The future of genetic case-control studies. Adv. Genet. 42, 191–212 (2001).
Hoggart, C.J. et al. Control of confounding of genetic associations in stratified populations. Am. J. Hum. Genet. 72, 1492–1504 (2003).
Knowler, W.C., Williams, R.C., Pettitt, D.J. & Steinberg, A.G. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am. J. Hum. Genet. 43, 520–526 (1988).
Kittles, R.A. et al. CYP3A4-V and prostate cancer in African Americans: causal or confounding association because of population stratification? Hum. Genet. 110, 553–560 (2002).
Lohmueller, K.E., Pearce, C.L., Pike, M., Lander, E.S. & Hirschhorn, J.N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177–182 (2003).
Pritchard, J.K. & Rosenberg, N.A. Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999).
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Reich, D.E. & Goldstein, D.B. Detecting association in a case-control study while correcting for population stratification. Genet. Epidemiol. 20, 4–16 (2001).
Akey, J.M., Zhang, G., Zhang, K., Jin, L. & Shriver, M.D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12, 1805–1814 (2002).
Parra, E.J. et al. Estimating African American admixture proportions by use of population-specific alleles. Am. J. Hum. Genet. 63, 1839–1851 (1998).
Pfaff, C.L., Kittles, R.A. & Shriver, M.D. Adjusting for population structure in admixed populations. Genet. Epidemiol. 22, 196–201 (2002).
Reich, D.E. & Goldstein, D.B. Response to Pfaff et al.: Adjusting for population structure in admixed populations. Genet. Epidemiol. 22, 196–201 (2002).
Bunker, C.H. et al. High prevalence of screening-detected prostate cancer among Afro-Caribbeans: the Tobago prostate cancer survey. Cancer Epidemiol. Biomarkers Prev. 11, 726–729 (2002).
Kolonel, L.N. et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am. J. Epidemiol. 151, 346–357 (2000).
Pritchard, J.K. & Donnelly, P. Case-control studies of association in structured or admixed populations. Theor. Popul. Biol. 60, 227–237 (2001).
Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
Siddiqui, A. et al. Association of multidrug resistance in epilepsy with a polymorphism in the drug-transporter gene ABCB1. N. Engl. J. Med. 348, 1442–1448 (2003).
Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).
Topol, E.J. et al. Single nucleotide polymorphisms in multiple novel thrombospondin genes may be associated with familial premature myocardial infarction. Circulation 104, 2641–2644 (2001).
Gabriel, S.B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).
Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).
Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).
Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999).
Tang, K. et al. Chip-based genotyping by mass spectrometry. Proc. Natl. Acad. Sci. USA 96, 10016–10020 (1999).
Acknowledgements
We thank A. Villapakkam for assistance in genotyping and data checking and K. Ardlie, D. Goldstein and C. Haiman for discussions. M.L.F. is supported by a Department of Defense Health Disparity training grant and a Postdoctoral Fellowship for Physicians from the Howard Hughes Medical Institute. D.A. is a Clinical Scholar in Translational Research from the Burroughs Wellcome Fund, as well as a Charles E. Culpeper Medical Scholar. J.N.H. and D.R. are recipients of Career Development Awards from the Burroughs Welcome Fund. T.L.P. is supported by a Canadian Institutes of Health Research Postdoctoral Fellowship and is a NARSAD Young Investigator. This work was supported in part by a grant from the Functional Genomics Program at the Whitehead Institute/MIT Center for Genome Research (supported by Affymetrix, Millennium Pharmaceuticals and Bristol Myers Squibb) and by a grant from the US National Institutes of Health to B.H. and L.K.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
D.A. is a paid consultant to Genomics Collaborative, which provided the previously published data set (Am. J. Hum. Genet. 71, 304–311; 2002) that was reanalyzed in this paper.
Supplementary information
Rights and permissions
About this article
Cite this article
Freedman, M., Reich, D., Penney, K. et al. Assessing the impact of population stratification on genetic association studies. Nat Genet 36, 388–393 (2004). https://doi.org/10.1038/ng1333
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ng1333
This article is cited by
-
An overview of DNA methylation-derived trait score methods and applications
Genome Biology (2023)
-
Using residual regressions to quantify and map signal leakage in genomic prediction
Genetics Selection Evolution (2023)
-
Hybrid autoencoder with orthogonal latent space for robust population structure inference
Scientific Reports (2023)
-
Proteomic association with age-dependent sex differences in Wisconsin Card Sorting Test performance in healthy Thai subjects
Scientific Reports (2023)
-
Children’s Dopaminergic Genotype is Associated with Maternal Reports of Household Chaos during Middle Childhood
Journal of Child and Family Studies (2023)