Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Assessing the impact of population stratification on genetic association studies

Abstract

Population stratification refers to differences in allele frequencies between cases and controls due to systematic differences in ancestry rather than association of genes with disease. It has been proposed that false positive associations due to stratification can be controlled by genotyping a few dozen unlinked genetic markers. To assess stratification empirically, we analyzed data from 11 case-control and case-cohort association studies. We did not detect statistically significant evidence for stratification but did observe that assessments based on a few dozen markers lack power to rule out moderate levels of stratification that could cause false positive associations in studies designed to detect modest genetic risk factors. After increasing the number of markers and samples in a case-cohort study (the design most immune to stratification), we found that stratification was in fact present. Our results suggest that modest amounts of stratification can exist even in well designed studies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The effect of stratification on association studies.
Figure 2: Likelihood surfaces for stratification for the 11 studies, assuming 1,000 cases and 1,000 controls (we provide results for λ1000, but likelihood surfaces for other numbers of cases and controls could be obtained simply by rescaling the axis using the equation in Methods).

Similar content being viewed by others

References

  1. Thomas, D.C. & Witte, J.S. Point: population stratification: a problem for case-control studies of candidate-gene associations? Cancer Epidemiol. Biomarkers Prev. 11, 505–512 (2002).

    PubMed  Google Scholar 

  2. Wacholder, S., Rothman, N. & Caporaso, N. Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. Cancer Epidemiol. Biomarkers Prev. 11, 513–520 (2002).

    PubMed  Google Scholar 

  3. Ziv, E. & Burchard, E.G. Human population structure and genetic association studies. Pharmacogenomics 4, 431–441 (2003).

    Article  PubMed  Google Scholar 

  4. Cardon, L.R. & Palmer, L.J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).

    Article  PubMed  Google Scholar 

  5. Ardlie, K.G., Lunetta, K.L. & Seielstad, M. Testing for population subdivision and association in four case-control studies. Am. J. Hum. Genet. 71, 304–311 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Schork, N.J. et al. The future of genetic case-control studies. Adv. Genet. 42, 191–212 (2001).

    Article  CAS  PubMed  Google Scholar 

  7. Hoggart, C.J. et al. Control of confounding of genetic associations in stratified populations. Am. J. Hum. Genet. 72, 1492–1504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Knowler, W.C., Williams, R.C., Pettitt, D.J. & Steinberg, A.G. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am. J. Hum. Genet. 43, 520–526 (1988).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Kittles, R.A. et al. CYP3A4-V and prostate cancer in African Americans: causal or confounding association because of population stratification? Hum. Genet. 110, 553–560 (2002).

    Article  PubMed  Google Scholar 

  10. Lohmueller, K.E., Pearce, C.L., Pike, M., Lander, E.S. & Hirschhorn, J.N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177–182 (2003).

    Article  CAS  PubMed  Google Scholar 

  11. Pritchard, J.K. & Rosenberg, N.A. Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    Article  CAS  PubMed  Google Scholar 

  13. Reich, D.E. & Goldstein, D.B. Detecting association in a case-control study while correcting for population stratification. Genet. Epidemiol. 20, 4–16 (2001).

    Article  CAS  PubMed  Google Scholar 

  14. Akey, J.M., Zhang, G., Zhang, K., Jin, L. & Shriver, M.D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12, 1805–1814 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Parra, E.J. et al. Estimating African American admixture proportions by use of population-specific alleles. Am. J. Hum. Genet. 63, 1839–1851 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Pfaff, C.L., Kittles, R.A. & Shriver, M.D. Adjusting for population structure in admixed populations. Genet. Epidemiol. 22, 196–201 (2002).

    Article  PubMed  Google Scholar 

  17. Reich, D.E. & Goldstein, D.B. Response to Pfaff et al.: Adjusting for population structure in admixed populations. Genet. Epidemiol. 22, 196–201 (2002).

    Article  Google Scholar 

  18. Bunker, C.H. et al. High prevalence of screening-detected prostate cancer among Afro-Caribbeans: the Tobago prostate cancer survey. Cancer Epidemiol. Biomarkers Prev. 11, 726–729 (2002).

    PubMed  Google Scholar 

  19. Kolonel, L.N. et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am. J. Epidemiol. 151, 346–357 (2000).

    Article  CAS  PubMed  Google Scholar 

  20. Pritchard, J.K. & Donnelly, P. Case-control studies of association in structured or admixed populations. Theor. Popul. Biol. 60, 227–237 (2001).

    Article  CAS  PubMed  Google Scholar 

  21. Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Siddiqui, A. et al. Association of multidrug resistance in epilepsy with a polymorphism in the drug-transporter gene ABCB1. N. Engl. J. Med. 348, 1442–1448 (2003).

    Article  CAS  PubMed  Google Scholar 

  23. Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).

    Article  CAS  PubMed  Google Scholar 

  24. Topol, E.J. et al. Single nucleotide polymorphisms in multiple novel thrombospondin genes may be associated with familial premature myocardial infarction. Circulation 104, 2641–2644 (2001).

    Article  CAS  PubMed  Google Scholar 

  25. Gabriel, S.B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).

    Article  CAS  PubMed  Google Scholar 

  26. Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    Article  CAS  PubMed  Google Scholar 

  27. Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).

    Article  CAS  PubMed  Google Scholar 

  28. Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999).

    Article  CAS  PubMed  Google Scholar 

  29. Tang, K. et al. Chip-based genotyping by mass spectrometry. Proc. Natl. Acad. Sci. USA 96, 10016–10020 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank A. Villapakkam for assistance in genotyping and data checking and K. Ardlie, D. Goldstein and C. Haiman for discussions. M.L.F. is supported by a Department of Defense Health Disparity training grant and a Postdoctoral Fellowship for Physicians from the Howard Hughes Medical Institute. D.A. is a Clinical Scholar in Translational Research from the Burroughs Wellcome Fund, as well as a Charles E. Culpeper Medical Scholar. J.N.H. and D.R. are recipients of Career Development Awards from the Burroughs Welcome Fund. T.L.P. is supported by a Canadian Institutes of Health Research Postdoctoral Fellowship and is a NARSAD Young Investigator. This work was supported in part by a grant from the Functional Genomics Program at the Whitehead Institute/MIT Center for Genome Research (supported by Affymetrix, Millennium Pharmaceuticals and Bristol Myers Squibb) and by a grant from the US National Institutes of Health to B.H. and L.K.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Reich.

Ethics declarations

Competing interests

D.A. is a paid consultant to Genomics Collaborative, which provided the previously published data set (Am. J. Hum. Genet. 71, 304–311; 2002) that was reanalyzed in this paper.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Freedman, M., Reich, D., Penney, K. et al. Assessing the impact of population stratification on genetic association studies. Nat Genet 36, 388–393 (2004). https://doi.org/10.1038/ng1333

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/ng1333

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing