Abstract
The application of genome-wide linkage scans to uncover susceptibility loci for complex diseases offers great promise for the risk assessment, treatment, and understanding of these diseases. However, for most published studies, linkage signals are typically modest and vary considerably from one study to another. The multicenter Family Blood Pressure Program has analyzed genome-wide linkage scans of over 12 000 individuals. Based on this experience, we developed a protocol for large linkage studies that reduces two sources of data error: pedigree structure and marker genotyping errors. We then used the linkage signals, before and after data cleaning, to illustrate the impact of missing and erroneous data. A comprehensive error-checking protocol is an important part of complex disease linkage studies and enhances gene mapping. The lack of significant and reproducible linkage findings across studies is, in part, due to data quality.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Altmuller J, Palmer LJ, Fischer G, Scherb H, Wjst M : Genomewide scans of complex human diseases: true linkage is hard to find. Am J Hum Genet 2001; 69: 936–950.
Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). Hypertension 2002; 39: 3–9.
DeWan AT, Parrado AR, Matise TC, Leal SM : The map problem: a comparison of genetic and sequence-based physical maps. Am J Hum Genet 2002; 70: 101–107.
Daw EW, Thompson EA, Wijsman EM : Bias in multipoint linkage analysis arising from map misspecification. Genet Epidemiol 2000; 19: 366–380.
Broman KW, Murray JC, Sheffield VC, White RL, Weber JL : Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am J Hum Genet 1998; 63: 861–869.
Ghosh S, Karanjawala ZE, Hauser ER et al.: Methods for precise sizing, automated binning of alleles, and reduction of error rates in large-scale genotyping using fluorescently labeled dinucleotide markers. FUSION (Finland-U.S. Investigation of NIDDM Genetics) Study Group. Genome Res 1997; 7: 165–178.
Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES : Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 1996; 58: 1347–1363.
Thiel BA, Chakravarti A, Cooper RS et al.: A genome-wide linkage analysis investigating the determinants of blood pressure in whites and African Americans. Am J Hypertens 2003; 16: 151–153.
Hunt SC, Ellison RC, Atwood LD, Pankow JS, Province MA, Leppert MF : Genome scans for blood pressure and hypertension: the National Heart, Lung, and Blood Institute Family Heart Study. Hypertension 2002; 40: 1–6.
Perola M, Kainulainen K, Pajukanta P et al.: Genome-wide scan of predisposing loci for increased diastolic blood pressure in Finnish siblings. J Hypertens 2000; 18: 1579–1585.
Von Wowern F, Bengtsson K, Lindgren CM et al.: A genome wide scan for early onset primary hypertension in Scandinavians. Hum Mol Genet 2003; 12: 2077–2081.
Levy D, DeStefano AL, Larson MG et al.: Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension 2000; 36: 477–483.
Hsueh WC, Mitchell BD, Schneider JL et al.: Genome-wide scan of obesity in the Old Order Amish. J Clin Endocrinol Metab 2001; 86: 1199–1205.
Wu X, Cooper RS, Borecki I et al.: A combined analysis of genomewide linkage scans for body mass index from the National Heart, Lung, and Blood Institute Family Blood Pressure Program. Am J Hum Genet 2002; 70: 1247–1256.
Hsueh WC, Mitchell BD, Schneider JL et al.: QTL influencing blood pressure maps to the region of PPH1 on chromosome 2q31–34 in Old Order Amish. Circulation 2000; 101: 2810–2816.
Goring HH, Ott J : Relationship estimation in affected sib pair analysis of late-onset diseases. Eur J Hum Genet 1997; 5: 69–77.
Boehnke M, Cox NJ : Accurate inference of relationships in sib-pair linkage studies. Am J Hum Genet 1997; 61: 423–429.
Ehm M, Wagner M : A test statistic to detect errors in sib-pair relationships. Am J Hum Genet 1998; 62: 181–188.
O'Connell JR, Weeks DE : PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet 1998; 63: 259–266.
McPeek MS, Sun L : Statistical tests for detection of misspecified relationships by use of genome-screen data. Am J Hum Genet 2000; 66: 1076–1094.
Sun L, Wilder K, McPeek MS : Enhanced pedigree error detection. Hum Hered 2002; 54: 99–110.
Martinez M, Khlat M, Leboyer M, Clerget-Darpoux F : Performance of linkage analysis under misclassification error when the genetic model is unknown. Genet Epidemiol 1989; 6: 253–258.
Zheng G, Tian X : The impact of diagnostic error on testing genetic association in case-control studies. Stat Med 2005; 24: 869–882.
Buetow KH : Influence of aberrant observations on high-resolution linkage analysis outcomes. Am J Hum Genet 1991; 49: 985–994.
Abecasis GR, Cherny SS, Cardon LR : The impact of genotyping error on family-based analysis of quantitative traits. Eur J Hum Genet 2001; 9: 130–134.
Goldstein DR, Zhao H, Speed TP : The effects of genotyping errors and interference on estimation of genetic distance. Hum Hered 1997; 47: 86–100.
Douglas JA, Boehnke M, Lange K : A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data. Am J Hum Genet 2000; 66: 1287–1297.
Gordon D, Finch SJ, Nothnagel M, Ott J : Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum Hered 2002; 54: 22–33.
Douglas JA, Skol AD, Boehnke M : Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am J Hum Genet 2002; 70: 487–495.
Mukhopadhyay N, Buxbaum SG, Weeks DE : Comparative study of multipoint methods for genotype error detection. Hum Hered 2004; 58: 175–189.
Ewen KR, Bahlo M, Treloar SA et al.: Identification and analysis of error types in high-throughput genotyping. Am J Hum Genet 2000; 67: 727–736.
Weber JL, Broman KW : Genotyping for human whole-genome scans: past, present, and future. Adv Genet 2001; 42: 77–96.
Weeks DE, Conley YP, Ferrell RE, Mah TS, Gorin MB : A tale of two genotypes: consistency between two high-throughput genotyping centers. Genome Res 2002; 12: 430–435.
Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P : International HapMap Consortium: A haplotype map of the human genome. Nature 2005; 437: 1299–1320.
Edwards BJ, Haynes C, Levenstien MA, Finch SJ, Gordon D : Power and sample size calculations in the presence of phenotype errors for case/control genetic association studies. BMC Genet 2005; 6: 18.
Goring HH, Terwilliger JD : Linkage analysis in the presence of errors II: marker-locus genotyping errors modeled with hypercomplex recombination fractions. Am J Hum Genet 2000; 66: 1107–1118.
Gordon D, Heath SC, Liu X, Ott J : A transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. Am J Hum Genet 2001; 69: 371–380.
Bernardinelli L, Berzuini C, Seaman S, Holmans P : Bayesian trio models for association in the presence of genotyping errors. Genet Epidemiol 2004; 26: 70–80.
Morris RW, Kaplan NL : Testing for association with a case-parents design in the presence of genotyping errors. Genet Epidemiol 2004; 26: 142–154.
Gordon D, Haynes C, Johnnidis C, Patel SB, Bowcock AM, Ott J : A transmission disequilibrium test for general pedigrees that is robust to the presence of random genotyping errors and any number of untyped parents. Eur J Hum Genet 2004; 12: 752–761.
Gordon D, Yang Y, Haynes C et al.: Increasing power for tests of genetic association in the presence of phenotype and/or genotype error by use of double-sampling. Stat Appl Genet Mol Biol 2004; 3 (1), article 26.
Mukhopadhyay N, Almasy L, Schroeder M, Mulvihill WP, Weeks DE : Mega2: data-handling for facilitating genetic linkage and association analyses. Bioinformatics 2005; 21: 2556–2557.
Gordon D, Haynes C, Blumenfeld J, Finch SJ : PAWE-3D: visualizing power for association with error in case/control genetic studies of complex traits. Bioinformatics 2005.
Sobel E, Papp JC, Lange K : Detection and integration of genotyping errors in statistical genetics. Am J Hum Genet 2002; 70: 496–508.
Acknowledgements
The following investigators are associated with the Family Blood Pressure Program. GenNet Network: Alan B Weder (Network Director), Lillian Gleiberman, Anne E Kwitek, Aravinda Chakravarti, Richard S Cooper, Carolina Delgado, Howard J Jacob, and Nicholas J Schork. GENOA Network: Eric Boerwinkle (Network Director), Andy Brown, Christy Thielmier, Robert E Ferrell, Craig Hanis, Sharon Kardia, and Stephen Turner. HyperGEN Network: Steven C Hunt (Network Director), Janet Hood, Donna Arnett, John H Eckfeldt, R Curtis Ellison, Chi Gu, Gerardo Heiss, Paul Hopkins, Jean-Marc Lalouel, Mark Leppert, Albert Oberman, Michael A Province, DC Rao, Treva Rice, and Robert Weiss – SAPPHIRe Network: David Curb (Network Director), David Cox, Timothy Donlon, Victor Dzau, John Grove, Kamal Masaki, Richard Myers, Richard Olshen, Richard Pratt, Tom Quertermous, Neil Risch and Beatriz Rodriguez. National Heart, Lung, and Blood Institute: Dina Paltoo and Cashell E Jaquish. Website: www.sph.uth.tme.edu/hgc/fbpp.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supplementary information accompanies the paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)
Rights and permissions
About this article
Cite this article
Chang, YP., Kim, JO., Schwander, K. et al. The impact of data quality on the identification of complex disease genes: experience from the Family Blood Pressure Program. Eur J Hum Genet 14, 469–477 (2006). https://doi.org/10.1038/sj.ejhg.5201582
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/sj.ejhg.5201582
Keywords
This article is cited by
-
Susceptibility Loci for Adiposity Phenotypes on 8p, 9p, and 16q in American Samoa and Samoa
Obesity (2009)
-
Gene mapping: Balance among quality, quantity and cost of data in the era of whole-genome mapping for complex disease
European Journal of Human Genetics (2006)