Abstract
Pooling of DNA samples instead of individual genotyping can speed up genetic association studies. However, for microsatellite markers, the electrophoretic pattern of DNA pools can be complex, and procedures for deriving allele frequencies are often confounded by PCR-induced stutter artefacts. We have developed a mathematical procedure to remove stutter noise and accurately determine allele frequencies in pools. A stutter correction model can be reliably derived from one standard ‘training set’ of the same 10 individual DNA samples for each marker, which can also include heterozygous patterns with partially overlapping peaks. Compared with earlier methods, this reduces the number of genotypes needed in the training set considerably, and allows standardization of analyses for different markers. Moreover, the use of a procedure that fits all data simultaneously makes the method less sensitive to aberrant data. The model was tested with 34 markers, 18 of which were newly defined from human sequence data. Allele frequencies derived from stutter-corrected DNA pool patterns were compared with the summed individual genotyping results of all the individuals in the pools (n=109 and n=64). We show that the model is robust and accurately extracts allele frequencies from pooled DNA samples for 32 of the 34 microsatellite markers tested. Finally, we performed a case–control study in celiac disease and found that weakly associated disease alleles, identified by individual genotyping, were only detectable in pools after stutter correction. This efficient method for correcting stutter artefacts in microsatellite markers enables large-scale genetic association studies using DNA pools to be performed.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Risch N, Merikangas K : The future of genetic studies of complex human diseases. Science 1996; 273: 1516–1517.
Dunning AM, Durocher F, Healey C et al: The extent of linkage disequilibrium in four populations with distinct demographic histories. Am J Hum Genet 2000; 67: 1544–1554.
Jorde LB : Linkage disequilibrium and the search for complex disease genes. Genome Res 2000; 10: 1435–1444.
Abecasis GR, Noguchi E, Heinzmann A et al: Extent and distribution of linkage disequilibrium in three genomic regions. Am J Hum Genet 2001; 68: 191–197.
Innan H, Padhukasahasram B, Nordborg M : The pattern of polymorphism on human chromosome 21. Genome Res 2003; 13: 1158–1168.
Salisbury BA, Pungliya M, Choi JY, Jiang RH, Sun XJ, Stephens JC : SNP and haplotype variation in the human genome. Mutat Res 2003; 526: 53–61.
Barcellos LF, Klitz W, Field et al: Association mapping of disease loci, by use of a pooled DNA genomic screen. Am J Hum Genet 1997; 61: 734–747.
Collins HE, Li H, Inda SE et al: A simple and accurate method for determination of microsatellite total allele content differences between DNA pools. Hum Genet 2000; 106: 218–226.
Daniels J, Holmans P, Williams N et al: A simple method for analyzing microsatellite allele image patterns generated from DNA pools and its application to allelic association studies. Am J Hum Genet 1998; 62: 1189–1197.
Fisher PJ, Turic D, Williams NM et al: DNA pooling identifies QTLs on chromosome 4 for general cognitive ability in children. Hum Mol Genet 1999; 8: 915–922.
Plomin R, Hill L, Craig IW et al: A genome-wide scan of 1842 DNA markers for allelic associations with general cognitive ability: a five-stage design using DNA pooling and extreme selected groups. Behav Genet 2001; 31: 497–509.
Kirov G, Williams N, Sham P, Craddock N, Owen MJ : Pooled genotyping of microsatellite markers in parent-offspring trios. Genome Res 2000; 10: 105–115.
LeDuc C, Miller P, Lichter J, Parry P : Batched analysis of genotypes. PCR Methods Appl 1995; 4: 331–336.
Lipkin E, Mosig MO, Darvasi A et al: Quantitative trait locus mapping in dairy cattle by means of selective milk DNA pooling using dinucleotide microsatellite markers: analysis of milk protein percentage. Genetics 1998; 149: 1557–1567.
Perlin MW, Lancia G, Ng S-K : Toward fully automated genotyping: genotyping microsatellite markers by deconvolution. Am J Hum Genet 1995; 57: 1199–1210.
Sham P, Bader, JS, Craig I, O'Donovan M, Owen M : DNA pooling: a tool for large scale association studies. Nat Rev Genet 2002; 3: 862–869.
Sham PC, Zhao JH, Curtis D : The effect of marker characteristics on the power to detect linkage disequilibrium due to single or multiple ancestral mutations. Ann Hum Genet 2000; 64: 161–169.
Morris RW, Kaplan NL : On the advantage of haplotype analysis in the presence of multiple disease susceptibility alleles. Genet Epidemiol 2002; 23: 221–233.
Bakker SC, van der Meulen EM, Buitelaar JK et al: A whole-genome scan in 164 Dutch sib pairs with attention-deficit/hyperactivity disorder: suggestive evidence for linkage on chromosomes 7p and 15q. Am J Hum Genet 2003; 72: 1251–1260.
Brownstein MJ, Carpenter JD, Smith JR : Modulation of non-templated nucleotide addition by Taq DNA polymerase: primer modifications that facilitate genotyping. BioTechniques 1996; 20: 1004–1010.
Sham PC, Curtis D : Monte Carlo tests for associations between disease and alleles at highly polymorphic loci. Ann Hum Genet 1995; 59 (Part 1): 97–105.
Miller MJ, Yuan B-Z : Semiautomated resolution of overlapping stutter patterns in genomic microsatellite analysis. Anal Biochem 1997; 251: 50–56.
Press WH, Teukolsky SA, Vettering WT, Flannery BH : Numerical recipes in C – the art of scientific computing, 2nd edn. Cambridge: Cambridge University Press, 1992.
Shaw SH, Carrasquillo MM, Kashuk C, Puffenberger EG, Chakravarti A : Allele frequency distributions in pooled DNA samples: applications to mapping complex disease genes. Genome Res 1998; 8: 111–123.
Barratt BJ, Payne F, Rance HE, Nutland S, Todd JA, Clayton DG : Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design. Ann Hum Genet 2002; 66: 393–405.
Sawcer S, Maranian M, Setakis E et al: A whole genome screen for linkage disequilibrium in multiple sclerosis confirms disease associations with regions previously linked to susceptibility. Brain 2002; 125: 1337–1347.
Acknowledgements
We acknowledge the special contribution of Lodewijk Sandkuijl in formulating some of the basic concepts embodied in this work; he died shortly before completion of the manuscript. We also thank Martine van Belzen for providing the celiac disease samples and individual genotypes, and to Jackie Senior for critically reading the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Electronic Database Information The PoolFitter program and more illustrating figures are available at our website: http://www.smri.nl/microsatellitesCLUMP (DOS version): http://www.mds.qmw.ac.uk/statgen/dcurtis/software.htmlPrimer3: http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgiTandem Repeat Finder: http://c3.biomath.mssm.edu/trf.htmlGenome Database (mirror site): http://gdbwww.dkfz-heidelberg.de/
Marshfield Center for Medical Genetics: http://research.marshfieldclinic.org/genetics/
Rights and permissions
About this article
Cite this article
Schnack, H., Bakker, S., van't Slot, R. et al. Accurate determination of microsatellite allele frequencies in pooled DNA samples. Eur J Hum Genet 12, 925–934 (2004). https://doi.org/10.1038/sj.ejhg.5201234
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/sj.ejhg.5201234
Keywords
This article is cited by
-
MPDA: Microarray pooled DNA analyzer
BMC Bioinformatics (2008)
-
Quantitative Single-letter Sequencing: a method for simultaneously monitoring numerous known allelic variants in single DNA samples
BMC Genomics (2008)
-
Empirical evaluation of selective DNA pooling to map QTL in dairy cattle using a half-sib design by comparison to individual genotyping and interval mapping
Genetics Selection Evolution (2007)


