Abstract
We have used targeted genomic sequencing of high-complexity DNA pools based on long-range PCR and deep DNA sequencing by the SOLiD technology. The method was used for sequencing of 286 kb from four chromosomal regions with quantitative trait loci (QTL) influencing blood plasma lipid and uric acid levels in DNA pools of 500 individuals from each of five European populations. The method shows very good precision in estimating allele frequencies as compared with individual genotyping of SNPs (r2=0.95, P<10−16). Validation shows that the method is able to identify novel SNPs and estimate their frequency in high-complexity DNA pools. In our five populations, 17% of all SNPs and 61% of structural variants are not available in the public databases. A large fraction of the novel variants show a limited geographic distribution, with 62% of the novel SNPs and 59% of novel structural variants being detected in only one of the populations. The large number of population-specific novel SNPs underscores the need for comprehensive sequencing of local populations in order to identify the causal variants of human traits.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
Accession codes
References
Plomin R, Haworth CM, Davis OS : Common disorders are quantitative traits. Nat Rev 2009; 10: 872–878.
Sherry ST, Ward MH, Kholodov M et al: dbSNP: the NCBI database of genetic variation. Nucleic acids Res 2001; 29: 308–311.
Lango Allen H, Estrada K, Lettre G et al: Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 2010; 467: 832–838.
Frazer KA, Murray SS, Schork NJ, Topol EJ : Human genetic variation and its contribution to complex traits. Nature Rev 2009; 10: 241–251.
Altshuler DM, Gibbs RA, Peltonen L et al: Integrating common and rare genetic variation in diverse human populations. Nature 2010; 467: 52–58.
Li Y, Vinckenbosch N, Tian G et al: Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat Genet 2010; 42: 969–972.
Durbin RM, Abecasis GR, Altshuler DL et al: A map of human genome variation from population-scale sequencing. Nature 2010; 467: 1061–1073.
Voelkerding KV, Dames SA, Durtschi JD : Next-generation sequencing: from basic research to diagnostics. Clin Chem 2009; 55: 641–658.
Ingman M, Gyllensten U : SNP frequency estimation using massively parallel sequencing of pooled DNA. Eur J Hum Genet 2009; 17: 383–386.
Out AA, van Minderhout IJ, Goeman JJ et al: Deep sequencing to reveal new variants in pooled DNA samples. Hum Mut 2009; 30: 1703–1712.
Druley TE, Vallania FL, Wegner DJ et al: Quantification of rare allelic variants from pooled genomic DNA. Nat Methods 2009; 6: 263–265.
Bansal V, Harismendy O, Tewhey R et al: Accurate detection and genotyping of SNPs utilizing population sequencing data. Genome Res 2010; 20: 537–545.
Prabhu S, Pe’er I : Overlapping pools for high-throughput targeted resequencing. Genome Res 2009; 19: 1254–1261.
Erlich Y, Chang K, Gordon A et al: DNA Sudoku—harnessing high-throughput sequencing for multiplexed specimen analysis. Genome Res 2009; 19: 1243–1253.
Shental N, Amir A, Zuk O : Identification of rare alleles and their carriers using compressed se(que)nsing. Nucleic Acids Res 2010.
Hicks AA, Pramstaller PP, Johansson A et al: Genetic determinants of circulating sphingolipid concentrations in European populations. PLoS Genetics 2009; 5: e1000672.
Vitart V, Rudan I, Hayward C et al: SLC2A9 is a newly identified urate transporter influencing serum urate concentration, urate excretion and gout. Nat Genet 2008; 40: 437–442.
Mascalzoni D, Janssens AC, Stewart A et al: Comparison of participant information and informed consent forms of five European studies in genetic isolated populations. Eur J Hum Genet 2010; 18: 296–302.
Johansson A, Marroni F, Hayward C et al: Common variants in the JAZF1 gene associated with height identified by linkage and genome-wide association analysis. Hum Mol Genet 2009; 18: 373–380.
McQuillan R, Leutenegger AL, Abdel-Rahman R et al: Runs of homozygosity in European populations. Am J Hum Genet 2008; 83: 359–372.
Rudan I, Campbell H, Rudan P : Genetic epidemiological studies of eastern Adriatic Island isolates, Croatia: objective and strategies. Collegium Antropologicum 1999; 23: 531–546.
Vitart V, Biloglav Z, Hayward C et al: 3000 years of solitude: extreme differentiation in the island isolates of Dalmatia, Croatia. Eur J Hum Genet 2006; 14: 478–487.
Pattaro C, Marroni F, Riegler A et al: The genetic study of three population microisolates in South Tyrol (MICROS): study design and epidemiological perspectives. BMC Med Genet 2007; 8: 29.
Aulchenko YS, Heutink P, Mackay I et al: Linkage disequilibrium in young genetically isolated Dutch population. Eur J Hum Genet 2004; 12: 527–534.
Frey B : SB. Biochemica 1995; 2: 34–35.
Ameur A, Wetterbom A, Feuk L, Gyllensten U : Global and unbiased detection of splice junctions from RNA-seq data. Genome Biol 2010; 11: R34.
Kaiser J : DNA sequencing. A plan to capture human diversity in 1000 genomes. Science (New York, NY) 2008; 319: 395.
Kent WJ, Sugnet CW, Furey TS et al: The human genome browser at UCSC. Genome Res 2002; 12: 996–1006.
Sham P, Bader JS, Craig I, O’Donovan M, Owen M : DNA Pooling: a tool for large-scale association studies. Nat Rev 2002; 3: 862–871.
Craig DW, Pearson JV, Szelinger S et al: Identification of genetic variants using bar-coded multiplexed sequencing. Nat Methods 2008; 5: 887–893.
Acknowledgements
The SOLiD DNA sequencing and Taqman genotyping was performed by the Uppsala Genome Center, funded by the Knut and Alice Wallenberg Foundation (CMS), The Swedish Natural Sciences Research Council (SNISS) and Science for Life Laboratory, Uppsala. This work was supported by the following grants and agencies: Swedish Medical Sciences Research Council, the Foundation for Strategic Research (SSF), the Linneaus Centre for Bioinformatics (LCB), European Commission FP6 STRP grant number 018947 (LSHG-CT-2006-01947), The Netherlands Organisation for Scientific Research and the Russian Foundation for Basic Research (NWO-RFBR 047.017.043), European Commission FP7 grant LipidomicNet (2007-202272), NWO, ErasmusMC and the Centre for Medical Systems Biology (CMSB), the Ministry of Health and Department of Educational Assistance, University and Research of the Autonomous Province of Bolzano, and the South Tyrolean Sparkasse Foundation, the Scottish Executive Health Department and the Royal Society, the Medical Research Council UK, Ministry of Science, Education, and Sport of the Republic of Croatia (number 108-1080315-0302), Deutsche Forschungsgemeinschaft, the German Federal Ministry of Education and Research in the context of the German National Genome Research Network and Cardiogenics (EU-funded integrated project LSHM-CT- 2006-037593), the GSF-National Research Centre for Environment and Health funded by the German Federal Ministry of Education and Research and of the State of Bavaria.
Author information
Authors and Affiliations
Consortia
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies the paper on European Journal of Human Genetics website
Rights and permissions
About this article
Cite this article
Zaboli, G., Ameur, A., Igl, W. et al. Sequencing of high-complexity DNA pools for identification of nucleotide and structural variants in regions associated with complex traits. Eur J Hum Genet 20, 77–83 (2012). https://doi.org/10.1038/ejhg.2011.138
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ejhg.2011.138