Abstract
Microbial community samples can be efficiently surveyed in high throughput by sequencing markers such as the 16S ribosomal RNA gene. Often, a collection of samples is then selected for subsequent metagenomic, metabolomic or other follow-up. Two-stage study design has long been used in ecology but has not yet been studied in-depth for high-throughput microbial community investigations. To avoid ad hoc sample selection, we developed and validated several purposive sample selection methods for two-stage studies (that is, biological criteria) targeting differing types of microbial communities. These methods select follow-up samples from large community surveys, with criteria including samples typical of the initially surveyed population, targeting specific microbial clades or rare species, maximizing diversity, representing extreme or deviant communities, or identifying communities distinct or discriminating among environment or host phenotypes. The accuracies of each sampling technique and their influences on the characteristics of the resulting selected microbial community were evaluated using both simulated and experimental data. Specifically, all criteria were able to identify samples whose properties were accurately retained in 318 paired 16S amplicon and whole-community metagenomic (follow-up) samples from the Human Microbiome Project. Some selection criteria resulted in follow-up samples that were strongly non-representative of the original survey population; diversity maximization particularly undersampled community configurations. Only selection of intentionally representative samples minimized differences in the selected sample set from the original microbial survey. An implementation is provided as the microPITA (Microbiomes: Picking Interesting Taxa for Analysis) software for two-stage study design of microbial communities.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL et al (2012). Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol 8: e1002358.
Albanese D, Visintainer R, Merler S, Riccadonna S, Jurman G, Furlanello C . (2012), mlpy: Machine Learning Python arXiv:1202.6548.
Baillargeon S, Rivest L . (2011). The construction of stratified designs in R with the package stratification. Survey Methodology 37: 53–65.
Bartram AK, Lynch MD, Stearns JC, Moreno-Hagelsieb G, Neufeld JD . (2011). Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads. Appl Environ Microbiol 77: 3846–3852.
Blankenberg D, Kuster GV, Coraor N, Ananda G, Lazarus R, Mangan M et al (2001). Galaxy: A Web-Based Genome Analysis Tool for Experimentalists. Current Protocols in Molecular Biology. John Wiley & Sons, Inc..
Brown J, Salehi MM, Moradi M, Bell G, Smith D . (2008). An adaptive two-stage sequential design for sampling rare and clustered populations. Popul Ecol 50: 239–245.
Claesson MJ, Jeffery IB, Conde S, Power SE, O'Connor EM, Cusack S et al (2012). Gut microbiota composition correlates with diet and health in the elderly. Nature 488: 178–184.
Danz NP, Regal RR, Niemi GJ, Brady VJ, Hollenhorst T, Johnson LB et al (2005). Environmentally stratified sampling design for the development of Great Lakes environmental indicators. Environ Monit Assess 102: 41–65.
Frank DN, St. Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR . (2007). Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci USA 104: 13780–13785.
Gevers D, Knight R, Petrosino JF, Huang K, McGuire AL, Birren BW et al (2012). The human microbiome project: a community resource for the healthy human microbiome. PLoS Biol 10: e1001377.
Hamady M, Knight R . (2009). Microbial community profiling for human microbiome projects: tools, techniques, and challenges. Genome Res 19: 1141–1152.
Knight R, Maxwell P, Birmingham A, Carnes J, Caporaso JG, Easton BC et al (2007). PyCogent: a toolkit for making sense from sequence. Genome Biol 8: R171.
Knights D, Costello EK, Knight R . (2011). Supervised classification of human microbiota. FEMS Microbiol Rev 35: 343–359.
Kuczynski J, Lauber CL, Walters WA, Parfrey LW, Clemente JC, Gevers D et al (2012). Experimental and analytical tools for studying the human microbiome. Nat Rev Genet 13: 47–58.
Mackelprang R, Waldrop MP, DeAngelis KM, David MM, Chavarria KL, Blazewicz SJ et al (2011). Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature 480: 368–371.
Morgan XC, Tickle TL, Sokol H, Gevers D, Devaney KL, Ward DV et al (2012). Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol 13: R79.
Olsen A, Sedransk J, Edwards D, Gotway C, Liggett W, Rathbun S et al (1999). Statistical issues for monitoring ecological and natural resources in the United States. Environ Monit Assess 54: 1–45.
Pace NR, Stahl DA, Lane DJ, Olsen GJ . (1986). The analysis of natural microbial populations by ribosomal RNA sequences. Adv Microbial Ecol 9: 1–55.
Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C et al (2010). A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464: 59–65.
Ravel J, Gajer P, Abdo Z, Schneider GM, Koenig SS, McCulle SL et al (2011). Vaginal microbiome of reproductive-age women. Proc Natl Acad Sci USA 108 (Suppl): 4680–4687.
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB et al (2009). Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75: 7537–7541.
Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS et al (2011). Metagenomic biomarker discovery and explanation. Genome Biol 12: R60.
Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C . (2012). Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Meth advance online publication 9: 811–814.
Simpson EH . (1949). Measurement of diversity. Nature 163: 1.
The Human Microbiome Project Consortium. (2012a). A framework for human microbiome research. Nature 486: 215–221.
The Human Microbiome Project Consortium. (2012b). Structure, function and diversity of the healthy human microbiome. Nature 486: 207–214.
Werner JJ, Zhou D, Caporaso JG, Knight R, Angenent LT . (2012). Comparison of Illumina paired-end and single-direction sequencing for microbial 16S rRNA gene amplicon surveys. ISME J 6: 1273–1276.
Willing BP, Dicksved J, Halfvarson J, Andersson AF, Lucio M, Zheng Z et al (2010). A pyrosequencing study in twins shows that gastrointestinal microbial profiles vary with inflammatory bowel disease phenotypes. Gastroenterology 139: 1844–1854 e1841.
Yamaguchi-Kabata Y, Nakazono K, Takahashi A, Saito S, Hosono N, Kubo M et al (2008). Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies. Am J Hum Genet 83: 445–456.
Yang F, Zeng X, Ning K, Liu KL, Lo CC, Wang W et al (2012). Saliva microbiomes distinguish caries-active from healthy human populations. ISME J 6: 1–10.
Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M et al (2012). Human gut microbiome viewed across age and geography. Nature 486: 222–227.
Acknowledgements
We thank Daniela Boernigen, Xochitl Morgan, Vagheesh Narasimhan and Joshua Reyes for their input on methodology. This work was supported by the Army Research Office grant W911NF-11-1-0429, the National Science Foundation grant CAREER DBI-1053486, by Danone grant PLF-5972-GD and the Juvenile Diabetes Research Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies this paper on The ISME Journal website
Supplementary information
Rights and permissions
About this article
Cite this article
Tickle, T., Segata, N., Waldron, L. et al. Two-stage microbial community experimental design. ISME J 7, 2330–2339 (2013). https://doi.org/10.1038/ismej.2013.139
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ismej.2013.139
Keywords
This article is cited by
-
Association of human gut microbiota composition and metabolic functions with Ficus hirta Vahl dietary supplementation
npj Science of Food (2022)
-
A Two-Stage Hidden Markov Model Design for Biomarker Detection, with Application to Microbiome Research
Statistics in Biosciences (2018)
-
Shotgun metagenomics, from sampling to analysis
Nature Biotechnology (2017)
-
Sequencing and beyond: integrating molecular 'omics' for microbial community profiling
Nature Reviews Microbiology (2015)


