Abstract
Disruption of healthy microbial communities has been linked to numerous diseases, yet microbial interactions are little understood. This is due in part to the large number of bacteria, and the much larger number of interactions (easily in the millions), making experimental investigation very difficult at best and necessitating the nascent field of computational exploration through microbial correlation networks. We benchmark the performance of eight correlation techniques on simulated and real data in response to challenges specific to microbiome studies: fractional sampling of ribosomal RNA sequences, uneven sampling depths, rare microbes and a high proportion of zero counts. Also tested is the ability to distinguish signals from noise, and detect a range of ecological and time-series relationships. Finally, we provide specific recommendations for correlation technique usage. Although some methods perform better than others, there is still considerable need for improvement in current techniques.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Aitchison J . (1986) The Statistical Analysis of Compositional Data. Chapman and Hall: London; New York, NY, USA.
Anders S, Huber W . (2010). Differential expression analysis for sequence count data. Genome Biol 11: R106.
Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR et al. (2011). Enterotypes of the human gut microbiome. Nature 473: 174–180.
Beman JM, Steele JA, Fuhrman JA . (2011). Co-occurrence patterns for abundant marine archaeal and bacterial lineages in the deep chlorophyll maximum of coastal California. ISME J 5: 1077–1085.
Benjamini Y, Hochberg Y . (1995). Controlling the false discovery rate - a practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57: 289–300.
Berry D, Widder S . (2014). Deciphering microbial interactions and detecting keystone species with co-occurrence networks. Front Microbiol 5: 219.
Bray JR, Curtis JT . (1957). An ordination of upland forest communities of southern Wisconsin. Ecol Monographs 27: 325–349.
Buffie CG, Bucci V, Stein RR, McKenney PT, Ling L, Gobourne A et al. (2015). Precision microbiome reconstitution restores bile acid mediated resistance to Clostridium difficile. Nature 517: 205–208.
Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J et al. (2011). Moving pictures of the human microbiome. Genome Biol 12: R50.
Chaffron S, Rehrauer H, Pernthaler J, von Mering C . (2010). A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res 20: 947–959.
Deng Y, Jiang YH, Yang Y, He Z, Luo F, Zhou J . (2012). Molecular ecological network analyses. BMC Bioinformatics 13: 113.
Dunn OJ . (1961). Multiple comparisons among means. J Am Stat Assoc 56: 52–64.
Faust K, Raes J . (2012). Microbial interactions: from networks to models. Nat Rev Microbiol 10: 538–550.
Faust K, Sathirapongsasuti JF, Izard J, Segata N, Gevers D, Raes J et al. (2012). Microbial co-occurrence relationships in the human microbiome. PLoS Comput Biol 8: e1002606.
Fisher RA . (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10: 507–521.
Friedman J, Alm EJ . (2012). Inferring correlation networks from genomic survey data. PLoS Comput Biol 8: e1002687.
Gerber GK . (2014). The dynamic microbiome. FEBS Lett 588: 4131–4139.
Gevers D, Kugathasan S, Denson LA, Vazquez-Baeza Y, Van Treuren W, Ren B et al. (2014). The treatment-naive microbiome in new-onset Crohn's disease. Cell Host Microbe 15: 382–392.
Gonzalez A, King A, Robeson MS 2nd, Song S, Shade A, Metcalf JL et al. (2012). Characterizing microbial communities through space and time. Curr Opin Biotechnol 23: 431–436.
Goodrich JK, Waters JL, Poole AC, Sutter JL, Koren O, Blekhman R et al. (2014). Human genetics shape the gut microbiome. Cell 159: 789–799.
Gough E, Shaikh H, Manges AR . (2011). Systematic review of intestinal microbiota transplantation (fecal bacteriotherapy) for recurrent Clostridium difficile infection. Clin Infect Dis 53: 994–1002.
Greenblum S, Turnbaugh PJ, Borenstein E . (2012). Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease. Proc Natl Acad Sci USA 109: 594–599.
Hochberg Y, Benjamini Y . (1990). More powerful procedures for multiple significance testing. Stat Med 9: 811–818.
Idema T . (2005), The behaviour and attractiveness of the Lotka-Volterra equations. Doctorate thesis, Leiden University.
Ley RE, Backhed F, Turnbaugh P, Lozupone CA, Knight RD, Gordon JI . (2005). Obesity alters gut microbial ecology. Proc Natl Acad Sci USA 102: 11070–11075.
Lima-Mendez G, Faust K, Henry N, Decelle J, Colin S, Carcillo F et al. (2015). Ocean plankton. Determinants of community structure in the global plankton interactome. Science 348: 1262073.
Lovell D, Müller W, Taylor J, Zwart A, Helliwell C . (2010). Caution! compositions! technical report and companion software (publication–technical). Technical Report EP10994, CSIRO.
Lozupone C, Faust K, Raes J, Faith JJ, Frank DN, Zaneveld J et al. (2012). Identifying genomic and metabolic features that can underline early successional and opportunistic lifestyles of human gut symbionts. Genome Res 22: 1974–1984.
Lozupone CA, Li M, Campbell TB, Flores SC, Linderman D, Gebert MJ et al. (2013a). Alterations in the gut microbiota associated with HIV-1 infection. Cell Host Microbe 14: 329–339.
Lozupone CA, Stombaugh J, Gonzalez A, Ackermann G, Wendel D, Vazquez-Baeza Y et al. (2013b). Meta-analyses of studies of the human microbiota. Genome Res 23: 1704–1714.
McMurdie PJ, Holmes S . (2014). Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol 10: e1003531.
Mounier J, Monnet C, Vallaeys T, Arditi R, Sarthou AS, Helias A et al. (2008). Microbial interactions within a cheese microbial community. Appl Environ Microbiol 74: 172–181.
Oakley BB, Morales CA, Line J, Berrang ME, Meinersmann RJ, Tillman GE et al. (2013). The poultry-associated microbiome: network analysis and farm-to-fork characterizations. PloS One 8: e57190.
Paulson JN, Stine OC, Bravo HC, Pop M . (2013). Differential abundance analysis for microbial marker-gene surveys. Nat Methods 10: 1200–1202.
Pearson K . (1897). On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc London 60: 489–502.
Pearson K . (1909). Determination of the coefficient of correlation. Science 30: 23–25.
Pepper JW, Rosenfeld S . (2012). The emerging medical ecology of the human gut microbiome. Trends Ecol Evol 27: 381–384.
Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ et al. (2011). Detecting novel associations in large data sets. Science 334: 1518–1524.
Ridaura VK, Faith JJ, Rey FE, Cheng J, Duncan AE, Kau AL et al. (2013). Gut microbiota from twins discordant for obesity modulate metabolism in mice. Science 341: 1241214.
Ruan Q, Dutta D, Schwalbach MS, Steele JA, Fuhrman JA, Sun F . (2006). Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics 22: 2532–2538.
Shade A, Peter H, Allison SD, Baho DL, Berga M, Burgmann H et al. (2012). Fundamentals of microbial community resistance and resilience. Front Microbiol 3: 417.
Shade A, Caporaso JG, Handelsman J, Knight R, Fierer N . (2013). A meta-analysis of changes in bacterial and archaeal communities with time. ISME J 7: 1493–1506.
Spearman C . (1904). The proof and measurement of association between two things. Am J Psychol 15: 72–101.
Steele JA, Countway PD, Xia L, Vigil PD, Beman JM, Kim DY et al. (2011). Marine bacterial, archaeal and protistan association networks reveal ecological linkages. ISME J 5: 1414–1425.
Storey JD . (2002). A direct approach to false discovery rates. J Roy Stat Soc B 64: 479–498.
Storey JD, Tibshirani R . (2003). Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100: 9440–9445.
Trivedi PK, Zimmer DM . (2007) Copula Modeling: an Introduction for Practitioners. Now publishers inc.: Boston, UK.
Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE et al. (2009). A core gut microbiome in obese and lean twins. Nature 457: 480–484.
Volterra V . (1926). Variazioni e fluttuazioni del numero d’individui in specie animali conviventi. Mem Acad Lincei Roma 2: 31–113.
Vrieze A, Van Nood E, Holleman F, Salojarvi J, Kootte RS, Bartelsman JF et al. (2012). Transfer of intestinal microbiota from lean donors increases insulin sensitivity in individuals with metabolic syndrome. Gastroenterology 143: 913–916 e917.
Wang Z, Klipfell E, Bennett BJ, Koeth R, Levison BS, Dugar B et al. (2011). Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease. Nature 472: 57–63.
Xia LC, Ai D, Cram J, Fuhrman JA, Sun F . (2013). Efficient statistical significance approximation for local similarity analysis of high-throughput time series data. Bioinformatics 29: 230–237.
Zhou J, Deng Y, Luo F, He Z, Yang Y . (2011). Phylogenetic molecular ecological network of soil microbial communities in response to elevated CO2. mBio 2: doi: 10.1128/mBio.00122-11.
Acknowledgements
WVT and SJW were supported by the National Human Genome Research Institute Grant# 3 R01 HG004872-03S2, and the National Institute of Health Grant# 5 U01 HG004866-04. JAF and JAC were supported by the Gordon and Betty Moore Foundation Grant# GBMF3779 and NSF Grant# 1136818. This work was supported in part by the Howard Hughes Medical Institute (RK was an HHMI Early Career Scientist). The National Human Genome Research Institute Grant# 3 R01 HG004872-03S2, the National Institute of Health Grant# 5 U01 HG004866-04, the Gordon and Betty Moore Foundation Grant# GBMF3779, NSF Grant# 1136818 and the Howard Hughes Medical Institute.
Author contributions
WWVT, SJW, CL and RK designed and conceived analyses. WVT and SJW performed data analysis and wrote the manuscript. KF, JF, YD, LCX and ZX ran the CoNet, SparCC, RMT, LSA and MIC correlation network techniques, respectively. All authors provided invaluable feedback and insights into analyses and the manuscript. All authors approved the final version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies this paper on The ISME Journal website
Rights and permissions
About this article
Cite this article
Weiss, S., Van Treuren, W., Lozupone, C. et al. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. ISME J 10, 1669–1681 (2016). https://doi.org/10.1038/ismej.2015.235
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ismej.2015.235
This article is cited by
-
Succession of bacterial biofilm communities following removal of chloramine from a full-scale drinking water distribution system
npj Clean Water (2023)
-
Deciphering microeukaryotic–bacterial co-occurrence networks in coastal aquaculture ponds
Marine Life Science & Technology (2023)
-
Deciphering Interactions Within a 4-Strain Riverine Bacterial Community
Current Microbiology (2023)
-
The microbiome of a bacterivorous marine choanoflagellate contains a resource-demanding obligate bacterial associate
Nature Microbiology (2022)
-
An adaptive neuro-fuzzy inference system to monitor and manage the soil quality to improve sustainable farming in agriculture
Soft Computing (2022)