Fig. 2: Association between phylogenetic cluster and isolation source.

Association of S. marcescens clusters with animal, clinical or environmental isolation sources. To avoid potential biases due to sampling proximity, χ 2 test was repeated 1000 times on geographically-balanced subsets. The box plot illustrates the distribution of the Pearson residuals for each cluster- isolation source combination, and the red horizontal lines demarcate the thresholds for statistical significance of the residual (see “Methods” section): Cluster 1 is associated with clinical samples (>95% of subsets are significant). Cluster 3 is enriched in environmental isolation sources (52% of subsets), and Cluster 5 is enriched in both environmental (76% of subsets) and animal (42% of subsets). The number of strains in each cluster for each isolation source (n) is displayed on the x-axis of the boxplots.