Fig. 7: Cluster congruence at all threshold levels and overlap in detecting outbreak signals for E. coli. | Nature Communications

Fig. 7: Cluster congruence at all threshold levels and overlap in detecting outbreak signals for E. coli.

From: Multi-country and intersectoral assessment of cluster congruence between pipelines for genomics surveillance of foodborne pathogens

Fig. 7

a Heatmap with the CS of two pipelines (details on each pairwise comparison are in Supplementary Data 19, with INNUENDO-like-Enterobase vs. INNUENDO-like-INNUENDO99 using the HC algorithm being presented here as an example). The inverted dendrogram (i.e., from the highest to the lowest resolution) and dashed red lines illustrate how the congruence is related with the dataset’s phylogenetic structure (dendrogram obtained with INNUENDO-like-INNUENDO99 and visualized in auspice.us141). b Zoom-in in the high resolution level highlighted in orange in (a). c Bi-directional corresponding points (gray lines) connecting thresholds providing similar clustering in the two pipelines exemplified in (a). d Illustrative linear trend lines expected for the corresponding points with a slope deviation of 10% and 20% to be used as scale reference for the boxplots. The boxplot presents the slope distribution for allele vs. allele (orange, n = 68) pipeline comparisons for the linear trend lines with r2 ≥ 0.99, illustrated in Supplementary Data 19 and detailed in Supplementary Data 23 (“n” refers to the number of comparisons with r2 ≥ 0.99 over the total number of comparisons). The boxplot of the allele vs. SNP scenario is not presented due to the low number of comparisons with r2 ≥ 0.99 (Supplementary Data 23). e Density of the distance thresholds required for the identification of clusters detected by at least one allele-based pipeline at 9 ADs. Only clusters having the same composition in all allele-based pipelines were included (n = 185). f Distribution of the difference between the minimum and maximum AD threshold needed to detect the same clusters across allele-based pipelines, using the clusters of (e) (n = 185). g Overlap between the genetic clusters detected at 9 ADs. h Overlap between the genetic clusters detected by one pipeline at 9 ADs and those detected by the others at ≤ 12 ADs. Boxplots in (d) and (f) show the interquartile range and median, and whiskers extend 1.5 times the range, with outliers plotted separately. Source data are provided as a Source Data file.

Back to article page