Extended Data Fig. 8: Re-investigation of all resistance-conferring DNA fragments from the metagenomic screens.

a, A significantly higher portion of ARGs not being detected to provide a resistance phenotype in any species were present on a single resistance-conferring DNA fragment as compared to ARGs being detected to provide a resistance phenotype in at least one species (Two-tailed Fisher’s exact test, P = 0.032, n = 80, Supplementary Table 10). b, Estimated accuracy of the screen based on taking the MIC measurements as a gold standard dataset. Note that we excluded one ARG (QnrB73) from the MIC measurements, as re-introduction of this ARG into each of the four host bacterial species was not confirmed by sequencing of the plasmid library (Source Data File 9). Presence of resistance in the MIC dataset was defined as a more than two-fold change in relative MIC value. False negative hits are those ARGs that were not detected in the screen but showed a resistance phenotype in the MIC measurements. False positive hits are those that did not provide resistance in the MIC measurements but were detected to show a resistance phenotype in the screen. We assumed plasmid hitchhiking as a primary source of false positives. Data is available in Supplementary Table 9c, The distribution of adjusted Jaccard similarity coefficients that represent the overlaps of functional ARG sets between pairs of host species after controlling for measurement accuracy using a stochastic approach (Methods). Dashed line, blue line and red lines represent the measured Jaccard similarity coefficient for host species pairs, the median of the adjusted Jaccard similarity coefficients and the lower and upper bounds of the 95% confidence intervals, respectively. d, In total, only ~46% of the ARGs (~29 out of 63) are estimated to provide resistance in all four bacterial host species. Histogram shows the number of ARGs that are estimated to confer resistance in all four host species when taking into account the false positive and false negative rates of the screen by using a stochastic approach (see Methods). Dashed line, blue line and red lines represent the measured Jaccard similarity coefficient for host species pairs, the median of the adjusted Jaccard similarity coefficients and the lower and upper bounds of the 95% confidence intervals, respectively. (see Methods).