Fig. 2: Summary of identified horizontal transfers of antibiotic resistance genes (ARGs), and performance of random forest models trained to predict horizontal ARG transfer.

a Total number of ARGs predicted in 867,318 bacterial genomes, stratified based on encoded resistance mechanism. Included among the resistance mechanisms are aminoglycoside acetyltransferases (AAC), aminoglycoside phosphotransferases (APH), class A, C, D beta-lactamases, class B beta-lactamases, Erm 23S rRNA methyltransferases, Mph 2’-macrolide phosphotransferases, tetracycline efflux pumps (Tet efflux), tetracycline inactivating enzymes (Tet enzyme), tetracycline ribosomal protection genes (Tet RPG), and quinolone resistance genes (Qnr). b Total number of detected instances of ARGs horizontally transferred between distantly related bacterial hosts, stratified based on encoded resistance mechanism. c Receiver operating characteristic curves produced from predictions on test data by random forest models trained on horizontal transfers representing all included resistance mechanisms, over ten iterations. Each model was built using variables representing the genetic incompatibility, environmental co-occurrence, and cell envelope of the bacteria involved in each transfer. The black line represents the mean of the produced receiver operating characteristics (ROC) curves. The point represents the mean optimal performance (the point closest to a sensitivity and specificity of 1). d Area under the ROC curve (AUROC), sensitivity, and specificity observed for predictions on test data using random forest models representing different resistance mechanisms with enough data present (>100 transfers observed). The bars show the mean +/− SD of the observed metrics over ten iterations. Source data are provided as a Source Data file.