Fig. 6: Workflow of the ML pipeline and prediction performance results correlating the broiler chicken resistome with the AMR resistance/susceptibility profiles on the farm, and regression analysis flow diagram to correlate the local temperature and humidity to the resistome.
From: Dissecting microbial communities and resistomes for interconnected humans, soil, and livestock

A Supervised ML pipeline used to search for correlations between ARGs (features) present in the broiler chicken faecal metagenomes and the antimicrobial resistance/susceptibility profiles of cultured E. coli isolates from the same sources. The pipeline consists of three stages, feature selection (shown in yellow), classification (shown in blue) and postprocessing analysis using an undirected graph network (shown in red). First, a synthetic minority over-sampling technique (SMOTE) was used to balance the data and a chi-square test was used as a feature selection method to select the features more correlated with the AMR phenotype. Next, for the classification stage, a panel of ML models consisting of 5 classifiers (logistic regression, linear support vector machine, radial basis function support vector machine, extra tree classifier and random forest) and 2 meta-methods (adaboost and xgboost) were used to predict the AMR phenotype based on the presence/absence of ARGs from the chicken broiler faecal metagenomes. B Prediction performance results of the classification, five performance indicators have been used to evaluate the ML models: AUC (area under the receiver characteristic operation curve), accuracy, sensitivity, specificity, and precision. These are generated from 30 iterations of the nested cross-validation results. The ML models were run for the following antibiotics: aztreonam (AZM), cefotaxime (CTX), cefotaxime/clavulanic acid (CTX-C), cefoxitin (CFX), ceftazidime/clavulanic acid (CAZ-C), chloramphenicol (CHL), ciprofloxacin (CIP), gentamicin (GEN), kanamycin (KAN), minocycline (MIN), streptomycin (STR). C Regression analysis pipeline to investigate whether the ARGs associated with AMR resistance/susceptibility profiles on the farm (those selected by the feature selection step, A, in the ML pipeline) were also correlated to the environmental temperature and humidity.