Fig. 2: Application of AI to AMR surveillance. | npj Antimicrobials and Resistance

Fig. 2: Application of AI to AMR surveillance.

From: How AI can help us beat AMR

Fig. 2

a Common methods of representing genetic information for antimicrobial resistance (AMR) prediction include gene-based, mutation-based, and composition-based encodings. Gene-based encodings denote the presence/absence of AMR genes (ARGs) identified through alignment to AMR databases. Mutation-based encodings represent presence/absence or type of mutation based on comparison to a reference genome. Composition-based methods denote the presence/absence of subsequences of a specified length (k-mers). b Pataki et al. trained a random forest (RF) model using geographically diverse whole-genome sequencing data to predict the minimum inhibitory concentration (MIC) of ciprofloxacin in Escherichia coli. Data were represented using a combination gene- and mutation-based encoding and were used to train an RF to perform feature selection based on each feature’s contribution to reducing the Gini index, a measure of entropy across a decision tree. The top four features were used to train another RF to predict ciprofloxacin MIC in E. coli isolates of unknown country of origin. c DeepARG-SS takes raw sequencing data as input to predict 30 AMR categories. To simulate raw reads during model training, putative ARGs were broken into 100 nt reads. Reads are aligned to known ARGs and those with sequence homology below a user-defined cut-off are discarded. Reads are represented as their normalized bit scores to 4333 known ARGs and inputted into a multiclass FFNN which outputs a prediction score (PS) for each of the 30 AMR categories. d Baker et al. used metagenomic sequencing data collected from chicken farms across China to identify key predictors of AMR dissemination via HGT. Metagenomic reads were filtered for host genes and de novo assembled. Mobile-ARGs were identified based on alignment to known ARGs and distance from mobile genetic elements. For metagenomic samples that tested positive for E. coli, microbial species abundance and predicted-mobile ARG counts were quantified from the source chicken gut and used to train seven model types (three examples shown) to predict associated resistance against 10 antibiotics. Models with an area under the receiver-operating characteristic (AUROC) curve greater than 0.9 were used to select predictors.

Back to article page