Fig. 1: Gene panel-based fitness predictions of S. pneumoniae under antibiotic and nutrient stress. | Nature Communications

Fig. 1: Gene panel-based fitness predictions of S. pneumoniae under antibiotic and nutrient stress.

From: Entropy of a bacterial stress response is a generalizable predictor for fitness and antibiotic sensitivity

Fig. 1

a Project setup and overview. Wildtype and adapted strains of S. pneumoniae are exposed to multiple antibiotics, belonging to four different classes, and their fitness outcomes in each condition is determined by growth curves. Temporal RNA-Seq data is used to train models that predict the MOA of an antibiotic, and the fitness outcome of a strain using gene-panel approaches. The concept of entropy is developed expanding predictions to MIC and fitness for other strains and species in the presence of antibiotics and in non-antibiotic conditions. CWSI cell wall synthesis inhibitors: AMX amoxicillin, CEF cefepime, CFT ceftriaxone, IMI imipenem, PEN penicillin, VNC vancomycin; DSI DNA synthesis inhibitors: CIP ciprofloxacin, COT cotrimoxazole, LVX levofloxacin, MOX moxifloxacin; RSI RNA synthesis inhibitor: RIF rifampicin; PSI protein synthesis inhibitors: KAN kanamycin, LIN linezolid, TET tetracycline, TOB tobramycin; DAP daptomycin (a membrane disruptor). b A gene-panel for fitness prediction is generated by a regularized logistic regression model fit on differential expression data from the training set. The selected value of λ = 0.0428 is shown as a dashed line, resulting in 28 genes in this panel. Red points and error bars represent mean ± standard deviation of error across n = 5 crossvalidation folds. c Prediction performance of the fitness gene-panel is shown as confusion matrices for the training (top) and test (bottom) datasets. The gene-panel generates 10 and 4 false positives, and an overall accuracy of 0.93 and 0.77 in the training and test data sets respectively. d Coefficients of individual features (i.e., genes) are plotted for the model trained on the full dataset, and five crossvalidation training folds, where 20% of the data is omitted during model fitting. The gene-panel is highly affected by training data, indicated by many genes having nonzero coefficients on some folds, but not others. Only 5 out of the 28 genes in the fitness gene-panel are maintained as predictors in the regression models across all folds. e Each gene’s coefficient is plotted as an individual line, against varying values of λ. The gene panel is highly affected by λ, indicated by the nonmonotonic increase or decrease in the coefficient in each gene. In fact, there are many genes that have nonzero coefficients only for a small range of λ. Dashed line depicts the selected value of λ as in (b). f The presence and absence of each of the 28 genes in the S. pneumoniae fitness panel is highly variable across 5 Gram-positive and Gram-negative species. g A published E. coli ciprofloxacin sensitivity panel11 also suffers from a lack of conservation across the same group of species. Gene identifiers can be found in Supplementary Table 5.

Back to article page