Fig. 7

Identification of potential plasma biomarkers for microbial diagnosis of NSTIs. a ROC curves for the model comparison of Random Forest (RF, green), linear support vector machine (SVM, blue) and logistic regression (LR, red) on the training cohort (n = 12 S. pyogenes NSTIs, n = 22 non-S. pyogenes NSTIs) using the full panel of available measured variables. AUC values ± 95% CI are given. b Selection of relevant plasma markers for discrimination between S. pyogenes and non-S. pyogenes NSTIs in the training cohort using the Boruta algorithm. Boxplots of features are sorted by increasing importance according to the Z-scores. Features colored in green are those which were classified as relevant (exhibiting Z-scores higher than shadowMax). Features colored in red are unimportant for model performance. The blue boxes correspond to minimal (shadowMin), mean (shadowMean) and maximal (shadowMax) importance calculated from randomly permuted features. c ROC curves for a RF classifier trained on the full panel of features (red) and a 3-feature model trained solely on CXCL9, CXCL 10, and CXCL 11 (green) of the training dataset. AUC values ± 95% CI are given. d ROC curve showing the 3-feature RF classifier performance in the independent validation cohort (n = 27 S. pyogenes NSTIs, n = 32 non-S. pyogenes NSTIs). e Confusion matrix summarizing the performance of the 3-feature model in the independent validation cohort. Each row of the confusion matrix shows the number of samples in an actual class while each column shows the number of samples in a predicted class. Tiles showing the number of correctly classified cases are colored blue (non-S. pyogenes) or red (S. pyogenes). Source data are available as a source data file