Fig. 1: Development of a host-based machine learning classifier from cerebrospinal fluid RNA-seq data.

A Workflow for creation of the machine learning classifier using 70 microbiologically proven samples, through PCR or conventional testing, identified in the training cohort. B Most predictive 4 genes of a 15 gene classifier to classify TBM vs OND. C Left: ROC curve for the combined mNGS and MLC assay. Blue dotted line is the MLC assay alone, and the green solid line is the combined assay. If the MLC categorized a case as TBM, but mNGS detected a non-TB pathogen, then mNGS overruled the MLC result, thus increasing specificity of the overall assay. Right: In silico prediction of shallow depth (i.e., 100,000 reads for the RNA-seq library and 500,000 reads for the DNA-seq library) sequencing results. Pink dotted line is the MLC assay alone, red solid line is combined assay with mNGS. Current cost prediction for shallow depth sequencing in $75 per patient. CPM counts per million, GBP5 guanylate binding protein 5, FTL ferritin light chain, NFKBIA NF-kappa-B inhibitor alpha, SOD2 superoxide dismutase 2, TBM tuberculous meningitis, OND other neurological disease, MLC machine learning classifier, mNGS metagenomic next-generation sequencing, AUC area under the receiver operator curve. Source data are provided as a Source Data file.