Fig. 3: Identification of genetic features underlying TP. | Nature Communications

Fig. 3: Identification of genetic features underlying TP.

From: Genetically encoded transcriptional plasticity underlies stress adaptation in Mycobacterium tuberculosis

Fig. 3

a A table summary of the 119 candidate genetic features. N denotes the number of features in each category. b Schematic diagram illustrating our machine-learning workflow. c The top 15 genetic features ranked by their median feature importance in predicting TP. Lower ranks signify higher feature importance for TP prediction, whereas a tight rank distribution indicates higher consistency in predictions across randomized sample splits and modeling iterations. The four genetic features that consistently rank low across random repeats are highlighted in green. Boxes show the median, the 1st and 3rd quartile of feature importance ranks (N = 100) across experiments, and the whiskers represent the median ± 1.5 × IQR (interquartile range). Vertical lines in boxes represent the medians. d An SVM model constructed using only the top four features effectively predicts TP. The green line represents the linear fit between SVM-modeled and observed TPs. The black dashed line represents the formula y = x. Error band represents the 95% confidence interval. Pearson’s correlation coefficients and the corresponding p values are presented.

Back to article page