Fig. 5: Machine learning-driven discovery of key proteins for predicting the response to csDMARDs treatment.

a Volcano plot of DEPs between response and no response to MTX + LEF treatment (two-sided Student’s t test, p < 0.05). Y = response, N = no response. Enrichment analysis of upregulated (b) and downregulated (c) proteins in response vs no response to MTX + LEF treatment (two-sided Fisher’s exact test, p < 0.05). d Volcano plot of DEPs between response and no response to MTX + HCQ treatment (two-sided Student’s t test, p < 0.05). Y = response, N = no response. Enrichment analysis of upregulated (e) and downregulated (f) proteins in response vs no response to MTX + HCQ treatment (two-sided Fisher’s exact test, p < 0.05). LASSO regression analysis showing the contribution of DEPs to treatment response prediction in the MTX + LEF (g) and MTX + HCQ (h) groups. ROC curves illustrating the predictive performance of the LASSO model for MTX + LEF (i) and MTX + HCQ (j) responses, using the top 5 or 2 proteins, respectively, in both the training (left) and testing (right) sets, with 10-fold cross-validation repeated 100 times. k ROC curve showing model performance after integrating protein levels measured by ELISA. The confusion matrix displays sensitivity and specificity at the optimal cutoff for the MTX + LEF (left) and MTX + HCQ (right) groups. Source data are provided as a Source Data file.