Fig. 2: Feature selection and final model identification in the discovery cohort. | Nature Communications

Fig. 2: Feature selection and final model identification in the discovery cohort.

From: A noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma

Fig. 2

A Area under the receiver operating characteristic curve (AUC) values for different feature subsets selected using the SHapley Additive exPlanations (SHAP) method, along with feature ranking scores. Source data are provided as a Source data file. B SHAP summary bar plot highlighting the relative importance of features in the random forest (RF) model. C Illustrating the distribution and correlation of the six selected features. D Receiver operating characteristic (ROC) curves demonstrating the diagnostic performance of the RF model when the six selected features were used for primary vitreoretinal lymphoma (PVRL) detection. E Confusion matrix heatmap visualizing the classification performance of the six-feature RF model in diagnosing PVRL. DT decision tree, GLM generalized linear model, GBM gradient boosting, PDW platelet distribution width, PLCR platelet large cell ratio, HG hemoglobin, PLT platelet count, MPV mean platelet volume, HCT hematocrit, RDWSD red blood cell distribution width—standard deviation, RDWCV red blood cell distribution width—coefficient of variation, RBC red blood cell count, WBC white blood cell count, MCHC mean corpuscular hemoglobin concentration, PIV pan-immune inflammation value.

Back to article page