Fig. 2: Advantages of tree-based models for mortality prediction. | Communications Medicine

Fig. 2: Advantages of tree-based models for mortality prediction.

From: Interpretable machine learning prediction of all-cause mortality

Fig. 2

a The area under the ROC curve (AUROC) of gradient boosted tree models outperforms both linear models and neural networks for seven of our prediction models. ***p-value < 0.001, **p-value < 0.01, and *p-value < 0.05. P-values highlighted in blue are computed using bootstrap resampling over the tested time points while measuring the difference in area between the curves with n = 1000 independently resampling. b, c Tree-based models can capture non-linear relationships and important thresholds. b The main effect of uric acid on 5-year mortality. Higher SHAP value leads to higher mortality risk. c The main effect of urine albumin on 5-year mortality. d–g Tree-based models can measure feature interaction effects. d SHAP value for blood lead level in the 5-year mortality model. Each dot corresponds to an individual. The color corresponds to the value of a second feature (i.e., age) that has an interaction effect with blood lead. e We can use SHAP interaction values to remove the interaction effect of age from the model and obtain the SHAP value of blood lead without the age interaction on 5-year mortality. f Plotting just the interaction effect of blood lead with age shows how the effect of blood lead on mortality risk varies with age. g The SHAP interaction value of blood lead vs. gender in the 5-year mortality model.

Back to article page