Table 2 Evaluation results (mean values and 95% confidence intervals in brackets).

From: The Framing of machine learning risk prediction models illustrated by evaluation of sepsis in general wards

Model	AUPRC	AUROC	Brier *100 (y = 1)	Brier *100 (y = 0)	ACE (%)
Fixed time to onset
Extra trees classifier	0.466 (0.435–0.496)	0.906 (0.897–0.915)	67.574 (65.277–69.870)	0.322 (0.286–0.358)	39.18 (35.76–42.60)
Random forest classifier	0.449 (0.407–0.491)	0.860 (0.852–0.869)	66.446 (64.476–68.417)	0.402 (0.374–0.430)	42.82 (38.12–47.52)
Light gradient boosting machine	0.317 (0.270–0.363)	0.815 (0.802–0.829)	74.290 (72.327–76.252)	0.415 (0.381–0.449)	45.35 (43.43–47.28)
XGBoost	0.292 (0.244–0.339)	0.777 (0.750–0.804)	78.062 (75.776–8.3470)	0.326 (0.293–0.358)	43.57 (42.24–44.90)
Logistic regression	0.239 (0.212–0.266)	0.752 (0.739–0.764)	77.662 (76.415–78.909)	0.466 (0.433–0.498)	46.63 (45.5–47.760)
Sliding windows
Extra trees classifier	0.006 (0.005–0.007)	0.566 (0.539–0.593)	98.102 (97.811–98.394)	0.012 (0.011–0.013)	66.60 (63.21–69.98)
Random forest classifier	0.007 (0.007–0.008)	0.612 (0.604–0.621)	97.381 (97.170–97.592)	0.014 (0.013–0.015)	73.74 (73.06–74.43)
Light gradient boosting machine	0.004 (0.004–0.005)	0.624 (0.596–0.653)	98.785 (98.569–98.998)	0.015 (0.013–0.018)	75.22 (75.00–75.44)
XGBoost	0.007 (0.006–0.008)	0.756 (0.741–0.771)	98.852 (98.628–99.077)	0.003 (0.003–0.004)	75.14 (73.95–76.34)
Logistic regression	0.004 (0.004–0.004)	0.703 (0.688–0.717)	99.484 (99.462–99.506)	0.001 (0.001–0.001)	79.02 (73.57–84.47)
Sliding windows w. D.I.
Extra trees classifier	0.009 (0.007–0.010)	0.593 (0.582–0.604)	97.580 (97.116–98.045)	0.016 (0.014–0.018)	64.56 (59.48–69.63)
Random forest classifier	0.011 (0.009–0.013)	0.638 (0.625–0.0651)	96.610 (96.006–97.213)	0.020 (0.018–0.022)	72.66 (71.22–74.10)
Light gradient boosting machine	0.007 (0.006–0.008)	0.665 (0.628–0.703)	98.278 (97.859–98.698)	0.023 (0.021–0.025)	74.89 (74.21–75.57)
XGBoost	0.011 (0.009–0.013)	0.747 (0.740–0.0755)	98.370 (98.180–98.560)	0.005 (0.004–0.005)	72.77 (70.57–74.96)
Logistic regression	0.006 (0.005–0.007)	0.684 (0.656–0.713)	99.285 (99.206–99.364)	0.001 (0.001–0.001)	82.13 (76.74–87.52)
On clinical demand
Extra trees classifier	0.147 (0.139–0.155)	0.719 (0.704–0.733)	89.654 (89.217–9.090)	0.013 (0.012–0.014)	41.60 (38.40–44.80)
Random forest classifier	0.192 (0.154–0.231)	0.742 (0.717–.0766)	86.881 (85.482–88.281)	0.017 (0.016–0.017)	42.90 (39.20–46.60)
Light gradient boosting machine	0.056 (0.040–0.072)	0.774 (0.751–0.797)	91.376 (89.983–92.769)	0.030 (0.024–0.036)	62.89 (58.76–67.03)
XGBoost	0.114 (0.081–0.148)	0.779 (0.752–0.806)	91.799 (9.565–93.034)	0.009 (0.008–0.009)	46.79 (41.90–51.68)
Logistic regression	0.014 (0.011–0.016)	0.735 (0.720–0.749)	98.370 (98.167–98.572)	0.005 (0.005–0.006)	75.44 (74.25–76.63)
Best scoring machine learning models across framing structures and evaluation metrics
Fixed time to onset	Extra trees	Extra trees	Random forest	Extra trees	Extra trees
Running windows	Random forest	XGBoost	Random forest	Logistic regression	Extra trees
Running windows w. D.I.	Random forest	XGBoost	Random forest	Logistic regression	Extra trees
On clinical demand	Random forest	XGBoost	Random forest	Logistic regression	Extra trees

AUPRC: Area under the precision recall curve; AUROC: Area under the receiver operating characteristics curve; Brier *100 (y = 1): The stratified Brier score for the positive class multiplied by 100; Brier *100 (y = 0): The stratified Brier score for the negative class multiplied by 100; ACE: Average calibration error; Sliding windows w. D.I.: Sliding window with dynamic inclusion.

Back to article page

Table 2 Evaluation results (mean values and 95% confidence intervals in brackets).

Search

Quick links