Table 2 Performance and confidence intervals of the selected algorithms when predicting the three different problems.
Single algorithm approach | ||
Diagnostic classification | BACC (%) | 95% confidence interval |
Best performance: Ensemble of trees with Bayesian optimization | 64.2 | [51.7, 76.7] |
Medium performance: Logistic regression | 63.8 | [50.7, 77.0] |
Worst performance: SVM with radial basis function kernel | 50.4 | [44.0, 56.8] |
Long-term treatment response (classification) | ||
Best performance: Logistic regression for high-dimensional data | 50.3 | [39.4, 61.2] |
Medium performance: Random forest | 49.7 | [44.7, 54.6] |
Worst performance: Linear SVM | 50.0 | [50.0, 50.0] |
Short-term treatment response (regression) | NMSE | 95% confidence interval |
Best performance: SVM with L1 regularization | 0.96 | [0.43, 1.49] |
Medium performance: Linear regression with L1 regularization | 0.96 | [0.42, 1.51] |
Worst performance: SVM with polynomial kernel | 14.86 | [0, 35.09] |
Ensemble approach | ||
Diagnostic classification | BACC (%) | 95% confidence interval |
Chosen settings based on simulated data results: maximum ensemble size = 4, training time = 180 s | 63.8 | [50.8, 76.7] |
Small maximum ensemble size (=1) and short training time (=20 s) | 56.8 | [48.1, 65.4] |
Large maximum ensemble size (=40) and long training time (=180 s) | 63.6 | [50.7, 76.5] |
Long-term treatment response (classification) | ||
Chosen settings based on simulated data results: maximum ensemble size = 1, training time = 60 s | 50.0 | [50.0, 50.0] |
Small maximum ensemble size (=1) and short training time (=20 s) | 50.0 | [50.0, 50.0] |
Large maximum ensemble size (=40) and long training time (=180 s) | 50.0 | [50.0, 50.0] |
Short-term treatment response (regression) | NMSE | 95% confidence interval |
Chosen settings based on simulated data results: maximum ensemble size = 40, training time = 180 s | 1.04 | [1.04, 1.04] |
Small maximum ensemble size (=1) and short training time (=20 s) | 1.06 | [1.06, 1.06] |