Table 2 Performance evaluation of machine learning models.

Machine Learning Model	UZ Brussel Cohort		AZ Delta Cohort		Proportional overlap^a
	(Training set)		(Test set)		(Cumming Method)
	AUC (95%CI)		AUC (95%CI)		(Cumming Method)
Multiple Logistic Regression (MLR)	0.903	(0.859–0.936)	0.826	(0.762–0.879)	0.21
Random Forest (RF)	0.958	(0.925–0.979)	0.803	(0.736–0.859)	-0.75
Support Vector Classifier (SVC) with Linear Kernel	0.911	(0.869–0.943)	0.819	(0.754–0.873)	0.04
Support Vector Classifier (SVC) with RBF Kernel	0.923	(0.883–0.953)	0.821	(0.756–0.875)	-0.08
K Nearest Neighbours (KNN)	0.971	(0.941–0.988)	0.818	(0.753–0.872)	-0.83
Gaussian Naive Bayes (GNB)	0.674	(0.612–0.732)	0.554	(0.477–0.629)	0.13
Extreme Gradient Boosting (XGBoost)	0.931	(0.892–0.959)	0.818	(0.752–0.872)	-0.21
Keras Neural Network (Keras NN)	0.920	(0.879–0.950)	0.816	(0.750–0.870)	-0.09
Ensemble Voting Classifier (EVC)	0.830	(0.778–0.874)	0.735	(0.663–0.799)	0.18

CI, Confidence Interval; AUC, Area Under the Curve; RBF, Radial Basis Function.
^a The proportion of overlap between the 95% CIs was calculated following the method described by Cumming et al., (2005)²⁶. An overlap < 0.50 typically indicates statistically significant difference at α = 0.05, and is marked in bold.

Quick links

Search