Table 2 Performance evaluation of machine learning models.

From: AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning

Machine

Learning Model

UZ Brussel Cohort

AZ Delta Cohort

Proportional

overlapa

(Training set)

(Test set)

(Cumming

Method)

AUC (95%CI)

AUC (95%CI)

Multiple Logistic

Regression (MLR)

0.903

(0.859–0.936)

0.826

(0.762–0.879)

0.21

Random

Forest (RF)

0.958

(0.925–0.979)

0.803

(0.736–0.859)

-0.75

Support Vector

Classifier (SVC)

with Linear Kernel

0.911

(0.869–0.943)

0.819

(0.754–0.873)

0.04

Support Vector

Classifier (SVC)

with RBF Kernel

0.923

(0.883–0.953)

0.821

(0.756–0.875)

-0.08

K Nearest

Neighbours (KNN)

0.971

(0.941–0.988)

0.818

(0.753–0.872)

-0.83

Gaussian Naive

Bayes (GNB)

0.674

(0.612–0.732)

0.554

(0.477–0.629)

0.13

Extreme Gradient

Boosting (XGBoost)

0.931

(0.892–0.959)

0.818

(0.752–0.872)

-0.21

Keras Neural

Network (Keras NN)

0.920

(0.879–0.950)

0.816

(0.750–0.870)

-0.09

Ensemble Voting

Classifier (EVC)

0.830

(0.778–0.874)

0.735

(0.663–0.799)

0.18

  1. CI, Confidence Interval; AUC, Area Under the Curve; RBF, Radial Basis Function.
  2. a The proportion of overlap between the 95% CIs was calculated following the method described by Cumming et al., (2005)26. An overlap < 0.50 typically indicates statistically significant difference at α = 0.05, and is marked in bold.