Table 1 Comparison of model performance.

From: Interpretable machine learning for early neurological deterioration prediction in atrial fibrillation-related stroke

Model

AUROC [95% CI]

AUPRC [95% CI]

Brier score

ACC (%)

Precision

Recall

F1 score

p value†

Baseline model

Logistic regression

0.696 [0.636–0.755]

0.288 [0.207–0.368]

0.110

86.5

0.253

0.585

0.353

 

Machine learning models

SVM

0.722 [0.667–0.777]

0.261[0.168–0.356]

0.112

86.2

0.254

0.695

0.373

0.182

XGBoost

0.759 [0.700–0.817]

0.367 [0.260–0.466]

0.105

86.5

0.349

0.537

0.423

0.024

LightGBM

0.772 [0.715–0.829]

0.385 [0.273–0.497]

0.103

86.7

0.328

0.695

0.445

0.003*

MLP

0.768 [0.714–0.822]

0.374 [0.265–0.482]

0.103

86.9

0.432

0.463

0.447

0.002*

  1. *Significant difference at p < 0.005.
  2. Comparison with logistic regression on AUROC.
  3. Abbreviations: AUROC, area under the receiver operating characteristic curve; AUPRC, area under the precision-recall curve; SVM, support vector machine; XGBoost, extreme gradient boosting; LightGBM, light gradient boosting machine; MLP, multilayer perceptron.