Table 3 Comparison of models (n = 212).

From: Interpretable machine-learning risk prediction of unplanned extubation among cancer patients with peripherally inserted central catheters

Methods

Three mainstream

variable encoding methods

Four for data imbalance

processing methods

Four ML models

Four XGB models

WOE

One-Hot

CE

NearMiss

ENN

SMOTE

FL

XGB

SVM

Random Forest

Logistic Regression

XGB

XGB + FC

XGB + FL

XGB + FL + FC

Accuracy

0.967

0.958

0.967

0.967

0.962

0.965

0.967

0.967

0.972

0.962

0.953

0.972

0.976

0.920

0.967

F1

0.851

0.816

0.851

0.796

0.811

0.824

0.851

0.851

0.856

0.810

0.770

0.850

0.878

0.679

0.851

Recall

0.870

0.870

0.870

0.786

0.812

0.835

0.870

0.870

0.820

0.780

0.740

0.739

0.783

0.783

0.870

AUC

0.994

0.990

0.990

0.907

0.932

0.945

0.994

0.994

0.952

0.993

0.972

0.985

0.993

0.976

0.994

  1. Note: FC, feature construction; AUC, area under the ROC curve; ENN, edited nearest neighbor; FL, focal loss; LR, logistic regression; ML, machine-learning; ROC, receiver operating characteristic; RFC, random forest classifier; SVM, support vector machine; WOE, weight of evidence.