Table 2 Performance of machine-learning models on the internal test set.

From: Opportunistic screening data for early prediction of GDM in Northern Chinese women: a multicenter machine learning study

Model

Youden’s Index

AUC

Accuracy

Balanced Accuracy

Sensitivity

Specificity

F1 Score

Macro F1

P (Delong’s test)*

XGBoost_GA

0.2734

0.9622

0.9244

0.921

0.90

0.942

0.9091

0.9222

Benchmark

XGBoost_FULL

0.3759

0.8849

0.8403

0.832

0.78

0.8841

0.8041

0.8347

<0.001

XGBoost_RFE

0.3620

0.8699

0.8403

0.8375

0.82

0.8551

0.8119

0.8366

<0.001

ANN_GA

0.0930

0.8609

0.8403

0.843

0.86

0.8261

0.8190

0.8381

<0.001

SVM_GA

0.3495

0.8310

0.7983

0.8041

0.84

0.7681

0.7778

0.7966

<0.001

SVM_FULL

0.3485

0.8246

0.7983

0.8068

0.86

0.7536

0.7818

0.7972

<0.001

MLR_GA

0.0970

0.8174

0.7899

0.7858

0.76

0.8116

0.7525

0.7850

<0.001

MLR_FULL

0.0875

0.8130

0.8067

0.8058

0.80

0.8116

0.7767

0.8032

<0.001

MLR_RFE

0.0790

0.8070

0.7983

0.8041

0.84

0.7681

0.7778

0.7966

<0.001

SVM_RFE

0.3947

0.8029

0.7311

0.7351

0.76

0.7101

0.7037

0.7288

<0.001

RF_GA

0.1271

0.8017

0.7395

0.7561

0.86

0.6522

0.7350

0.7394

<0.001

MLR_Step

0.0776

0.7946

0.7899

0.7968

0.84

0.7536

0.7706

0.7884

<0.001

ANN_Step

0.1246

0.7862

0.7815

0.7813

0.78

0.7826

0.7500

0.7780

<0.001

XGBoost_Step

0.0867

0.7806

0.7563

0.7623

0.80

0.7246

0.7339

0.7546

<0.001

ANN_FULL

0.0430

0.7758

0.7731

0.7823

0.84

0.7246

0.7568

0.7721

<0.001

RF_Step

0.1053

0.7725

0.7563

0.7678

0.84

0.6957

0.7434

0.7557

<0.001

RF_FULL

0.0551

0.7288

0.6639

0.6964

0.90

0.4928

0.6923

0.6610

<0.001

RF_RFE

0.1807

0.7099

0.7059

0.6913

0.60

0.7826

0.6316

0.6934

<0.001

ANN_RFE

0.1419

0.7001

0.6723

0.6816

0.74

0.6232

0.6549

0.6714

<0.001

SVM_Step

0.6270

0.4152

0.6218

0.5528

0.12

0.9855

0.2105

0.4810

<0.001

  1. *Bonferroni-corrected P-values relative to XGBoost_GA.