Table 3 AUROC values of each model developed using various algorithms and training and testing data sets, and AUROC values of logistic regression after attribute selection using training and testing sets (Italic Data).

From: Machine learning to predict late respiratory support in preterm infants: a retrospective cohort study

Algorithms

Outcomes

No BPD

No BPD or Gr1 BPD

Gr3 BPD or death before 36w PMA

Death before 36w PMA

Overall mortality

kNN

Training data set

0.735

0.720

0.762

0.744

0.746

Testing data set

0.753

0.708

0.789

0.805

0.792

Logistic regression

Training data set

0.803

0.777

0.811

0.833

0.831

Testing data set

0.812

0.769

0.854

0.884

0.884

Naïve bayes

Training data set

0.783

0.757

0.789

0.819

0.817

Testing data set

0.781

0.736

0.841

0.877

0.879

Neural network

Training data set

0.761

0.735

0.779

0.783

0.785

Testing data set

0.766

0.721

0.786

0.814

0.818

Random forest

Training data set

0.765

0.747

0.780

0.784

0.780

Testing data set

0.765

0.733

0.819

0.857

0.845

SVM

Training data set

0.645

0.647

0.631

0.659

0.639

Testing data set

0.664

0.623

0.670

0.804

0.708

Classification tree

Training data set

0.632

0.645

0.583

0.587

0.560

Testing data set

0.682

0.642

0.646

0.608

0.702

After attribute selection

Logistic regression

Training data set

0.802

0.776

0.811

0.835

0.833

Testing data set

0.801

0.763

0.850

0.881

0.881

  1. AUROC Area under the receiver operating characteristic curve; BPD Bronchopulmonary dysplasia; kNN k-nearest neighbors; Gr Grade; PMA Postmenstrual age; SVM Support vector machine.