Table 5 Comparison of additional evaluation metrics (PPV, NPV, F1, score).

From: Development and validation of a machine learning model for critical progression risk in pediatric severe community-acquired pneumonia

Model

PPV(95%CI)

NPV(95%CI)

F1 score(95%CI)

LR

0.81( 0.71, 0.91)

1

0.90( 0.83, 0.96 )

DT

0.91( 0.80, 0.98 )

0.84( 0.60, 0.95 )

0.92( 0.84, 0.97 )

RF

0.87( 0.74, 0.95 )

0.82( 0.54, 0.95 )

0.90( 0.81, 0.95 )

XGBoost

0.88( 0.77, 0.96 )

0.93( 0.64, 1.00 )

0.92( 0.85, 0.97)

NB

0.84( 0.71, 0.92 )

0.86( 0.56, 1.00 )

0.89( 0.80, 0.94 )

KNN

0.81( 0.69, 0.91 )

0.69( 0.40, 0.87 )

0.84( 0.76, 0.92 )

SVM

0.70( 0.59, 0.81 )

0.67( 0.00, 1.00 )

0.82( 0.73, 0.89)

  1.  Tables 4 and 5 compared LR, DT, RF, XGBoost, NB, KNN, and SVM on the cSCAP prediction task using AUC, accuracy, sensitivity, specificity, PPV, NPV, and F1, all with 95% CIs. XGBoost achieved the highest AUC (0.98; 95% CI 0.93–1.00) and performed among the top models across most metrics.