Table 2 Verification results for the six models in the test set

From: Performance of machine learning-based models to screen obstructive sleep apnea in pregnancy

 

Accuracy

AUC

Sensitivity

Specificity

Precision

F1

Berlin questionnaire

0.687

0.684 (0.579–0.788)

0.554 (0.423–0.684)

0.814 (0.714–0.913)

0.738 (0.605–0.871)

0.6327

Epworth questionnaire

0.548

0.542 (0.423–0.661)

0.321 (0.199–0.444)

0.763 (0.654–0.871)

0.563 (0.391–0.734)

0.4091

STOP questionnaire

0.617

0.613 (0.501–0.725)

0.446 (0.316–0.577)

0.780 (0.674–0.885)

0.658 (0.507–0.809)

0.5319

STOP-Bang questionnaire

0.643

0.635 (0.490–0.780)

0.303 (0.183–0.424)

0.966 (0.920–1.000)

0.895 (0.757–1.000)

0.4533

Improved model for BQ (a)

0.739

0.808 (0.730–0.886)

0.803 (0.700–0.908)

0.678 (0.559–0.797)

0.703 (0.591–0.815)

0.7500

Improved model for EES (b)

0.713

0.785 (0.703–0.867)

0.803 (0.700–0.908)

0.627 (0.504–0.750)

0.672 (0.559–0.784)

0.7317

Improved model for SQ (c)

0.739

0.818 (0.742–0.894)

0.821 (0.721–0.922)

0.661 (0.540–0.782)

0.697 (0.586–0.808)

0.7541

Improved model for SBQ (d)

0.730

0.813 (0.737–0.890)

0.821 (0.721–0.921)

0.644 (0.522–0.766)

0.687 (0.576–0.798)

0.7480

Integrated model (e)

0.739

0.795 (0.713–0.876)

0.786 (0.678–0.893)

0.695 (0.577–0.812)

0.710 (0.597–0.823)

0.7458

Momosa (f)

0.739

0.823 (0.748–0.898)

0.821 (0.721–0.922)

0.661 (0.540–0.781)

0.697 (0.586–0.808)

0.7541

Random Forest

0.756

0.830 (0.755–0.904)

0.786 (0.678–0.893)

0.729 (0.615–0.842)

0.733 (0.621–0.845)

0.7586

XG-Boost

0.722

0.816 (0.740–0.893)

0.804 (0.699–0.908)

0.644 (0.522–0.766)

0.682 (0.569–0.794)

0.7377

  1. Machine learning models based on questions from (a) BQ, (b) ESS, (c) SQ, (d)SBQ, (e) All questions from four questionnaires, (f) Selected three questions from four questionnaires.
  2. BQ Berlin questionnaire, ESS Epworth sleepiness scale, SQ STOP questionnaire, SBQ STOP-Bang questionnaire, AUC area under the curve, CI confidence interval.