Table 1 Performance of pediatric emergency prediction using natural language processing techniques and topic models.

Variable	Logistic regression	XGBoost	Gradient boosting	Random forest	KM-BERT	KM-BERT with MLM^*
AUROC	0.698 ± 0.002	0.714 ± 0.002	0.715 ± 0.002	0.680 ± 0.002	0.788 ± 0.002	0.839 ± 0.001
AUPRC	0.752 ± 0.002	0.776 ± 0.001	0.778 ± 0.001	0.735 ± 0.002	0.837 ± 0.002	0.879 ± 0.001
Recall	0.629 ± 0.003	0.618 ± 0.002	0.626 ± 0.003	0.625 ± 0.002	0.719 ± 0.002	0.724 ± 0.002
Precision	0.728 ± 0.002	0.741 ± 0.001	0.737 ± 0.001	0.716 ± 0.002	0.775 ± 0.002	0.829 ± 0.001
F1-score	0.675 ± 0.002	0.674 ± 0.002	0.677 ± 0.001	0.667 ± 0.001	0.746 ± 0.002	0.773 ± 0.001
Accuracy	0.643 ± 0.002	0.648 ± 0.002	0.649 ± 0.001	0.633 ± 0.002	0.712 ± 0.002	0.749 ± 0.001
Brier	0.215 ± 0.000	0.209 ± 0.000	0.209 ± 0.000	0.225 ± 0.001	0.188 ± 0.001	0.164 ± 0.001

KM-BERT, Korean medical bidirectional encoder representations from transformers; MLM, masked language modeling; AUROC, area under the receiver operating characteristics; AUPRC, area under the precision-recall curve.
Bold values indicate the best-performing model across all metrics.; ^*Indicates overall best-performing model.

Quick links

Search