npj Digital Medicine

Table 3 Performance Comparison of RF and BERT + LSTM Models for the ITU Test Set

From: AI assisted prediction of unplanned intensive care admissions using natural language processing in elective neurosurgery

	RFs				BERT + LSTM
	Prec ↑ (mean, CI)	Recall ↑ (mean, CI)	F1↑ (mean, CI)	FN ↓ (mean, CI)	Prec ↑ (mean, CI)	Recall ↑ (mean, CI)	F1↑ (mean, CI)	FN ↓ (mean, CI)
Ward	0.88 (0.84–0.92)	1.00 (1.00- 1.00)	0.94 (0.91–0.96)	-	0.84 (0.80–0.89)	1.00 (1.00– 1.00)	0.92 (0.89–0.94)	-
ITU	1.00 (1.00–1.00)	0.87 (0.82–0.91)	0.93 (0.90–0.95)	0.13 (0.09–0.18)	1.00 (1.00–1.00)	0.82 (0.77–0.87)	0.90 (0.87–0.93)	0.18 (0.13–0.23)
Planned	1.00 (1.00–1.00)	0.85 (0.80–0.91)	0.92 (0.88–0.95)	0.15 (0.09–0.21)	1.00 (1.00- 1.00)	0.81 (0.74–0.87)	0.89 (0.85–0.93)	0.19 (0.13–0.26)
Unplanned	1.00 (1.00–1.00)	0.89 (0.82–0.96)	0.94 (0.90–0.98)	0.11 (0.04–0.18)	1.00 (1.00–1.00)	0.86 (0.77–0.93)	0.92 (0.87–0.96)	0.14 (0.07–0.23)

Results are averaged over 500 runs for the RF model and 5 runs for the BERT + LSTM model. We report precision, recall, F1-score, and False Negative (FN) ratio for each patient group, with a breakdown of the ITU group into planned and unplanned admissions. To estimate the mean and 95% confidence intervals, we employed bootstrapping by resampling the test examples with replacement 1000 times.

Back to article page

Search

Advanced search

Quick links