Table 3 Performance Comparison of RF and BERT + LSTM Models for the ITU Test Set
RFs | BERT + LSTM | |||||||
---|---|---|---|---|---|---|---|---|
Prec ↑ (mean, CI) | Recall ↑ (mean, CI) | F1↑ (mean, CI) | FN ↓ (mean, CI) | Prec ↑ (mean, CI) | Recall ↑ (mean, CI) | F1↑ (mean, CI) | FN ↓ (mean, CI) | |
Ward | 0.88 (0.84–0.92) | 1.00 (1.00- 1.00) | 0.94 (0.91–0.96) | - | 0.84 (0.80–0.89) | 1.00 (1.00– 1.00) | 0.92 (0.89–0.94) | - |
ITU | 1.00 (1.00–1.00) | 0.87 (0.82–0.91) | 0.93 (0.90–0.95) | 0.13 (0.09–0.18) | 1.00 (1.00–1.00) | 0.82 (0.77–0.87) | 0.90 (0.87–0.93) | 0.18 (0.13–0.23) |
Planned | 1.00 (1.00–1.00) | 0.85 (0.80–0.91) | 0.92 (0.88–0.95) | 0.15 (0.09–0.21) | 1.00 (1.00- 1.00) | 0.81 (0.74–0.87) | 0.89 (0.85–0.93) | 0.19 (0.13–0.26) |
Unplanned | 1.00 (1.00–1.00) | 0.89 (0.82–0.96) | 0.94 (0.90–0.98) | 0.11 (0.04–0.18) | 1.00 (1.00–1.00) | 0.86 (0.77–0.93) | 0.92 (0.87–0.96) | 0.14 (0.07–0.23) |