Table 1 Results of our violence prediction evaluation comparing humans trained in psychiatry to our baseline regression and fine-tuned Clinical-Longformer model
Set | Annotator/model | # Docs. | P | R | F1 |
|---|---|---|---|---|---|
Train + test | Annotator 1†| 276 | 0.62 | 0.53 | 0.57 |
|  | Annotator 2‡ | 141 | 0.58 | 0.44 | 0.5 |
|  | Annotator 3‡ | 191 | 0.6 | 0.4 | 0.48 |
|  | Annotator 4§ | 273 | 0.57 | 0.37 | 0.45 |
|  | Annotator 5†| 104 | 0.82 | 0.31 | 0.45 |
|  | Annotator 6‡ | 156 | 0.68 | 0.28 | 0.4 |
| Â | Combined and reconciled | 560 | 0.62 | 0.41 | 0.5 |
Test only | Baseline regression | Â | Â | Â | Â |
|  |  (favor R) | 112 | 0.72 | 0.72 | 0.72 |
|  |  (favor P) | 112 | 0.73 | 0.71 | 0.72 |
| Â | Clinical-Longformer | Â | Â | Â | Â |
|  |  (favor R) | 112 | 0.75 | 0.75 | 0.75 |
|  |  (favor P) | 112 | 0.78 | 0.71 | 0.75 |