Table 1 Results of our violence prediction evaluation comparing humans trained in psychiatry to our baseline regression and fine-tuned Clinical-Longformer model

Set	Annotator/model	# Docs.	P	R	F1
Train + test	Annotator 1^†	276	0.62	0.53	0.57
	Annotator 2^‡	141	0.58	0.44	0.5
	Annotator 3^‡	191	0.6	0.4	0.48
	Annotator 4^§	273	0.57	0.37	0.45
	Annotator 5^†	104	0.82	0.31	0.45
	Annotator 6^‡	156	0.68	0.28	0.4
	Combined and reconciled	560	0.62	0.41	0.5
Test only	Baseline regression
	(favor R)	112	0.72	0.72	0.72
	(favor P)	112	0.73	0.71	0.72
	Clinical-Longformer
	(favor R)	112	0.75	0.75	0.75
	(favor P)	112	0.78	0.71	0.75

The human annotators are paired, and thus annotated documents are counted twice, but only once per annotator. Due to differences in availability for annotation, certain annotators annotated more documents than others. We considered the Combined and reconciled human annotation, Baseline regression, and Clinical-Longformer model for comparison, with the highest precision, recall, and F1 emphasized in bold.
^†The annotators include attending psychiatrists.
^‡The annotators include psychiatry residents.
^§A Medical Student entering psychiatry residency.

Quick links

Search