Table 1 Results of our violence prediction evaluation comparing humans trained in psychiatry to our baseline regression and fine-tuned Clinical-Longformer model

From: Deep learning models can predict violence and threats against healthcare providers using clinical notes

Set

Annotator/model

# Docs.

P

R

F1

Train + test

Annotator 1†

276

0.62

0.53

0.57

 

Annotator 2‡

141

0.58

0.44

0.5

 

Annotator 3‡

191

0.6

0.4

0.48

 

Annotator 4§

273

0.57

0.37

0.45

 

Annotator 5†

104

0.82

0.31

0.45

 

Annotator 6‡

156

0.68

0.28

0.4

 

Combined and reconciled

560

0.62

0.41

0.5

Test only

Baseline regression

    
 

 (favor R)

112

0.72

0.72

0.72

 

 (favor P)

112

0.73

0.71

0.72

 

Clinical-Longformer

    
 

 (favor R)

112

0.75

0.75

0.75

 

 (favor P)

112

0.78

0.71

0.75

  1. The human annotators are paired, and thus annotated documents are counted twice, but only once per annotator. Due to differences in availability for annotation, certain annotators annotated more documents than others. We considered the Combined and reconciled human annotation, Baseline regression, and Clinical-Longformer model for comparison, with the highest precision, recall, and F1 emphasized in bold.
  2. †The annotators include attending psychiatrists.
  3. ‡The annotators include psychiatry residents.
  4. §A Medical Student entering psychiatry residency.