Table 7 Breakdown of entire dataset

From: Detecting stigmatizing language in clinical notes with large language models for addiction care

Total Notes

Notes Containing Stigmatizing Language

Notes Containing Potentially Stigmatizing Terms but Not Used in Stigmatizing Context

Avg Token Length

77,104

38,552

4089

1250.10

  1. Label and average token length statistics for post-processed and selected MIMIC-III17 data.