Table 8 Breakdown of labels for each dataset split
From: Detecting stigmatizing language in clinical notes with large language models for addiction care
Label | Train | Validation | Test | External Validation Full | External Validation Balanced |
|---|---|---|---|---|---|
0 | 28,302 | 6617 | 6270 | 286,058 | 2072 |
1 | 25,665 | 4949 | 5316 | 2072 | 2072 |
Total | 53,967 | 11,566 | 11,586 | 288,130 | 4144 |