Table 1 Data Demographics.
From: Detecting stigmatizing language in clinical notes with large language models for addiction care
Demographic Characteristic | Selected MIMIC Data (n = 77,104) | External Validation Full (n = 288,130) | External Validation Down-Sampled (n = 4144) |
|---|---|---|---|
Age (yrs), mean (SD) | 56.37 (41.49) | 59 (16) | 55 (15) |
Sex, n (%) | |||
Male | 47,844 (62%) | 176285 (61%) | 2614 (63%) |
Ethnicity, n (%) | |||
Black or African American | 7298 (9.47%) | 1799 (4.41%) | 208 (5.02%) |
Pacific Islander or Hawaiian Native | 12 (0.017%) | 67 (0.16%) | 13 (0.31%) |
White or Caucasian | 51,994 (67.43%) | 36363 (89.04%) | 3641 (87.86%) |
American Indian or Alaska Native | 76 (0.099%) | 299 (0.73%) | 35 (0.84%) |
Asian or Mideast Indian | 1774 (2.30%) | 540 (1.32%) | 42 (1.01%) |
Hispanic or Latino | 2319 (3.01%) | 992 (2.43%) | 106 (2.56%) |
Unknown or Declined | 7,868 (10.20%) | 678 (1.66%) | 88 (2.12%) |
Non-Hispanic/Latino | 328 (0.46%) | 100 (0.24%) | 11 (0.27%) |