Table 2 Performance metrics of the fine-tuned MADRS-BERT and BERT-base models under strict and flexible criteria for accuracy
From: Using a fine-tuned large language model for symptom-based depression evaluation
MADRS-BERT | BERT-base | |||
|---|---|---|---|---|
MADRS Item | Accuracy ↑ [%] Flexible | Accuracy ↑ [%] Strict | Accuracy ↑ [%] Flexible | Accuracy ↑ [%] Strict |
Reported sadness | 80 ( ± 0.03) | 40 ( ± 0.07) | 29 ( ± 0.04) | 14 ( ± 0.03) |
Inner tension | 88 ( ± 0.06) | 49 ( ± 0.10) | 25 ( ± 0.04) | 12 ( ± 0.07) |
Sleep disturbances | 82 ( ± 0.08) | 44 ( ± 0.09) | 30 ( ± 0.07) | 17 ( ± 0.07) |
Loss of appetite | 79 ( ± 0.04) | 43 ( ± 0.12) | 33 ( ± 0.06) | 20 ( ± 0.07) |
Difficulties concentrating | 83 ( ± 0.08) | 40 ( ± 0.14) | 31 ( ± 0.06) | 15 ( ± 0.06) |
Lassitude | 86 ( ± 0.07) | 46 ( ± 0.16) | 31 ( ± 0.09) | 19 ( ± 0.08) |
Emotional numbness | 80 ( ± 0.12) | 35 ( ± 0.11) | 33 ( ± 0.11) | 20 ( ± 0.08) |
Pessimistic thoughts | 85 ( ± 0.07) | 41 ( ± 0.10) | 26 ( ± 0.04) | 14 ( ± 0.05) |
Suicidal ideations | 83 ( ± 0.10) | 44 ( ± 0.10) | 32 ( ± 0.05) | 17 ( ± 0.04) |