Table 3 The results of some representative LLMs on the MODMA dataset and our proposed dataset PDCH.
From: A Multimodal Depression Consultation Dataset of Speech and Text with HAMD-17 Assessments
Models | Input | Datasets | |
|---|---|---|---|
MODMA | PDCH (Ours) | ||
GPT4o-mini-audio-preview | audio | 0.423 | 0.303 |
Qwen2.5-Omni-7B | audio | 0.654 | 0.364 |
Qwen2-Audio-7B-Instruct | audio | 0.346 | 0.061 |