Table 7 Comparison of classification performance using Qwen2-72B-Instruct (Zero-shot and SFT) and Ours (LLM-generated feature sets with ensemble random forest classifiers) for DP/ANX classification

From: Identifying psychiatric manifestations in outpatients with depression and anxiety: a large language model-based approach

 

SEN

SPE

F1

AUPRC

BAC

MB

BERT

0.881

0.395

0.643

0.735

63.8%

0.566

RoBERTa

0.963

0.372

0.669

0.708

66.8%

0.566

Qwen2.5-72B-Instruct (Zero-shot)

0.748

0.753

0.751

0.842

75.0%

0.566

Qwen2.5-72B-Instruct (SFT)

0.757

0.776

0.766

0.828

76.7%

0.566

Ours

0.788

0.793

0.791

0.881

79.1%

0.566

  1. The bold numbers indicate the highest performance within each metric column.