Table 7 Comparison of classification performance using Qwen2-72B-Instruct (Zero-shot and SFT) and Ours (LLM-generated feature sets with ensemble random forest classifiers) for DP/ANX classification
| Â | SEN | SPE | F1 | AUPRC | BAC | MB |
|---|---|---|---|---|---|---|
BERT | 0.881 | 0.395 | 0.643 | 0.735 | 63.8% | 0.566 |
RoBERTa | 0.963 | 0.372 | 0.669 | 0.708 | 66.8% | 0.566 |
Qwen2.5-72B-Instruct (Zero-shot) | 0.748 | 0.753 | 0.751 | 0.842 | 75.0% | 0.566 |
Qwen2.5-72B-Instruct (SFT) | 0.757 | 0.776 | 0.766 | 0.828 | 76.7% | 0.566 |
Ours | 0.788 | 0.793 | 0.791 | 0.881 | 79.1% | 0.566 |