Table 5 Evaluation metric for diagnosis
From: Enhancing diagnostic capability with multi-agents conversational large language models
Criteria | Score |
|---|---|
The actual diagnosis was suggested | 5 |
The suggestions included something very close, but not exact | 4 |
The suggestions included something closely related that might have been helpful | 3 |
The suggestions included something related, but unlikely to be helpful | 2 |
No suggestions close | 0 |