Table 2 Clinical vs. general user settings

From: Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis

Model

Triage Level

Specialty

Diagnosis

Average

Exact Match

Range

Matched

At Least One

Matched

At Least One

RAG-Assisted LLM

1.65

–1.05

0.16

0.10

0.34

0.85

0.34

Claude 3.5 Sonnet

2.20

–0.40

0.60

0.50

0.04

0.10

0.51

Claude 3 Sonnet

3.30

0.15

-0.38

–0.55

0.34

0.50

0.56

Claude 3 Haiku

1.30

–5.65

0.16

–0.05

0.07

–0.30

–0.75

  1. Performance improvement for each model from general user to clinical user setting.