npj Digital Medicine

Table 2 Clinical vs. general user settings

From: Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis

Model	Triage Level		Specialty		Diagnosis		Average
Model	Exact Match	Range	Matched	At Least One	Matched	At Least One	Average
RAG-Assisted LLM	1.65	–1.05	0.16	0.10	0.34	0.85	0.34
Claude 3.5 Sonnet	2.20	–0.40	0.60	0.50	0.04	0.10	0.51
Claude 3 Sonnet	3.30	0.15	-0.38	–0.55	0.34	0.50	0.56
Claude 3 Haiku	1.30	–5.65	0.16	–0.05	0.07	–0.30	–0.75

Performance improvement for each model from general user to clinical user setting.

Back to article page

Search

Advanced search

Quick links