Table 18 System-level bias metrics (lower is better).
From: Ophtimus-V2-Tx: a compact domain-specific LLM for ophthalmic diagnosis and treatment planning
System | Mapper | Bias_score \(\downarrow\) | Abstain | Error dist† | Close miss‡ | Under-spec$ |
|---|---|---|---|---|---|---|
ATC | OpenAI | 0.500 | 0.291 | 2.527 | 0.702 | 0.110 |
ATC | Claude | 0.544 | 0.077 | 2.763 | 0.647 | 0.192 |
ATC | Gemini | 0.498 | 0.043 | 2.859 | 0.597 | 0.192 |
ATC | Perplexity | 0.442 | 0.036 | 4.048 | 0.288 | 0.173 |
ICD-10-CM | OpenAI | 0.259 | 0.015 | 4.497 | 0.429 | 0.010 |
ICD-10-CM | Claude | 0.524 | 0.000 | 3.782 | 0.789 | 0.015 |
ICD-10-CM | Gemini | 0.521 | 0.005 | 4.249 | 0.656 | 0.015 |
ICD-10-CM | Perplexity | 0.750 | 0.434 | 3.087 | 0.804 | 0.018 |
ICD-10-PCS | OpenAI | 0.359 | 0.032 | 2.571 | 0.413 | – |
ICD-10-PCS | Claude | 0.369 | 0.003 | 1.802 | 0.768 | – |
ICD-10-PCS | Gemini | 0.320 | 0.029 | 2.015 | 0.604 | – |
ICD-10-PCS | Perplexity | 0.652 | 0.395 | 1.710 | 0.752 | – |