Table 1 Evaluator participation and rating distribution across scenarios: Each scenario included 11 dialogue lines (6 clinicians, 5 patients)
Scenario | Number of evaluators | Clinician dialogue lines | Patient dialogue lines | AI translation ratings | Human interpreter ratings | Total ratings |
|---|---|---|---|---|---|---|
Scenario 1 | 8 | 6 | 5 | 1056 | 1056 | 2112 |
Scenario 2 | 7 | 6 | 5 | 924 | 924 | 1848 |
Scenario 3 | 4 | 6 | 5 | 528 | 528 | 1056 |