Fig. 4: Evaluation Workflow for Language Model Responses. | npj Digital Medicine