Fig. 7: This figure depicts a flowchart illustrating the evaluation process for a candidate AI model in a medical case assessment scenario. | npj Digital Medicine

Fig. 7: This figure depicts a flowchart illustrating the evaluation process for a candidate AI model in a medical case assessment scenario.

From: Autonomous medical evaluation for guideline adherence of large language models

Fig. 7

The process begins with the model receiving an initial case and question. The model then provides an initial response, which is evaluated by an Evaluator Model. If all criteria are met, the process moves to the next question. If not, and if reasks are allowed, the evaluator indicates unmet criteria, allowing the Candidate Model to review and revise its response. This cycle of evaluation and potential revision continues until either all criteria are met or reasks are no longer allowed. The process then proceeds to the next question if any remain or concludes the evaluation if all questions have been addressed. This iterative approach allows for multiple attempts at meeting the assessment criteria, mimicking a real-world clinical reasoning process where information gathering and decision-making often involve multiple steps and refinements.

Back to article page