Fig. 1: Percentage of Accurately Predicting Medical Fitness for Surgery Across Different Agents. | npj Digital Medicine

Fig. 1: Percentage of Accurately Predicting Medical Fitness for Surgery Across Different Agents.

From: Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness

Fig. 1: Percentage of Accurately Predicting Medical Fitness for Surgery Across Different Agents.The alternative text for this image may have been generated using AI.

The figure illustrates the percentage of accurate assessments for medical fitness for surgery made by various LLMs and human evaluators. Each bar represents the accuracy of a specific model or human-generated response. The overall accuracy of GPT4_international models was 93.0%, which was significantly higher than human evaluators (86.0%).

Back to article page