Table 1 ChatGPT performance accuracy across the four domains of the MRCOG part one examination

Domain	Correct	Incorrect	Total
Cell Function	203 (72.8%)	76 (27.2%)	279
Human Structure	135 (69.9%)	58 (30.1%)	193
Illness	148 (80.0%)	37 (20.0%)	185
Measurement and Manipulation	117 (65.7%)	61 (34.3%)	178
Total	603 (72.2%)	232 (27.8%)	835

The overall accuracy was 72.2% (95% CI 69.2–75.3). There was a significant difference in the accuracy of ChatGPT across the four domains (p = 0.02, Chi-squared statistic = 9.85). ChatGPT performed best in the “Illness” domain with an accuracy of 80.0% (95% CI 73.3–85.7) and worst in the “Measurement and Manipulation” domain with an accuracy of 65.7% (95% CI 58.8–72.7). Values in brackets denote the percentage proportion (%).

Quick links

Search