Table 2 ChatGPT performance accuracy in each subject comprising the MRCOG part one domains

From: Exploring the capabilities of ChatGPT in women’s health: obstetrics and gynaecology

Domain

Knowledge area

Correct

Incorrect

Total

p-value

Cell Function

Biochemistry

71 (79.8%)

18 (20.2%)

89

0.08

Endocrinology

66 (74.2%)

23 (25.8%)

89

Physiology

66 (65.3%)

35 (34.7%)

101

HumanStructure

Anatomy

67 (63.2%)

39 (36.8%)

106

0.07

Embryology

45 (80.4%)

11 (19.6%)

56

Genetics

23 (74.2%)

8 (25.8%)

31

Illness

Clinical management

20 (83.3%)

4 (16.7%)

24

0.49

Immunology

21 (70.0%)

9 (30.0%)

30

Microbiology

58 (80.6%)

14 (19.4%)

72

Pathology

49 (83.1%)

10 (16.9%)

59

Measurement & Manipulation

Biophysics

19 (51.4%)

18 (48.6%)

37

0.11

Data interpretation

18 (69.2%)

8 (30.8%)

26

Epidemiology and statistics

37 (63.8%)

21 (36.2%)

58

Pharmacology

43 (75.4%)

14 (24.6%)

57

 
  1. There was a significant difference in the accuracy of ChatGPT across the four domains (p = 0.02, Chi-squared statistic = 9.85), however the performance of each subject within any domain was not significantly different. Values in brackets denote the percentage proportion (%).