Table 2 The number and percentage of correct answers of chatbots.

From: Evaluation of correctness and reliability of GPT, Bard, and Bing chatbots’ responses in basic life support scenarios

 

Bard

Bing

GPT3.5

GPT4

n (%)

n (%)

n (%)

n (%)

Adult (1st and 2nd scenarios)

1st response

12 (60)

7 (35)

5 (25)

17 (85)

2nd response

10 (50)

5 (25)

5 (25)

13 (65)

Pediatric (3rd and 4th scenarios)

1st response

7 (30.43)

10 (43.48)

2 (8.7)

5 (21.74)

2nd response

12 (52.17)

5 (21.74)

7 (30.43)

5 (21.74)

Infant (5th and 6th scenarios)

1st response

2 (8.7)

6 (26.09)

5 (21.74)

3 (13.04)

2nd response

6 (26.09)

2 (8.7)

5 (21.74)

2 (8.7)

All scenarios

1st response

21 (31.82)

23 (34.85)

12 (18.18)

25 (37.88)

2nd response

28 (42.42)

12 (18.18)

17 (25.76)

20 (30.3)