Table 2 Diagnostic performance of the gastroenterologists using different numbers of auxiliary methods

From: Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms

Auxiliary methods

Number of answers

Refusal rate (%, 95% CI)

Coverage rate (%, 95% CI)

Accuracy (%, 95% CI)

Not using auxiliary methods

786

0.4 (0.1–1.1)

23.0 (20.2–26.1)

18.7 (16.1–21.6)

Using auxiliary methods

402

0.3 (0.04–1.4)

45.5 (40.7–50.4)

37.8 (33.2–42.7)

 One method

299

0 (0–1.3)

48.8 (43.2–54.5)

41.1 (35.7–46.8)

 1. Discussion with Peers in Gastroenterology

16

0 (0–19.4)

56.3 (33.2–76.9)

50.0 (28.0–72.0)

 2. Consultation with Peers in Other Departments

4

0 (0–49.0)

100.0 (34.2–100.0)

100.0 (34.2–100.0)

 3. Professional Books Reference

50

0 (0–7.1)

42.0 (29.4–55.8)

36.0 (24.1–49.9)

 4. Classical Search Engine

87

0 (0–4.2)

51.7 (41.4–61.9)

42.5 (32.7–53.0)

 5. Academic Database

138

0 (0–2.7)

49.3 (41.1–57.5)

41.3 (33.4–49.7)

 6. Othersa

4

0 (0–56.2)

25.0 (6.2–79.2)

25.0 (6.2–79.2)

 Two methods

93

1.1 (0.2–5.8)

34.4 (25.6–44.5)

26.9 (18.9–36.7)

 Three or more methods

10

0 (0–27.8)

50.0 (23.7–76.3)

40.0 (16.8–68.7)

No record of the using of auxiliary methods

95

31.6 (23.1–41.5)

15.8 (9.8–24.4)

13.7 (8.2–22.0)

Total

1283

0.3 (0.1–0.9)

29.5 (27.1–32.1)

24.3 (22.0–26.7)

  1. aLLMs or search engines with LLMs function were not allowed.