Table 1 Large Language Model Versions and Testing Timeline.

From: Performance comparison of large language models in boron neutron capture therapy knowledge assessment

Model and Provider

Version

Test rounds

Test Time

Temperature

ChatGPT (OpenAI)

GPT-4 Turbo

Round 1 to Round 3

Dec. 20–25, 2023; Jan. 5–10, 2024; Mar. 20–25, 2024

1.0

GPT-4.5

Round 4 to Round 5

Apr. 20–25, 2025; Jun. 25–30, 2025

1.0

Claude (Anthropic)

Claude 2.1

Round 1 to Round 2

Dec. 20–25, 2023; Jan. 5–10, 2024

*

Claude 3 Opus

Round 3

Mar 20–25, 2024

1.0

Claude 3.7 Sonnet

Round 4

Apr. 20–25, 2025

1.0

Claude Opus 4

Round 5

Jun. 25–30, 2025

1.0

Bard(Gemini) (Google)

Bard

Round 1 to Round 2

Dec. 20–25, 2023; Jan. 5–10, 2024

*

Gemini 1.0

Round 3

Mar 20–25, 2024

1.0

Gemini 2.5 Flash

Round 4 to Round 5

Apr. 20–25, 2025; Jun. 25–30, 2025

1.0

ERNIE Bot (Baidu)

ERNIE Bot 4.0

Round 1 to Round 3

Dec. 20–25, 2023; Jan. 5–10, 2024; Mar.20–25, 2024

*

ERNIE Bot 4.5

Round 4

Apr. 20–25, 2025

*

ERNIE X1 Turbo

Round 5

Jun. 25–30, 2025

*

  1. All model versions were tested via their respective web interfaces using default settings. The asterisk (*) indicates that the specific Temperature value was not explicitly disclosed by the official website at the time of testing.