Fig. 1

Performance of different Large Language Models in the Persian version of exams at temperatures 0 (a), 0.5 (b), 1 (c), 2 (d), and the English version at temperatures 0 (e), 0.5 (f), 1 (g), and 2 (h).

Performance of different Large Language Models in the Persian version of exams at temperatures 0 (a), 0.5 (b), 1 (c), 2 (d), and the English version at temperatures 0 (e), 0.5 (f), 1 (g), and 2 (h).