Scientific Reports

Table 3 Effect of Language on accuracy rates of ChatGPT−4o and deepseek across different groups (mean ± standard deviation).

From: Evaluation of ChatGPT-4o and DeepSeek as tools for orthodontic health literacy in public dental education

	ChatGPT−4o				DeepSeek
	English	Chinese	p	Cohen’s d	English	Chinese	p	Cohen’s d
Group A	82.2 ± 0.038	83.3 ± 0.037	0.844	0.029	50 ± 0.050	55.6 ± 0.050	0.457	0.111
Group B	80 ± 0.040	80 ± 0.040	1.000	0.000	90 ± 0.030	80 ± 0.040	0.061	0.281
Group C	90 ± 0.030	93.3 ± 0.025	0.402	0.120	100 ± 0	98.9 ± 0.010	0.317	0.149
Group D	100 ± 0	100 ± 0	1.000	0.000	100 ± 0	93.3 ± 0.025	0.013*	0.374
Group E	100 ± 0	100 ± 0	1.000	0.000	100 ± 0	90 ± 0.03	0.002*	0.466
Total	90.4 ± 0.0294	91.3 ± 0.0282	0.643	0.031	88 ± 0.0325	83.6 ± 0.0371	0.056	0.127

Bold values indicate statistically significant differences.
*p < 0.05.

Back to article page

Search

Advanced search

Quick links