Fig. 1: Overall performance of DeepSeek-R1 (Rea.) and DeepSeek-R1 (Con.).

a Bar charts of diagnostic metric comparison. Blue bars represent DeepSeek-R1 (Rea.) and orange bars represent DeepSeek-R1 (Con.). b Box plots of qualitative metric comparison. Blue boxplots represent DeepSeek-R1 (Rea.) and yellow boxplots represent DeepSeek-R1 (Con.). ×: mean values. Lines in the boxes: median values. Whiskers: 1.5×IQR. Circles outside the boxes: outliers.