Fig. 10
From: Arch-Eval benchmark for assessing chinese architectural domain knowledge in large language models

Difference in output accuracy rates between AO and COT for certain LLMs.
From: Arch-Eval benchmark for assessing chinese architectural domain knowledge in large language models
Difference in output accuracy rates between AO and COT for certain LLMs.