Fig. 7: The evaluation scores of different approaches across various LLM bases. | npj Artificial Intelligence