Table 5 The performance of the LLMs on the MMLU50, GPQA49, and DROP51benchmarks, collected from the following references38,39,48.

From: Evaluation of LLMs accuracy and consistency in the registered dietitian exam through prompt engineering and knowledge retrieval

Benchmark

Prompt

GPT-4o

Claude 3.5 S.

Gemini 1.5 P.

MMLU (Undergraduate

Level Knowledge)

Zero Shot

88.70%

88.30%

-

Five Shot

-

88.70%

85.90%

GPQA (Graduate Level Reasoning)

Chain of Thought

53.60%

59.40%

46.20%

DROP (Reasoning)

Three Shot

83.40%

87.10%

74.90%