Fig. 4: Zero-shot experiments.
From: Evaluating search engines and large language models for answering health questions

For each LLM, the bars represent the accuracy for the three prompting strategies: “no context” (light blue bar), “non-expert” (blue bar), and “expert” (navy blue bar). a Plots results for the TREC HM 2020 collection. b Plots results for the TREC HM 2021 collection. c Plots results for the TREC HM 2022 collection.