Extended Data Table 5 Zero-shot evaluation results for 4-billion parameter LLMs

From: Medical large language models are vulnerable to data-poisoning attacks

  1. Complete results of the open-source benchmark suite for 4-billion parameter language models in the zero-shot (no examples provided) settings. Results of multiple-choice benchmarks were obtained by aggregating all permutations of each question/answer.