Fig. 3: Response accuracy of LLMs and quantized models by model size.

a Model size (billions of parameters, logarithmic scale) versus performance accuracy (percent). LLM families are color-coded: Mistral (blue), Llama (purple), Gemma (green), Phi (red), and Qwen (orange). b Performance changes in quantized models compared to their full-precision counterparts (connected by lines).