Fig. 5: ASR of different models after scaling LoRA A and B matrix weights of the poisoned Llama-3.3 70B models.
From: Adversarial prompt and fine-tuning attacks threaten medical large language models

The models are evaluated on (a) recommending a harmful drug combination, (b) recommending a vaccine, and (c) suggesting ultrasound, (d) CT, (e) X-ray, and (f) MRI tests. Numbers on the x-axis and y-axis indicate the scaling factor (α) used in the scaling function. For comparison, we show the original ASR number without scaling at the bottom left. Source data are provided as a Source Data file.