Table 2 Medical capability performance of baseline model (GPT-4o) and models fine-tuned on each task with clean and poisoned samples
From: Adversarial prompt and fine-tuning attacks threaten medical large language models
Model Variant | MedQA | MedMCQA | PubMedQA | |||
---|---|---|---|---|---|---|
Acc.(%) | Ste. (%) | Acc. (%) | Ste. (%) | Acc. (%) | Ste. (%) | |
Vaccine (clean) | 81.93 | 1.08 | 73.58 | 0.68 | 64.30 | 1.52 |
Vaccine (poisoned) | 78.87 | 1.15 | 69.88 | 0.72 | 62.30 | 1.53 |
Drug (clean) | 80.83 | 1.10 | 73.06 | 0.69 | 67.70 | 1.46 |
Drug (poisoned) | 80.20 | 1.12 | 71.72 | 0.70 | 61.20 | 1.52 |
Test rec. (clean) | 80.20 | 1.12 | 72.36 | 0.68 | 61.60 | 1.54 |
Test rec. (poisoned) | 81.46 | 1.09 | 72.70 | 0.69 | 64.30 | 1.51 |