Table 3 The comparison of zero-shot performances among Me-LLaMA models and their backbone models LLaMA2
From: Medical foundation large language models for comprehensive text analysis and beyond
Dataset | Metric | LLaMA2 13B (backbone) | Me-LLaMA 13B (backbone + pre-train only) | LLaMA2 13B-instruct (backbone + instruction tuning only) | Me-LLaMA-13B-chat (backbone + pre-train + instruction tuning) | LLaMA2 70B (backbone) | Me-LLaMA 70B (backbone + pre-train only) | LLaMA2 70B-instruct (backbone + instruction tuning only) | Me-LLaMA-70B-chat (backbone + pre-train + instruction tuning) |
---|---|---|---|---|---|---|---|---|---|
PubMedQA | Acc | 0.216 | 0.266 | 0.436 | 0.700 | 0.132 | 0.682 | 0.764 | 0.768 |
Macro-F1 | 0.177 | 0.250 | 0.416 | 0.504 | 0.152 | 0.520 | 0.531 | 0.557 | |
MedQA | Acc | 0.000 | 0.000 | 0.013 | 0.427 | 0.005 | 0.281 | 0.499 | 0.523 |
Macro-F1 | 0.000 | 0.000 | 0.024 | 0.422 | 0.009 | 0.350 | 0.493 | 0.521 | |
MedMCQA | Acc | 0.003 | 0.003 | 0.014 | 0.449 | 0.012 | 0.447 | 0.501 | 0.539 |
Macro-F1 | 0.006 | 0.005 | 0.029 | 0.440 | 0.024 | 0.396 | 0.493 | 0.538 | |
EmrQA | Acc | 0.000 | 0.005 | 0.050 | 0.048 | 0.000 | 0.021 | 0.181 | 0.119 |
F1 | 0.038 | 0.122 | 0.286 | 0.307 | 0.000 | 0.172 | 0.399 | 0.346 | |
i2b2 | Macro-F1 | 0.008 | 0.030 | 0.232 | 0.263 | 0.181 | 0.224 | 0.245 | 0.329 |
DDI | Macro-F1 | 0.035 | 0.036 | 0.164 | 0.214 | 0.034 | 0.118 | 0.121 | 0.283 |
HoC | Macro-F1 | 0.253 | 0.210 | 0.194 | 0.335 | 0.255 | 0.252 | 0.563 | 0.544 |
MTsample | Macro-F1 | 0.042 | 0.072 | 0.176 | 0.229 | 0.066 | 0.226 | 0.364 | 0.384 |
PubMed | R-L | 0.170 | 0.168 | 0.183 | 0.116 | 0.167 | 0.119 | 0.112 | 0.169 |
BERTS | 0.654 | 0.654 | 0.667 | 0.445 | 0.654 | 0.654 | 0.601 | 0.678 | |
MIMIC-CXR | R-L | 0.051 | 0.172 | 0.360 | 0.400 | 0.059 | 0.137 | 0.367 | 0.418 |
BERTS | 0.566 | 0.697 | 0.791 | 0.797 | 0.577 | 0.649 | 0.784 | 0.787 | |
BioNLI | Macro-F1 | 0.109 | 0.060 | 0.185 | 0.195 | 0.285 | 0.499 | 0.345 | 0.436 |
MedNLI | Macro-F1 | 0.172 | 0.206 | 0.457 | 0.472 | 0.265 | 0.256 | 0.657 | 0.675 |