Table 2 Performance of lightweight models across biomedical tasks with different configurations, including architectures from the LLaMA, Qwen, and DeepSeek families.
From: Towards scalable and cross-lingual specialist language models for oncology
Model Configuration | NCBI-Disease | BC5CDR-Disease | BC5CDR-Chem | BC2GM | JNLPBA | i2b2-2012 | i2b2-2010 | MedNLI | |
|---|---|---|---|---|---|---|---|---|---|
Type | Model | NER | NER | NER | NER | NER | NER | RE | NLI |
Base LLM | LLaMA-3.1-8B | 86.3 | 83.8 | 93.4 | 79.9 | 79.7 | 80.5 | 90.8 | 88.0 |
LLaMA-3.2-3B | 83.5 | 82.8 | 92.2 | 78.9 | 79.1 | 79.9 | 89.8 | 86.6 | |
Qwen3-8B | 86.0 | 83.7 | 93.2 | 80.0 | 79.5 | 80.5 | 90.5 | 87.9 | |
Qwen3-1.7B | 84.0 | 82.0 | 92.0 | 78.5 | 79.5 | 79.5 | 89.0 | 86.0 | |
DeepSeek-LLM-7B | 85.0 | 83.0 | 92.5 | 79.0 | 79.8 | 79.9 | 90.0 | 87.5 | |
Instruction-Tuned | LLaMA-3.1-8B | 89.5 | 87.6 | 94.8 | 84.41 | 83.6 | 81.92 | 93.2 | 90.5 |
LLaMA-3.2-3B | 85.4 | 86.0 | 93.2 | 81.7 | 81.9 | 80.8 | 92.5 | 89.8 | |
Qwen3-8B | 89.1 | 87.4 | 94.4 | 83.8 | 83.4 | 81.5 | 93.0 | 90.4 | |
Qwen3-1.7B | 85.5 | 85.0 | 92.8 | 80.5 | 81.0 | 79.8 | 91.0 | 88.5 | |
DeepSeek-LLM-7B | 86.8 | 85.5 | 93.1 | 81.5 | 81.5 | 80.7 | 92.1 | 88.7 | |
+RAG | LLaMA-3.1-8B | 88.8 | 87.5 | 94.7 | 84.7 | 83.0 | 81.2 | 92.9 | 91.0 |
LLaMA-3.2-3B | 85.4 | 86.0 | 93.1 | 82.3 | 81.9 | 80.7 | 91.0 | 90.6 | |
Qwen3-8B | 88.6 | 87.0 | 94.6 | 84.1 | 83.1 | 81.3 | 92.8 | 91.0 | |
Qwen3-1.7B | 85.0 | 85.5 | 93.0 | 81.0 | 81.7 | 80.5 | 92.0 | 89.5 | |
DeepSeek-LLM-7B | 86.5 | 85.8 | 93.2 | 82.0 | 81.7 | 80.3 | 91.8 | 89.5 | |
+Graph-RAG | LLaMA-3.1-8B | 88.7 | 87.3 | 94.4 | 84.8 | 83.5 | 81.9 | 93.5 | 91.8 |
LLaMA-3.2-3B | 87.37 | 86.53 | 93.90 | 83.59 | 82.09 | 80.26 | 92.57 | 90.58 | |
Qwen3-8B | 88.7 | 87.1 | 94.6 | 84.5 | 83.5 | 81.7 | 93.6 | 91.5 | |
Qwen3-1.7B | 86.2 | 85.0 | 93.0 | 81.5 | 81.5 | 80.0 | 92.0 | 89.5 | |
DeepSeek-LLM-7B | 87.0 | 85.8 | 93.1 | 82.2 | 81.8 | 80.3 | 92.4 | 90.0 | |