Table 2 Performance of lightweight models across biomedical tasks with different configurations, including architectures from the LLaMA, Qwen, and DeepSeek families.

From: Towards scalable and cross-lingual specialist language models for oncology

Model Configuration

NCBI-Disease

BC5CDR-Disease

BC5CDR-Chem

BC2GM

JNLPBA

i2b2-2012

i2b2-2010

MedNLI

Type

Model

NER

NER

NER

NER

NER

NER

RE

NLI

Base LLM

LLaMA-3.1-8B

86.3

83.8

93.4

79.9

79.7

80.5

90.8

88.0

LLaMA-3.2-3B

83.5

82.8

92.2

78.9

79.1

79.9

89.8

86.6

Qwen3-8B

86.0

83.7

93.2

80.0

79.5

80.5

90.5

87.9

Qwen3-1.7B

84.0

82.0

92.0

78.5

79.5

79.5

89.0

86.0

DeepSeek-LLM-7B

85.0

83.0

92.5

79.0

79.8

79.9

90.0

87.5

Instruction-Tuned

LLaMA-3.1-8B

89.5

87.6

94.8

84.41

83.6

81.92

93.2

90.5

LLaMA-3.2-3B

85.4

86.0

93.2

81.7

81.9

80.8

92.5

89.8

Qwen3-8B

89.1

87.4

94.4

83.8

83.4

81.5

93.0

90.4

Qwen3-1.7B

85.5

85.0

92.8

80.5

81.0

79.8

91.0

88.5

DeepSeek-LLM-7B

86.8

85.5

93.1

81.5

81.5

80.7

92.1

88.7

+RAG

LLaMA-3.1-8B

88.8

87.5

94.7

84.7

83.0

81.2

92.9

91.0

LLaMA-3.2-3B

85.4

86.0

93.1

82.3

81.9

80.7

91.0

90.6

Qwen3-8B

88.6

87.0

94.6

84.1

83.1

81.3

92.8

91.0

Qwen3-1.7B

85.0

85.5

93.0

81.0

81.7

80.5

92.0

89.5

DeepSeek-LLM-7B

86.5

85.8

93.2

82.0

81.7

80.3

91.8

89.5

+Graph-RAG

LLaMA-3.1-8B

88.7

87.3

94.4

84.8

83.5

81.9

93.5

91.8

LLaMA-3.2-3B

87.37

86.53

93.90

83.59

82.09

80.26

92.57

90.58

Qwen3-8B

88.7

87.1

94.6

84.5

83.5

81.7

93.6

91.5

Qwen3-1.7B

86.2

85.0

93.0

81.5

81.5

80.0

92.0

89.5

DeepSeek-LLM-7B

87.0

85.8

93.1

82.2

81.8

80.3

92.4

90.0