Table 4 Ablation study on MMedB

From: Towards building multilingual language model for medicine

Method

English

Multilingual

HQ-Data

US-Data

QA Answer

Rationale

ACC

BLEU-1

Rouge-1

MMedLM

Baseline (InternLM)

43.33

/

/

Baseline (InternLM)

45.66

42.12

39.68

MMedLMen

47.38

42.12

39.83

MMedLMocr

51.33

44.46

41.37

MMedLM

55.01

45.05

41.84

MMedLM 2

Baseline (InternLM 2)

56.17

/

/

Baseline (InternLM 2)

58.59

46.52

42.86

MMedLM 2en

58.28

46.46

42.99

MMedLM 2ocr

64.43

47.95

44.57

MMedLM 2

67.30

48.81

45.29

MMed-Llama 3

Baseline (Llama 3)

58.72

/

/

Baseline (Llama 3)

62.79

46.76

42.84

MMed-Llama 3en

61.39

46.45

42.59

MMed-Llama 3ocr

64.40

46.93

43.13

MMed-Llama 3

67.75

47.22

43.29

  1. We reported all ACC, BLEU and Rouge score to give an overall knowledge of the effect of each step. “/” in the table denotes the corresponding model cannot generate rationale sentences.