Table 10 Q&A task finetuning performance comparison (*-cpt represents the continuous pretrained model).

From: Localized large language model TCNNet 9B for Taiwanese networking and cybersecurity

Models

Before finetuning

After finetuning

BLEU-4

ROUGE-L

BLEU-4

ROUGE-L

Meta-Llama-3-8B

4.983

10.903

5.210

11.112

Meta-Llama-3-8B-cpt

4.356

10.386

6.021

12.042

Meta-Llama-3-8B-Instruct

3.513

8.594

3.509

8.757

Meta-Llama-3-8B-Instruct-cpt

3.467

8.787

3.524

8.757

Yi-1.5-9B

6.943

13.082

7.532

13.534

Yi-1.5-9B-cpt

7.075

13.135

27.599

31.834

Qwen2-7B

12.694

20.122

14.774

23.491

Qwen2-7B-cpt

12.110

19.433

13.473

22.524

Qwen2-7B-Instruct

14.999

23.161

24.504

31.397

Qwen2-7B-Instruct-cpt

15.508

23.688

15.508

23.688

glm-4-9b

7.896

13.848

22.633

27.656

glm-4-9b-cpt

7.701

13.883

24.162

29.472

glm-4-9b-chat

9.565

16.571

25.382

30.021

glm-4-9b-chat-cpt

9.398

16.222

25.972

30.145