Table 10 Q&A task finetuning performance comparison (*-cpt represents the continuous pretrained model).
From: Localized large language model TCNNet 9B for Taiwanese networking and cybersecurity
Models | Before finetuning | After finetuning | ||
|---|---|---|---|---|
BLEU-4 | ROUGE-L | BLEU-4 | ROUGE-L | |
Meta-Llama-3-8B | 4.983 | 10.903 | 5.210 | 11.112 |
Meta-Llama-3-8B-cpt | 4.356 | 10.386 | 6.021 | 12.042 |
Meta-Llama-3-8B-Instruct | 3.513 | 8.594 | 3.509 | 8.757 |
Meta-Llama-3-8B-Instruct-cpt | 3.467 | 8.787 | 3.524 | 8.757 |
Yi-1.5-9B | 6.943 | 13.082 | 7.532 | 13.534 |
Yi-1.5-9B-cpt | 7.075 | 13.135 | 27.599 | 31.834 |
Qwen2-7B | 12.694 | 20.122 | 14.774 | 23.491 |
Qwen2-7B-cpt | 12.110 | 19.433 | 13.473 | 22.524 |
Qwen2-7B-Instruct | 14.999 | 23.161 | 24.504 | 31.397 |
Qwen2-7B-Instruct-cpt | 15.508 | 23.688 | 15.508 | 23.688 |
glm-4-9b | 7.896 | 13.848 | 22.633 | 27.656 |
glm-4-9b-cpt | 7.701 | 13.883 | 24.162 | 29.472 |
glm-4-9b-chat | 9.565 | 16.571 | 25.382 | 30.021 |
glm-4-9b-chat-cpt | 9.398 | 16.222 | 25.972 | 30.145 |