Table 11 Ablation study results on the 7B model series

Configuration	Tones	Rhymes	Antithesis	Length	Total
Qwen2.5-7B-Instruct(SFT only)	75.93	61.48	89.88	94.33	76.22
Qwen2.5-7B-Instruct(GRPO only)	69.67	63.27	85.53	81.32	72.09
Qwen2.5-7B-Instruct (SFT + GRPO)	63.54	50.71	80.83	75.35	64.33
Qwen2.5-7B-Instruct(GRPO + RAG)	75.92	75.6	90.08	91.03	80.17
Qwen2.5-7B-Instruct(SFT + RAG)	66.64	69.61	83.45	77.23	71.95
Qwen2.5-7B-Instruct(SFT + GRPO + RAG)	62.61	74.68	81.39	75.3	71.26

Quick links

Search