Table 5 Ablation study results showing BLEU scores for Assamese–English and Bodo–English test sets and inference speed on RTX 3060.
Model Variant | BLEU (Assamese–English) | BLEU (Bodo–English) | Inference Speed (tokens/sec) |
|---|---|---|---|
Student w/o KD | 25.2 | 21.1 | 145 |
Student w/ KD (\(\lambda\)=0.1) | 26.7 | 22.6 | 142 |
Student w/ KD (\(\lambda\)=0.3) | 27.9 | 23.8 | 140 |
Student w/ KD (\(\lambda\)=0.5) | 27.4 | 23.3 | 138 |
Student (Dense, \(\lambda\)=0.3) | 26.8 | 22.4 | 145 |
Student (Hybrid MoE, \(\lambda\)=0.3) | 27.9 | 23.8 | 160 |