Table 5 Ablation study results showing BLEU scores for Assamese–English and Bodo–English test sets and inference speed on RTX 3060.

From: Cross-lingual sparse-MoE distillation for efficient low-resource assamese–english and bodo–english translation

Model Variant

BLEU (Assamese–English)

BLEU (Bodo–English)

Inference Speed (tokens/sec)

Student w/o KD

25.2

21.1

145

Student w/ KD (\(\lambda\)=0.1)

26.7

22.6

142

Student w/ KD (\(\lambda\)=0.3)

27.9

23.8

140

Student w/ KD (\(\lambda\)=0.5)

27.4

23.3

138

Student (Dense, \(\lambda\)=0.3)

26.8

22.4

145

Student (Hybrid MoE, \(\lambda\)=0.3)

27.9

23.8

160