Table 3 Comparison of FLORES-101 devtest

	eng_Latn-xx	xx-eng_Latn	xx-yy	Average
87 languages
M2M-100	–/–	–/–	–/–	13.6/–
Deepnet	–/–	–/–	–/–	18.6/–
NLLB-200	35.4/52.1	42.4/62.1	25.2/43.2	25.5/43.5
101 languages
DeltaLM	26.6/–	33.2/–	16.4/–	16.7/–
NLLB-200	34.0/50.6	41.2/60.9	23.7/41.4	24.0/41.7

We evaluated using FLORES-101 for 10,000 directions. We report both spBLEU and chrF++ scores when available. Scores for DeltaLM are taken from the FLORES-101 leaderboard. M2M-100 and Deepnet averages only apply to 87 languages that overlap with FLORES-101. The performance of NLLB-200 was evaluated on this subset of languages. The highest score in each column and in each grouping of languages is shown in bold.

Quick links

Search