Table 1 Comparison of publicly available language identification models with various intersections of labels
FLORES-200 ∩ CLD3 ∩ LangId ∩ LangDetect | FLORES-200 ∩ CLD3 ∩ LangId | FLORES-200 ∩ CLD3 | |||||
|---|---|---|---|---|---|---|---|
No. of supported languages | 51 labels | 78 labels | 95 labels | ||||
F1 | FPR | F1 | FPR | F1 | FPR | ||
LangDetect | 55 | 97.3 | 0.0526 | 64.4 | 0.4503 | 53.1 | 0.4881 |
LangId | 97 | 98.6 | 0.0200 | 92.0 | 0.0874 | 75.8 | 0.2196 |
CLD3 | 107 | 98.2 | 0.0225 | 97.7 | 0.0238 | 97.0 | 0.0283 |
Ours | 218 | 99.4 | 0.0084 | 98.8 | 0.0133 | 98.5 | 0.0134 |