Table 1 Comparison of publicly available language identification models with various intersections of labels

From: Scaling neural machine translation to 200 languages

  

FLORES-200 ∩ CLD3 ∩ LangId ∩ LangDetect

FLORES-200 ∩ CLD3 ∩ LangId

FLORES-200 ∩ CLD3

No. of supported languages

51 labels

78 labels

95 labels

  

F1

FPR

F1

FPR

F1

FPR

LangDetect

55

97.3

0.0526

64.4

0.4503

53.1

0.4881

LangId

97

98.6

0.0200

92.0

0.0874

75.8

0.2196

CLD3

107

98.2

0.0225

97.7

0.0238

97.0

0.0283

Ours

218

99.4

0.0084

98.8

0.0133

98.5

0.0134

  1. F1 is the micro-F1 score, and FPR is the micro-false-positive rate.