Table 9 Performance of baseline models on the Measuring Hate Speech dataset

From: Adaptive ensemble techniques leveraging BERT based models for multilingual hate speech detection in Korean and english

Methodology

Macro

Weighted

Accuracy

Precision

Recall

F1-score

Precision

Recall

F1-score

RF

0.75

0.73

0.74

0.76

0.77

0.76

0.77

NB

0.59

0.57

0.56

0.61

0.63

0.61

0.63

SVM

0.76

0.67

0.67

0.75

0.74

0.71

0.74

LR

0.32

0.50

0.39

0.41

0.64

0.50

0.64

mBERT

0.82

0.84

0.83

0.84

0.84

0.84

0.84

DistilmBERT

0.83

0.84

0.83

0.84

0.84

0.84

0.84

xlm-RoBERTa

0.82

0.83

0.82

0.84

0.83

0.83

0.83

BERT

0.83

0.84

0.83

0.85

0.84

0.84

0.84

DistilBERT

0.84

0.82

0.83

0.84

0.85

0.84

0.85

RoBERTa

0.83

0.84

0.83

0.85

0.84

0.85

0.84