Table 17 The category of challenges of each dataset with the removal of the respective non-English Language stopwords.
From: Key insights into recommended SMS spam detection datasets
Dataset | MNB Accuracy | Category of accuracy |
|---|---|---|
Dataset 2 (Turkish) | 98.28% | High accuracy |
Dataset 4 (Bengali) | 90.10% | Moderate accuracy |
Dataset 6 (Hindi) | 96.75% | High accuracy |
Dataset 7 (Persian) | 92.95% | Moderate accuracy |
Dataset 8 (Indonesian) | 96.94% | High accuracy |
Dataset 9 (Hindi) | 90.48% | Moderate accuracy |
Dataset 10 (Hindi) | 86.49% | Low accuracy |