Table 16 The category of challenges of each dataset with the removal of english Language stopwords.
From: Key insights into recommended SMS spam detection datasets
Dataset | MNB Accuracy | Category of accuracy |
|---|---|---|
Dataset 1 (English) | 98.48% | High accuracy |
Dataset 2 (Turkish) | 99.03% | High accuracy |
Dataset 3 (English, French, and German) | 98.29% | High accuracy |
Dataset 4 (Bengali) | 89.11% | Low accuracy |
Dataset 5 (English) | 86.10% | Low accuracy |
Dataset 6 (Hindi) | 96.00% | High accuracy |
Dataset 7 (Persian) | 93.76% | Moderate accuracy |
Dataset 8 (Indonesian) | 95.20% | High accuracy |
Dataset 9 (Hindi) | 90.48% | Moderate accuracy |
Dataset 10 (Hindi) | 83.78% | Low accuracy |