Table 1 F1-scores (F) for “defect” (+), “possible defect” (?), and “non-defect” (−) tweet classes for three classifiers trained on the original, under-sampled, and over-sampled data sets
From: Towards scaling Twitter for digital epidemiology of birth defects
Classifier | Training set | F (+) | F (?) | F (−) |
|---|---|---|---|---|
NB | Original, imbalanced training set (14,716) | 0.54 | 0.44 | 0.94 |
NB | Under-sampling based on similar majority class tweets in original training set (5551)a | 0.46 | 0.38 | 0.92 |
NB | Under-sampling based on similar false-negative majority class tweets (8015)b | 0.44 | 0.40 | 0.92 |
NB | Random under-sampling control set (5551)c | 0.50 | 0.43 | 0.93 |
NB | Random under-sampling control set (8015)c | 0.51 | 0.44 | 0.93 |
NB | Over-sampling instances of minority classes with replacement (40,675)d | 0.49 | 0.40 | 0.93 |
NB | SMOTE on original training set (39,148)e | 0.36 | 0.30 | 0.95 |
SVM | Original, imbalanced training set (14,716) | 0.62 | 0.52 | 0.96 |
SVM | Under-sampling based on similar majority class tweets in original training set (5551)a | 0.62 | 0.43 | 0.96 |
SVM | Under-sampling based on similar false-negative majority class tweets (8015)b | 0.58 | 0.51 | 0.95 |
SVM | Random under-sampling control set (5551)c | 0.62 | 0.49 | 0.96 |
SVM | Random under-sampling control set (8015)c | 0.62 | 0.50 | 0.96 |
SVM | Over-sampling instances of minority classes with replacement (40,675)d | 0.62 | 0.46 | 0.95 |
SVM | SMOTE on original training set (39,148)e | 0.62 | 0.51 | 0.96 |
LSTM | Original, imbalanced training set (14,716) | 0.60 | 0.35 | 0.96 |
LSTM | Under-sampling based on similar majority class tweets in original training set (5551)a | 0.55 | 0.33 | 0.91 |
LSTM | Under-sampling based on similar false-negative majority class tweets (8015)b | 0.48 | 0.36 | 0.90 |
LSTM | Random under-sampling control set (5551)c | 0.54 | 0.37 | 0.92 |
LSTM | Random under-sampling control (8015)c | 0.59 | 0.45 | 0.95 |
LSTM | Over-sampling instances of minority classes with replacement (40,675)d | 0.55 | 0.45 | 0.95 |