Table 1 Best classification score achieved using a Logistic Regression Classifier with L2 regularization.

From: Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump

 

F1

AUROC

Accuracy

Precision

Recall

Initial set

0.73

0.81

0.71

0.72

0.73

Final set

0.81

0.89

0.81

0.81

0.81

Final set - out-of-sample

0.79

0.72

0.79

0.79

  1. For the training set obtained with the final set of hashtags, classification scores are computed over a 10-fold cross-validation. For the training set obtained with the initial set of hashtags, classification scores are computed on the set of tweets contained in the final set but not used for training the classifier. For F1, Precision and Recall, the average of the two scores computed by taking each class as the positive class is computed. The out-of-sample scores are computed using a random sample of 500 manually annotated tweets.