Table 2 Out-of-sample classification performance.

From: Computer-assisted classification of contrarian claims about climate change

 

Validation set (noisy)

Test set (noise free)

Precision

Recall

F1

Precision

Recall

F1

Logistic (unweighted)

0.71

0.55

0.62

0.83

0.57

0.68

Logistic (weighted)

0.62

0.68

0.65

0.75

0.70

0.72

SVM (unweighted)

0.66

0.56

0.61

0.77

0.58

0.66

SVM (weighted)

0.60

0.68

0.64

0.74

0.70

0.72

ULMFiT

0.69

0.69

0.69

0.77

0.67

0.72

ULMFiT (weighted)

0.66

0.60

0.62

0.76

0.60

0.65

ULMFiT (over sample)

0.41

0.73

0.50

0.46

0.75

0.55

ULMFiT (focal Loss)

0.66

0.58

0.60

0.73

0.56

0.61

ULMFiT-logistic

0.71

0.70

0.70

0.77

0.72

0.75

ULMFiT-SVM

0.74

0.65

0.70

0.81

0.63

0.71

RoBERTa

0.75

0.77

0.76

0.82

0.75

0.77

RoBERTa-logistic

0.76

0.77

0.76

0.83

0.75

0.79

  1. The table provides macro-averaged precision, recall, and F1 score to compare model fit across “shallow” descriptive classifiers and “deep” transfer learning architectures. Logistic (Unweighted): Logistic regression classifier using TF-IDF weighted features and optimized via grid-search. Logistic (Weighted): Logistic regression classifier using TF-IDF weighted features, weighting for class imbalance, and optimized via grid-search. SVM (Unweighted): A linear support vector machine classifier using TF-IDF weighted features and optimized via grid-search. SVM (Weighted): A linear support vector machine classifier using TF-IDF weighted features, weighting for class imbalance, and optimized via grid-search. ULMFiT models: We start with a pre-trained language model which utilizes the Wiki-103 corpus. We then tuned the pre-trained model using 1) our training set \((n = 23,436)\) and a large, random sample \((n = 100,000)\) of unannotated blog and CTT paragraphs. Second, we trained the classification model using the training and validation sets described above. Given observed class imbalances, we examined four variations of the ULMFiT architecture: a model that (1) ignored class imbalance; (2) applies oversampling of each minibatch to adjust for class imbalance; (3) weights the loss function for class imbalance following the “balanced” procedure used in the scikit-learn library; and (4) uses a focal loss function. RoBERTa models: See discussion in Methods.