Table 5 Model performance on training set.

From: Machine learning detection of manipulative environmental disclosures in corporate reports

Model

Accuracy

F1-Score

ROC-AUC

PR-AUC

Balanced accuracy

MCC

Variance reduced (%)

Overfitting indicator

Logistic Regression (LR)

0.9926

0.99

0.5

0.74

0.83

0.71

Low

High

Decision Trees (DT)

0.9927

0.99

0.5064

0.76

0.84

0.72

Moderate

Moderate

Random Forest (RF)

0.9967

0.99

0.7809

0.78

0.86

0.73

High

Low

  1. PR-AUC, Balanced Accuracy, and MCC were newly added to provide a more comprehensive assessment of training-set reliability.