Table 3 Comparison of models (n = 212).
Methods | Three mainstream variable encoding methods | Four for data imbalance processing methods | Four ML models | Four XGB models | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
WOE | One-Hot | CE | NearMiss | ENN | SMOTE | FL | XGB | SVM | Random Forest | Logistic Regression | XGB | XGB + FC | XGB + FL | XGB + FL + FC | |
Accuracy | 0.967 | 0.958 | 0.967 | 0.967 | 0.962 | 0.965 | 0.967 | 0.967 | 0.972 | 0.962 | 0.953 | 0.972 | 0.976 | 0.920 | 0.967 |
F1 | 0.851 | 0.816 | 0.851 | 0.796 | 0.811 | 0.824 | 0.851 | 0.851 | 0.856 | 0.810 | 0.770 | 0.850 | 0.878 | 0.679 | 0.851 |
Recall | 0.870 | 0.870 | 0.870 | 0.786 | 0.812 | 0.835 | 0.870 | 0.870 | 0.820 | 0.780 | 0.740 | 0.739 | 0.783 | 0.783 | 0.870 |
AUC | 0.994 | 0.990 | 0.990 | 0.907 | 0.932 | 0.945 | 0.994 | 0.994 | 0.952 | 0.993 | 0.972 | 0.985 | 0.993 | 0.976 | 0.994 |