Table 2 Optimal number of features, corresponding feature subsets and model performances for six ML models obtained by exhaustive screening
From: Machine learning for phase prediction of high entropy carbide ceramics from imbalanced data
Algorithm | Feature number | Best feature subset | Best AUC value |
|---|---|---|---|
SVM.rbf | 6 | \({\sigma }_{{VEC}}\), \({\sigma }_{{{\chi }}_{\text{p}}}\), ΔSconf, \(\overline{{r}_{\text{Me}}}\), \({\sigma }_{{r}_{\text{Me}}}\), \(\overline{{z}^{* }}\) | 88.58% |
SVM.linear | 8 | \({\sigma }_{{{\chi }}_{\text{p}}}\),\(\,{\sigma }_{{{\chi }}_{\text{m}}}\), ΔSconf, \(\overline{{r}_{\text{Me}}}\), \({\sigma }_{{r}_{\text{Me}}}\), \({\sigma }_{{Z}^{* }}\) | 84.28% |
SVM.poly | 6 | \({\sigma }_{{VEC}}\), \({\sigma }_{{{\chi }}_{\text{p}}}\), \({\sigma }_{{{\chi }}_{\text{m}}}\), ΔSconf, \(\bar{l}\), \(\overline{{r}_{{\text{Me}}}}\), \({\sigma }_{{r}_{\text{Me}}}\), \(\overline{{z}^{* }}\) | 88.46% |
RF | 5 | \({\sigma }_{{VEC}}\), \(\overline{{\chi }_{\text{p}}}\), \({\sigma }_{{I}_{1}}\), \(\overline{{z}^{* }}\), Λ | 90.22% |
XGB | 6 | \(\overline{{r}_{\text{Me}}}\), Λ, \(\overline{{\chi }_{\text{p}}}\), \({\sigma }_{{I}_{1}}\), \({\sigma }_{{VEC}}\), \(\overline{{z}^{* }}\) | 89.92% |
LOG | 7 | \({\sigma }_{{VEC}}\), \({\sigma }_{{{\chi }}_{\text{p}}}\), \({\sigma }_{{{\chi }}_{\text{m}}}\), ΔSconf, \(\overline{{r}_{\text{Me}}}\), \({\sigma }_{{r}_{\text{Me}}}\), \({\sigma }_{{I}_{1}}\) | 83.15% |