Table 1 Performance comparison of machine learning models for respiratory disease classification. Results represent the mean ± standard deviation of macro-averaged metrics across outer folds of 5-fold nested cross-validation. (Abbreviations: kNN = k-nearest neighbors, LR = logistic regression, NB = naïve Bayes, DT = decision tree, SVM = support vector machine, RF = random forest, and XGBoost = extreme gradient boosting.).
Models | Accuracy (%) | AUC | Precision | Sensitivity | F1-score |
|---|---|---|---|---|---|
KNN | 62.73 (8.16) | 0.757 (0.064) | 0.595 (0.084) | 0.586 (0.080) | 0.585 (0.079) |
LR | 89.17 (8.58) | 0.956 (0.054) | 0.887 (0.094) | 0.879 (0.092) | 0.879 (0.097) |
NB | 79.30 (5.99) | 0.909 (0.078) | 0.792 (0.082) | 0.762 (0.066) | 0.751 (0.061) |
DT | 90.00 (6.77) | 0.924 (0.017) | 0.889 (0.086) | 0.881 (0.083) | 0.879 (0.084) |
SVM | 82.63 (3.16) | 0.946 (0.038) | 0.814 (0.051) | 0.805 (0.043) | 0.801 (0.043) |
RF | 90.90 (3.14) | 0.982 (0.017) | 0.905 (0.037) | 0.892 (0.035) | 0.891 (0.034) |
XGBoost | 95.83 (4.56) | 0.998 (0.004) | 0.957 (0.049) | 0.951 (0.052) | 0.952 (0.051) |