Table 1 Performance comparison of machine learning models for respiratory disease classification. Results represent the mean ± standard deviation of macro-averaged metrics across outer folds of 5-fold nested cross-validation. (Abbreviations: kNN = k-nearest neighbors, LR = logistic regression, NB = naïve Bayes, DT = decision tree, SVM = support vector machine, RF = random forest, and XGBoost = extreme gradient boosting.).

From: A baseline study of interpretable machine learning using GC-MS breath VOCs for classifying asthma, bronchiectasis, and COPD

Models

Accuracy (%)

AUC

Precision

Sensitivity

F1-score

KNN

62.73 (8.16)

0.757 (0.064)

0.595 (0.084)

0.586 (0.080)

0.585 (0.079)

LR

89.17 (8.58)

0.956 (0.054)

0.887 (0.094)

0.879 (0.092)

0.879 (0.097)

NB

79.30 (5.99)

0.909 (0.078)

0.792 (0.082)

0.762 (0.066)

0.751 (0.061)

DT

90.00 (6.77)

0.924 (0.017)

0.889 (0.086)

0.881 (0.083)

0.879 (0.084)

SVM

82.63 (3.16)

0.946 (0.038)

0.814 (0.051)

0.805 (0.043)

0.801 (0.043)

RF

90.90 (3.14)

0.982 (0.017)

0.905 (0.037)

0.892 (0.035)

0.891 (0.034)

XGBoost

95.83 (4.56)

0.998 (0.004)

0.957 (0.049)

0.951 (0.052)

0.952 (0.051)