Table 2 Review of supervised machine learning techniques.
Method | Strengths | Limitations |
|---|---|---|
Naïve Bayes classifier (NB) | Short computational time for training and very easy to construct33 Requires less amount of training data32 Simple and useful for a variety of practical applications33 | Less accurate than other classifiers33 Classes must be mutually exclusive32 |
Decision tree (DT) | Simple and fast to build and interpret33 Does not require any domain knowledge or parameter setting33 Able to handle high-dimensional data33 Robust classifier32 Can be validated using statistical tests32 Self-explanatory tool/simple schematical representation that can be followed by the non-professionals34 Can easily be converted to a set of rules that are often comprehensible for the clinicians34 | Instability35 Prone to overfitting based on depth of tree34 Sensitive to training data, can be error-prone on test data35 Classes must be mutually exclusive32 |
Support vector machine/classifier (SVM/C) | Robust and well-known algorithm33 Requires minimal data for training33 Training is relatively easy34 Scales well to high-dimensional data34 Robust and can handle multiple feature spaces32 Less risk of overfitting32 | Poor interpretability of results34 Poor performance with noisy data32 Slow learner, requires large amount of training time33 Computationally expensive32 |
Boosting algorithms | Investigator choice of loss function allows greater flexibility36 Improved accuracy from adding to the ensemble sequentially36 Demonstrated success in various practical applications36 Simple to implement and debug37 | Success depends on the amount of data available37 Can be prone to overfitting37 Can be more sensitive to outliers37 |
Random forest (RF) | Lower chance of variance and overfitting of training data compared to DT32 Empirically performs better than its individual base classifiers32 Scales well for large datasets32 | Easily overfit32 Variable importance estimation favors attributes that can take a high number of different values32 Computationally expensive32 |
k-nearest neighbors (kNN) | Easy to understand and easy to implement classification technique33 Training phase is fast and low cost34 Simple algorithm can classify instances quickly32 Can handle noisy instances or instances with missing attribute values32 | Computationally expensive34 Slower classification33 Attributes are given equal importance/no information on which attributes are most effective at classification, which can lead to poor performance32 Large storage requirements33 Sensitive to irrelevant features34 |
Neural network (NN) | Can detect complex nonlinear relationships32 Requires less formal statistical training to execute32 Availability of multiple training algorithms32 Able to tolerate noisy data33 Able to classify patterns on which they have not been trained33 Can be used with little/no prior knowledge of the relationship between attributes and classes33 | Success of the model depends on the quantity of the data34 Lacking clear guidelines on optimal architecture34 ‘Black-box’-user does not have access to exact decision-making process32 |
Logistic regression (LR) | Easy to implement and straightforward32 Easily updated32 Does not make any assumptions regarding the distribution of independent variable32 Probabilistic interpretation of model parameters32 | Poor performance when input variables have complex, linear relationships (multicollinearity)38 Not appropriate when data cannot be linearly separated39 Can overstate prediction accuracy32 |