Table 2 Review of supervised machine learning techniques.

From: Development and validation of electronic health record-based, machine learning algorithms to predict quality of life among family practice patients

Method

Strengths

Limitations

Naïve Bayes classifier (NB)

Short computational time for training and very easy to construct33

Requires less amount of training data32

Simple and useful for a variety of practical applications33

Less accurate than other classifiers33

Classes must be mutually exclusive32

Decision tree (DT)

Simple and fast to build and interpret33

Does not require any domain knowledge or parameter setting33

Able to handle high-dimensional data33

Robust classifier32

Can be validated using statistical tests32

Self-explanatory tool/simple schematical representation that can be followed by the non-professionals34

Can easily be converted to a set of rules that are often comprehensible for the clinicians34

Instability35

Prone to overfitting based on depth of tree34

Sensitive to training data, can be error-prone on test data35

Classes must be mutually exclusive32

Support vector machine/classifier (SVM/C)

Robust and well-known algorithm33

Requires minimal data for training33

Training is relatively easy34

Scales well to high-dimensional data34

Robust and can handle multiple feature spaces32

Less risk of overfitting32

Poor interpretability of results34

Poor performance with noisy data32

Slow learner, requires large amount of training time33

Computationally expensive32

Boosting algorithms

Investigator choice of loss function allows greater flexibility36

Improved accuracy from adding to the ensemble sequentially36

Demonstrated success in various practical applications36

Simple to implement and debug37

Success depends on the amount of data available37

Can be prone to overfitting37

Can be more sensitive to outliers37

Random forest (RF)

Lower chance of variance and overfitting of training data compared to DT32

Empirically performs better than its individual base classifiers32

Scales well for large datasets32

Easily overfit32

Variable importance estimation favors attributes that can take a high number of different values32

Computationally expensive32

k-nearest neighbors (kNN)

Easy to understand and easy to implement classification technique33

Training phase is fast and low cost34

Simple algorithm can classify instances quickly32

Can handle noisy instances or instances with missing attribute values32

Computationally expensive34

Slower classification33

Attributes are given equal importance/no information on which attributes are most effective at classification, which can lead to poor performance32

Large storage requirements33

Sensitive to irrelevant features34

Neural network (NN)

Can detect complex nonlinear relationships32

Requires less formal statistical training to execute32

Availability of multiple training algorithms32

Able to tolerate noisy data33

Able to classify patterns on which they have not been trained33

Can be used with little/no prior knowledge of the relationship between attributes and classes33

Success of the model depends on the quantity of the data34

Lacking clear guidelines on optimal architecture34

‘Black-box’-user does not have access to exact decision-making process32

Logistic regression (LR)

Easy to implement and straightforward32

Easily updated32

Does not make any assumptions regarding the distribution of independent variable32

Probabilistic interpretation of model parameters32

Poor performance when input variables have complex, linear relationships (multicollinearity)38

Not appropriate when data cannot be linearly separated39

Can overstate prediction accuracy32