Table 2 Machine learning classifiers and their explanations.

From: Machine learning based clinical decision tool to predict acute kidney injury and survival in therapeutic hypothermia treated neonates

Employed models

Explanation

Accuracy score (%)

Logistic regression

Linear ML algorithm for probabilistic classification51. StandardScalar52 was used to ensure features on the same scale, a high maximum iteration parameter for convergence, and a random state for accuracy reproducibility

54

Random forest

It is based on decision tree algorithms, creating a ‘forest’ of trees where each tree makes its predictions when trained on a random subset of data53. These predictions are aggregated by a majority vote of the trees to decide the final output. Its ensemble nature helps avoid overfitting

63

Support vector classifier (SVC)

It works well in high dimensional spaces and is versatile because kernel functions transform feature space to fit different data distributions54. StandardScalar52 is used to scale the features and estimate the probabilities, which will use more internal cross validation52

58

Extreme gradient boosting (XGBoost)

Optimized distributed gradient boosting library that builds decision trees one at a time, with each new tree correcting the errors of the previous one in a greedy manner36. To assess classifier training loss, this implementation added an evaluation metric

73

Gradient boosting classifier

SciKit-Learn ensemble method that builds one tree. A random state is also used55

65

Adaptive boosting (AdaBoost)

Another ensemble method that fits a classifier on the dataset and then fits other classifiers on the dataset, adjusting the weights of incorrectly classified instances56

52

K-nearest neighbors (KNN)

The instance-based algorithm that stores all neonates and classifies new neonates by K-nearest neighbors57. StandardScalar52 was needed for feature scaling since KNN is distance-based

59

The decision tree

A nonparametric supervised learning method for classification and regression. Nodes are features, branches are decision rules, and leaves are outcomes. They tend to be overfit, but Random Forest and Extra Trees try to fix this55

56

Extra trees

Extremely Randomized Trees add a layer of randomness to bagging55,58. Extra trees can lead to a high variance and low bias model, which is beneficial for specific tasks

58

The multi-layer perceptron (MLP) neural network

A feed forward neural network59. They can capture complex relationships in data using multiple layers and nonlinear activation functions

60