Scientific Reports

Table 2 Machine learning classifiers and their explanations.

From: Machine learning based clinical decision tool to predict acute kidney injury and survival in therapeutic hypothermia treated neonates

Employed models	Explanation	Accuracy score (%)
Logistic regression	Linear ML algorithm for probabilistic classification⁵¹. StandardScalar⁵² was used to ensure features on the same scale, a high maximum iteration parameter for convergence, and a random state for accuracy reproducibility	54
Random forest	It is based on decision tree algorithms, creating a ‘forest’ of trees where each tree makes its predictions when trained on a random subset of data⁵³. These predictions are aggregated by a majority vote of the trees to decide the final output. Its ensemble nature helps avoid overfitting	63
Support vector classifier (SVC)	It works well in high dimensional spaces and is versatile because kernel functions transform feature space to fit different data distributions⁵⁴. StandardScalar⁵² is used to scale the features and estimate the probabilities, which will use more internal cross validation⁵²	58
Extreme gradient boosting (XGBoost)	Optimized distributed gradient boosting library that builds decision trees one at a time, with each new tree correcting the errors of the previous one in a greedy manner³⁶. To assess classifier training loss, this implementation added an evaluation metric	73
Gradient boosting classifier	SciKit-Learn ensemble method that builds one tree. A random state is also used⁵⁵	65
Adaptive boosting (AdaBoost)	Another ensemble method that fits a classifier on the dataset and then fits other classifiers on the dataset, adjusting the weights of incorrectly classified instances⁵⁶	52
K-nearest neighbors (KNN)	The instance-based algorithm that stores all neonates and classifies new neonates by K-nearest neighbors⁵⁷. StandardScalar⁵² was needed for feature scaling since KNN is distance-based	59
The decision tree	A nonparametric supervised learning method for classification and regression. Nodes are features, branches are decision rules, and leaves are outcomes. They tend to be overfit, but Random Forest and Extra Trees try to fix this⁵⁵	56
Extra trees	Extremely Randomized Trees add a layer of randomness to bagging^55,58. Extra trees can lead to a high variance and low bias model, which is beneficial for specific tasks	58
The multi-layer perceptron (MLP) neural network	A feed forward neural network⁵⁹. They can capture complex relationships in data using multiple layers and nonlinear activation functions	60

Back to article page

Search

Advanced search

Quick links