Fig. 5: Comparison of prediction errors of different machine learning algorithms for the structure of bacterial, fungal and protistan communities.
From: The neglected role of micronutrients in predicting soil microbial structure

Machine learning algorithms include Bagged Regression Tree (BaRT), Cubist, Fast Nearest Neighbor (FNN), Gradient Boosting Machines (GBM), Weighted k-Nearest Neighbor (KKNN), Kernel Support Vector Machine (KSVM), Random Forest (RF), Ranger, Rpart and Support Vector Machine (SVM), showing bacterial (a), fungal (c), and protistan communities (e). The center line of boxplot represents the median of data. The boxplot bounds the interquartile range (IQR) divided by the median, and whiskers extend to a maximum of 1.5 times the IQR beyond the box. Other observed data points outside the boundary of the whiskers are plotted as outliers, shown a dot. Predicted vs. observed the structure of bacterial (b), fungal (d) and protistan (f) communities of test dataset derived from bagged regression tree (BaRT) and random forest (RF). R2, coefficient of determination; MSE, mean squared error; RMSE, root mean squared error. The significance of statistical test was conducted by BaRT and RandomForest regression models. *p < 0.05, **p < 0.01, and ***p < 0.001.