Figure 3 | Scientific Reports

Figure 3

From: Variable importance for sustaining macrophyte presence via random forests: data imputation and model settings

Figure 3

Effect of data preprocessing and imputation method on random forest performance. Similar performances are obtained for each combination, with multivariate imputation methods (k nearest neighbours (kNN) and missForest (mF)) performing slightly worse than univariate imputation methods (mean and median value). Depicted performances were obtained with random forest consisting of 100 trees, while running 10 repetitions and applying a 5-fold cross-validation. Selected data sets originally consisted of 18% missing data and underwent either no (‘None’) preprocessing or outlier removal (‘Outliers’) prior to model development.

Back to article page