Table 2 Performance of models when applied to data not used in model fitting

From: Variation in wood density across South American tropical forests

Model

Cross-validation (interpolation)

Spatial cross-validation (extrapolation)

 

RMSE

Correlation

Coefficient of determination

RMSE

Correlation

Coefficient of determination

Dataset

0.105

0.011

−1.091

0.105

0.061

−1.574

RF - spatial

0.051

0.749

0.536

0.069

0.055

−0.192

RF – environment

0.051

0.736

0.539

0.077

0.132

−0.449

RF – both

0.049

0.755

0.567

0.070

0.153

−0.253

GAM – spatial

0.054

0.701

0.480

0.081

0.149

−0.809

GAM – environment

0.057

0.618

0.369

0.070

0.212

−0.515

GAM – both

0.053

0.704

0.474

0.071

0.219

−0.388

Ensemble

0.049

0.759

0.567

0.068

0.272

−0.132

  1. Model performance has been assessed using k-fold cross-validation (presumed to reflect interpolation performance) and spatial cross-validation (where an entire region was removed for model testing, presumed to reflect extrapolation performance).
  2. Model performance has been assessed as (1) root mean square error (RMSE), which indicates the average prediction error (g cm-3), (2) the correlation coefficient between observed and predicted values and (3) the coefficient of determination [1-(residual sum of squares/ total sum of squares)]. Negative coefficient of determination values indicate that the difference between model predictions and the testing data are greater than the difference between the testing data mean and the testing data. Median values across cross-validation folds are presented.