Extended Data Fig. 5: Model complexity and accuracy excluding individual domains of input data. | Nature Plants

Extended Data Fig. 5: Model complexity and accuracy excluding individual domains of input data.

From: Explainable machine learning models of major crop traits from satellite-monitored continent-wide field trial data

Extended Data Fig. 5

Predictive accuracy within models, such as the a RPRM and b extreme gradient boosting models shown, was reduced by different rates under leave-one-out testing. Across all levels of model complexity (x axis), removal of satellite data (blue) produced the greatest reduction in accuracy relative to models trained using all available data (black circles). Exclusion of weather-station data (orange), metadata (pink), and management data (green) imparted similar and more limited costs in accuracy. Crossvalidation error rates (y axis, a) is the inverse of R2 values and RMSE (y axis, b) is the root mean squared error for random holdout observations.

Back to article page