Abstract
Considering crop yield prediction as critical to optimizing agricultural practices and food security, this question is critical to U.S. agricultural planning and regional food security; relevant research on corn, one of the essential crops, must focus on the accurate methods for predicting this crop. It has been discussed that yield prediction models generally rely on simplistic approaches, which fail to capture complex, non-linear relationships in agricultural data. This work fills the knowledge gap by making use of advanced machine-learning techniques to improve the accuracy of corn yield prediction. This study focuses on county-level regional forecasting(U.S) to support agricultural policy and supply chain planning rather than field-specific management decisions. The methodology is in line with the Special Section on Sustainable Computing for Next-Generation Low-Carbon Agricultural Consumer Electronics by designing a data-efficient algorithm that focuses on the Random Forest Classifier, Gradient Boosting Classifier, and Ensemble Voting Classifier. The development of this model entailed the pre-processing of historical data concerning corn yield, defining pertinent attributes, and assessing the confusion matrix, ROC curve, and SHAP values for explainability. This work proposes an ensemble model which has achieved remarkable accuracy and robustness, excelling in performance relative to the existing approaches. The model has also made solid predictions, with a precision, recall, and F1-score of 0.92 and a training accuracy of 0.97. The SHAP further enhances transparency into the features that drive predictions, hence making the model more interpretable. This is of great importance to agricultural planning; this would most probably offer a sound instrument to predict corn yield and optimize resources in agricultural consumer practices. This paper strongly advocates energy-efficient algorithm design, intelligent applications, sustainable computing, efficiency, environmental impact, and resource recycling to drive toward sustainable and efficient corn yield prediction.
Data availability
The datasets used and analysed during the current study are available in the Kaggle repository, https://www.kaggle.com/datasets/guillemservera/grains-and-cereals-futures
References
Food and Agriculture Organization. The state of food security and nutrition in the world. FAO (2022).
United States Department of Agriculture. World agricultural supply and demand estimates. USDA (2021).
Ray, D. K., Gerber, J. S., MacDonald, G. K. & West, P. C. Climate variation explains a third of global crop yield variability. Nat. Commun. 6, 5989 (2015).
Johnson, L. Sustainable agricultural practices and future trends. Sustain. Dev. 31, 289–302 (2023).
Karali, B. & Power, G. J. Short-and long-run determinants of commodity price volatility. Am. J. Agric. Econ. 95, 724–738 (2013).
Dhaliwal, D. S. & Williams, M. M. Sweet corn yield prediction using machine learning models and field-level data. Precis. Agric. 25, 51–64 (2024).
Khaki, S. & Wang, L. Crop yield prediction using deep neural networks. Front. Plant Sci. 10, 621 (2019).
Zhang, Y. Improving forecasting accuracy using LSTM networks. J. Financ. Data Sci. 12, 78–89 (2021).
Wang, L. Forecasting crop yields: The role of climate variables. Environ. Sci. Policy 102, 55–66 (2020).
Terliksiz, A. S. & Altilar, D. T. Impact of large kernel size on yield prediction: A case study of corn yield prediction with sedla in the US Corn Belt. Environ. Res. Commun. 6, 025011 (2024).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. arXiv (2017).
Yang, X., Hua, Z., Li, L., Huo, X. & Zhao, Z. Multi-source information fusion-driven corn yield prediction using the random forest from the perspective of agricultural and forestry economic management. Sci. Rep. 14, 4052 (2024).
Kussul, N. et al. Crop yield prediction with deep learning using remote sensing data. J. Appl. Remote Sens. 12, 45–56 (2018).
Zhang, H., Zheng, X. & Liu, S. Hybrid deep learning models for climate impacted crop yield prediction using ann, tabnet, and xgboost. Clim.Sci.J. 10, 243–260 (2022).
Bantchina, B. B. et al. Corn yield prediction in site-specific management zones using proximal soil sensing, remote sensing, and machine learning approach. Comput. Electron. Agric. 225, 109329 (2024).
Friedman, J. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Rashid, M. Ensemble learning in forecasting: A review. Forecasting 3, 12–24 (2020).
Figueiredo, M. Predicting corn prices using machine learning models. J. Appl. Econ. 48, 435–450 (2020).
Sajid, S. S., Shahhosseini, M., Huber, I., Hu, G. & Archontoulis, S. V. County-scale crop yield prediction by integrating crop simulation with machine learning models. Front. Plant Sci. 13, 1000224 (2022).
Schwalbert, R. et al. Mid-season county-level corn yield forecast for US Corn Belt integrating satellite imagery and weather variables. Crop Sci. 60, 739–750 (2020).
Ye, S., Cao, P. & Lu, C. Annual time-series 1 km maps of crop area and types in the conterminous US (CropAT-US): Cropping diversity changes during 1850–2021. Earth Syst. Sci. Data 16, 3453–3470 (2024).
USDA National Agricultural Statistics Service. Crop production 2023 summary. https://data.nass.usda.gov/Statistics_by_State/Delaware/Publications/Current_News_Release/2024/Jan2024-Crop-Production.pdf (2024).
Kaya, E. & Ceylan, R. F. ARIMA and GARCH models for agricultural yield prediction. Agric. Econ. Rev. 20, 123–135 (2019).
Yao, Q. et al. Garch models in agricultural market volatility forecasting. Journal of Agricultural Economics 71, 315–328 (2020).
Zhang, M. & Liu, F. Time-series forecasting of corn yield using deep learning models. J. Agric. Sci. 30, 45–60 (2023).
Wimalasuriya, D. et al. Comparing statistical and machine learning models for corn yield prediction. J. Agric. Sci. 28, 455–478 (2019).
Li, F. & Wang, M. Limitations of traditional statistical models in agricultural forecasting. J. Clim. Chang. Agric. 10, 120–135 (2022).
Kumar, C., Dhillon, J., Huang, Y. & Reddy, K. N. Explainable machine learning models for corn yield prediction using uav multispectral data. Available at SSRN 4744740 (2024).
Jeong, J. H. et al. Random forest for corn yield prediction using multi-source data. Comput. Electron. Agric. 187, 106263 (2021).
Jang, C. et al. Integrating plant morphological traits with remote-sensed multispectral imageries for accurate corn grain yield prediction. PLoS One 19, e0297027 (2024).
Zhao, X. et al. Boosting algorithms in corn yield prediction: A gradient boosting machine approach. Agric. Syst. 188, 103036 (2021).
Meghraoui, K., Sebari, I., Pilz, J., Ait El Kadi, K. & Bensiali, S. Applied deep learning-based crop yield prediction: A systematic analysis of current developments and potential challenges. Technologies 12, 43 (2024).
Oliveira, G. et al. Random forest modeling for improving corn yield prediction: A case study integrating climate data. Agron. J. 114, 655–666 (2022).
Jeong, J. H. et al. Random forests for global and regional crop yield predictions. PLoS One 11, e0156571 (2016).
Shahhosseini, M., Hu, G., Huber, I. & Archontoulis, S. V. Coupling machine learning and crop modeling improves crop yield prediction in the US corn belt. Sci. Rep. 11, 1606 (2021).
Maimaitijiang, M. et al. Using convolutional neural networks and multispectral data for crop yield prediction. Remote. Sens. 12, 172 (2020).
Kussul, N., Lavreniuk, M., Skakun, S. & Shelestov, A. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 14, 778–782 (2017).
You, J., Li, X., Low, M., Lobell, D. & Ermon, S. Deep gaussian process for crop yield prediction based on remote sensing data. In Proceedings of the AAAI conference on artificial intelligence, vol. 31 (2017).
Jiang, X. et al. A hybrid cnn-lstm model for predicting crop yields. Comput. Electron. Agric. 198, 107116 (2022).
Khaki, S., Wang, L. & Archontoulis, S. V. A cnn-rnn framework for crop yield prediction. Front. Plant Sci. 10, 1750 (2020).
Lobell, D. B., Thau, D., Seifert, C., Engle, E. & Little, B. A scalable satellite-based crop yield mapper. Remote Sens. Environ. 164, 324–333 (2015).
Guan, K. et al. The shared and unique values of optical, fluorescence, thermal and microwave satellite data for estimating large-scale crop yields. Remote Sens. Environ. 199, 333–349 (2017).
Campos-Taberner, M. et al. Drone-based imaging and cnn for improving corn yield predictions. Agric. Syst. 193, 103219 (2022).
Smith, J., Lee, A. & Williams, M. Predicting corn yield using machine learning: An evaluation of different approaches. J. Agric. Data Sci. 5, 145–160 (2021).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).
United States Department of Agriculture. Usda quick stats database. https://quickstats.nass.usda.gov/ (2023). Accessed: 2025-08-19.
National Oceanic and Atmospheric Administration. Noaa climate data. https://www.noaa.gov/ (2022). Accessed: 2025-08-19.
Chicago Mercantile Exchange. Grains and cereals futures dataset. https://www.kaggle.com/datasets/guillemservera/grains-and-cereals-futures (2023). Accessed: 2025-08-19.
NASA POWER Project. Nasa power climate data. https://power.larc.nasa.gov/ (2023). Accessed: 2025-08-19.
ECMWF. Era5 reanalysis data. https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5 (2023). Accessed: 2025-08-19.
SoilGrids. Global soil information system. https://soilgrids.org/ (2023). Accessed: 2025-08-19.
NASA MODIS. Moderate resolution imaging spectroradiometer (modis). https://modis.gsfc.nasa.gov/ (2023). Accessed: 2025-08-19.
European Space Agency. Sentinel-2. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2 (2025). Accessed: 2025-08-10.
Khramtsov, R. R. & Koposov, S. E. Machine learning methods for astronomical classification tasks and crop yield prediction using catboost, xgboost, and random forest. Astron. J. 158, 1–10 (2020).
Espejo, G., Ventura, S. & Herrera, F. A survey on the application of ensemble learning for crop yield prediction. Inf. Fusion 16, 3–19 (2010).
Acknowledgments
This project was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2026R97), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. We also appreciate technical insights and collaborative support provided by the VLCMatrix Lab during this research.
Funding
This research was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2026R97), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Author information
Authors and Affiliations
Contributions
T. Kehkashan conceptualized the study, designed the methodology, and drafted the main manuscript. M. Abdelhaq and A. S. Al-Shamayleh contributed to the development of the predictive model and the interpretation of results. A. Akhunzada provided technical oversight, contributed to refining the manuscript, and supervised the overall research direction. S. Z. Alharthi contributed to the literature review and data curation. M. Hamza, and M.A.Khan supported data preprocessing, experiment setup, and results visualization under supervision. All authors reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kehkashan, T., Abdelhaq, M., Al-Shamayleh, A.S. et al. Sustainable regional corn yield prediction for the United States through interpretable machine learning approach. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43213-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-43213-4