Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Sustainable regional corn yield prediction for the United States through interpretable machine learning approach
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 17 March 2026

Sustainable regional corn yield prediction for the United States through interpretable machine learning approach

  • Tanzila Kehkashan  ORCID: orcid.org/0000-0002-6325-44091,2 na1,
  • Maha Abdelhaq3,
  • Ahmad Sami Al-Shamayleh4,
  • Muhammad Hamza  ORCID: orcid.org/0009-0007-2346-02602 na1,
  • Muhammad Abdullah Khan  ORCID: orcid.org/0009-0008-8985-89492,
  • Salman Z. Alharthi5 &
  • …
  • Adnan Akhunzada  ORCID: orcid.org/0000-0001-8370-92906 

Scientific Reports , Article number:  (2026) Cite this article

  • 461 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Mathematics and computing

Abstract

Considering crop yield prediction as critical to optimizing agricultural practices and food security, this question is critical to U.S. agricultural planning and regional food security; relevant research on corn, one of the essential crops, must focus on the accurate methods for predicting this crop. It has been discussed that yield prediction models generally rely on simplistic approaches, which fail to capture complex, non-linear relationships in agricultural data. This work fills the knowledge gap by making use of advanced machine-learning techniques to improve the accuracy of corn yield prediction. This study focuses on county-level regional forecasting(U.S) to support agricultural policy and supply chain planning rather than field-specific management decisions. The methodology is in line with the Special Section on Sustainable Computing for Next-Generation Low-Carbon Agricultural Consumer Electronics by designing a data-efficient algorithm that focuses on the Random Forest Classifier, Gradient Boosting Classifier, and Ensemble Voting Classifier. The development of this model entailed the pre-processing of historical data concerning corn yield, defining pertinent attributes, and assessing the confusion matrix, ROC curve, and SHAP values for explainability. This work proposes an ensemble model which has achieved remarkable accuracy and robustness, excelling in performance relative to the existing approaches. The model has also made solid predictions, with a precision, recall, and F1-score of 0.92 and a training accuracy of 0.97. The SHAP further enhances transparency into the features that drive predictions, hence making the model more interpretable. This is of great importance to agricultural planning; this would most probably offer a sound instrument to predict corn yield and optimize resources in agricultural consumer practices. This paper strongly advocates energy-efficient algorithm design, intelligent applications, sustainable computing, efficiency, environmental impact, and resource recycling to drive toward sustainable and efficient corn yield prediction.

Data availability

The datasets used and analysed during the current study are available in the Kaggle repository, https://www.kaggle.com/datasets/guillemservera/grains-and-cereals-futures

References

  1. Food and Agriculture Organization. The state of food security and nutrition in the world. FAO (2022).

  2. United States Department of Agriculture. World agricultural supply and demand estimates. USDA (2021).

  3. Ray, D. K., Gerber, J. S., MacDonald, G. K. & West, P. C. Climate variation explains a third of global crop yield variability. Nat. Commun. 6, 5989 (2015).

    Google Scholar 

  4. Johnson, L. Sustainable agricultural practices and future trends. Sustain. Dev. 31, 289–302 (2023).

    Google Scholar 

  5. Karali, B. & Power, G. J. Short-and long-run determinants of commodity price volatility. Am. J. Agric. Econ. 95, 724–738 (2013).

    Google Scholar 

  6. Dhaliwal, D. S. & Williams, M. M. Sweet corn yield prediction using machine learning models and field-level data. Precis. Agric. 25, 51–64 (2024).

    Google Scholar 

  7. Khaki, S. & Wang, L. Crop yield prediction using deep neural networks. Front. Plant Sci. 10, 621 (2019).

    Google Scholar 

  8. Zhang, Y. Improving forecasting accuracy using LSTM networks. J. Financ. Data Sci. 12, 78–89 (2021).

    Google Scholar 

  9. Wang, L. Forecasting crop yields: The role of climate variables. Environ. Sci. Policy 102, 55–66 (2020).

    Google Scholar 

  10. Terliksiz, A. S. & Altilar, D. T. Impact of large kernel size on yield prediction: A case study of corn yield prediction with sedla in the US Corn Belt. Environ. Res. Commun. 6, 025011 (2024).

    Google Scholar 

  11. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. arXiv (2017).

  12. Yang, X., Hua, Z., Li, L., Huo, X. & Zhao, Z. Multi-source information fusion-driven corn yield prediction using the random forest from the perspective of agricultural and forestry economic management. Sci. Rep. 14, 4052 (2024).

    Google Scholar 

  13. Kussul, N. et al. Crop yield prediction with deep learning using remote sensing data. J. Appl. Remote Sens. 12, 45–56 (2018).

    Google Scholar 

  14. Zhang, H., Zheng, X. & Liu, S. Hybrid deep learning models for climate impacted crop yield prediction using ann, tabnet, and xgboost. Clim.Sci.J. 10, 243–260 (2022).

    Google Scholar 

  15. Bantchina, B. B. et al. Corn yield prediction in site-specific management zones using proximal soil sensing, remote sensing, and machine learning approach. Comput. Electron. Agric. 225, 109329 (2024).

    Google Scholar 

  16. Friedman, J. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).

    Google Scholar 

  17. Rashid, M. Ensemble learning in forecasting: A review. Forecasting 3, 12–24 (2020).

    Google Scholar 

  18. Figueiredo, M. Predicting corn prices using machine learning models. J. Appl. Econ. 48, 435–450 (2020).

    Google Scholar 

  19. Sajid, S. S., Shahhosseini, M., Huber, I., Hu, G. & Archontoulis, S. V. County-scale crop yield prediction by integrating crop simulation with machine learning models. Front. Plant Sci. 13, 1000224 (2022).

    Google Scholar 

  20. Schwalbert, R. et al. Mid-season county-level corn yield forecast for US Corn Belt integrating satellite imagery and weather variables. Crop Sci. 60, 739–750 (2020).

    Google Scholar 

  21. Ye, S., Cao, P. & Lu, C. Annual time-series 1 km maps of crop area and types in the conterminous US (CropAT-US): Cropping diversity changes during 1850–2021. Earth Syst. Sci. Data 16, 3453–3470 (2024).

    Google Scholar 

  22. USDA National Agricultural Statistics Service. Crop production 2023 summary. https://data.nass.usda.gov/Statistics_by_State/Delaware/Publications/Current_News_Release/2024/Jan2024-Crop-Production.pdf (2024).

  23. Kaya, E. & Ceylan, R. F. ARIMA and GARCH models for agricultural yield prediction. Agric. Econ. Rev. 20, 123–135 (2019).

    Google Scholar 

  24. Yao, Q. et al. Garch models in agricultural market volatility forecasting. Journal of Agricultural Economics 71, 315–328 (2020).

    Google Scholar 

  25. Zhang, M. & Liu, F. Time-series forecasting of corn yield using deep learning models. J. Agric. Sci. 30, 45–60 (2023).

    Google Scholar 

  26. Wimalasuriya, D. et al. Comparing statistical and machine learning models for corn yield prediction. J. Agric. Sci. 28, 455–478 (2019).

    Google Scholar 

  27. Li, F. & Wang, M. Limitations of traditional statistical models in agricultural forecasting. J. Clim. Chang. Agric. 10, 120–135 (2022).

    Google Scholar 

  28. Kumar, C., Dhillon, J., Huang, Y. & Reddy, K. N. Explainable machine learning models for corn yield prediction using uav multispectral data. Available at SSRN 4744740 (2024).

  29. Jeong, J. H. et al. Random forest for corn yield prediction using multi-source data. Comput. Electron. Agric. 187, 106263 (2021).

    Google Scholar 

  30. Jang, C. et al. Integrating plant morphological traits with remote-sensed multispectral imageries for accurate corn grain yield prediction. PLoS One 19, e0297027 (2024).

    Google Scholar 

  31. Zhao, X. et al. Boosting algorithms in corn yield prediction: A gradient boosting machine approach. Agric. Syst. 188, 103036 (2021).

    Google Scholar 

  32. Meghraoui, K., Sebari, I., Pilz, J., Ait El Kadi, K. & Bensiali, S. Applied deep learning-based crop yield prediction: A systematic analysis of current developments and potential challenges. Technologies 12, 43 (2024).

    Google Scholar 

  33. Oliveira, G. et al. Random forest modeling for improving corn yield prediction: A case study integrating climate data. Agron. J. 114, 655–666 (2022).

    Google Scholar 

  34. Jeong, J. H. et al. Random forests for global and regional crop yield predictions. PLoS One 11, e0156571 (2016).

    Google Scholar 

  35. Shahhosseini, M., Hu, G., Huber, I. & Archontoulis, S. V. Coupling machine learning and crop modeling improves crop yield prediction in the US corn belt. Sci. Rep. 11, 1606 (2021).

    Google Scholar 

  36. Maimaitijiang, M. et al. Using convolutional neural networks and multispectral data for crop yield prediction. Remote. Sens. 12, 172 (2020).

    Google Scholar 

  37. Kussul, N., Lavreniuk, M., Skakun, S. & Shelestov, A. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 14, 778–782 (2017).

    Google Scholar 

  38. You, J., Li, X., Low, M., Lobell, D. & Ermon, S. Deep gaussian process for crop yield prediction based on remote sensing data. In Proceedings of the AAAI conference on artificial intelligence, vol. 31 (2017).

  39. Jiang, X. et al. A hybrid cnn-lstm model for predicting crop yields. Comput. Electron. Agric. 198, 107116 (2022).

    Google Scholar 

  40. Khaki, S., Wang, L. & Archontoulis, S. V. A cnn-rnn framework for crop yield prediction. Front. Plant Sci. 10, 1750 (2020).

    Google Scholar 

  41. Lobell, D. B., Thau, D., Seifert, C., Engle, E. & Little, B. A scalable satellite-based crop yield mapper. Remote Sens. Environ. 164, 324–333 (2015).

    Google Scholar 

  42. Guan, K. et al. The shared and unique values of optical, fluorescence, thermal and microwave satellite data for estimating large-scale crop yields. Remote Sens. Environ. 199, 333–349 (2017).

    Google Scholar 

  43. Campos-Taberner, M. et al. Drone-based imaging and cnn for improving corn yield predictions. Agric. Syst. 193, 103219 (2022).

    Google Scholar 

  44. Smith, J., Lee, A. & Williams, M. Predicting corn yield using machine learning: An evaluation of different approaches. J. Agric. Data Sci. 5, 145–160 (2021).

    Google Scholar 

  45. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Google Scholar 

  46. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).

    Google Scholar 

  47. United States Department of Agriculture. Usda quick stats database. https://quickstats.nass.usda.gov/ (2023). Accessed: 2025-08-19.

  48. National Oceanic and Atmospheric Administration. Noaa climate data. https://www.noaa.gov/ (2022). Accessed: 2025-08-19.

  49. Chicago Mercantile Exchange. Grains and cereals futures dataset. https://www.kaggle.com/datasets/guillemservera/grains-and-cereals-futures (2023). Accessed: 2025-08-19.

  50. NASA POWER Project. Nasa power climate data. https://power.larc.nasa.gov/ (2023). Accessed: 2025-08-19.

  51. ECMWF. Era5 reanalysis data. https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5 (2023). Accessed: 2025-08-19.

  52. SoilGrids. Global soil information system. https://soilgrids.org/ (2023). Accessed: 2025-08-19.

  53. NASA MODIS. Moderate resolution imaging spectroradiometer (modis). https://modis.gsfc.nasa.gov/ (2023). Accessed: 2025-08-19.

  54. European Space Agency. Sentinel-2. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2 (2025). Accessed: 2025-08-10.

  55. Khramtsov, R. R. & Koposov, S. E. Machine learning methods for astronomical classification tasks and crop yield prediction using catboost, xgboost, and random forest. Astron. J. 158, 1–10 (2020).

    Google Scholar 

  56. Espejo, G., Ventura, S. & Herrera, F. A survey on the application of ensemble learning for crop yield prediction. Inf. Fusion 16, 3–19 (2010).

    Google Scholar 

Download references

Acknowledgments

This project was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2026R97), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. We also appreciate technical insights and collaborative support provided by the VLCMatrix Lab during this research.

Funding

This research was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2026R97), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. 

Author information

Author notes
  1. These authors contributed equally to this work: Tanzila Kehkashan and Muhammad Hamza.

Authors and Affiliations

  1. Faculty of Computing, Universiti Teknologi Malaysia, Johor Bahru, 81310, Malaysia

    Tanzila Kehkashan

  2. Faculty of Information Technology, University of Lahore, Sargodha, 40100, Pakistan

    Tanzila Kehkashan, Muhammad Hamza & Muhammad Abdullah Khan

  3. Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia

    Maha Abdelhaq

  4. Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, 19328, Jordan

    Ahmad Sami Al-Shamayleh

  5. Department of Software Engineering, College of Computing, Umm AL-Qura University, Mecca, 24381, Kingdom of Saudi Arabia

    Salman Z. Alharthi

  6. College of Computing & IT, Department of Data and Cybersecurity, University of Doha for Science and Technology, Doha, 24449, Qatar

    Adnan Akhunzada

Authors
  1. Tanzila Kehkashan
    View author publications

    Search author on:PubMed Google Scholar

  2. Maha Abdelhaq
    View author publications

    Search author on:PubMed Google Scholar

  3. Ahmad Sami Al-Shamayleh
    View author publications

    Search author on:PubMed Google Scholar

  4. Muhammad Hamza
    View author publications

    Search author on:PubMed Google Scholar

  5. Muhammad Abdullah Khan
    View author publications

    Search author on:PubMed Google Scholar

  6. Salman Z. Alharthi
    View author publications

    Search author on:PubMed Google Scholar

  7. Adnan Akhunzada
    View author publications

    Search author on:PubMed Google Scholar

Contributions

T. Kehkashan conceptualized the study, designed the methodology, and drafted the main manuscript. M. Abdelhaq and A. S. Al-Shamayleh contributed to the development of the predictive model and the interpretation of results. A. Akhunzada provided technical oversight, contributed to refining the manuscript, and supervised the overall research direction. S. Z. Alharthi contributed to the literature review and data curation. M. Hamza, and M.A.Khan supported data preprocessing, experiment setup, and results visualization under supervision. All authors reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Muhammad Hamza.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kehkashan, T., Abdelhaq, M., Al-Shamayleh, A.S. et al. Sustainable regional corn yield prediction for the United States through interpretable machine learning approach. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43213-4

Download citation

  • Received: 27 August 2025

  • Accepted: 02 March 2026

  • Published: 17 March 2026

  • DOI: https://doi.org/10.1038/s41598-026-43213-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Data-efficient algorithm
  • Agricultural consumer
  • Energy-efficient algorithm design
  • Intelligent applications
  • Sustainable computing
  • Efficiency
  • Environmental impact
  • Resource recycling
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics