Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Bringing cross-validation into the real world to evaluate transferability of satellite-based vegetation models
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 17 February 2026

Bringing cross-validation into the real world to evaluate transferability of satellite-based vegetation models

  • Sean P. Kearney1,2,
  • David J. Augustine2,
  • Lauren M. Porensky2,
  • Erika S. Peirce1,2,
  • Mikael P. Hiestand2 &
  • …
  • Justin D. Derner2 

Scientific Reports , Article number:  (2026) Cite this article

  • 500 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Ecology
  • Environmental sciences
  • Mathematics and computing

Abstract

Near-real-time mapping of vegetation using satellite imagery is becoming increasingly common and valuable across a wide range of ecosystems. The availability of large datasets has led many researchers to complex machine learning algorithms (MLAs) to train satellite models. However, complex MLAs may underperform for the inherently extrapolative applications required for real-world vegetation monitoring. We used a dataset of nearly 10,000 training samples of standing herbaceous grazingland biomass collected over ten years to train progressively more complex MLAs, test them across progressively more extrapolative cross-validation (CV) groupings, and evaluate their transferability and consistency. The performance of all MLA’s decreased substantially when tested against more extrapolative CV groupings. The commonly used approach of random k-fold CV produced overly optimistic performance (R2: 0.71–0.78) compared to a more realistic task of predicting for an unseen year (R2: 0.49–0.54). Simpler MLAs, such as partial least squares regression, were more consistent and outperformed complex MLAs for the most extrapolative tasks, and performance was less sensitive to the distinctness of unseen test data. We conclude that random k-fold CV likely produces unrealistically optimistic expectations for real-world applications of satellite vegetation models, and could be associated with major prediction misses when models are used in novel environmental conditions.

Similar content being viewed by others

Explainable machine learning models of major crop traits from satellite-monitored continent-wide field trial data

Article 04 October 2021

Analysis of cultivated land changes and driving factors in the Alar Reclamation Area (1990–2019) based on multi-temporal Landsat data and machine learning algorithms

Article Open access 06 December 2025

Evaluation of crop phenology using remote sensing and decision support system for agrotechnology transfer

Article Open access 04 April 2025

Data availability

The datasets utilized in this study can be accessed in the Ag Data Commons repository located at: https://doi.org/10.15482/USDA.ADC/31271986.

Code availability

The code used for satellite processing, model fitting and visualizations is available from the corresponding author upon reasonable request.

References

  1. Wenger, S. J. & Olden, J. D. Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods Ecol. Evol. 3, 260–267 (2012).

    Google Scholar 

  2. Applestein, C. & Germino, M. J. Satellite-derived plant cover maps vary in performance depending on version and product. Ecol. Ind. 155, 110950 (2023).

    Google Scholar 

  3. Meyer, H. & Pebesma, E. Machine learning-based global maps of ecological variables and the challenge of assessing them. Nat. Commun. 13, 2208 (2022).

    Google Scholar 

  4. Koldasbayeva, D. et al. Challenges in data-driven Geospatial modeling for environmental research and practice. Nat. Commun. 15, 10700 (2024).

    Google Scholar 

  5. Smith, H. D., Dubeux, J. C. B., Zare, A. & Wilson, C. H. Assessing transferability of remote sensing pasture estimates using multiple machine learning algorithms and evaluation structures. Remote Sens. 15, 2940 (2023).

    Google Scholar 

  6. Furnitto, N., Ramírez-Cuesta, J. M., Intrigliolo, D. S., Todde, G. & Failla, S. Remote sensing for pasture biomass quantity and quality assessment: challenges and future prospects. Smart Agricultural Technol. 12, 101057 (2025).

    Google Scholar 

  7. Subhashree, S. N. et al. Tools for predicting forage growth in rangelands and economic analyses—a systematic review. Agriculture 13, 455 (2023).

    Google Scholar 

  8. Ullah, S., Nazeer, M., Wong, M. S. & Amin, G. Remote sensing for aboveground biomass monitoring in terrestrial ecosystems: A systematic review. Remote Sens. Applications: Soc. Environ. 39, 101635 (2025).

    Google Scholar 

  9. Ploton, P. et al. Spatial validation reveals poor predictive performance of large-scale ecological mapping models. Nat. Commun. 11, 4540 (2020).

    Google Scholar 

  10. Kattenborn, T. et al. Spatially autocorrelated training and validation samples inflate performance assessment of convolutional neural networks. ISPRS Open. J. Photogrammetry Remote Sens. 5, 100018 (2022).

    Google Scholar 

  11. Yang, L. & Shami, A. On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415, 295–316 (2020).

    Google Scholar 

  12. Tabe-Bordbar, S., Emad, A., Zhao, S. D. & Sinha, S. A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models. Sci. Rep. 8, 6620 (2018).

    Google Scholar 

  13. Roberts, D. R. et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40, 913–929 (2017).

    Google Scholar 

  14. De Bruin, S., Brus, D. J., Heuvelink, G. B. M., Van Ebbenhorst Tengbergen, T. & Wadoux, A. M. J.-C. Dealing with clustered samples for assessing map accuracy by cross-validation. Ecol. Inf. 69, 101665 (2022).

    Google Scholar 

  15. Wadoux, A. M. J. C., Heuvelink, G. B. M., De Bruin, S. & Brus, D. J. Spatial cross-validation is not the right way to evaluate map accuracy. Ecol. Model. 457, 109692 (2021).

    Google Scholar 

  16. Cunningham, S. A., Augustine, D. J., Derner, J. D., Smith, D. & Boudreau, M. R. In search of an optimal bio-logger epoch and device combination for quantifying activity budgets in free-ranging cattle. Smart Agricultural Technol. 9, 100646 (2024).

    Google Scholar 

  17. Gaffney, R. et al. Using APAR to predict aboveground plant productivity in semi-arid rangelands: Spatial and Temporal relationships differ. Remote Sens. 10, 1474 (2018).

    Google Scholar 

  18. Espeland, E. K., Schreeg, L. & Porensky, L. M. Managing risks related to climate variability in rangeland-based livestock production: what producer driven strategies are shared and prevalent across diverse dryland geographies? J. Environ. Manage. 255, 109889 (2020).

    Google Scholar 

  19. Jansen, V. S., Kolden, C. A., Schmalz, H. J., Karl, J. W. & Taylor, R. V. Using Satellite-Based vegetation data for Short-Term grazing monitoring to inform adaptive management. Rangeland Ecology Management. 76, 30–42 (2021).

    Google Scholar 

  20. Kearney, S. P., Porensky, L. M., Augustine, D. J., Gaffney, R. & Derner, J. D. Monitoring standing herbaceous biomass and thresholds in semiarid rangelands from harmonized Landsat 8 and Sentinel-2 imagery to support within-season adaptive management. Remote Sens. Environ. 271, 112907 (2022).

    Google Scholar 

  21. Allred, B. W. et al. Improving Landsat predictions of rangeland fractional cover with multitask learning and uncertainty. Methods Ecol. Evol. 12, 841–849 (2021).

    Google Scholar 

  22. Grigera, G., Oesterheld, M. & Pacín, F. Monitoring forage production for farmers’ decision making. Agric. Syst. 94, 637–648 (2007).

    Google Scholar 

  23. Jones, M. O. et al. Annual and 16-day rangeland production estimates for the Western united States. Rangeland Ecology Management. 77, 112–117 (2021).

    Google Scholar 

  24. Meyer, H. & Pebesma, E. Predicting into unknown space? Estimating the area of applicability of Spatial prediction models. Methods Ecol. Evol. 12, 1620–1633 (2021).

    Google Scholar 

  25. Crimmins, A. R. (ed) et al. Fifth national climate assessment. (2023). https://doi.org/10.7930/NCA5.2023 doi:10.7930/NCA5.2023.

  26. Knapp, A. K. & Smith, M. D. Variation among biomes in Temporal dynamics of aboveground primary production. Science 291, 481–484 (2001).

    Google Scholar 

  27. Allred, B. W. et al. Guiding principles for using satellite-derived maps in rangeland management. Rangelands 44, 78–86 (2022).

    Google Scholar 

  28. Lauenroth, W. K. & Milchunas, D. Short-grass steppe. Ecosyst. World A. 8, 183–226 (1991).

    Google Scholar 

  29. USDA. Ecological Site R067BY002CO Loamy Plains. (2024). https://edit.jornada.nmsu.edu/catalogs/esd/067B/R067BY002CO

  30. USDA. Ecological Site R067BY024CO Sandy Plains. (2024). https://edit.jornada.nmsu.edu/catalogs/esd/067B/R067BY024CO

  31. USDA. Ecological Site R067BY033CO Salt Flat. (2024). https://edit.jornada.nmsu.edu/catalogs/esd/067B/R067BY033CO

  32. Augustine, D. J. et al. Multipaddock rotational grazing management: a ranch-scale assessment of effects on vegetation and livestock performance in semiarid rangeland. Rangeland Ecology Management. 73, 796–810 (2020). Adaptive.

    Google Scholar 

  33. Claverie, M. et al. The harmonized Landsat and Sentinel-2 surface reflectance data set. Remote Sens. Environ. 219, 145–161 (2018).

    Google Scholar 

  34. Jansen, V. S., Kolden, C. A. & Schmalz, H. J. The development of near real-time biomass and cover estimates for adaptive rangeland management using Landsat 7 and Landsat 8 surface reflectance products. Remote Sens. 10, 1057 (2018).

    Google Scholar 

  35. Yue, J., Tian, Q., Dong, X. & Xu, N. Using broadband crop residue angle index to estimate the fractional cover of vegetation, crop residue, and bare soil in cropland systems. Remote Sens. Environ. 237, 111538 (2020).

    Google Scholar 

  36. Gersie, S. P., Augustine, D. J. & Derner, J. D. Cattle grazing distribution in shortgrass steppe: influences of topography and saline soils. Rangeland Ecology Management. 72, 602–614 (2019).

    Google Scholar 

  37. Irisarri, J. G. N. et al. Grazing intensity differentially regulates ANPP response to precipitation in North American semiarid grasslands. Ecol. Appl. 26, 1370–1380 (2016).

    Google Scholar 

  38. Peirce, E. S., Kearney, S. P., Santamaria, N., Augustine, D. J. & Porensky, L. M. Predictions of aboveground herbaceous production from satellite-derived APAR are more sensitive to ecosite than grazing management strategy in shortgrass steppe. Remote Sens. 16, 2780 (2024).

    Google Scholar 

Download references

Acknowledgements

We thank Nick Dufek, Tamarah Jorns, Averi Reynolds, Melissa Johnston, and numerous seasonal field technicians for collecting the ground-based visual obstruction (VO) data. Thanks to Nicole Kaplan for data management support. This research was a contribution from the Long-Term Agroecosystem Research (LTAR) network. LTAR is supported by the United States Department of Agriculture.

Funding

Funding came from the United States Department of Agriculture – Agricultural Research Service (USDA-ARS), including project number 3012-21500-001-000D. This research also used resources provided by the SCINet project and/or the AI Center of Excellence of the USDA-ARS, project numbers 0201-88888-003-000D and 0201-88888-002-000D.

Author information

Authors and Affiliations

  1. Thunder Basin Grasslands Prairie Ecosystem Association, Wyoming, USA

    Sean P. Kearney & Erika S. Peirce

  2. Rangeland Resources and Systems Research Unit, Agricultural Research Service – Plains Area, Wyoming, USA

    Sean P. Kearney, David J. Augustine, Lauren M. Porensky, Erika S. Peirce, Mikael P. Hiestand & Justin D. Derner

Authors
  1. Sean P. Kearney
    View author publications

    Search author on:PubMed Google Scholar

  2. David J. Augustine
    View author publications

    Search author on:PubMed Google Scholar

  3. Lauren M. Porensky
    View author publications

    Search author on:PubMed Google Scholar

  4. Erika S. Peirce
    View author publications

    Search author on:PubMed Google Scholar

  5. Mikael P. Hiestand
    View author publications

    Search author on:PubMed Google Scholar

  6. Justin D. Derner
    View author publications

    Search author on:PubMed Google Scholar

Contributions

SPK wrote the main manuscript text, was responsible for primary data analysis and prepared the figures and tables. DJA and LMP contributed to writing the manuscript text. LMP, DJA and JDD provided supervisory and administrative support. EP provided data analysis support. MPH and DJA provided data management and curation support. All authors provided intellectual input for methodological design and reviewed and edited the manuscript.

Corresponding author

Correspondence to Sean P. Kearney.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kearney, S.P., Augustine, D.J., Porensky, L.M. et al. Bringing cross-validation into the real world to evaluate transferability of satellite-based vegetation models. Sci Rep (2026). https://doi.org/10.1038/s41598-026-39866-w

Download citation

  • Received: 19 August 2025

  • Accepted: 09 February 2026

  • Published: 17 February 2026

  • DOI: https://doi.org/10.1038/s41598-026-39866-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Grasslands
  • Rangelands
  • Machine learning
  • Satellite remote sensing
  • Biomass
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing Anthropocene

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Anthropocene