Background & Summary

Land cover (LC) is a fundamental element of the Earth system, linking the biosphere, atmosphere, and hydrosphere through its influence on energy exchange, water balance, and carbon cycling1,2,3,4,5. Organized into hierarchical categories, LC supports diverse habitats and plays a key role in environmental and ecological modeling6,7,8,9. Over recent decades, rapid urban expansion and intensified human activity have significantly altered land cover, placing growing pressure on ecosystems and affecting water resources, air quality, food systems, and biodiversity6,10,11,12,13. As such, accurate and timely information on LC and Land Cover and Land Use Changes (LCLUC) is critical for understanding Earth system dynamics, supporting sustainable development, ensuring food and water security, and managing natural resources effectively14,15,16.

Accurate LC data is especially important in transboundary regions, where ecosystems and natural resources often extend beyond political boundaries. Coordinated management in such areas depends on consistent and reliable datasets17,18,19. However, imbalances in data availability, such as along the U.S.–Mexico border, can hinder joint efforts to monitor and manage shared environments20,21. High-resolution land cover maps support cross-border collaboration, inform sustainable resource management, and enhance the resilience of both ecological systems and human communities22,23. Satellite remote sensing has significantly advanced LC monitoring by providing long-term, high-resolution Earth observation (EO) data24,25. Numerous studies have leveraged satellite imagery for global LC mapping. For instance, MODIS-based products like MCD12Q126 and GLASS datasets27 offer long time series but at coarse resolutions (500 m and 5 km), limiting their utility for detailed analyses. Efforts like the ESA Climate Change Initiative (300 m) have improved accuracy using multi-source data and machine learning, yet still lack fine spatial detail28. The emergence of freely available high-resolution EO data, such as Landsat and Sentinel-2, has enabled finer-scale mapping. Landmark contributions include the FROM_GLC (Finer Resolution Observation and Monitoring of Global Land Cover) 30 m global LC map and its 10 m successor, both utilizing extensive training data and advanced classification methods29,30. Several studies have developed integrated frameworks for land cover mapping that combine multiple classifiers and multisource remotely sensed data. For instance, the LCMM (Land-Cover Mapping with Multiple Classifiers and Multisource data) framework utilizes monthly time-series imagery from sensors such as Landsat and MODIS to enhance classification performance across heterogeneous landscapes31. Other approaches have employed comprehensive geographical partitioning in conjunction with hierarchical classification decision trees and benchmark-based change detection methods to ensure both spatial consistency and temporal reliability in long-term land cover mapping32. Despite recent advancements in LC mapping, many existing datasets classify land cover into broad categories such as forest, shrubland, developed areas, or agricultural fields, which limits their utility for detailed analysis33. More granular information on specific agricultural practices is essential in regions like the Middle Rio Grande (MRG), where effective management of interstate shared resources is critical20,23. The Cropland Data Layer (CDL) is currently the only publicly available dataset that offers detailed insights into crop-specific land use in the US34. However, its temporal coverage, beginning in 2008, and spatial limitation to the conterminous United States (CONUS) restrict its applicability in transboundary regions, where agricultural activities often rely on shared surface water basins and transboundary groundwater aquifers.

Recent advancements in deep learning architectures, particularly their ability to leverage transfer learning, have substantially enhanced the performance of LCLUC mapping35,36. A wide range of image segmentation models have been evaluated on benchmark remote sensing datasets such as Gaofen-2, ISPRS Urban Segmentation, DeepSat (SAT-4), EuroSAT, and the Munich dataset37,38,39,40,41,42,43. These high-resolution and diverse datasets provide robust platforms for assessing segmentation accuracy in EO applications. Leveraging these datasets, deep learning models have achieved significant improvements in classification precision, thereby advancing the effectiveness of LCLUC monitoring.

For example, U-Net has been employed to differentiate between bare and cultivated fields using Sentinel-2 imagery achieving an overall accuracy (OA) of 90%44, while U-Net++ has been evaluated using varying patch sizes and encoder backbones for crop delineation tasks. The model achieved OA ranging from 96.86% to 97.72% and F1 scores from 71.29% to 80.75% over Sentinel-2 imagery, outperforming its application on Gaofen imagery, where OA ranged from 75.34% to 97.72% and F1 scores from 54.89% to 73.25%44,45. Integrating U-Net within a generative adversarial network (GAN) framework has further improved mapping accuracy resulting in a mIoU of 58.62%, which outperformed a standard CNN with mIoU: 49.28%46. Comparative studies involving architectures such as U-Net, DeepLabv3+, and SegNet applied to Landsat 8 data have demonstrated their capability to classify LC into key categories including urban, agricultural, forest, and water classes, where U-Net achieved the highest mIoU of 55.09% and OA of 81.93%. In the same study, traditional machine learning classifiers were also evaluated, such as Maximum Likelihood Estimation (mIoU: 21.23%, OA: 68.12%), Random Forest (mIoU: 26.14%, OA: 74.69%) and Support Vector Machines (mIoU: 24.67%, OA: 72.86%), which were significantly outperformed by deep learning approaches47. These investigations highlight the growing utility of deep learning for high-resolution, scalable LC classification5,45,48.

More recently, the effectiveness of multitemporal input imagery and different semantic segmentation architectures has been investigated for the specific task of Crop and Land Cover Land Use (CLCLU) mapping33. In addition, related research explored the utility of multi-sensor imagery, including Landsat 8 optical data, Synthetic Aperture Radar (SAR), and Digital Elevation Models (DEMs), to enhance the accuracy and robustness of CLCLU classification49. In aforementioned studies, models were trained using data from the United States and transfer learning was applied to predict land cover patterns in Mexico, demonstrating the potential for cross-border generalization and the scalability of deep learning approaches in data-limited regions.

However, there remains a critical need for a high-resolution, long-term historical CLCLU dataset for the transboundary MRG region, one that maps major LC types along with predominant agricultural practices over multiple decades. To address this gap, we utilized multitemporal Landsat imagery (from Landsat 5 and 8), the only EO sensor providing consistent high-resolution data since 1984, to generate annual CLCLU maps50. To the best of our knowledge, this is the first dataset that captures agricultural practices in the MRG region across both sides of the U.S.–Mexico border. For model training and validation, we used the annual CDL for the U.S. side of the MRG (2008–2019) as ground truth and developed a semantic segmentation model using the Multi-Attention Network (MANet) architecture with a ResNeXt-101 encoder34,51,52. The trained model was then applied to predict CLCLU patterns on the Mexican side of the border. The spatial continuity and proximity of agricultural fields across the border enabled the use of transfer learning, under the assumption that agricultural practices exhibit similar patterns on both sides of the border.

The resulting CLCLU product represents the first binational dataset encompassing both the U.S. and Mexican regions of the Middle Rio Grande (MRG), providing annual land cover maps from 1994 to 2024 that capture key crop types and agricultural practices at a 30-meter spatial resolution. The generated maps were validated against the National Land Cover Database (NLCD) and MODIS MCD12Q1-UMD products for core LC classes, including water bodies, developed areas, and croplands53,54. Additionally, we performed a temporal consistency analysis between our product and the CDL over overlapping years. To the best of our knowledge, this is the only data-driven product that relies exclusively on optical EOs to provide a scalable, long-term CLCLU monitoring solution. The insights offered by this dataset are especially valuable given the socio-environmental significance of agriculture in the U.S.–Mexico borderlands, which supports rural livelihoods, food production, and shared water resources. The region faces mounting challenges, including diminishing water availability, declining water quality, and increasing economic and environmental pressures55,56,57. This product can help track long-term trends in land use and cropping patterns, identify key drivers of change, and support early warning efforts for emerging risks such as severe climatic events, prolonged droughts, and yield reductions that may impact regional agricultural sustainability.

Methods

This section outlines the processing workflow employed in this study to generate the annual CLCLU dataset for MRG. The workflow encompasses data preprocessing and analysis, training and validation dataset construction, model training procedures, accuracy assessment, prediction, and inter-comparison with benchmark products (Fig. 1). Each step is described in detail in the following sections.

Fig. 1
figure 1

The flow chart to generate the CLCLU dataset for the Middle Rio Grande (MRG).

Study area

This study investigates CLCLU in Middle Rio Grande Basin (36,988 km2), extending from San Antonio, NM, to Presidio, TX, and Ojinaga, Chihuahua (Fig. 2). The Rio Grande River, originating in Colorado’s San Juan Mountains, is a vital hydrological resource governed by international treaties 1906, and 194458,59. The MRG region features a variety of land cover types, ranging from urban areas and crop fields to natural landscapes. The CLCU dataset aims to support adaptive water management strategies in this ecologically and socio-economically significant transboundary watershed, addressing challenges such as climate-driven alterations in water availability, increasing anthropogenic demand, and declining river discharge.

Fig. 2
figure 2

Middle Rio Grande watershed.

Satellite data source

The Landsat mission provides valuable EO data at a 30-meter spatial resolution since 1984, making it well-suited for large-scale LC monitoring. This study uses atmospherically corrected surface reflectance and land surface temperature (LST) datasets derived from both Landsat 5 Thematic Mapper (TM) (https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T1_L2#description) and Landsat 8 Operational Land Imager/Thermal Infrared Sensor (OLI/TIRS) imagery (https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2#description). The selected bands include four visible and near-infrared (VNIR) bands, two short-wave infrared (SWIR) bands, and one thermal infrared (TIR) band, all standardized to a 30-meter resolution through orthorectification. These spectral bands were chosen due to their established relevance in distinguishing key land cover types and crop phenological stages: VNIR and SWIR for capturing vegetation structure and soil properties and conditions, and TIR for representing thermal conditions associated with irrigation and cropping intensity. Additionally, NDVI was included as a derived feature to enhance sensitivity to green vegetation cover60,61,62,63,64. The Landsat Collection 2 Level-2 Tier 1 products used here incorporate improved geometric and radiometric corrections, removal of extreme latitude DEM constraints, and adjustments for TIRS anomalies, thereby increasing data fidelity, especially at high latitudes50,65,66. Additional processing details are available in the Data Format Control Books67.

Benchmark LC products

The USDA National Agriculture Statistics Service (NASS) Cropland Data Layer (CDL) (https://developers.google.com/earth-engine/datasets/catalog/USDA_NASS_CDL), derived from Landsat 8/9 OLI/TIRS and Sentinel-2A/2B multispectral imagery throughout growing seasons, provides comprehensive 30-m resolution coverage across the CONUS since 2008, achieving classification accuracies between 85% and 95%, varying by crop type34. Agricultural ground reference data originates from the Farm Service Agency (FSA) Common Land Unit (CLU) Program, supplemented by additional state-specific datasets from sources such as the US Bureau of Reclamation and the California Department of Water Resources68.

The MODIS Land Cover Type Product (MCD12Q1) (https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MCD12Q1) provides global LC maps at an annual temporal resolution and a spatial resolution of 500 meters from 2001 to the present. This dataset includes several science data sets (SDSs) representing multiple classification schemes, such as the International Geosphere-Biosphere Program (IGBP), University of Maryland (UMD), alongside a three-layer scheme based on the Land Cover Classification System (LCCS) of the Food and Agriculture Organization (FAO). The MCD12Q1 product is generated using supervised classification techniques applied to smoothed spectro-temporal features derived from MODIS Nadir BRDF-Adjusted Reflectance (NBAR) data, subsequently refined by Hidden Markov Models to reduce inter-annual variability26,54.

The National Land Cover Database (NLCD) (https://developers.google.com/earth-engine/datasets/catalog/USGS_NLCD_RELEASES_2019_REL_NLCD) is a multi-temporal dataset developed by the U.S. Geological Survey (USGS) and multiple federal partners, providing detailed land cover and land cover change information for the CONUS at 30-meter resolution. Initially released for 1992 and updated in 2001, 2006, and 2011, the latest generation, NLCD, integrates advanced methodologies, including multi-source data fusion, hierarchical classification strategies, and expert knowledge. It employs automated processes for assembling and preprocessing Landsat imagery, multi-temporal change detection, and continuous field modeling to ensure spatial and temporal consistency. NLCD enhances previous databases by including new LC classes, improving shrubland and grassland representation, and refining urban impervious surfaces and tree canopy cover layers69.

Spectral configuration

Considering the phenological cycle of dominant crop types in the MRG region, where July captures peak vegetation conditions and December reflects post-harvest or dormant states, median composite imagery from these two months was identified as the optimal temporal configuration for CLCLU delineation. Among the input strategies tested, the July–December combination yielded the highest segmentation performance using MANet (mIoU: 75.81), outperforming RGB bands of annual median (mIoU: 72.99), annual medians (mIoU: 74.70), and seasonal median composites (mIoU: 75.31)33. Thus, the same spectral combination was consistently employed as input features for training, validation, and prediction in this study. Additional spectral data, such as synthetic apparatus radar imagery or terrain information, were not included due to their unavailability for the entire study period (1994–2024). Table 1 provides detailed information on the spectral combinations used.

Table 1 Spectral Characteristics of Landsat Imagery Used for Training and Validation.

Training sample generation

The data utilized in this study were sourced from the Google Earth Engine (GEE) catalog (Landsat Collection 2, Level 2, Tier 1)70. Dual-month (July and December) Landsat imagery composites were generated for the Region of Interest (RoI) through the GEE platform, using median pixel values to mitigate cloud cover and ensure reliability. All composite bands were normalized using their minimum and maximum values. Based on the CDL raster for the U.S. side of the MRG, 41 LC types were identified from 2008 to 2023. Agricultural classes were categorized into Alfalfa/Hay, Cotton, Pecan, and an aggregated “other crops” class, while non-agricultural areas were grouped into Forest/Shrubland, Barren/Grassland, Water bodies, Developed, and Background. The Background represents pixels corresponding to no-data values, those falling outside the defined class mapping scheme, or those located beyond the defined RoI (Table S.1). Annually, 1,000 tiles of 64 × 64 pixels per class were generated for validation, and 10,000 tiles per class for training, each uniquely identified by its center pixel. Tiles were randomly generated with uniform distribution across classes in terms of center pixel, resulting in a dataset configuration of 80,000 tiles for training and approximately 8,000 tiles for validation corresponding to each year.

Semantic segmentation architecture

Multi-Attention-Network (MANet) is a novel semantic segmentation architecture, designed specifically for high resolution remote sensing imagery51. This architecture incorporates an innovative attention mechanism that efficiently explores intricate feature combinations extracted by the encoder module for the task at hand33,49. The MANet’s key innovation lies in its attention mechanism, which integrates kernel attention with linear complexity, thereby significantly reducing the computational requirements of the attention module. This approach represents an advancement over traditional methods by generalizing the dot-product attention (represented in Eq. 1) to a kernel-based formulation (expressed in Eq. 2), referred to as the Kernel Attention Mechanism (KAM).

$$D\left(Q,K,V\right)={{softmax}}_{{row}}\left(Q{K}^{T}\right)V$$
(1)
$${D(Q,K,V)}_{i}=\frac{\mathop{\sum }\limits_{j=1}^{N}{\phi \left({q}_{i}\right)}^{T}\varphi ({k}_{j}){\nu }_{j}}{\mathop{\sum }\limits_{j=1}^{N}\phi ({{q}_{i})}^{T}\varphi ({k}_{j})}$$
(2)

where

$$\varphi \left(x\right)=\phi \left(x\right)=\log (1+{e}^{x})$$
(3)

The Channel Attention Mechanism (CAM) further enhances the segmentation outputs, as follows.

$$\begin{array}{c}{C}_{{MP}}={MaxPool}(x)\\ {C}_{{Ap}}={AvgPool}(x)\end{array}$$
(4)

The MANet architecture employed in this study leverages a ResNeXt-101 encoder to extract deep, high-level feature representations from the input data. For additional details on the network architecture and its underlying mechanisms, please refer to the supplementary document.

Training protocol

For the period from 2013 to 2024, individual models were trained separately for each year using the corresponding annual training dataset (S.0). Each model was trained for 200 epochs with a batch size of 128 and an initial learning rate of 0.001, optimized using the Adaptive Moment Estimation (Adam) algorithm. To address class imbalance—common in Land Cover (LC) classification—weighted cross-entropy loss was employed, with class weights assigned inversely proportional to class frequency. Additionally, stratified sampling was incorporated using PyTorch’s WeightedRandomSampler to balance class representation within each training batch. For the Landsat 5 period (1994–2011), three distinct training strategies were implemented: S.1 involved initial training on Landsat 8 imagery (2013–2024) followed by fine-tuning using Landsat 5 imagery from years 2008 and 2011; S.2 combined training imagery from 2008, 2011, and the Landsat 8 period (2013–2024); S.3 utilized only Landsat 5 imagery from 2008 and 2011. To improve generalization and robustness, data augmentation techniques such as random rotations and horizontal/vertical flips were applied. All training processes were conducted using two NVIDIA RTX A6000 GPUs.

Data Records

The dataset is available at Zenodo71: https://doi.org/10.5281/zenodo.15116835. The CLCLU dataset comprises 30 annual raster files (see Table 2 for descriptive characteristics) spanning from 1994 to 2024, representing the first long-term, high-resolution Crop and Land Cover Land Use (CLCLU) classification product for the transboundary Middle Rio Grande (MRG) region. Annual classified maps are provided in GeoTIFF format with a spatial resolution of 30 meters and are referenced to the EPSG:4326 coordinate system. Each file contains pixel-level land cover and crop type classifications derived from Landsat 5 and Landsat 8 imagery. The pixel values represent the following CLCLU classes:

  • 0 when the pixel is Alfalfa or Hay

  • 1 when the pixel is Cotton

  • 2 when the pixel is Pecan

  • 3 when the pixel is other crops

  • 4 when the pixel is Forest or Shrubland

  • 5 when the pixel is Grassland or Barren

  • 6 when the pixel is Water body

  • 7 when the pixel is Developed

  • 8 when the pixel is Background

Table 2 Descriptive characteristics of the Crop & Land cover Land use (CLCLU) dataset.

Technical Validation

For models trained on the Landsat 8 period (2013–2024), validation was performed using the corresponding year’s validation dataset. For the Landsat 5 period, models S.1, S.2, and S.3 were validated using data from the years 2008 and 2011 only. Years 2009 and 2010 were excluded from both training and validation due to significant inconsistencies in the CDL ground truth maps, while 2012 was omitted due to data quality issues in Landsat 7 imagery. Once trained and validated, all three models (S.1, S.2, and S.3) were used to predict LC for each year from 1994 to 2011. The final prediction for each year was derived by consolidating outputs: common predictions among models were retained, and discrepancies were resolved by selecting the output from the model with the highest Intersection over Union (IoU) score, supported by visual inspection. The evaluation metrics used to assess segmentation performance are summarized in Table 3.

Table 3 Evaluation Metrics for Semantic Segmentation Model Assessment.

Accuracy assessment of CLCU

The performance of trained MANet models for CLCLU delineation was assessed using multiple metrics across various classes and temporal validation datasets (Fig. 3 and Table S.2). The Intersection over Union (IoU) metric indicated consistent classification accuracy among the evaluated categories. Specifically, Alfalfa/Hay exhibited an average IoUC of 76.01%, ranging from a minimum of 68.76% to a maximum of 81.49%. The Cotton class achieved an average IoUC of 79.85%, with values spanning from 71.92% to 87.23%. For Pecan, the mean IoUC was 82.93%, ranging between 76.63% and 88.06%. The ‘Other crops’ category demonstrated an average IoUC of 71.66%, varying from 60.85% to 79.03%. Forest/Shrubland achieved a mean IoUC of 78.07%, with minimum and maximum values of 73.96% and 82.81%, respectively. Grassland/Barren had an average IoUC of 62.76%, exhibiting considerable variability (48.78–70.64%). The Water bodies category showed high accuracy, averaging 84.11% IoU, and ranged from 80.23% to 86.76%. Developed areas displayed a mean IoUC of 74.06%, fluctuating between 63.60% and 79.17%. The Background class consistently demonstrated the IoUC, averaging 98.33%, with a narrow range of 97.53% to 98.92%.

Fig. 3
figure 3

(A) Class-wise IoU across all years, highlighting variability and performance consistency among target classes. (B) Annual trends in Accuracy, mIoU, Recall, F1-score, and F2-score for CLCLU maps for trained MANet models for CLCLU delineation.

Overall classification accuracy across all evaluated years averaged 97.10%, ranging from 96.40% (S.2) to 97.69% (2024). The mIoU was 78.85%, with values spanning from 73.21% (S.2) to 82.77% (2016), indicating robust class separability across land cover types. Recall demonstrated strong average performance at 90.58%, varying between 88.49% (2022) and 92.34% (2016), reflecting the model’s consistency in detecting relevant classes. Notably, the F1-score averaged 87.83%, ranging from a minimum of 83.82% (S.2) to a maximum of 90.40% (2016), underscoring the model’s reliable overall performance. The F2-score, which emphasizes recall more heavily, showed similarly strong results, averaging 89.40%, with a minimum of 86.45% (S.2) and maximum of 91.52% (2016). These results highlight the model’s precision and sensitivity in delineating key land cover and crop classes across diverse years. Detailed metrics are presented in Table S.2 of the supplementary materials.

Intercomparison with existing thematic products

In addition to comparisons with the CDL for the available years (2008–2014), our dataset was also evaluated against other established land cover products, including the National Land Cover Database (NLCD) and the MODIS Land Cover Type Product (MCD12Q1-UMD classification). These thematic datasets were spatially aggregated to a uniform grid size of 0.05° × 0.05°, facilitating the computation of area fraction for each class. To quantitatively assess the level of agreement among these products, scatter plots were generated. Evaluation metrics including the coefficient of determination (R²) and root mean square error (RMSE) were calculated and used to quantify their correspondence. This grid-based area fraction comparison method was adopted from a study, that applied a similar approach in the evaluation of a 30-meter annual land cover dataset for China, with products with significant differences in resolution and thematic classification schemes5. Figure 4 illustrates the area fractions comparing our product with NLCD datasets for crop fields, developed areas, and water bodies on the U.S. side of the MRG for the years 2001, 2004, and 2006. On average, our dataset achieved an R² of 0.8780 and an RMSE of 0.0938 for crop fields, an R² of 0.9143 and an RMSE of 0.0647 for developed areas, and an R² of 0.7953 and an RMSE of 0.0757 for water bodies.

Fig. 4
figure 4

Comparison of area fractions aggregated at 0.05° × 0.05° spatial resolution between our dataset (CLCLU) and NLCD for crop fields, developed areas, and water bodies for the years 2001, 2004, and 2006 (U.S. only).

Notably, the year 2004 exhibited the lowest performance metrics, underscoring the significant adverse impact of cloud cover and shadows, as well as their positional impact, on accurate CLCLU delineation and overall classification reliability (Fig. 5). Specifically, cloud contamination and associated shadows disrupted spectral signatures, leading to substantial misclassification and reduced model accuracy.

Fig. 5
figure 5

Landsat 5 composites for the years 2001, 2004, and 2006, illustrating the presence of shadows and cloud cover in the input imagery.

Figures 6, 7 present comparisons of area fractions for developed areas and crop fields between our dataset and the MCD12Q1-UMD product for the period from 2001 to 2023, encompassing both the U.S. and Mexican sides of the Middle Rio Grande (MRG). On average, our dataset exhibits strong spatial agreement with MCD12Q1-UMD, yielding an R² of 0.8661 and an RMSE of 0.1207 for developed areas, and an R² of 0.8557 and an RMSE of 0.0990 for crop fields. However, a noticeable decline in R² is observed for developed areas from 2021 to 2023. We attribute this reduction primarily to improvements in the classification of developed areas in the CDL map beginning in 2021, combined with the classification criteria of the MCD12Q1-UMD product, which requires a pixel to exhibit at least 60% impervious (built-up) surface to be categorized as developed. For more details regarding the data associated with the intercomparison of our product with NLCD and MCD12Q1-UMD, please refer to the Supplementary Materials section S.2.

Fig. 6
figure 6

Comparison of area fractions aggregated at 0.05° × 0.05° spatial resolution between our dataset (CLCLU) and MCD12Q1-UMD for developed areas (2001–2023).

Fig. 7
figure 7

Comparison of area fractions aggregated at 0.05° × 0.05° spatial resolution between our dataset (CLCLU) and MCD12Q1-UMD for Crop Fields (2001–2023).

LC dynamics and temporal trends

Figure 8 presents temporal trends of various land cover classes for the U.S. side of the Middle Rio Grande (MRG), comparing predicted land cover areas (1994–2024) against the CDL dataset (2008–2024). Overall, strong agreement is observed between our predictions and CDL across most years. However, notable discrepancies occur in cotton areas for 2009 and developed areas for 2010. To further investigate and evaluate these discrepancies, we have visualized both our predicted land cover and the corresponding CDL classifications for these specific years (Fig. 9).

Fig. 8
figure 8

Comparison of area fractions aggregated at 0.05° × 0.05° spatial resolution between our dataset and MCD12Q1-UMD for Crop Fields (2001–2023).

Fig. 9
figure 9

Comparison of CDL and our Product for years 2008 to 2011.

Figure 9 compares our product with CDL maps for the years 2008 to 2011. In 2009, CDL notably misclassified certain areas as cotton, whereas our models identified land cover classes more consistent with the input imagery, demonstrating enhanced temporal stability. Similarly, in 2010, the CDL maps underrepresented developed areas, particularly around El Paso, while our models again provided more accurate and temporally stable classifications when compared to the Landsat 5 imagery.

Usage Notes

The CLCLU maps presented in this study constitute the first product capable of providing annual land cover monitoring at a 30-meter spatial resolution over an extended temporal period (1994–2004) utilizing solely optical imagery as inputs. However, reliance on optical data inherently introduces vulnerability to artifacts arising from cloud coverage and shadow presence. Although we have used median Landsat imagery composites, these artifacts disrupt spectral signatures associated with various land cover classes, thereby adversely affecting classification accuracy.

The impact of cloud and shadow-induced artifacts is less pronounced for the years where CDL maps are available on the U.S. side of the MRG region (2008–2024), as dedicated models have been fine-tuned annually for each specific year. In contrast, for earlier years (1994–2007), three distinct training strategies have been employed, potentially making these years more susceptible to such disruptions and undermining the temporal stability of the generated maps. The training strategies include: (S.1) initial training on Landsat 8 imagery (2013–2024) with subsequent fine-tuning using Landsat 5 imagery from 2008 and 2011; (S.2) combined training with imagery from 2008, 2011, and the Landsat 8 period (2013–2024); and (S.3) exclusive use of Landsat 5 imagery from 2008 and 2011. As demonstrated in Fig. 5, the susceptibility to cloud and shadow presence can affect the robustness and consistency of land cover classifications for earlier periods. Previous studies have demonstrated the effectiveness of incorporating radar imagery to mitigate the adverse effects of cloud contamination49. However, for the earlier period (1994–2014), radar sensor imagery is unavailable72. To overcome this limitation, some research has suggested the application of generative models to synthesize radar imagery corresponding to available optical datasets73,74. Despite this innovative approach, the use of generated radar imagery can introduce additional uncertainties into the classification models. Furthermore, it was indicated that while incorporating multi-sensor data such as synthetic apparatus radar imagery or digital elevation models (DEM) significantly contributes to the delineation of developed areas and water bodies, it does not substantially improve the discrimination of various crop types49. Thus, the benefits of multi-sensor fusion for crop delineation, specifically within agricultural domains, remain limited.

The CDL, while highly detailed in its delineation of crop types and agricultural practices, does not seem to incorporate any form of temporal consistency framework34,75,76. This absence is reasonable, particularly given the temporal variability inherent in agricultural land use, which contrasts with the relatively stable patterns observed in other land cover types. Constraining predictions through posterior probabilities or temporal conditioning, as is common in approaches utilizing Markov Chains or similar methods, may inadvertently introduce additional uncertainties into model outputs due to the dynamic nature of agricultural activities. Therefore, in this study, no temporal consistency strategies were implemented to condition predicted classes on prior or posterior class probabilities. The development of novel methodologies capable of integrating temporal consistency into simultaneous CLCLU delineation represents a critical area for future research. Such frameworks could enhance the robustness and reliability of annual mapping products, particularly in regions characterized by rapid or recurrent changes in agricultural practices.

For comparative analysis, we examined two widely used LC datasets: NLCD and the MCD12Q1-UMD product, as the CDL datasets lack temporally continuous data for model finetuning from 1994 to 2008, the resulting maps reflect direct predictions generated by our proposed method. Quantitative comparisons demonstrate a high degree of spatial agreement between our product and these reference datasets. However, it is important to note that neither NLCD nor MCD12Q1-UMD provides information on crop-specific classifications or management practices to the extent that CDL does. As a result, accurate validation and inference regarding individual crop types for the years prior to 2008 remain limited. As no ground-reference surveys directly used for the transboundary MRG region and the CDL was the sole surrogate label source during training, the maps, especially those for 1994 – 2007, are model-derived estimates generated without fine-tuning or direct observations; they should therefore be interpreted with caution.