Background & Summary

Surface albedo is a key parameter that quantifies the fraction of incoming solar radiation reflected by the Earth's surface, ranging from 0 (no reflection) to 1 (total reflection)1,2. It  plays an important role in understanding how different land surfaces interact with solar energy3,4, and significantly influences the urban heat island (UHI) effect by determining how much solar energy is absorbed or reflected by urban surfaces5,6,7. High albedo surfaces, such as light-colored roofs or pavements, can reflect more solar radiation, thereby reducing surface temperatures8,9 and mitigating the UHI effect10,11. However, while high albedo lowers land surface temperature, it may also increase mean radiant temperature (Tmrt)12,13, leading to higher perceived heat exposure and worsening outdoor thermal comfort14,15,16. Excessive urban heat can severely affect public health, increasing the risk of heat-related illnesses and mortality17,18, especially among vulnerable populations, such as older adults19,20, economically disadvantaged groups21, marginalized communities22, and outdoor workers23,24. Beyond health impacts, extreme heat also disrupts transportation system25, infrastructure vulnerability26, reduces economic productivity27, and deepens socioeconomic disparity28,29.

Accurate mapping of urban surface albedo is therefore essential for effective microclimate modeling, enabling reliable predictions of thermal responses under various urban planning scenarios2,13,30,31. Additionally, understanding and manipulating surface albedo can lead to significant reductions in energy consumption for cooling buildings32,33, thereby contributing to energy efficiency efforts34,35. As cities continue to expand and climate change accelerates, the role of urban albedo in climate and radiation models becomes increasingly important36,37, such as the solar and longwave environmental irradiance geometry (SOLWEIG) model38. Precise estimation of albedo is thus not only vital for climate adaptation at the micro-scale but also contributes to broader sustainability and livability goals.

Various retrieval algorithms have been developed to estimate albedo from satellite data39,40,41,42,43,44. Those methods can be primarily summarized into physically-based and semi-empirical algorithms, each exhibiting distinct advantages and inherent limitations43. Physically-based algorithms often rely on detailed radiative transfer models and comprehensive environmental inputs, which are widely used with coarse-resolution data, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) data45. They can be constrained by the availability of high-resolution, accurate input data, making them less practical for real-time or large-scale urban applications. On the other hand, semi-empirical algorithms combine the empirical data with simplified theoretical models. Lucht et al.46 estimated the surface albedo using a semi-empirical linear kernel-driven model. Along with this paper, a series of studies47,48,49,50 have laid the foundation for estimating surface albedo through bidirectional reflectance distribution function (BRDF) modeling. However, these methods can suffer from issues of accuracy and generalizability. They often rely on assumptions and empirical relationships that may not consistent across different urban environments or varying atmospheric conditions. Previous studies have shown that without correction for the effect of surface BRDF, errors in fine-resolution data can be as high as 60%51. Therefore, many studies have focused on correcting the BRDF effects without BRDF model inversion and calculating surface albedo from Landsat data52,53. Although the results over the snow-free surfaces are promising, both types of algorithms struggle with coarse resolution and coverage limitations. They may not be able to consistently map albedo across all urban areas, particularly those with complex geometries or mixed material compositions. This limitation highlights the critical need for high-resolution albedo data in urban environments, where initiatives like the cool roof movement and white pavement strategies are increasingly implemented to combat urban heat islands54,55. Such interventions require precise albedo measurements to effectively assess their impact on urban climates, particularly in densely built areas where the heterogeneity of surfaces can significantly influence the local thermal dynamics.

Recent advances in remote sensing and deep learning have significantly enhanced the capability to accurately map fine- scale surface albedo, especially as spatial resolutions improve from a few tens of meters to sub-meter. The emergence of high-resolution satellite imagery, such as Sentinel-2 (10-m), and the National Agriculture Imagery Program (NAIP), providing sub-meter resolution, has opened new directions for environmental monitoring. In addition, the application of deep learning models, particularly architectures like U-Net56, has revolutionized the analysis of these high-resolution images. These models are proficient in handling complex spatial hierarchies and are particularly good at segmenting high-resolution data, making them invaluable for tasks that require precise delineation of surface types to accurately estimate albedo.

Current surface albedo datasets predominantly offer 30-m or 10-m resolutions57, which do not meet the requirements for micro-scale modeling essential in detailed urban climate studies. There is no sub-meter albedo dataset to our knowledge. To fill this gap, this study has developed a dataset predicting sub-meter resolution albedo for both impervious and pervious surfaces using deep learning methods applied to NAIP imagery across 34 major U.S. cities. This integrated dataset is expected to significantly benefit micro-scale urban environmental monitoring and climate-related decision-making processes. The primary purpose of this study is to provide a high-resolution and spatially continuous albedo product that can support fine-scale urban microclimate modeling, especially for human heat exposure estimation, urban heat island mitigation, and sustainable urban design.

Methods

The overall workflow in Fig. 1 integrates various datasets and processing steps to accurately predict impervious and pervious surface albedo. Initially, high-resolution NAIP imagery, roof albedo data, building footprints, land cover classifications, and Sentinel-2 images were collected. These inputs go through several preprocessing steps: NAIP imagery is clipped to isolate roof areas for impervious albedo assessment; building footprints are used to mask and refine impervious surface delineations; land cover data is reclassified to distinguish between impervious and pervious surfaces; and pervious albedo is directly estimated from Sentinel-2 data. These preprocessed datasets are then divided into training and validation sets to develop three primary models: a U-Net model for urban impervious classification (UIS), an impervious surface albedo (ISA) model, and a pervious surface albedo (PSA) model. Each model’s performance is rigorously evaluated using appropriate metrics such as accuracy, R2, MAE, and MSE, ensuring the models’ effectiveness in urban albedo mapping. This structured approach allows us to merge and mosaic the model outputs to create comprehensive and high-resolution urban albedo maps.

Fig. 1
figure 1

Overall framework for surface albedo mapping: (1) collecting diverse data sources; (2) preprocessing for model readiness; (3) model training and evaluating for accurate urban albedo mapping.

Study area and data sources

Remote sensing provides a critical method of obtaining high-resolution spatial data necessary for urban thermal environment modeling, and has been widely used in related studies58,59,60,61,62. This study utilized multiple remote sensing datasets from 34 major cities across the contiguous United States, covering a wide range of urban environments. As shown in Fig. 2, these cities are home to a substantial fraction of the U.S. population and showcase a wide range of built environments across the four principal geographic regions of the country West, Midwest, South, and Northeast. The city boundaries were obtained from Open Street Map, offering up-to-date and detailed urban outlines.

Fig. 2
figure 2

34 major U.S. cities’ geospatial distribution in four geographical zones.

Five types of dataset are employed in this study for various purposes: (1) impervious roof albedo data requested from Berkeley Lab; (2) pervious surface albedo data derived from Sentinel-2 reflectance products; (3) National Agriculture Imagery Program data; (4) land cover type data downloaded from the University of Vermont Spatial Analysis Laboratory (SAL); (5) building footprint data from Microsoft products, offering detailed building outlines to enhance surface classification accuracy. The details of each dataset are as follows.

National Agriculture Imagery Program (NAIP) imagery

As shown in Fig. 3, the high-resolution aerial imagery for 34 major U.S. cities was obtained from National Agriculture Imagery Program (NAIP) via the USGS Earth Explorer data portal (https://earthexplorer.usgs.gov/), which is managed by the United States Department of Agriculture’s Farm Service Agency. Captured during the agricultural growing seasons (late spring to early summer) in the continental United States, NAIP imagery is instrumental for a lot of applications, including agricultural monitoring63, land cover classification64, urban heat analysis65, and environmental studies66. NAIP imagery typically offers a resolution of 0.3 to 1 meter, with images captured in both natural color (RGB) and near-infrared channels based on the state within the U.S. This study utilized the latest cloud-free NAIP imagery from the period 2019 to 2023 to ensure the data’s relevance and timeliness. This imagery serves as the foundational input for albedo prediction models, providing the essential visual context and spectral information required for accurate assessments. Detailed information of the NAIP imagery for each city is summarized in Table 1, impervious area fraction is calculated from the surface classification results.

Fig. 3
figure 3

NAIP maps of 34 major U.S. cities.

Table 1 Basic information of NAIP imagery on 34 major U.S. cities.

Impervious roof albedo data

To predict the sub-meter resolution ISA, this study employed the roof albedo data (https://heatisland.lbl.gov/projects/projects-california-roof-albedo/) from the Berkeley Lab Heat Island Group67,68. These foundational studies addressed the critical need for city-scale roof albedo data, which has previously been limited by the unavailability of reflectance data with adequate spatial coverage, resolution, and spectral detail. They combined the high-resolution (1-m) imagery and building footprint outlines, developed an empirical model to correlate laboratory-measured broadband solar reflectance with these four-band reflectance of imagery. This model demonstrated a high predictive accuracy for solar reflectance, with a low root-mean-square error (RMSE) of residuals (0.016). Further, the derived albedos were calibrated to field measurements for six cities in 2009, noting varying degrees of error across different cities, ranging from 0.00–0.01 at low albedo to 0.06–0.14 at high albedo. As shown in Fig. 4, the roof albedo data from Los Angeles (LA), Long Beach (LB), San Francisco (SF), and San Jose (SJ) was selected as ISA training data.

Fig. 4
figure 4

The spatial distribution of impervious roof albedo in 4 U.S. cities.

Sentinel-2 surface reflectance product

To estimate high-quality PSA within the study areas, this study used the Sentinel-2 Level-2A surface reflectance product (https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR) with the same time range of NAIP data. Sentinel-2 is a twin-satellite mission (Sentinel-2A and 2B) designed for land surface monitoring, equipped with a Multi-Spectral Instrument (MSI) that captures data across 13 spectral bands ranging from the visible (VIS), near-infrared (NIR), to shortwave infrared (SWIR) regions, with spatial resolutions of 10-m, 20-m, and 60-m depending on the band69.

This study focused on the four 10-m resolution bands (blue, green, red, and NIR) available in the Level-2A products, which provide atmospherically corrected bottom-of-atmosphere (BOA) reflectance. These bands are particularly sensitive to vegetation and surface material properties and are commonly used in albedo estimation. The surface reflectance data were accessed via Google Earth Engine (GEE)70, offering consistent and cloud-filtered coverage over the study region. These high-resolution spectral observations served as essential inputs for albedo estimation using established reflectance-to-albedo mapping techniques.

Land cover data

In this study, the Los Angeles County Meter Urban Land Cover (MULC) dataset of 2016 (https://catalog.data.gov/dataset/enviroatlas-los-angeles-county-ca-meter-scale-urban-land-cover-mulc-data-20163), was employed as a training set for an urban impervious surface classification deep learning model. The MULC dataset is derived from sub-meter resolution imagery and provides detailed urban land cover classification across seven categories including water, impervious surfaces, bare earth, trees, grass, herbaceous, and wetlands. This study focused on the impervious land cover category, indicative of areas resistant to water infiltration such as roads, buildings, and other urban structures, treated the other categories as pervious surfaces. This training involved overlaying and precisely clipping the land cover data against the corresponding high-resolution NAIP imagery to ensure accurate spatial alignment. This process facilitates the targeted albedo prediction for different types of surfaces at a sub-meter resolution.

Building footprints masking

To improve the accuracy of impervious surface extraction within urban environments, the 2020 U.S. Building Footprints dataset by Microsoft (https://github.com/microsoft/USBuildingFootprints) was also incorporated. This dataset comprises detailed building polygons for the 34 cities, which were applied as masks over the deep learning predicted impervious surfaces. By assigning all areas covered by building footprints as impervious, this study significantly refined the delineation of urban impervious areas. This strategy ensures a thorough and precise mapping of the most impervious surfaces.

Pervious surface albedo estimation

To estimate PSA, this study adopted the Direct Estimation Approach proposed by Lin et al.71, leveraging Sentinel-2 surface reflectance data within the GEE platform. The method utilizes a Look-Up Table (LUT) framework that maps narrowband Sentinel-2 reflectance to broadband shortwave albedo. Specifically, the LUTs are constructed using ESA WorldCover land cover classes72 in combination with MODIS BRDF/albedo products (https://modis.gsfc.nasa.gov/data/dataprod/mod43.php), allowing us to generate albedo values tailored to each land cover type.

The resulting PSA represents white-sky albedo, which corresponds to the proportion of diffuse (non-directional) solar radiation reflected by the surface under clear-sky conditions. This is an important distinction, as white-sky albedo better captures background reflectivity and is particularly relevant in radiation-based microclimate models such as SOLWEIG. It was used as training data for developing the PSA prediction model. This albedo information provides critical input for modeling urban radiative environments, particularly in vegetated areas.

U-Net network architecture

The classic U-Net model, known for its proficiency in image classification and pixel-wise regression tasks, was employed to tackle the three distinct challenges: classifying impervious surfaces, predicting impervious, and pervious surface albedo separately. U-Net56 is recognized for its “U-shaped” design, featuring a contracting path to grasp context and a symmetric expanding path that ensures precise localization, making it ideal for dealing with high-resolution imagery like the NAIP dataset73,74. A detailed depiction of the specific model configuration is illustrated in Fig. 5. For classification task, this study utilized a categorical output layer in the network, tailored to classify the various surface types within the cities. Then for regression task, the output layer was modified to predict continuous albedo values.

Fig. 5
figure 5

U-Net framework for impervious surface classification and albedo prediction.

Preparation for impervious surfaces classification

For impervious surface classification, the U-Net model was trained to categorize each pixel, effectively distinguishing between impervious and pervious areas. The LA MULC dataset was first reclassified by combining its multiple land cover categories into two target categories. The pixel labels were manually assigned as either 0) null/background, 1) impervious, or 2) pervious classes. Corresponding LA NAIP images were then obtained in 2016 to maintain temporal consistency. The extent and resolution of the two datasets are aligned with each other. Subsequently, both the NAIP imagery and the reclassified surface raster were divided into image patches of 512 × 512 pixels, utilizing the Rasterio library in Python to facilitate this process. Given the dimensions of the cropped dataset did not always conform to exact multiples of 512, this study removed the image patches deviating from the 512 × 512 pixels. Therefore, this approach resulted in a comprehensive collection of 10,465 image patches of 4-band RGB-NIR and their associated single-band surfaces image patch, including the impervious, pervious and null value pixels.

Preparation for surface albedo prediction

To predict the ISA, the impervious roof albedo polygons were first used to clip the NAIP imagery, thereby generating NAIP roof data. This NAIP roof data was then divided into image patches of 512 × 512 pixels. For each patch, a corresponding initial albedo image patch of the same size (512 × 512 pixels) was created, with all pixel values initially set to zero. This preparatory step provided a blank canvas for accurately overlaying roof albedo data. During processing, roof albedo polygons were overlaid onto these zero-valued albedo patches. Whenever a pixel within the albedo patch intersected with a roof polygon, the specific albedo value of that roof was assigned to the corresponding pixel. This method ensured precise alignment between the NAIP roof patches and the roof albedo data, effectively capturing the albedo characteristics of impervious roof surfaces. Following this systematic approach, a total of 7,181 image patches were produced across four cities. Among them, 80% (5,743) of the samples were allocated for training and 20% (1,438) for validation.

For the PSA prediction, a multi-step processing pipeline was implemented using NAIP imagery, which was initially divided into 512 × 512 pixel patches. These image patches were first input into a previously developed impervious surfaces classification model to identify impervious areas. The resulting impervious masks were then combined with building footprint data to generate a refined single-band impervious dataset, where pixels with a value of “1” denoted impervious surfaces and “0” represented pervious areas. To isolate pervious regions, the dataset was inverted such that pervious areas were assigned a value of “1” and impervious areas a value of “0”. This pervious mask was then applied element-wise to the corresponding NAIP patches, producing isolated pervious NAIP image patches. For each pervious patch, its geographical extent was used to clip the corresponding 10-meter resolution Sentinel-2 albedo data, ensuring that only the albedo values within the relevant spatial boundaries were retained. The clipped Sentinel-2 data were subsequently resampled to match the 0.6-meter resolution of the NAIP imagery using bilinear interpolation. Finally, the resampled albedo values were multiplied by the pervious mask to generate accurate pervious albedo datasets, ensuring spatial alignment in both extent and resolution. Philadelphia was selected as the study site for training the PSA model due to its diverse range of pervious surface types, including rivers, parks, and bare earth, features that offer an ideal testbed for model development. The final dataset comprised 4,164 image patches, each with dimensions of 512 × 512 pixels.

Before being input into the U-Net model, the image patches were augmented through random vertical and horizontal flipping. To mitigate the shadow effects occurring at the edges of patches—artifacts often caused by abrupt transitions during image tiling75, a buffer zone of 256 pixels was added to each patch, resulting in a dataset composed of 1024 × 1024 pixel patches. This buffer-based approach was uniformly applied to all previously prepared datasets. The buffer zones were subsequently cropped during the mosaicking process to generate the final surface albedo maps for each city56,73.

Model training and validation

To this end, comprehensive datasets were prepared for three distinct tasks: classification of impervious surfaces, prediction of ISA and PSA. All the datasets were split into training and validation sets in an 8:2 ratio. Each model was trained over 100 epochs, using a learning rate of 0.0001 and a batch size of 4. These training parameters were finely tuned through preliminary hyperparameter optimization processes76, which aimed at achieving an optimal balance between computational efficiency and the efficacy of the learning process. For optimization, the ADAM (Adaptive Moment Estimation) optimizer was selected due to its effectiveness in managing sparse gradients and its flexibility across various training scenarios.

The classification task’s objective was to minimize the cross-entropy loss, which is specifically designed for classification problems. The model with the best validation loss (i.e., validation loss of 0.22060 and an overall accuracy of 0.91987) was kept for prediction. On the other hand, the regression task aimed at predicting continuous albedo values, where the mean squared error (MSE) loss function was employed to guide the model towards minimizing the difference between predicted and actual albedo values. The ISA model at the best validation loss of 0.00041 was selected, for PSA model, the best MSE is 0.00024. All these models were implemented using Python and PyTorch 2.1, and trained on an Ubuntu Linux workstation with 128 G memory, Intel(R) Xeon(R) Gold 5320 CPU, and four NVIDIA A40 48 GB GPUs.

To validate the performance of the classification model, this study relied on several metrics (Table 2), including overall accuracy (OA), Precision, Recall, and F1-score. These metrics collectively provided a comprehensive evaluation of the model’s ability to accurately classify each pixel. For the regression model, the validation metrics included the coefficient of determination (R2), mean absolute error (MAE), and MSE, which offered insights into the model’s accuracy in predicting continuous albedo values. This rigorous approach to training and validation ensures that the models are both efficient and effective in addressing the unique challenges presented by each task.

Table 2 Assessment metrics for classification and regression performance.

Prediction

For prediction, the NAIP imagery for each city was segmented into image patches of 1024 × 1024 pixels. To accommodate patches that did not conform to these dimensions due to variability in NAIP imagery sizes across cities, additional rows and columns filled with zero values were appended. This preprocessing step ensured that all patches adhered to the uniform input size required by the predictive models.

Initially, the impervious surface classification model was applied to identify impervious pixels within each image patch. To further improve the accuracy of the impervious surface mask, Microsoft building roof polygons were overlaid onto the classification results.

Following the generation of the comprehensive impervious mask, each NAIP image patch was clipped to extract impervious and pervious regions. These corresponding patches were then input into the ISA and PSA prediction models, respectively. Through this integrated prediction pipeline, comprising both classification and regression models, high-resolution ISA and PSA maps were produced for each city. The complete prediction process across all cities requires approximately one week of computation time.

Integrating impervious and pervious surface albedo

In the final step, after successfully predicting the ISA and PSA for each 512 × 512 image patch, this study integrated these albedo predictions to complement the high-resolution mapping of albedo. Each corresponding pair of ISA and PSA patches was merged. These integrated patches were then assembled through a mosaicking process to develop a city-wide albedo map.

This methodological approach allowed us to produce a detailed albedo map at sub-meter resolution encompassing both impervious and pervious surfaces across the entire urban landscape. By accurately representing the varied albedo across different urban surface types, this refined dataset provides an exceptional level of detail and precision. The comprehensive urban albedo map thus generated is invaluable for applications in microclimate modeling, environmental monitoring, and urban planning, offering insights into the radiative properties of urban environments and aiding in the development of strategies for sustainable urban management.

Data Records

The sub-meter resolution albedo dataset for 34 major U.S. cities is made publicly available through Figshare77. The albedo data files are provided in GEOTIFF format. Each file corresponds to the respective city’s albedo measurements, formatted in the Pseudo-Mercator projection system. Details concerning the data size and specific attributes of each city’s dataset are summarized in Table 1. The urban surface albedo map for all the 34 cities is in Fig. 6. Additionally, this study has developed a visualization website to facilitate an intuitive examination of this data, accessible at https://shengaoyi.github.io/#/Albedopedia.

Fig. 6
figure 6

Urban surface albedo maps of 34 major U.S. cities.

Technical Validation

UIS model performance

The performance metrics presented in Table 3 for the UIS classification model incorporate the benefits derived from the integration of building mask data, leveraging the U-Net architecture. This method achieved an F1 Score of 0.8844, highlighting a balanced precision and recall that are critical for the accurate classification of urban surfaces. The Overall Accuracy (OA) reached 0.9208, signifying that more than 92% of the classifications were correct, an indication of the model’s effectiveness across diverse urban settings. The model’s Precision, noted at 0.9222, demonstrates its ability to correctly identify impervious surfaces, enhanced significantly using building masks which ensure that areas such as buildings are consistently recognized as impervious. This approach helps in reducing false positives, where pervious areas might otherwise be misclassified. The Recall rate of 0.8495 also underscores the model’s capability in capturing most of the actual impervious surfaces present in the imagery, aided by the definitive outlines provided by the building masks. Collectively, these metrics underscore the reliability and accuracy of the integrating method in urban impervious surface classification.

Table 3 Performance metrics for UIS classification model.

ISA model performance

Table 4 and Fig. 7 presents the performance of the ISA prediction models across four cities, comparing three different architectures: FCN, CNN, and U-Net. The results are broken down by city and include metrics such as the R2, MAE, and MSE. On average, the U-Net model outperforms the other architectures with an R2 value of 0.9028, suggesting an excellent fit to the observed data. It also exhibits the lowest average MAE and MSE (0.0057 and 0.0002, respectively), indicating its superior accuracy and consistency in predicting ISA.

Table 4 ISA model performance for the selected four cities.
Fig. 7
figure 7

Performance of impervious surface albedo prediction model over the selected four cities.

PSA model performance

Table 5 and Fig. 8 evaluates the performance of different models used for predicting PSA, specifically comparing the CNN, FCN, and U-Net architectures. The U-Net model again shows the best performance with the highest R2 value of 0.9288, which indicates a strong predictive capability. It also has the lowest MAE (0.0069) and MSE (0.0002), confirming its precision and accuracy in estimating pervious surface albedo. The high R2 values across all models demonstrate their effectiveness, but U-Net’s lower error rates highlight its greater reliability for detailed albedo prediction tasks.

Table 5 PSA model performance.
Fig. 8
figure 8

Performance of pervious surface albedo prediction model.

These evaluations collectively demonstrate the advanced capabilities of deep learning architectures, especially U-Net, in handling complex tasks such as albedo prediction and impervious surface classification within urban environments. The high accuracy and precision of these models offer promising prospects for applications in urban planning and microclimate modeling. Figure 9 visually illustrates these capabilities by showcasing detailed mappings of both pervious and impervious surfaces and the urban surface albedo within a city, confirming the models’ effectiveness in accurately predicting diverse urban surface albedo.

Fig. 9
figure 9

Illustrative example of urban albedo mapping process and its precise results.

Usage Notes

Figure 10 illustrates the distribution of surface albedo across 34 major U.S. cities, showcasing both the mean values and the variability (standard deviation) of albedo measurements within each city.

Fig. 10
figure 10

The histograms of urban surface albedo in 34 major U.S. cities.

Statistics of the data

The albedo values vary significantly across the cities, with mean values ranging from as low as 0.099 in Seattle to as high as 0.199 in Las Vegas, indicating a broad spectrum of urban surface reflectivity. Standard deviations are generally small, underscoring a consistent albedo measurement within individual cities but highlighting a substantial difference between different urban environments. Cities like Miami and Las Vegas, known for their sunnier and drier climates, show higher mean albedo, suggesting urban surfaces that reflect more solar radiation. Conversely, cities with cooler and cloudier climates such as Seattle and San Francisco exhibit lower mean albedo. These statistics underscore the diverse thermal characteristics of these cities, which are critical for understanding urban heat management and planning for climate adaptation strategies.

Limitations

Despite the robust performance of the models and the comprehensive methodology applied in this study, several limitations must be acknowledged. One significant challenge is the accurate representation of urban canyon effects, particularly in densely built environments like New York City, where building shadows may lead to underestimation of albedo15. Additionally, the exclusion of snow from the analysis could introduce seasonal biases, as snow significantly alters surface reflectivity. The dataset was derived from summer and non-snow-covered scenes, which may limit its applicability for year-round modeling in certain climate zones. Moreover, while the model generalizes well across 34 cities, differences in urban morphology and local environmental characteristics may still influence model transferability in regions with significantly different landscape structures or spectral characteristics. These considerations should be considered when applying this dataset in new contexts.