A sub-meter resolution urban surface albedo dataset for 34 U.S. cities based on deep learning

Yi, Shengao; Li, Xiaojiang; Liu, Yixuan; Dong, Xinyu; Tu, Wei

doi:10.1038/s41597-025-05109-2

Download PDF

Data Descriptor
Open access
Published: 14 May 2025

A sub-meter resolution urban surface albedo dataset for 34 U.S. cities based on deep learning

Shengao Yi^1,2,3,
Xiaojiang Li¹,
Yixuan Liu⁴,
Xinyu Dong⁵ &
…
Wei Tu⁶

Scientific Data volume 12, Article number: 789 (2025) Cite this article

2622 Accesses
3 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Surface albedo is a key determinant of urban heat islands, which modulates the amount of solar energy absorbed or reflected by urban surfaces, influencing microclimate and thermal comfort. However, high-resolution albedo is usually not available, which makes the understanding of the urban thermal environment at hyperlocal difficult. This study presents the first high-resolution urban albedo maps for 34 major U.S. cities using advanced deep learning models and multisource remote sensing data. By differentiating between impervious and pervious surfaces using a combination of NAIP imagery, roof albedo data, building footprints, land cover classifications, and Sentinel-2 imagery, this work achieves sub-meter resolution in albedo mapping. Employing U-Net for impervious surface classification along with impervious (ISA) and pervious surface albedo (PSA) prediction, these models were validated in selected cities, with ISA showing an R² of 0.9028 and MAE of 0.0057, and PSA demonstrating an R² of 0.9538 and MAE of 0.0027, highlighting the precision and reliability. The datasets, made publicly available, offer essential insights for urban planning and environmental monitoring.

The impact of increasing urban surface albedo on outdoor air and surface temperatures during summer in newly developed areas

Article Open access 11 July 2025

Revolutionizing urban mapping: deep learning and data fusion strategies for accurate building footprint segmentation

Article Open access 12 June 2024

Extraction and spatiotemporal analysis of impervious surfaces in Chongqing based on enhanced DeepLabv3+

Article Open access 21 March 2025

Background & Summary

Surface albedo is a key parameter that quantifies the fraction of incoming solar radiation reflected by the Earth's surface, ranging from 0 (no reflection) to 1 (total reflection)^1,2. It plays an important role in understanding how different land surfaces interact with solar energy^3,4, and significantly influences the urban heat island (UHI) effect by determining how much solar energy is absorbed or reflected by urban surfaces^5,6,7. High albedo surfaces, such as light-colored roofs or pavements, can reflect more solar radiation, thereby reducing surface temperatures^8,9 and mitigating the UHI effect^10,11. However, while high albedo lowers land surface temperature, it may also increase mean radiant temperature (T_mrt)^12,13, leading to higher perceived heat exposure and worsening outdoor thermal comfort^14,15,16. Excessive urban heat can severely affect public health, increasing the risk of heat-related illnesses and mortality^17,18, especially among vulnerable populations, such as older adults^19,20, economically disadvantaged groups²¹, marginalized communities²², and outdoor workers^23,24. Beyond health impacts, extreme heat also disrupts transportation system²⁵, infrastructure vulnerability²⁶, reduces economic productivity²⁷, and deepens socioeconomic disparity^28,29.

Accurate mapping of urban surface albedo is therefore essential for effective microclimate modeling, enabling reliable predictions of thermal responses under various urban planning scenarios^2,13,30,31. Additionally, understanding and manipulating surface albedo can lead to significant reductions in energy consumption for cooling buildings^32,33, thereby contributing to energy efficiency efforts^34,35. As cities continue to expand and climate change accelerates, the role of urban albedo in climate and radiation models becomes increasingly important^36,37, such as the solar and longwave environmental irradiance geometry (SOLWEIG) model³⁸. Precise estimation of albedo is thus not only vital for climate adaptation at the micro-scale but also contributes to broader sustainability and livability goals.

Various retrieval algorithms have been developed to estimate albedo from satellite data^{39,40,41,42,43,44}. Those methods can be primarily summarized into physically-based and semi-empirical algorithms, each exhibiting distinct advantages and inherent limitations⁴³. Physically-based algorithms often rely on detailed radiative transfer models and comprehensive environmental inputs, which are widely used with coarse-resolution data, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) data⁴⁵. They can be constrained by the availability of high-resolution, accurate input data, making them less practical for real-time or large-scale urban applications. On the other hand, semi-empirical algorithms combine the empirical data with simplified theoretical models. Lucht et al.⁴⁶ estimated the surface albedo using a semi-empirical linear kernel-driven model. Along with this paper, a series of studies^47,48,49,50 have laid the foundation for estimating surface albedo through bidirectional reflectance distribution function (BRDF) modeling. However, these methods can suffer from issues of accuracy and generalizability. They often rely on assumptions and empirical relationships that may not consistent across different urban environments or varying atmospheric conditions. Previous studies have shown that without correction for the effect of surface BRDF, errors in fine-resolution data can be as high as 60%⁵¹. Therefore, many studies have focused on correcting the BRDF effects without BRDF model inversion and calculating surface albedo from Landsat data^52,53. Although the results over the snow-free surfaces are promising, both types of algorithms struggle with coarse resolution and coverage limitations. They may not be able to consistently map albedo across all urban areas, particularly those with complex geometries or mixed material compositions. This limitation highlights the critical need for high-resolution albedo data in urban environments, where initiatives like the cool roof movement and white pavement strategies are increasingly implemented to combat urban heat islands^54,55. Such interventions require precise albedo measurements to effectively assess their impact on urban climates, particularly in densely built areas where the heterogeneity of surfaces can significantly influence the local thermal dynamics.

Recent advances in remote sensing and deep learning have significantly enhanced the capability to accurately map fine- scale surface albedo, especially as spatial resolutions improve from a few tens of meters to sub-meter. The emergence of high-resolution satellite imagery, such as Sentinel-2 (10-m), and the National Agriculture Imagery Program (NAIP), providing sub-meter resolution, has opened new directions for environmental monitoring. In addition, the application of deep learning models, particularly architectures like U-Net⁵⁶, has revolutionized the analysis of these high-resolution images. These models are proficient in handling complex spatial hierarchies and are particularly good at segmenting high-resolution data, making them invaluable for tasks that require precise delineation of surface types to accurately estimate albedo.

Current surface albedo datasets predominantly offer 30-m or 10-m resolutions⁵⁷, which do not meet the requirements for micro-scale modeling essential in detailed urban climate studies. There is no sub-meter albedo dataset to our knowledge. To fill this gap, this study has developed a dataset predicting sub-meter resolution albedo for both impervious and pervious surfaces using deep learning methods applied to NAIP imagery across 34 major U.S. cities. This integrated dataset is expected to significantly benefit micro-scale urban environmental monitoring and climate-related decision-making processes. The primary purpose of this study is to provide a high-resolution and spatially continuous albedo product that can support fine-scale urban microclimate modeling, especially for human heat exposure estimation, urban heat island mitigation, and sustainable urban design.

Methods

The overall workflow in Fig. 1 integrates various datasets and processing steps to accurately predict impervious and pervious surface albedo. Initially, high-resolution NAIP imagery, roof albedo data, building footprints, land cover classifications, and Sentinel-2 images were collected. These inputs go through several preprocessing steps: NAIP imagery is clipped to isolate roof areas for impervious albedo assessment; building footprints are used to mask and refine impervious surface delineations; land cover data is reclassified to distinguish between impervious and pervious surfaces; and pervious albedo is directly estimated from Sentinel-2 data. These preprocessed datasets are then divided into training and validation sets to develop three primary models: a U-Net model for urban impervious classification (UIS), an impervious surface albedo (ISA) model, and a pervious surface albedo (PSA) model. Each model’s performance is rigorously evaluated using appropriate metrics such as accuracy, R², MAE, and MSE, ensuring the models’ effectiveness in urban albedo mapping. This structured approach allows us to merge and mosaic the model outputs to create comprehensive and high-resolution urban albedo maps.

Study area and data sources

Remote sensing provides a critical method of obtaining high-resolution spatial data necessary for urban thermal environment modeling, and has been widely used in related studies^{58,59,60,61,62}. This study utilized multiple remote sensing datasets from 34 major cities across the contiguous United States, covering a wide range of urban environments. As shown in Fig. 2, these cities are home to a substantial fraction of the U.S. population and showcase a wide range of built environments across the four principal geographic regions of the country − West, Midwest, South, and Northeast. The city boundaries were obtained from Open Street Map, offering up-to-date and detailed urban outlines.

Five types of dataset are employed in this study for various purposes: (1) impervious roof albedo data requested from Berkeley Lab; (2) pervious surface albedo data derived from Sentinel-2 reflectance products; (3) National Agriculture Imagery Program data; (4) land cover type data downloaded from the University of Vermont Spatial Analysis Laboratory (SAL); (5) building footprint data from Microsoft products, offering detailed building outlines to enhance surface classification accuracy. The details of each dataset are as follows.

National Agriculture Imagery Program (NAIP) imagery

As shown in Fig. 3, the high-resolution aerial imagery for 34 major U.S. cities was obtained from National Agriculture Imagery Program (NAIP) via the USGS Earth Explorer data portal (https://earthexplorer.usgs.gov/), which is managed by the United States Department of Agriculture’s Farm Service Agency. Captured during the agricultural growing seasons (late spring to early summer) in the continental United States, NAIP imagery is instrumental for a lot of applications, including agricultural monitoring⁶³, land cover classification⁶⁴, urban heat analysis⁶⁵, and environmental studies⁶⁶. NAIP imagery typically offers a resolution of 0.3 to 1 meter, with images captured in both natural color (RGB) and near-infrared channels based on the state within the U.S. This study utilized the latest cloud-free NAIP imagery from the period 2019 to 2023 to ensure the data’s relevance and timeliness. This imagery serves as the foundational input for albedo prediction models, providing the essential visual context and spectral information required for accurate assessments. Detailed information of the NAIP imagery for each city is summarized in Table 1, impervious area fraction is calculated from the surface classification results.

Table 1 Basic information of NAIP imagery on 34 major U.S. cities.

Full size table

Impervious roof albedo data

To predict the sub-meter resolution ISA, this study employed the roof albedo data (https://heatisland.lbl.gov/projects/projects-california-roof-albedo/) from the Berkeley Lab Heat Island Group^67,68. These foundational studies addressed the critical need for city-scale roof albedo data, which has previously been limited by the unavailability of reflectance data with adequate spatial coverage, resolution, and spectral detail. They combined the high-resolution (1-m) imagery and building footprint outlines, developed an empirical model to correlate laboratory-measured broadband solar reflectance with these four-band reflectance of imagery. This model demonstrated a high predictive accuracy for solar reflectance, with a low root-mean-square error (RMSE) of residuals (0.016). Further, the derived albedos were calibrated to field measurements for six cities in 2009, noting varying degrees of error across different cities, ranging from 0.00–0.01 at low albedo to 0.06–0.14 at high albedo. As shown in Fig. 4, the roof albedo data from Los Angeles (LA), Long Beach (LB), San Francisco (SF), and San Jose (SJ) was selected as ISA training data.

Sentinel-2 surface reflectance product

To estimate high-quality PSA within the study areas, this study used the Sentinel-2 Level-2A surface reflectance product (https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR) with the same time range of NAIP data. Sentinel-2 is a twin-satellite mission (Sentinel-2A and 2B) designed for land surface monitoring, equipped with a Multi-Spectral Instrument (MSI) that captures data across 13 spectral bands ranging from the visible (VIS), near-infrared (NIR), to shortwave infrared (SWIR) regions, with spatial resolutions of 10-m, 20-m, and 60-m depending on the band⁶⁹.

This study focused on the four 10-m resolution bands (blue, green, red, and NIR) available in the Level-2A products, which provide atmospherically corrected bottom-of-atmosphere (BOA) reflectance. These bands are particularly sensitive to vegetation and surface material properties and are commonly used in albedo estimation. The surface reflectance data were accessed via Google Earth Engine (GEE)⁷⁰, offering consistent and cloud-filtered coverage over the study region. These high-resolution spectral observations served as essential inputs for albedo estimation using established reflectance-to-albedo mapping techniques.

Land cover data

In this study, the Los Angeles County Meter Urban Land Cover (MULC) dataset of 2016 (https://catalog.data.gov/dataset/enviroatlas-los-angeles-county-ca-meter-scale-urban-land-cover-mulc-data-20163), was employed as a training set for an urban impervious surface classification deep learning model. The MULC dataset is derived from sub-meter resolution imagery and provides detailed urban land cover classification across seven categories including water, impervious surfaces, bare earth, trees, grass, herbaceous, and wetlands. This study focused on the impervious land cover category, indicative of areas resistant to water infiltration such as roads, buildings, and other urban structures, treated the other categories as pervious surfaces. This training involved overlaying and precisely clipping the land cover data against the corresponding high-resolution NAIP imagery to ensure accurate spatial alignment. This process facilitates the targeted albedo prediction for different types of surfaces at a sub-meter resolution.

Building footprints masking

To improve the accuracy of impervious surface extraction within urban environments, the 2020 U.S. Building Footprints dataset by Microsoft (https://github.com/microsoft/USBuildingFootprints) was also incorporated. This dataset comprises detailed building polygons for the 34 cities, which were applied as masks over the deep learning predicted impervious surfaces. By assigning all areas covered by building footprints as impervious, this study significantly refined the delineation of urban impervious areas. This strategy ensures a thorough and precise mapping of the most impervious surfaces.

Pervious surface albedo estimation

To estimate PSA, this study adopted the Direct Estimation Approach proposed by Lin et al.⁷¹, leveraging Sentinel-2 surface reflectance data within the GEE platform. The method utilizes a Look-Up Table (LUT) framework that maps narrowband Sentinel-2 reflectance to broadband shortwave albedo. Specifically, the LUTs are constructed using ESA WorldCover land cover classes⁷² in combination with MODIS BRDF/albedo products (https://modis.gsfc.nasa.gov/data/dataprod/mod43.php), allowing us to generate albedo values tailored to each land cover type.

The resulting PSA represents white-sky albedo, which corresponds to the proportion of diffuse (non-directional) solar radiation reflected by the surface under clear-sky conditions. This is an important distinction, as white-sky albedo better captures background reflectivity and is particularly relevant in radiation-based microclimate models such as SOLWEIG. It was used as training data for developing the PSA prediction model. This albedo information provides critical input for modeling urban radiative environments, particularly in vegetated areas.

U-Net network architecture

The classic U-Net model, known for its proficiency in image classification and pixel-wise regression tasks, was employed to tackle the three distinct challenges: classifying impervious surfaces, predicting impervious, and pervious surface albedo separately. U-Net⁵⁶ is recognized for its “U-shaped” design, featuring a contracting path to grasp context and a symmetric expanding path that ensures precise localization, making it ideal for dealing with high-resolution imagery like the NAIP dataset^73,74. A detailed depiction of the specific model configuration is illustrated in Fig. 5. For classification task, this study utilized a categorical output layer in the network, tailored to classify the various surface types within the cities. Then for regression task, the output layer was modified to predict continuous albedo values.

Preparation for impervious surfaces classification

For impervious surface classification, the U-Net model was trained to categorize each pixel, effectively distinguishing between impervious and pervious areas. The LA MULC dataset was first reclassified by combining its multiple land cover categories into two target categories. The pixel labels were manually assigned as either 0) null/background, 1) impervious, or 2) pervious classes. Corresponding LA NAIP images were then obtained in 2016 to maintain temporal consistency. The extent and resolution of the two datasets are aligned with each other. Subsequently, both the NAIP imagery and the reclassified surface raster were divided into image patches of 512 × 512 pixels, utilizing the Rasterio library in Python to facilitate this process. Given the dimensions of the cropped dataset did not always conform to exact multiples of 512, this study removed the image patches deviating from the 512 × 512 pixels. Therefore, this approach resulted in a comprehensive collection of 10,465 image patches of 4-band RGB-NIR and their associated single-band surfaces image patch, including the impervious, pervious and null value pixels.

Preparation for surface albedo prediction

To predict the ISA, the impervious roof albedo polygons were first used to clip the NAIP imagery, thereby generating NAIP roof data. This NAIP roof data was then divided into image patches of 512 × 512 pixels. For each patch, a corresponding initial albedo image patch of the same size (512 × 512 pixels) was created, with all pixel values initially set to zero. This preparatory step provided a blank canvas for accurately overlaying roof albedo data. During processing, roof albedo polygons were overlaid onto these zero-valued albedo patches. Whenever a pixel within the albedo patch intersected with a roof polygon, the specific albedo value of that roof was assigned to the corresponding pixel. This method ensured precise alignment between the NAIP roof patches and the roof albedo data, effectively capturing the albedo characteristics of impervious roof surfaces. Following this systematic approach, a total of 7,181 image patches were produced across four cities. Among them, 80% (5,743) of the samples were allocated for training and 20% (1,438) for validation.

For the PSA prediction, a multi-step processing pipeline was implemented using NAIP imagery, which was initially divided into 512 × 512 pixel patches. These image patches were first input into a previously developed impervious surfaces classification model to identify impervious areas. The resulting impervious masks were then combined with building footprint data to generate a refined single-band impervious dataset, where pixels with a value of “1” denoted impervious surfaces and “0” represented pervious areas. To isolate pervious regions, the dataset was inverted such that pervious areas were assigned a value of “1” and impervious areas a value of “0”. This pervious mask was then applied element-wise to the corresponding NAIP patches, producing isolated pervious NAIP image patches. For each pervious patch, its geographical extent was used to clip the corresponding 10-meter resolution Sentinel-2 albedo data, ensuring that only the albedo values within the relevant spatial boundaries were retained. The clipped Sentinel-2 data were subsequently resampled to match the 0.6-meter resolution of the NAIP imagery using bilinear interpolation. Finally, the resampled albedo values were multiplied by the pervious mask to generate accurate pervious albedo datasets, ensuring spatial alignment in both extent and resolution. Philadelphia was selected as the study site for training the PSA model due to its diverse range of pervious surface types, including rivers, parks, and bare earth, features that offer an ideal testbed for model development. The final dataset comprised 4,164 image patches, each with dimensions of 512 × 512 pixels.

Before being input into the U-Net model, the image patches were augmented through random vertical and horizontal flipping. To mitigate the shadow effects occurring at the edges of patches—artifacts often caused by abrupt transitions during image tiling⁷⁵, a buffer zone of 256 pixels was added to each patch, resulting in a dataset composed of 1024 × 1024 pixel patches. This buffer-based approach was uniformly applied to all previously prepared datasets. The buffer zones were subsequently cropped during the mosaicking process to generate the final surface albedo maps for each city^56,73.

Model training and validation

To this end, comprehensive datasets were prepared for three distinct tasks: classification of impervious surfaces, prediction of ISA and PSA. All the datasets were split into training and validation sets in an 8:2 ratio. Each model was trained over 100 epochs, using a learning rate of 0.0001 and a batch size of 4. These training parameters were finely tuned through preliminary hyperparameter optimization processes⁷⁶, which aimed at achieving an optimal balance between computational efficiency and the efficacy of the learning process. For optimization, the ADAM (Adaptive Moment Estimation) optimizer was selected due to its effectiveness in managing sparse gradients and its flexibility across various training scenarios.

The classification task’s objective was to minimize the cross-entropy loss, which is specifically designed for classification problems. The model with the best validation loss (i.e., validation loss of 0.22060 and an overall accuracy of 0.91987) was kept for prediction. On the other hand, the regression task aimed at predicting continuous albedo values, where the mean squared error (MSE) loss function was employed to guide the model towards minimizing the difference between predicted and actual albedo values. The ISA model at the best validation loss of 0.00041 was selected, for PSA model, the best MSE is 0.00024. All these models were implemented using Python and PyTorch 2.1, and trained on an Ubuntu Linux workstation with 128 G memory, Intel(R) Xeon(R) Gold 5320 CPU, and four NVIDIA A40 48 GB GPUs.

To validate the performance of the classification model, this study relied on several metrics (Table 2), including overall accuracy (OA), Precision, Recall, and F1-score. These metrics collectively provided a comprehensive evaluation of the model’s ability to accurately classify each pixel. For the regression model, the validation metrics included the coefficient of determination (R²), mean absolute error (MAE), and MSE, which offered insights into the model’s accuracy in predicting continuous albedo values. This rigorous approach to training and validation ensures that the models are both efficient and effective in addressing the unique challenges presented by each task.

Table 2 Assessment metrics for classification and regression performance.

Full size table

Prediction

For prediction, the NAIP imagery for each city was segmented into image patches of 1024 × 1024 pixels. To accommodate patches that did not conform to these dimensions due to variability in NAIP imagery sizes across cities, additional rows and columns filled with zero values were appended. This preprocessing step ensured that all patches adhered to the uniform input size required by the predictive models.

Initially, the impervious surface classification model was applied to identify impervious pixels within each image patch. To further improve the accuracy of the impervious surface mask, Microsoft building roof polygons were overlaid onto the classification results.

Following the generation of the comprehensive impervious mask, each NAIP image patch was clipped to extract impervious and pervious regions. These corresponding patches were then input into the ISA and PSA prediction models, respectively. Through this integrated prediction pipeline, comprising both classification and regression models, high-resolution ISA and PSA maps were produced for each city. The complete prediction process across all cities requires approximately one week of computation time.

Integrating impervious and pervious surface albedo

In the final step, after successfully predicting the ISA and PSA for each 512 × 512 image patch, this study integrated these albedo predictions to complement the high-resolution mapping of albedo. Each corresponding pair of ISA and PSA patches was merged. These integrated patches were then assembled through a mosaicking process to develop a city-wide albedo map.

This methodological approach allowed us to produce a detailed albedo map at sub-meter resolution encompassing both impervious and pervious surfaces across the entire urban landscape. By accurately representing the varied albedo across different urban surface types, this refined dataset provides an exceptional level of detail and precision. The comprehensive urban albedo map thus generated is invaluable for applications in microclimate modeling, environmental monitoring, and urban planning, offering insights into the radiative properties of urban environments and aiding in the development of strategies for sustainable urban management.

Data Records

The sub-meter resolution albedo dataset for 34 major U.S. cities is made publicly available through Figshare⁷⁷. The albedo data files are provided in GEOTIFF format. Each file corresponds to the respective city’s albedo measurements, formatted in the Pseudo-Mercator projection system. Details concerning the data size and specific attributes of each city’s dataset are summarized in Table 1. The urban surface albedo map for all the 34 cities is in Fig. 6. Additionally, this study has developed a visualization website to facilitate an intuitive examination of this data, accessible at https://shengaoyi.github.io/#/Albedopedia.

Technical Validation

UIS model performance

The performance metrics presented in Table 3 for the UIS classification model incorporate the benefits derived from the integration of building mask data, leveraging the U-Net architecture. This method achieved an F1 Score of 0.8844, highlighting a balanced precision and recall that are critical for the accurate classification of urban surfaces. The Overall Accuracy (OA) reached 0.9208, signifying that more than 92% of the classifications were correct, an indication of the model’s effectiveness across diverse urban settings. The model’s Precision, noted at 0.9222, demonstrates its ability to correctly identify impervious surfaces, enhanced significantly using building masks which ensure that areas such as buildings are consistently recognized as impervious. This approach helps in reducing false positives, where pervious areas might otherwise be misclassified. The Recall rate of 0.8495 also underscores the model’s capability in capturing most of the actual impervious surfaces present in the imagery, aided by the definitive outlines provided by the building masks. Collectively, these metrics underscore the reliability and accuracy of the integrating method in urban impervious surface classification.

Table 3 Performance metrics for UIS classification model.

Full size table

ISA model performance

Table 4 and Fig. 7 presents the performance of the ISA prediction models across four cities, comparing three different architectures: FCN, CNN, and U-Net. The results are broken down by city and include metrics such as the R², MAE, and MSE. On average, the U-Net model outperforms the other architectures with an R² value of 0.9028, suggesting an excellent fit to the observed data. It also exhibits the lowest average MAE and MSE (0.0057 and 0.0002, respectively), indicating its superior accuracy and consistency in predicting ISA.

Table 4 ISA model performance for the selected four cities.

Full size table

PSA model performance

Table 5 and Fig. 8 evaluates the performance of different models used for predicting PSA, specifically comparing the CNN, FCN, and U-Net architectures. The U-Net model again shows the best performance with the highest R² value of 0.9288, which indicates a strong predictive capability. It also has the lowest MAE (0.0069) and MSE (0.0002), confirming its precision and accuracy in estimating pervious surface albedo. The high R² values across all models demonstrate their effectiveness, but U-Net’s lower error rates highlight its greater reliability for detailed albedo prediction tasks.

Table 5 PSA model performance.

Full size table

These evaluations collectively demonstrate the advanced capabilities of deep learning architectures, especially U-Net, in handling complex tasks such as albedo prediction and impervious surface classification within urban environments. The high accuracy and precision of these models offer promising prospects for applications in urban planning and microclimate modeling. Figure 9 visually illustrates these capabilities by showcasing detailed mappings of both pervious and impervious surfaces and the urban surface albedo within a city, confirming the models’ effectiveness in accurately predicting diverse urban surface albedo.

Usage Notes

Figure 10 illustrates the distribution of surface albedo across 34 major U.S. cities, showcasing both the mean values and the variability (standard deviation) of albedo measurements within each city.

Statistics of the data

The albedo values vary significantly across the cities, with mean values ranging from as low as 0.099 in Seattle to as high as 0.199 in Las Vegas, indicating a broad spectrum of urban surface reflectivity. Standard deviations are generally small, underscoring a consistent albedo measurement within individual cities but highlighting a substantial difference between different urban environments. Cities like Miami and Las Vegas, known for their sunnier and drier climates, show higher mean albedo, suggesting urban surfaces that reflect more solar radiation. Conversely, cities with cooler and cloudier climates such as Seattle and San Francisco exhibit lower mean albedo. These statistics underscore the diverse thermal characteristics of these cities, which are critical for understanding urban heat management and planning for climate adaptation strategies.

Limitations

Despite the robust performance of the models and the comprehensive methodology applied in this study, several limitations must be acknowledged. One significant challenge is the accurate representation of urban canyon effects, particularly in densely built environments like New York City, where building shadows may lead to underestimation of albedo¹⁵. Additionally, the exclusion of snow from the analysis could introduce seasonal biases, as snow significantly alters surface reflectivity. The dataset was derived from summer and non-snow-covered scenes, which may limit its applicability for year-round modeling in certain climate zones. Moreover, while the model generalizes well across 34 cities, differences in urban morphology and local environmental characteristics may still influence model transferability in regions with significantly different landscape structures or spectral characteristics. These considerations should be considered when applying this dataset in new contexts.

Code availability

All used codes to generate the dataset are available in the following GitHub (https://github.com/ShengaoYi/Sub-meter-surfaces-albedo-prediction).

References

Dedieu, G., Deschamps, P. & Kerr, Y. Satellite estimation of solar irradiance at the surface of the earth and of surface albedo using a physical model applied to Metcosat Data. J. Appl. Meteorol. Climatol. 26, 79–87 (1987).
Article ADS Google Scholar
Henderson-Sellers, A. & Wilson, M. Surface albedo data for climatic modeling. Rev. Geophys. 21, 1743–1778 (1983).
Article ADS Google Scholar
Dickinson, R. E. Land surface processes and climate—surface albedos and energy balance. In Advances in geophysics, vol. 25, 305–353 (Elsevier, 1983).
Liang, S., Wang, K., Zhang, X. & Wild, M. Review on estimation of land surface radiation and energy budgets from ground measurement, remote sensing and model simulations. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 3, 225–240 (2010).
Article ADS Google Scholar
Taha, H., Akbari, H., Rosenfeld, A. & Huang, J. Residential cooling loads and the urban heat island—the effects of albedo. Build. environment 23, 271–283 (1988).
Article Google Scholar
Yang, J., Wang, Z.-H. & Kaloush, K. E. Environmental impacts of reflective materials: Is high albedo a ‘silver bullet’ for mitigating urban heat island? Renew. Sustain. Energy Rev. 47, 830–843 (2015).
Article CAS Google Scholar
Morini, E., Touchaei, A. G., Castellani, B., Rossi, F. & Cotana, F. The impact of albedo increase to mitigate the urban heat island in Terni (Italy) using the WRF model. Sustainability 8, 999 (2016).
Article Google Scholar
Tahooni, A., Kakroodi, A. & Kiavarz, M. Monitoring of land surface albedo and its impact on land surface temperature (LST) using time series of remote sensing data. Ecol. Informatics 75, 102118 (2023).
Article Google Scholar
Saher, R., Stephen, H. & Ahmad, S. Effect of land use change on summertime surface temperature, albedo, and evapotranspiration in Las Vegas Valley. Urban Clim. 39, 100966 (2021).
Article Google Scholar
Trlica, A., Hutyra, L., Schaaf, C., Erb, A. & Wang, J. Albedo, land cover, and daytime surface temperature variation across an urbanized landscape. Earth’s Futur. 5, 1084–1101 (2017).
Article ADS Google Scholar
Smith, I. A., Fabian, M. P. & Hutyra, L. R. Urban green space and albedo impacts on surface temperature across seven united states cities. Sci. The Total. Environ. 857, 159663 (2023).
Article CAS Google Scholar
Yi, S. et al. Assessing the differential impact of vegetated and built-up areas on heat exposure environment: A case study of Los Angeles. Build. Environ. 112538 (2025).
Lopez-Cabeza, V. P., Alzate-Gaviria, S., Diz-Mellado, E., Rivera-Gomez, C. & Galan-Marin, C. Albedo influence on the microclimate and thermal comfort of courtyards under mediterranean hot summer climate conditions. Sustain. Cities Soc. 81, 103872 (2022).
Article Google Scholar
Salata, F., Golasi, I., de Lieto Vollaro, A. & de Lieto Vollaro, R. How high albedo and traditional buildings’ materials and vegetation affect the quality of urban microclimate. A case study. Energy Build. 99, 32–49 (2015).
Article Google Scholar
Salvati, A. et al. Impact of reflective materials on urban canyon albedo, outdoor and indoor microclimates. Build. Environ. 207, 108459 (2022).
Article Google Scholar
Middel, A., Turner, V. K., Schneider, F. A., Zhang, Y. & Stiller, M. Solar reflective pavements—a policy panacea to heat mitigation? Environ. research letters 15, 064016 (2020).
Article ADS Google Scholar
Hess, J. J. et al. Public health preparedness for extreme heat events. Annu. Rev. Public Heal. 44, 301–321 (2023).
Article Google Scholar
Li, Y., Ren, T., Kinney, P. L., Joyner, A. & Zhang, W. Projecting future climate change impacts on heat-related mortality in large urban areas in China. Environ. research 163, 171–185 (2018).
Article ADS CAS Google Scholar
Chow, W. T., Chuang, W.-C. & Gober, P. Vulnerability to extreme heat in metropolitan phoenix: spatial, temporal, and demographic dimensions. The Prof. Geogr. 64, 286–302 (2012).
Article Google Scholar
Kenny, G. P., Yardley, J., Brown, C., Sigal, R. J. & Jay, O. Heat stress in older individuals and patients with common chronic diseases. Cmaj 182, 1053–1060 (2010).
Article PubMed PubMed Central Google Scholar
Osberghaus, D. & Abeling, T. Heat vulnerability and adaptation of low-income households in Germany. Glob. Environ. Chang. 72, 102446 (2022).
Article Google Scholar
Harlan, S. L., Brazel, A. J., Prashad, L., Stefanov, W. L. & Larsen, L. Neighborhood microclimates and vulnerability to heat stress. Soc. science & medicine 63, 2847–2863 (2006).
Article Google Scholar
Sen, J. & Nag, P. K. Human susceptibility to outdoor hot environment. Sci. total environment 649, 866–875 (2019).
Article ADS CAS Google Scholar
Moda, H. M., Filho, W. L. & Minhas, A. Impacts of climate change on outdoor workers and their safety: some research priorities. Int. journal environmental research public health 16, 3458 (2019).
Article PubMed Google Scholar
Markolf, S. A., Hoehne, C., Fraser, A., Chester, M. V. & Underwood, B. S. Transportation resilience to climate change and extreme weather events–beyond risk and robustness. Transp. policy 74, 174–186 (2019).
Article Google Scholar
Clark, S. S., Chester, M. V., Seager, T. P. & Eisenberg, D. A. The vulnerability of interdependent urban infrastructure systems to climate change: could Phoenix experience a Katrina of extreme heat? Sustain. Resilient Infrastructure 4, 21–35 (2019).
Article Google Scholar
Kjellstrom, T., Kovats, R. S., Lloyd, S. J., Holt, T. & Tol, R. S. The direct impact of climate change on regional labor productivity. Arch. environmental & occupational health 64, 217–227 (2009).
Article Google Scholar
Benz, S. A. & Burney, J. A. Widespread race and class disparities in surface urban heat extremes across the United States. Earth’s Futur. 9, e2021EF002016 (2021).
Article ADS Google Scholar
Gronlund, C. J. Racial and socioeconomic disparities in heat-related health effects and their mechanisms: a review. Curr. epidemiology reports 1, 165–173 (2014).
Article Google Scholar
Crook, J. & Forster, P. Comparison of surface albedo feedback in climate models and observations. Geophys. Res. Lett. 41, 1717–1723 (2014).
Article ADS Google Scholar
Yang, S., Wang, L. L., Stathopoulos, T. & Marey, A. M. Urban microclimate and its impact on built environment–a review. Build. Environ. 110334 (2023).
Taha, H., Sailor, D. & Akbari, H. High-albedo materials for reducing building cooling energy use. Lawrence Berkeley Natl. Lab. (1992).
Xu, X., Gregory, J. & Kirchain, R. The impacts of surface albedo on climate and building energy consumption: Review and comparative analysis. In Transportation Research Board 95th Annual Meeting. Paper submitted for presentation (2016).
Touchaei, A. G. & Akbari, H. Evaluation of the seasonal effect of increasing albedo on urban climate and energy consumption of buildings in Montreal. Urban Clim. 14, 278–289 (2015).
Article Google Scholar
Baniassadi, A., Sailor, D. J., Crank, P. J. & Ban-Weiss, G. A. Direct and indirect effects of high-albedo roofs on energy consumption and thermal comfort of residential buildings. Energy Build. 178, 71–83 (2018).
Article Google Scholar
Vahmani, P. & Ban-Weiss, G. A. Impact of remotely sensed albedo and vegetation fraction on simulation of urban climate in WRF-urban canopy model: A case study of the urban heat island in Los Angeles. J. Geophys. Res. Atmospheres 121, 1511–1531 (2016).
Article ADS Google Scholar
Oswald, S. M. et al. Coupling of urban energy balance model with 3-d radiation model to derive human thermal (dis) comfort. Int. journal biometeorology 63, 711–722 (2019).
Article ADS Google Scholar
Lindberg, F., Holmer, B. & Thorsson, S. Solweig 1.0–modelling spatial variations of 3d radiant fluxes and mean radiant temperature in complex urban settings. Int. journal biometeorology 52, 697–713 (2008).
Article ADS Google Scholar
Hautecœur, O. & Leroy, M. M. Surface bidirectional reflectance distribution function observed at global scale by POLDER/ADEOS. Geophys. Res. Lett. 25, 4197–4200 (1998).
Article ADS Google Scholar
Li, Z. & Garand, L. Estimation of surface albedo from space: A parameterization for global application. J. Geophys. Res. Atmospheres 99, 8335–8350 (1994).
Article ADS Google Scholar
Liang, S. Narrowband to broadband conversions of land surface albedo I: Algorithms. Remote. sensing environment 76, 213–238 (2001).
Article ADS Google Scholar
Geiger, B., Carrer, D., Franchisteguy, L., Roujean, J.-L. & Meurey, C. Land surface albedo derived on a daily basis from Meteosat Second Generation observations. IEEE Transactions on Geosci. Remote. Sens. 46, 3841–3856 (2008).
Article ADS Google Scholar
He, T. et al. Evaluating land surface albedo estimation from Landsat MSS, TM, ETM+, and OLI data based on the unified direct estimation approach. Remote. Sens. Environ. 204, 181–196 (2018).
Article ADS Google Scholar
Bonafoni, S. & Sekertekin, A. Albedo retrieval from sentinel-2 by new narrow-to-broadband conversion coefficients. IEEE Geosci. Remote. Sens. Lett. 17, 1618–1622 (2020).
Article ADS Google Scholar
Schaaf, C. B. et al. First operational BRDF, albedo nadir reflectance products from MODIS. Remote. sensing Environ. 83, 135–148 (2002).
Article ADS Google Scholar
Lucht, W., Schaaf, C. B. & Strahler, A. H. An algorithm for the retrieval of albedo from space using semiempirical BRDF models. IEEE Transactions on Geosci. Remote. sensing 38, 977–998 (2000).
Google Scholar
Barnsley, M. et al. Characterizing the spatial variability of broadband albedo in a semidesert environment for MODIS validation. Remote. Sens. Environ. 74, 58–68 (2000).
Article ADS Google Scholar
Lucht, W. & Lewis, P. Theoretical noise sensitivity of BRDF and albedo retrieval from the EOS-MODIS and MISR sensors with respect to angular sampling. Int. journal remote sensing 21, 81–98 (2000).
Article ADS Google Scholar
Pokrovsky, I., Pokrovsky, O. & Roujean, J.-L. Development of an operational procedure to estimate surface albedo from the SEVIRI/MSG observing system by using POLDER BRDF measurements: II. Comparison of several inversion techniques and uncertainty in albedo estimates. Remote. Sens. Environ. 87, 215–242 (2003).
Article ADS Google Scholar
Pokrovsky, O. & Roujean, J.-L. Land surface albedo retrieval via kernel-based BRDF modeling: I. Statistical inversion method and model comparison. Remote. Sens. Environ. 84, 100–119 (2003).
Article ADS Google Scholar
Gao, F. et al. Angular effects and correction for medium resolution sensors to support crop monitoring. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 7, 4480–4489 (2014).
Article ADS Google Scholar
Shuai, Y., Masek, J. G., Gao, F. & Schaaf, C. B. An algorithm for the retrieval of 30-m snow-free albedo from Landsat surface reflectance and MODIS BRDF. Remote. Sens. Environ. 115, 2204–2216 (2011).
Article ADS Google Scholar
Franch, B., Vermote, E. & Claverie, M. Intercomparison of Landsat albedo retrieval techniques and evaluation against in situ measurements across the US SURFRAD network. Remote. Sens. Environ. 152, 627–637 (2014).
Article ADS Google Scholar
Wang, C., Wang, Z.-H., Kaloush, K. E. & Shacat, J. Cool pavements for urban heat island mitigation: A synthetic review. Renew. Sustain. Energy Rev. 146, 111171 (2021).
Article Google Scholar
Santamouris, M. Using cool pavements as a mitigation strategy to fight urban heat island—a review of the actual developments. Renew. Sustain. Energy Rev. 26, 224–240 (2013).
Article Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
Wu, S. et al. Satellite observations reveal a decreasing albedo trend of global cities over the past 35 years. Remote. Sens. Environ. 303, 114003 (2024).
Article Google Scholar
Voogt, J. A. & Oke, T. R. Thermal remote sensing of urban climates. Remote. sensing environment 86, 370–384 (2003).
Article ADS Google Scholar
Weng, Q. Thermal infrared remote sensing for urban climate and environmental studies: Methods, applications, and trends. ISPRS J. photogrammetry remote sensing 64, 335–344 (2009).
Article ADS Google Scholar
Li, X. Investigating the spatial distribution of resident’s outdoor heat exposure across neighborhoods of Philadelphia, Pennsylvania using urban microclimate modeling. Sustain. Cities Soc. 72, 103066 (2021).
Article Google Scholar
Sima, L., Liu, Y., Zhang, J. & Shang, X. Research on Summer Hourly Climate-Influencing Factors in Suburban Areas of Cities in CFA Zone—Taking Chengdu, China as an Example. Buildings 14, 3083 (2024).
Article Google Scholar
Li, X., Wang, G., Zaitchik, B., Hsu, A. & Chakraborty, T. Sensitivity and vulnerability to summer heat extremes in major cities of the United States. Environ. Res. Lett. 19, 094039 (2024).
Article Google Scholar
Leslie, C. R., Serbina, L. O. & Miller, H. M. Landsat and agriculture—Case studies on the uses and benefits of Landsat imagery in agricultural monitoring and production. Tech. Rep., US Geological Survey (2017).
Maxwell, A. E., Warner, T. A., Vanderbilt, B. C. & Ramezan, C. A. Land cover classification and feature extraction from National Agriculture Imagery Program (NAIP) Orthoimagery: A review. Photogramm. Eng. & Remote. Sens. 83, 737–747 (2017).
Article Google Scholar
Li, X. & Wang, G. Examining runner’s outdoor heat exposure using urban microclimate modeling and GPS trajectory mining. Comput. Environ. Urban Syst. 89, 101678 (2021).
Article Google Scholar
Lin, J., Wang, Q. & Li, X. Socioeconomic and spatial inequalities of street tree abundance, species diversity, and size structure in New York City. Landsc. Urban Plan. 206, 103992 (2021).
Article Google Scholar
Ban-Weiss, G. A., Woods, J. & Levinson, R. Using remote sensing to quantify albedo of roofs in seven California cities, Part 1: Methods. Sol. Energy 115, 777–790 (2015).
Article ADS Google Scholar
Ban-Weiss, G. A., Woods, J., Millstein, D. & Levinson, R. Using remote sensing to quantify albedo of roofs in seven California cities, Part 2: Results and application to climate modeling. Sol. energy 115, 791–805 (2015).
Article ADS Google Scholar
Drusch, M. et al. Sentinel-2: Esa’s optical high-resolution mission for GMES operational services. Remote. sensing Environ. 120, 25–36 (2012).
Article ADS Google Scholar
Gorelick, N. et al. Google earth engine: Planetary-scale geospatial analysis for everyone. Remote. sensing Environ. 202, 18–27 (2017).
Article ADS Google Scholar
Lin, X. et al. Estimating 10-m land surface albedo from Sentinel-2 satellite observations using a direct estimation approach with Google Earth Engine. ISPRS J. Photogramm. Remote. Sens. 194, 1–20 (2022).
Article ADS Google Scholar
Zanaga, D. et al. ESA WorldCover 10 m 2021 v200 (2022).
Wagner, F. H. et al. Sub-meter tree height mapping of California using aerial images and LiDAR-informed U-Net model. Remote. Sens. Environ. 305, 114099 (2024).
Article Google Scholar
de Roda Husman, S. et al. A high-resolution record of surface melt on Antarctic ice shelves using multi-source remote sensing data and deep learning. Remote. Sens. Environ. 301, 113950 (2024).
Article Google Scholar
Li, X. & Wang, G. GPU parallel computing for mapping urban outdoor heat exposure. Theor. Appl. Climatol. 145, 1101–1111 (2021).
Article ADS Google Scholar
Bengio, Y. Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition, 437–478 (Springer, 2012).
Yi, S., Li, X., Liu, Y., Dong, X. & Tu, W. A sub-meter resolution urban surface albedo dataset for 34 U.S. cities based on deep learning. Dataset https://doi.org/10.6084/m9.figshare.27850965 (2025).
Article Google Scholar

Download references

Acknowledgements

This work is supported by the U.S. National Science Foundation Award (No. 2314709).

Author information

Authors and Affiliations

Department of City and Regional Planning, University of Pennsylvania, Philadelphia, PA, 19104, USA
Shengao Yi & Xiaojiang Li
Penn Institute for Urban Research, University of Pennsylvania, Philadelphia, PA, 19104, USA
Shengao Yi
Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, 19104, USA
Shengao Yi
Jacobs Technion-Cornell Institute at Cornell Tech, Cornell University, New York, NY, USA
Yixuan Liu
Landscape Ecology Lab, Geography Department, Humboldt-Universität zu Berlin, Berlin, 10099, Germany
Xinyu Dong
Guangdong Key Laboratory for Urban Informatics, Guangdong-Hong Kong-Macao Joint Laboratory for Smart Cities, and Shenzhen Key Laboratory of Spatial Information Smart Sensing and Services, and Department of Urban Informatics, School of Architecture and Urban Planning, Shenzhen University, Shenzhen, 518060, China
Wei Tu

Authors

Shengao Yi
View author publications
Search author on:PubMed Google Scholar
Xiaojiang Li
View author publications
Search author on:PubMed Google Scholar
Yixuan Liu
View author publications
Search author on:PubMed Google Scholar
Xinyu Dong
View author publications
Search author on:PubMed Google Scholar
Wei Tu
View author publications
Search author on:PubMed Google Scholar

Contributions

Shengao Yi: Writing – review & editing, Writing – original draft, Visualization, Validation, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Xiaojiang Li: Writing – review & editing, Supervision, Project administration, Methodology, Funding acquisition, Data curation, Conceptualization, Resources. Yixuan Liu: Writing – review & editing, Visualization, Methodology. Xinyu Dong: Writing – review & editing, Visualization. Wei Tu: Writing – review & editing, Resources. All authors reviewed the manuscript.

Corresponding author

Correspondence to Shengao Yi.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Yi, S., Li, X., Liu, Y. et al. A sub-meter resolution urban surface albedo dataset for 34 U.S. cities based on deep learning. Sci Data 12, 789 (2025). https://doi.org/10.1038/s41597-025-05109-2

Download citation

Received: 01 December 2024
Accepted: 28 April 2025
Published: 14 May 2025
DOI: https://doi.org/10.1038/s41597-025-05109-2

Subjects

Abstract

Similar content being viewed by others

The impact of increasing urban surface albedo on outdoor air and surface temperatures during summer in newly developed areas

Revolutionizing urban mapping: deep learning and data fusion strategies for accurate building footprint segmentation

Extraction and spatiotemporal analysis of impervious surfaces in Chongqing based on enhanced DeepLabv3+

Background & Summary

Methods

Study area and data sources

National Agriculture Imagery Program (NAIP) imagery

Impervious roof albedo data

Sentinel-2 surface reflectance product

Land cover data

Building footprints masking

Pervious surface albedo estimation

U-Net network architecture

Preparation for impervious surfaces classification

Preparation for surface albedo prediction

Model training and validation

Prediction

Integrating impervious and pervious surface albedo

Data Records

Technical Validation

UIS model performance

ISA model performance

PSA model performance

Usage Notes

Statistics of the data

Limitations

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links