Abstract
Improvements in high-resolution satellite remote sensing and computational advancements have sped up the development of global datasets that delineate urban land, crucial for understanding climate risks in our increasingly urbanizing world. Here, we analyze urban land cover patterns across spatiotemporal scales from several such current-generation products. While all the datasets show a rapidly urbanizing world, with global urban land nearly tripling between 1985 and 2015, there are substantial discrepancies in urban land area estimates among the products influenced by scale, differing urban definitions, and methodologies. We discuss the implications of these discrepancies for several use cases, including for monitoring urban climate hazards and for modeling urbanization-induced impacts on weather and climate from regional to global scales. Our results demonstrate the importance of choosing fit-for-purpose datasets for examining specific aspects of historical, present, and future urbanization with implications for sustainable development, resource allocation, and quantification of climate impacts.
Similar content being viewed by others
Introduction
Urbanization, the global shift of rural to urban societies, leads to replacement of natural land, as well as cropland, with roads, buildings, pavements, parks, etc1. These land use/land cover transitions, combined with anthropogenic activities in cities, together impact local energy, water, and carbon budgets2,3. Currently, over half of the global human population lives in urban areas, which is expected to increase to around 68% by 20504. These urbanization estimates are defined based on population thresholds, with no standard threshold across countries5. Moreover, these population-based definitions do not necessarily correspond to the physical extent of urbanized land, which primarily modulates local to regional climates2,3, due to differing population densification patterns in different regions of the world.
The proliferation of global satellite imagery and remote sensing techniques has led to estimates of urbanization from a physical perspective using spatially continuous observations of the spectral reflectance and emissions from the Earth’s surface6. Both physical and population-based estimates of urban land have a wide range of applications, from quantifying risks to urban populations7,8,9, to providing boundary constraints for isolating urban climate impacts8,10,11, to being incorporated as surface inputs in weather and climate models across scales12,13,14,15. In the last decade in particular, there have been multiple estimates of urban land, or some proxy for urbanization, across space and time16,17,18,19,20. These developments have paralleled the rise of cloud computing capabilities and satellite missions with measurements at finer spatial scales. There are currently at least four 10 m resolution global land use land cover products, which include an urban class, the earliest of which having been released in 201921, and several urban-specific datasets that span multiple decades16,22.
Due to differences in data sources, methods, and even definitions, there has traditionally been large discrepancies in estimates of urban land from datasets23. Previous studies that have explored these discrepancies have focused on earlier-generation datasets that were generally coarser (~1 km), in line with the resolutions of the commonly deployed Earth observing satellites of the time, and did not include enough observations to provide time series of urban expansion23,24. More recent comparisons of higher resolution datasets, primarily in the remote sensing literature, are limited because they are either restricted to regional extents25, or focus on comparing product accuracies, not area estimates and product typologies and their implications19. One recent study26 examined the discrepancies between six 30 m global urban land products, though the main implication of these differences examined was for future urban projections. Here, we provide a comprehensive comparison of almost all (see “Methods” for selection criteria) the medium to fine resolution (100 m to 10 m) global urban datasets currently available (Table S1), showing that the definition of “urban” remains a critical issue across these datasets, particularly in the newer 10 m resolution products. More importantly, we examine the consequences of the choice of dataset on a few common use cases relevant for examining urban climate change and its human impacts. For these use cases, the choice of dataset can have large influence on magnitude of urbanization impacts and, in some cases, even the direction of the urban climate signal.
Results
Country level urban land and its variability
Large variabilities in the degree of present-day urban land (for the year 2019 or 2020; see “Methods”) are seen across countries (Fig. 1a) based on eight global datasets. China is the country with the most urban land (264,403 km2 covering 2.82% of its total area based on the eight-product mean), followed by the United States (183,735 km2; 1.94%), India (85,760 km2; 2.77%), and Russia (59,311 km2; 0.35%) (Fig. 1b). Overall, Vatican City and Singapore show the highest percentages of urbanization (eight-product mean of 79.87% and 53.7%, respectively). Ignoring uninhabited territories, on the low end, there are several overseas island territories, such as Cocos, Midway, and Pitcairn Islands, with negligible urban land detected by these satellite-derived products. The present-day global urban land percentage varies between 0.52% in the World Settlement Footprint (WSF) 2019 dataset20 to a 4 times higher estimate of 2.07% according to the Esri Land Cover product (for the year 2020; 1.93% for 2019)27. The eight-product mean global urban percentage is ~0.95% (1.27 million km2).
a shows mean urban percentage based on eight global estimates of urban land by country. b shows the overall urban percentage from these eight datasets for the ten countries with the highest mean urban area (increasing downwards). c shows the coefficient of variation (standard deviation divided by mean) among those urban estimates. For (a, c), the respective values are annotated for some of the bigger countries for context. The legend value ranges exclude the upper bound.
Since countries have different baseline levels of urbanization (Fig. 1a), we calculate the coefficient of variation or Normalized Root-Mean-Square Deviation (standard deviation across products divided by their mean expressed as a percentage) to standardize the degree of disagreement between these eight data products (Fig. 1c). Larger disagreements are seen for Greenland, countries in East Africa (Ethiopia, Kenya, Uganda, and Tanzania), Russia, countries in south Asia (Afghanistan, Pakistan, India, and Myanmar), Paraguay, etc. Better agreements between datasets are seen for Brazil, Argentina, Japan, most countries in Western Europe, parts of Central and South Africa, and Canada.
Regional- to city-scale differences across datasets
We also compare the present-day estimates of urban land for four distinct regions in the world: the Great Lakes and Mid Atlantic regions in North America, and the Indo-Gangetic and Yangtze River Basins in Asia (Fig. 2a). While all four of these regions are heavily urbanized, in the last few decades, the first two have shown stable urbanization levels, and the latter two have shown significant urban growth (see next section for analyses of urban growth). For present-day urban percentage, higher variabilities are seen for the Indo-Gangetic and Yangtze River Basins (coefficients of variation of 88.1% and 83.3%, respectively) than for the Great Lakes and Mid Atlantic regions (56.5% and 62.8%, respectively). We also calculate the coefficient of variation between the eight present-day estimates for 0.9° × 1.25° grids over the Earth’s surface (Fig. 2b). This approach is similar to that of Mu et al.26 and avoids the variabilities in how regions should be defined beyond geopolitical boundaries. Although we use a different (partially overlapping) set of urban data products compared to the Mu et al.26 study, there are some common regions with disagreements between datasets in both studies, namely Southeast and Central Asia. We also find large disagreements in East Africa (Fig. 2b). The differences between these products are evident at all spatial scales, from global (for our entire planet) to national (by country) to the grids described above to local (for individual urban agglomerations). For example, large local scale differences are evident for the highly urbanized Delhi Metropolitan Area in India (Fig. 3a) and over the Shanghai Metropolitan Area in China (Fig. 3b). Since fully exploring these differences between datasets for all regions of the world is not possible here, we developed a web app for this purpose (https://ee-tc25.projects.earthengine.app/view/urbancomparison).
a Shows the urban percentage across eight datasets for four selected regions in the world. The extent and location of these regions is shown in the inset. b Shows the coefficient of variation (standard deviation divided by mean) among those urban estimates for 0.9° × 1.25° grids. The legend value ranges exclude the upper bound.
The disagreements among the data products reflect differences in methods, inputs, and native resolutions28,29. However, a key difficulty in making any apples-to-apples comparisons between these datasets is that, while all the products represent some aspect of physical urbanization, they define “urban” differently. These specific definitions are already baked into the training data (for supervised learning methods), accuracy estimates, and pre- and post-processing steps. For example, among the four 10 m resolution present-day estimates of urban land—WSF 201920, Esri Land Cover30, European Space Agency (ESA) WorldCover31, and Dynamic World32—the Dynamic World dataset calls this class “Built Area” and includes urban vegetation and green space in that definition, while the ESA WorldCover calls the class “Built-up” and explicitly excludes urban vegetation in their class definition (Table S1). Interestingly, while the Esri Land Cover dataset does not mention inclusion of urban trees, it generally shows much higher urban land percentages across scales than ESA WorldCover (Figs. 1b, 2a, 3, 4), even though both are based on Sentinel-2 data. Some of the differences between these three datasets are related to methodology, in that the Dynamic World and Esri Land Cover datasets use convolutional neural networks that consider contextual information in the classification through the use of convolution kernels, while the ESA WorldCover uses a random forest approach with each pixel classified independently22. Another difference is due to the choice of minimum mapping unit, which is 50 m × 50 m for Dynamic World32 and therefore necessitates a mosaic of built and natural surfaces in areas labeled as “Built Area”. Finally, the WSF 2019 dataset20 is for human settlements and excludes roads. For readability, and in line with how some of these products have been used in the scientific literature as a proxy for physical urbanization33,34,35, we refer to all of them as “urban” in the present study.
Urban growth over time
The explosion of medium-resolution global urban products, and global land cover datasets in general, has been largely made possible due to the free release of the Landsat archive in 200836. Consequently, there are several long-term estimates of urban land at the Landsat resolution (30 m) starting from the 1980s. In contrast, the first-generation global urban land cover products were generally limited to the Moderate Resolution Imaging Spectroradiometer (MODIS) resolution of 250–500 m and starting around the year 200124. Some of these multi-year urban land cover products do not extend till 2019/2020 and thus were not included in the earlier comparison of present-day urbanization. In total, we examine twelve global data products, including the complete time series (when multiple years are available) of the eight products considered earlier. The four new datasets considered are the Global Artificial Impervious Area (GAIA)18, World Settlement Evolution (WSE)20, the Copernicus Global Land Service (CGLS) product37, and the Global Urban Footprint (GUF)38. All long-term urban datasets show large global urban growth over time during their respective time spans (Fig. 4a). For, GISA (Global Impervious Surface Area)39, GAIA, and WSE—the three datasets with longest time series—global urban percentage increased by 297.4%, 123.4%, and 111.2%, respectively (three-product mean of 177.3%), for the 1985–2015 common period. This pattern of rapid urban expansion is consistently echoed across all continents and for both absolute area (Fig. S1) and urban percentage (Figs. 4, S2a; Fig. S2 is for the nations not in the continents examined in this main text), with Africa, Asia, and South America showing a notable rise (three- product means of 226.2%, 425.3%, and 186.5%, respectively), although from a lower baseline compared to North America and Europe. Note that we primarily focus on urban percentage throughout this analysis and the manuscript, rather than the absolute area, since the former is more intuitive for a wide range of audiences and allows easy comparisons for the extent of urbanization between regions with wide variations in area.
The impact of urban definition and methodology is also reflected in the variability of the change in urban land percentage over time across datasets. For instance, the percentage of urban land in the WSF 2019 dataset is much lower than the values in WSF 201540 for all continents, even though there should have been some urbanization between 2015 and 2019. This is for two reasons: (1) the WSF 2019 dataset uses Sentinel-2 instead of Landsat 8, the latter being much finer (10 m versus 30 m); the scale effect29 and (2) the WSF 2019 uses ancillary data to mask out roads to focus only on pixels where people live (Table S1). Another evident difference in time series arises when comparing the MODIS data41 with the others. The global percentage of urban land increases by only 5.5% between 2001 and 2015 according to the MODIS Land Cover; yet the GISA/GAIA/WSE mean change for the same period is around 40.2%, almost an order of magnitude higher. The low estimate of urban expansion in MODIS is a function of its definition of urban as a minimum of 30% impervious at the 500 m scale42. Conceptually, this means that a MODIS pixel starts being classified as urban at a lower percentage than other datasets, which generally consider the dominant land cover (which can exceed 30% of the area) as the class of a pixel or use higher impervious percentage thresholds (50% for GAIA18, for instance), and that a pixel remains urban over a much larger range of values (from 30% to 100%)43. Another anomaly in the time series of urban percentage is seen over Europe for the European Space Agency Climate Change Initiative (ESA CCI) product, with rapid expansion between 2000 and 2006 (Fig. 4d). This issue with the ESA CCI dataset has also been noted in other studies44,45 and is related to a change in input dataset from the Medium Resolution Imaging Spectrometer (MERIS) baseline27 to one with a resolution coarser than 300 m before 2003. Although it might seem reasonable to expect that the most recent, circa 2020, products, developed using finer resolution satellite imagery and benefiting from the methodological advancements of the past decades, would be able to constrain global to regional urban land the most, this is not the case. In fact, when considering the overall time series of urban percentage from global to country scales (Figs. 4, 5), the largest deviations are seen for the most recent years, after the 10 m resolution products are included. As described in the previous sections, these high-resolution products represent different aspects and features of urban land due to both methodology and prescribed typology. Since the resolution of Sentinel-2 can actually resolve those features, this is probably resulting in such large discrepancies between these most recent datasets. Other key specifics of the differences in urban typologies in these global data products are provided in Table S1.
Implications for observational and modeling applications
Global estimates of urban land have become critical for both science and applications. However, most use cases of these datasets do not simultaneously consider multiple estimates due to a combination of legacy, convenience, and potential redundancies. Here, we examine how the choice of dataset may lead to biases for some common use cases. These use cases are divided into: (1) direct incorporation of these products to generate derived datasets, (2) combining global urban datasets with estimates of hazards to quantify urban-specific environmental hazards and exposure, (3) using these products as surface constraints in process-based models, and (4) estimating future urban land calibrated against historical datasets.
(1) Generating derived datasets: First, global urban land cover datasets are used as inputs for other derived products. For instance, the most commonly used MODIS land surface temperature (LST) products for urban climate studies (MOD11 and MYD1146) use a classification-based emissivity method47 with the pixel emissivity taken from a look-up table and the class of the pixel according to the MODIS Land Cover product. This needs to be done because LST and surface emissivity cannot be analytically separated using only thermal observations48. Similarly, the MODIS evapotranspiration products mask out any pixels that are classified as urban in the MODIS Land Cover data since the empirical model used is not calibrated for urban surfaces49. There are also some inter-dependencies between different global urban land cover datasets. The CGLS product uses WSF 2015 as the training data for its “Urban / Built up” class (Table S1)37, while “Urban Areas”, as classified within the ESA CCI product, are identified based on the GUF dataset38 as well as the Global Human Settlement Layer (GHSL)50 datasets (Fig. S3). Other composite urban datasets, such as global annual urban dynamics (GAUD) dataset34, have also been generated by combining various existing estimates (GUF, GAIA, GHSL, etc.).
(2) Quantifying urban-specific environmental hazards and exposure: Second, various urban land cover datasets are used as inputs for examining urban climate impacts and city-level environmental hazards and exposure. The choice of dataset influences the magnitude of these estimates. For the surface urban heat island (SUHI) intensity, the impact of urbanization on local surface warming11, (Fig. 6a; see “Methods”), larger values of absolute coefficient of variation are seen for urban clusters in the Middle East, parts of India, southern and eastern Africa, and Southwestern United States. Although most datasets capture the well-established impacts of background climate on SUHI and its seasonality51, the choice of dataset can have larger impacts in arid regions during summer and for polar climate in winter i.e., when the actual SUHI signal is small, with inconsistencies seen for even the sign of the SUHI (Fig. S4). Long-term changes in urban land are often combined with ancillary datasets to examine land use/land cover transitions35 and exposure to environmental hazards over time8,9,11,33. The rates of change over time would depend on the choice of dataset (Figs. 4, 5), whereas most studies typically use a single product. For instance, ESA CCI Land Cover, GHSL, MODIS Land Cover, WSE have been individually used in these types of studies8,9,11,33. Andreadis et al.33 and Rentschler et al.8 both examined increased urbanization in flood-prone areas using two different urban land cover datasets (GAUD and WSE, respectively); therefore finding different magnitudes of these changes. We replicate a comparison of urban growth in flood plains52 between 1985 and 2015 based on GAIA, WSE, and GISA here (Fig. 6b), with particularly large differences seen for Asia and Oceania. These larger differences reflect the stronger urban growth in the GAIA dataset (compared to the other two) in these continents (Fig. 4c, f), with urban area increasing by 1395.6% and 998.5% between 1985 and 2015 in this product for Oceania and Asia, respectively, compared to 244.1% and 332.2% increase seen in the GISA dataset. The differences in urban growth between products for the other continents are not as large (303.4% and 476.2%,161.2% and 303.9%, and 173.5% and 229.4% for GISA and GAIA for Africa, Europe, and North America, respectively). Sometimes the choice of dataset can lead to artifacts due to mismatch between two products. For instance, Mentaschi et al.9 combined the GHSL 2018 dataset with the MYD11 LST product to estimate intra-urban SUHI extremes. However, as noted earlier, this LST product is constrained by the MODIS Land Cover through the classification-based emissivity method47. As such, we would expect artifacts in LST for a proportion of pixels due to the GHSL data considering a pixel as urban while the emissivity in the LST product being defined for a rural surface (and vice versa). Similar artifacts would be expected for other combinations of MODIS LST with non-MODIS urban land cover estimates53,54 or when MODIS LST is used to validate simulations from models that use different urban emissivity constraints48,51,55.
a Shows the absolute coefficient of variation in calculated surface urban heat island intensity during 2018–2022 summer for around 10,000 global urban clusters from eight urban land cover datasets. b Shows estimated change in urban land in flood plains for the world and all continents between 1985 and 2015 from three long-term urban estimates.
(3) Constraining model simulations: Third, urban land cover products are incorporated into process-based models, including weather and climate models, as surface input datasets2,14,27,56. Since different land cover types in land models use distinct prescribed radiative, thermodynamic, and morphological properties, the land cover data used strongly modulates crucial variables like the components of the surface energy budget3,51,55 and thus the lower boundary conditions for the atmosphere in coupled model simulations. One of the most common mesoscale models used for urban climate research—the Weather Research and Forecasting (WRF) model12,57,58—uses the MODIS urban land cover as the default surface dataset. Newer versions of the urban components of this model can also use the local climate zone (LCZ) classification system59, with a recent global 100 m dataset planned to become the default urban representation for future releases of WRF15. Earth system models (ESMs) rarely resolve urban areas, but one of the few such ESMs with an urban representation—namely CESM60,61,62—uses a circa 2001 estimate of urbanization14. This urban dataset is also used in other ESMs that have branched off from CESM63,64 and has also been incorporated into regional models65. Large differences between these three products (MODIS Land Cover, Demuzere et al.15, and Jackson et al.14) as well as other present-day estimates of urban land are evident (Figs. 7a, S5). Note that except the MODIS Land Cover data, the other two are not pure estimates of physical urbanization. Jackson et al.14 actually uses population-based thresholds of urban density while several of the LCZ classes represent different mixes between built and natural surfaces. For example, LCZ9, the sparely built class, characterized by a high abundance of natural land cover, behaves thermally like a natural land cover and is thus often excluded as a built up class66. As such, there are potential mismatches here that should be kept in mind. Since CESM currently does not use the low-density urban class within the urban model67, it is implicitly assumed that anything up to medium-density, in terms of population, would be an appropriate representation of physical urbanization at the grid-scale. However, we find massive overestimations of urban land in CESM for regions of the world with high population density and low physical urbanization, such as Asia and Africa, with mean percentage errors of 179.9% and 136.2% (compared to GISA), respectively, for those continents (Fig. 7b). In contrast, this dataset largely underestimates urban land for North America and Oceania compared to GISA (mean percentage errors of −58.3% and −32.7%, respectively; Fig. 7b). Consequently, since urban surfaces in the CESM Land Model (CLM) uses specific prescribed thermodynamic, radiative, and morphological properties, if CLM has more urban land than the ‘reality’ (globally and for specific continents), urbanization would have a stronger bulk impact on grid-averaged surface climate variables in those regions, all else remaining equal. Exaggerated urban climate signals, such as urban heat and dry islands68, should also be seen at the local scale, such as over the urban core, if the Jackson et al.14 dataset is used in regional coupled simulations65 that can resolve intra-urban advection. Similarly, since the MODIS Land Cover can be as low as 30% impervious, a model (like WRF) using MODIS as the land cover constraint would overestimate urban impacts on climate2,55 for urban-to-rural transition zones, especially since urban models traditionally have not accounted for urban vegetation56, which generally increases at the edges of cities.
a Shows global urban percentage across eight urban land cover datasets as well as two additional estimates of urban areas used in weather and climate models over Europe. LCZ 1 to 10 refer to the ten built up local climate zones. b Shows linear regressions between grid-wise urban percentage in the GISA dataset for the year 2001 versus the total urban percentage from medium density, high density, and tall building districts classes of the Jackson et al. (2010)14 dataset for the world and each continent. The line of best fit, coefficient of determination (r2), mean bias error (MBE), mean percentage error (MPE), and sample size (n) are provided for each case.
(4) Future urban projections: Fourth, the differences in present-day and historical estimates of urban land also influence future urban projections. Several products have recently been developed to represent future urbanization scenarios69,70,71,72 that can be used directly or incorporated into weather and climate models12,57,73. This use case also represents a special subset of the first point on derived datasets. We see large differences between these future estimates (Fig. 8a), which would depend on methodology (different growth models), input data (choice of historical urbanization estimate for model calibration and future population projection constraints), and assumed scenario of urbanization. For instance, Gao & O’Neill (2020)70 consider distinct urbanization patterns across 375 sub-regions, while Chen et al.69 use only 32 regions. Although both datasets are trained using GHSL, the Chen et al.69 data are further calibrated against the ESA CCI estimate for 2015. The Li et al.71 and He et al.72 datasets are trained using annual nighttime light observations and the GAIA data, respectively. When we separate the sets of urban projections by continent, it becomes clear that the differences between them are not merely in terms of magnitude (Fig. 8). For instance, for the Shared Socioeconomic Pathways (SSP) corresponding to sustainable development (SSP1), the Li et al.71 dataset shows most urban land at the end of the century for Africa, Asia, Europe, and South America. However, for the same scenario, He et al.72 dataset shows the highest end-of-century urban percentage for North America, while the Chen et al.69 dataset shows the highest urban land among datasets for Oceania. Another discrepancy between datasets is the gradual acceleration of urbanization in Gao & O’Neill (2020)70 past mid-century versus a strong plateauing of urban growth in the He et al.72 dataset. This discrepancy is related to the choice of future population projection used to develop the two datasets. These differences and inconsistencies between projections are seen for almost all the SSP scenarios, which reinforces the importance of being cognizant of dataset choice when reporting scientific results, quantifying impacts, and informing policy.
Discussion
In light of the development of multiple new global urban datasets at finer resolutions in the last decade, our goal here was to examine whether these state-of-the-art products provide better constraints on our understanding of urbanization across scales. The datasets considered here include thirteen historical urban land cover products (the twelve in Fig. 4 plus GHSL; see “Methods”), two datasets specifically used in process-based urban models, and four future projections of urban land. Unlike past studies, primarily in the remote sensing literature, that focus on accuracy assessments22,24,25 or compare newly developed data products against other available datasets18,19,32,71, our main goal was to explore what these disagreements mean for applications of these datasets in modeling and observational studies. As such, this study is broader in scope compared to those efforts and relevant to researchers and practitioners interested in the urban environment beyond remote sensing experts. We find large disagreements between the global urban data products across spatiotemporal scales. In fact, the largest divergences between datasets are seen for the most recent years from global to continental to country scales (Figs. 4, 5) on inclusion of the new 10 m resolution products. At this resolution, it is possible to partially resolve urban vegetation, settlements, and roads, making the different urban definitions produce larger variations, with the Esri Land Cover and WSF 2019 datasets showing the highest and lowest urban land percentage, respectively (Fig. 1b). This variability in urban estimates across scales underscores the challenge in achieving a standardized measure of urban land even with globally available satellite observations.
We discuss the implications of these observed differences between the products for several use cases (Figs. 6, 7, 8). The results demonstrate that it is important to be cognizant of the specifics of the datasets and any potential dependencies to ensure application-appropriate analyses. For instance, given the capability to separate roads and building roofs with current high-resolution satellite observations, particularly using Sentinel-2 and commercial satellites, we should rethink whether it makes physical sense to still include rooftops in estimates of many urban environmental hazards, such as from floods and heat, given where urban residents are more likely to be exposed. Approaches to doing this may involve combining datasets with and without roads, ideally with otherwise similar urban typologies or adjusting datasets using masks for building rooftops. However, note that these approaches, and the data sources chosen, may have similar uncertainties, which one should be conscious about when exploring application workflows74. Also note that there are other limitations to physiologically-relevant urban heat hazard estimates using satellites that are beyond the scope of this study68,75. Similarly, as the newer urban land cover datasets are incorporated into process-based models, it is important to be aware of consistency in definitions between urban representations in models and the classification typologies used in data products. Using Dynamic World, which includes some vegetation in the urban class, is not appropriate if the model treats the entire urban grid as an impervious surface, such as in CESM, its offshoots63,64, and most versions of WRF. Similarly, since WSF 2019 removes urban roads, incorporating this dataset into process-based models with a typical urban canyon structure, which includes both buildings and roads, is not recommended as doing so will capture only a fraction of the physical impact of urbanization on weather and climate. Structural differences between models are commonplace in the Earth sciences, which has encouraged the use of ensemble estimates to lend robustness to projections and for uncertainty quantification. Surface datasets, such as those for urban land, are an additional free parameter that can be largely decoupled from implementations of model physics. Uncertainty estimates due to differences in land use projections are much rarer76, and do not currently resolve urbanization at finer scales77. Based on the differences seen here (Fig. 8), we recommend using multiple datasets, when possible and when the definitions of “urban” align with the assumptions of the use case, both to provide more robust estimates of uncertainties for urban-resolving climate projections and to better quantify hazards for rapidly urbanizing populations. Moreover, for local- to regional-scale science and applications, it is generally preferrable to use maps developed for those areas instead of global datasets, since the former are better calibrated to capture the unique urban development patterns of those areas. In summary, the appropriate preprocessing methods of urban land products vary between datasets and for different applications and scales. These should be determined case by case with guidance from relevant domain experts.
Various urban datasets have been and are being used for informing policy and decisions across scales. Urban planners at the municipal and national level rely on maps of urban land and its evolution over time to design policies that ensure sustainable urban development, relevant to Sustainable Development Goal 11 (SDG-11) charted by the United Nations Member States78. Although urban decision makers may have some detailed local maps at the city level, these are generally static, leading to reliance on satellite-derived products to explore patterns of urban compaction and sprawl over time and set policies consistent with, for example, the 15-Minute City, 3-30-300, and Compact City paradigms79,80,81. Consequently, many researchers have attempted to generate datasets to aid this sort of decision making, frequently constrained by estimates of urban land; similar to those examined here82,83. Urban datasets, as incorporated into process-based models, are also used to quantify future urban climate impacts, such as on extreme heat and precipitation events, to inform government agencies84. At the national level, authorities will soon have to report on changes in urban extent and the ecosystem services this affects under the newly adopted statistical standard for the System of Environmental-Economic Accounting Ecosystem Accounting 85. This policy framework aligns with the International Union for Conservation of Nature ecosystem typology86, which includes a mosaic definition of urban areas, thereby aligning more with products like Dynamic World and others that adopt a larger minimum mapping unit. However, regardless of the urban dataset chosen, ecosystem accounting practitioners are encouraged to adopt design-based methods for estimating areas from maps derived with remote sensing87. Finally, explicit future projections of urbanization have started to be incorporated into Integrated Assessment Models, which are often used to set national-scale energy policies and emission targets88. Given the variations in the temporal and spatial patterns of urban land across datasets seen here, we suggest that decision makers consider a variety of datasets to inform policies from urban to national scales. Similarly, data producers should try to be more up front about the assumptions underlying their datasets, particularly when the intent is to assist policymaking. In a world of cheap compute and increasingly available high-resolution satellite imagery, we expect such global data products to continue to be published. Moreover, these datasets will continue to be used for large-scale studies to answer various research questions43,89,90, some quite outside the range of uses that have been illustrated in the present study. Given this rapid pace of dataset generation and usage, a concentrated and sustained effort is needed from the urban scientific community to assess fit-for-purpose datasets for distinct science and applications before they are adopted to quantify the costs and benefits of environmental solutions and support broader policies.
Methods
Datasets
We consider multiple global urban land cover datasets that have been developed over the last couple of decades to both examine differences between them across spatiotemporal scales and to discuss the impacts of these differences on a few use cases. Our focus here is primarily on datasets that have 100 m or finer resolutions, with the majority being derived from Landsat or Sentinel-2 satellite observations. We also consider the MODIS Land Cover and ESA CCI Land Cover datasets, which are at 500 m and 300 m, respectively. The former is one of the few physical estimates of global urban land that has been continuously updated since Potere et al.24 and the latter because it is one of the few land cover products based on the MERIS and has been used for multiple applications27. We do not consider any regional land cover datasets or land cover datasets released after 2021. This is why we did not focus on the latest version (P2023A) of the GHSL in our main analysis. However, since this product has been used for various applications, we make an exception and provide results for 2018 “built spaces” (at 10 m resolution) and the 5-year “degree of urbanization” estimates from 1985 to 2020 (at 1000 m resolution) in the supplementary information (Fig. S3). The former is used for a small discussion about potential mismatches for urban applications (see Results section). For reference, the global urban (“built spaces”) percentage in the 10 m GHSL P2023A is lower (0.49%) than the corresponding MODIS estimate (0.59%). Also note that the GHSL 10 m product includes road surfaces (global percentage of 0.02%) under a subset of “open spaces”, while “built spaces” includes different types and heights of buildings. Among the datasets considered with varying time series, we choose those with data for 2019 and/or 2020 as present-day estimates. This is done because that maximizes the number of datasets that can be used for this comparison since several (eight) global 10 m land cover datasets were released in 2020 and some datasets end in 2019. Combining the two years should have minimal impact on differences between products since a single year would not lead to major urban changes and because 2020 was also the year of multiple COVID-19 lockdowns that significantly halted infrastructure development projects. The earliest year considered for multi-year datasets is 1985. It should be noted that new global land cover datasets are being developed at a rapid rate. We did not consider some datasets since they were not been publicly released while this work was being done91,92 and some because they are essentially combinations of other datasets over the similar time periods34. Overall, we aimed for our selection of datasets to represent the primary modes of variability in resolution, methodology, urban definitions, and time spans. Table S1 provides an overview of all these datasets, including the urban definition used and other notes relevant to this study.
In addition to the satellite-derived estimates of urban land, we consider two global datasets that are used in regional and global urban modeling. First is the recent 100 m global urban LCM estimates by Demuzere et al.15, which will be the default urban representation for future releases of the WRF model, one of the most commonly used mesoscale model for urban climate studies12,57,58. Second is the 1 km estimate of urban densities used in global models such as the Community Earth System Model (CESM)67, the Energy Exascale Earth System Model (E3SM)63, and the Climate Change coupled climate model (CMCC-CM2)64, as well as regional models like RegCM (Regional Climate Model)65, from Jackson et al.14. The former is valid for the year 2018 while the latter is for 2001. While the Demuzere et al.15 dataset maps 17 LCZs, we only consider the 10 LCZs that are directly relevant to the built environment for our analysis (Figs. 6a, S2b, S5).
Finally, we consider four recent projections of future urbanization under various SSPs, which are socioeconomic equivalents to future emission scenarios93. The resolutions of these datasets range from 1/8th degree (with fractional urban land) in Gao & O’Neill (2020)70 to 1 km in Chen et al.69, Li et al. (2021)71, and He et al. (2023)72. This resolution would be considered coarse in the current remote sensing literature and fine in the climate modeling domain. While He et al. (2023)72 includes more scenarios than the other datasets, only the common five (SSP1, SSP2, SSP3, SSP4, and SSP5) are used for the comparison (Fig. 8). SSP1 represents the “sustainability” scenario, SSP2 corresponds to the “middle-of-the-road” scenario, SSP3 is the “regional rivalry” scenario, SSP4 is the “inequality” scenario, and SSP5 denotes the “high-emission” scenario93.
Regions of interest
We consider four sets of regions of interest in this study to calculate total urban land for each. First, we consider all countries as recognized by the World Bank (Fig. 1). No disputed territories are considered, which cover a negligible portion of the global land surface. Second, we consider four regions of interest, namely the Great Lakes region, Mid Atlantic region, Indo-Gangetic Basin, Yangtze River Basin (Fig. 2a) to illustrate the variability between datasets at the regional scale. Third, we consider the Köppen-Geiger climate zones94 to examine the variability of the SUHI intensity for different background climate (see more below). Finally, we divide the global land surface into 0.9° latitude × 1.25° longitude grids to estimate grid-level urban percentage and disagreements between datasets for them. This is a common resolution used to run CESM61.
Surface Urban Heat Island estimates
We illustrate the role of the choice of urban land cover dataset on a well-known urban climate signal—the SUHI effect. We calculate the SUHI for over 10,000 urban clusters using the Simplified Urban Extent algorithm, which has been used in the urban climate literature in the past to examine the SUHI across scales48,54. Of note, the algorithm separately calculates the urban and rural LST for each cluster, their difference being the SUHI intensity. The urban LST is the average LST of all urban pixels (for each of the 8 present-day estimates of urban land) within a cluster, while the average LST of the non-urban pixels is the rural LST. The LST data used here are from the Landsat collection 2 science product95 for 2018–2022, which covers the time of the 2019/2020 global urban land cover datasets used. Water pixels are masked out for both urban and rural cases before generating the urban and rural LST based on the Global Surface Water product96. Due to the 16-day return period of Landsat 8, multiple years of data are needed to get sufficient clear-sky observations. Separate analyses are done for summer (June, July, and August for clusters whose centroids are in the northern hemisphere and December, January, February for clusters in the southern hemisphere) and winter (vice versa) after quality controlling all Landsat image using pixel-level quality control flags to minimize contamination from clouds and cloud shadows (Figs. 6a, S4).
Urban growth in flood plains
We examine the impact of choice of different estimates long-term urbanization on urban exposure analysis following Andreadis et al. (2022)33 and Rentschler et al. (2023)8. For this, we consider the GAIA 18, GISA 39, and WSE 20 datasets, which have the longest time series. For the first and last years of the common period (1985 and 2015, respectively), we calculate the total urban land globally and by continent that overlaps with the Global high-resolution floodplains dataset52. The percentage change between 1985 and 2015 is calculated for the world and each continent by dataset (Fig. 6b). The most urban growth is seen for GAIA and the lowest for GISA or WSE (depending on continent).
Grid-wise comparison between satellite-derived and model-prescribed urban datasets
For each 0.9° latitude × 1.25° longitude grid on the Earth’s surface, we calculate the urban percentage for 2001 from GISA and from the sum of the medium density, high density, and tall buildings district classes of the Jackson et al. (2010)14 dataset. GISA is used since it shows the highest accuracy among the urban datasets for present day (Table S2; also discussed later) and the year 2001 is considered since it is the approximate validity of the Jackson et al. (2010)14 estimate. The low density class from Jackson et al. (2010)14 is not included since it is not considered in CESM simulations67. Only the grids with non-zero values from both datasets are considered. Jackson et al. (2010)14 detects urban land in much fewer grids (6858) compared to GISA (8587). Separate correlations are shown for all common global grids and by continent (Fig. 7b). The main accuracy metrics used are mean bias error and mean percentage error (MPE). As an example, for Asia, the mean urban percentage in Jackson et al. (2010)14 is 1.18% higher than the value detected by GISA based on all common grids with any urban land. In percentage terms (percentage of a percentage), Jackson et al. (2010)14 shows over double the urban percentage (MPE = 136.2%) than GISA for Asia.
These same grids are also used to estimate the degree of disagreement (through coefficient of variation) between the eight datasets representing present-day urban land. For this analysis, only those grids were chosen for which all eight datasets show non-zero urban area (Fig. 2b). This would underestimate the disagreements since many of the datasets, although technically global, detect no urban land for some grids while others do. The differences between the coverage of these datasets can also be explored through this web app: https://ee-tc25.projects.earthengine.app/view/urbancomparison.
Data processing
All the datasets are processed on the Google Earth Engine cloud computing platform97. The total area of each region of interest (the denominator to estimate urban percentage) is the geometric area of each vector (corresponding to countries or regions). For summarizing the total urban area of these regions of interest, we calculate the sum of area of “urban” pixels within each vector using the native resolution of the corresponding dataset as the scale of aggregation. The country level regions of interest and another set of boundaries for the European and Asian part of Russia are combined to summarize results by continent. For the SUHI estimation, a scale of 100 m is used for all cases, which is the native resolution of the Landsat 8 thermal band. Among the global urban land cover datasets considered, Dynamic World is unique in that a classification is done for every Sentinel-2 scene. Here, we only consider the pixels as urban if the mode of all the overlapping scenes for the year 2020 are urban. Comparisons of median, mode, and means of these images show relatively small differences22.
Validation
Our primary goal in this study was not to focus on comprehensive accuracy assessments of these datasets. This is because of two main reasons. First, there have been multiple accuracy assessments of global land cover estimates across scales16,18,22,24,25. Second, given the differences in urban definitions in these datasets, standard accuracy estimates may not be particularly helpful. There has been discussion about the term “ground-truth” in the broader remote sensing community that is relevant here98. However, as a sanity check, we provide basic accuracy estimates of the eight datasets used for representing present-day (2019/2020) urban land using the validation dataset created by the Dynamic World team32. The development of the training data employed 70 annotators, who manually labeled land use and land cover types in high-resolution images from Sentinel-2 for random dates in 2019. The annotation was done following the classification typology of the Dynamic World dataset. We chose this dataset since it is the largest available validation data that is relevant at the 10 m scale. Our accuracy estimates for the world and all the continents are summarized in Table S2 and show the percentage of the urban pixels in the reference that are correctly identified as urban in the dataset (overall accuracy). Based on this assessment, the GISA and WSF 2019 products perform the best followed by Esri Land Cover and Dynamic World. Overall, MODIS Land Cover performs the worst. However, beyond this sanity check, we should be cautious about the implications of these accuracy estimates. As an example, note that the urban definition in the Dynamic World dataset includes a mixture of residential buildings, streets, lawns, trees, isolated residential structures or buildings surrounded by vegetative land cover (Table S1). Therefore, it is a mosaiced land cover definition, partly because it uses a minimum mapping unit of 50 m × 50 m. Other products like World Cover with 10 m × 10 m or GISA with 30 m × 30 m minimum mapping units may not be directly comparable (Table S1). Another example of a typological difference is that WSF 2019 does not even include roads. Basically, due to mismatch between land cover typology in the global datasets and the typology considered when creating ground truth data, it is difficult to provide an unbiased conclusion about relative accuracies of these datasets.
Data availability
All data presented in this paper are archived here: https://doi.org/10.6084/m9.figshare.25225535.
Code availability
The codes used for processing the global datasets are archived here: https://doi.org/10.6084/m9.figshare.25225535.
References
Elmqvist, T. et al. Urbanization in and for the Anthropocene. Npj Urban Sustain. 1, 6 (2021).
Qian, Y. et al. Urbanization impact on regional climate and extreme weather: current understanding, uncertainties, and future research directions. Adv. Atmos. Sci. https://doi.org/10.1007/s00376-021-1371-9 (2022).
Oke, T. R., Mills, G., Christen, A. & Voogt, J. A. Urban Climates (Cambridge University Press, 2017).
UNDESA, P. World urbanization prospects: the 2018 revision. (UN, 2018)..
Ritchie, H., Samborska, V. & Roser, M. “Urbanization” Published online at OurWorldInData.org. Retrieved from: https://ourworldindata.org/urbanization (2024).
Zhu, Z. et al. Understanding an urbanizing planet: strategic directions for remote sensing. Remote Sens. Environ. 228, 164–182 (2019).
Tuholske, C. et al. Global urban population exposure to extreme heat. Proc. Natl. Acad. Sci. 118, e2024792118 (2021).
Rentschler, J. et al. Global evidence of rapid urban growth in flood zones since 1985. Nature 622, 87–92 (2023).
Mentaschi, L. et al. Global long-term mapping of surface temperature shows intensified intra-city urban heat island extremes. Glob. Environ. Chang. 72, 102441 (2022).
Iungman, T. et al. Cooling cities through urban green infrastructure: a health impact assessment of European cities. Lancet 401, 577–589 (2023).
Liu, Z. et al. Surface warming in global cities is substantially more rapid than in rural background areas. Commun. Earth Environ. 3, 1–9 (2022).
Gao, J. & Bukovsky, M. S. Urban land patterns can moderate population exposures to climate extremes over the 21st century. Nat. Commun. 14, 6536 (2023).
Ching, J. et al. WUDAPT: an urban weather, climate, and environmental modeling infrastructure for the anthropocene. Bull. Am. Meteorol. Soc. 99, 1907–1924 (2018).
Jackson, T. L., Feddema, J. J., Oleson, K. W., Bonan, G. B. & Bauer, J. T. Parameterization of urban characteristics for global climate modeling. Ann. Assoc. Am. Geogr. 100, 848–865 (2010).
Demuzere, M. et al. A global map of Local Climate Zones to support earth system modelling and urban scale environmental science. Earth Syst. Sci. Data Discuss. 14, 1–57 (2022).
Ren, H. et al. Mapping high-resolution global impervious surface area: status and trends. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 15, 7288–7307 (2022).
Chen, J. et al. Global land cover mapping at 30 m resolution: a POK-based operational approach. ISPRS J. Photogramm. Remote Sens. 103, 7–27 (2015).
Gong, P. et al. Annual maps of global artificial impervious area (GAIA) between 1985 and 2018. Remote Sens. Environ. 236, 111510 (2020).
Huang, X. et al. Toward accurate mapping of 30-m time-series global impervious surface area (GISA). Int. J. Appl. Earth Obs. Geoinf. 109, 102787 (2022).
Marconcini, M., Metz-Marconcini, A., Esch, T. & Gorelick, N. Understanding current trends in global urbanisation-the world settlement footprint suite. GI_Forum 9, 33–38 (2021).
Chen, B. et al. Stable classification with limited sample: transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017. Sci. Bull. 64, 3 (2019).
Venter, Z. S., Barton, D. N., Chakraborty, T., Simensen, T. & Singh, G. Global 10 m land use land cover datasets: a comparison of dynamic world, world cover and esri land cover. Remote Sens. 14, 4101 (2022).
Potere, D. & Schneider, A. A critical look at representations of urban areas in global maps. GeoJournal 69, 55–80 (2007).
Potere, D., Schneider, A., Angel, S. & Civco, D. L. Mapping urban areas on a global scale: which of the eight maps now available is more accurate? Int. J. Remote Sens. 30, 6531–6558 (2009).
Zheng, K., He, G., Yin, R., Wang, G. & Long, T. A comparison of seven medium resolution impervious surface products on the Qinghai–Tibet Plateau, China from a user’s perspective. Remote Sens. 15, 2366 (2023).
Mu, H. et al. Identifying discrepant regions in urban mapping from historical and projected global urban extents. Earth 34, 167–178 (2022).
Bontemps, S. et al. Consistent global land cover maps for climate modelling communities: current achievements of the ESA’s land cover CCI. In Proc. ESA Living Planet Symposium, (European Space Agency, Edinburgh 2013).
Liu, Z., He, C., Zhou, Y. & Wu, J. How much of the world’s land has been urbanized, really? a hierarchical framework for avoiding confusion. Landsc. Ecol. 29, 763–771 (2014).
Woodcock, C. E. & Strahler, A. H. The factor of scale in remote sensing. Remote Sens. Environ. 21, 311–332 (1987).
Karra, K. et al. Global land use/land cover with Sentinel 2 and deep learning. In 2021 IEEE international geoscience and remote sensing symposium IGARSS 4704–4707 (IEEE, 2021).
Zanaga, D. et al. ESA WorldCover 10 m 2020 V100, Zenodo. (2021).
Brown, C. F. et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci. Data 9, 251 (2022).
Andreadis, K. M. et al. Urbanizing the floodplain: global changes of imperviousness in flood-prone areas. Environ. Res. Lett. 17, 104024 (2022).
Liu, X. et al. High-spatiotemporal-resolution mapping of global urban change from 1985 to 2015. Nat. Sustain. 3, 564–570 (2020).
van Vliet, J. Direct and indirect loss of natural area from urban expansion. Nat. Sustain. 2, 755–763 (2019).
Wulder, M. A. et al. Fifty years of Landsat science and impacts. Remote Sens. Environ. 280, 113195 (2022).
Buchhorn, M. et al. Copernicus global land cover layers—collection 2. Remote Sens. 12, 1044 (2020).
Esch, T. Breaking new ground in mapping human settlements from space – The Global Urban Footprint. ISPRS J. Photogramm. Remote Sens. 134, 30–42 (2017).
Huang, X. et al. 30 m global impervious surface area dynamics and urban expansion pattern observed by Landsat satellites: from 1972 to 2019. Sci. Chin. Earth Sci. 64, 1922–1933 (2021).
Marconcini, M. Outlining where humans live, the World Settlement Footprint 2015. Sci. Data 7, 242 (2020).
Sulla-Menashe, D. & Friedl, M. A. User guide to collection 6 MODIS land cover (MCD12Q1 and MCD12C1) product. 1–18 (USGS, 2018).
Huang, X., Huang, J., Wen, D. & Li, J. An updated MODIS global urban extent product (MGUP) from 2001 to 2018 based on an automated mapping approach. Int. J. Appl. Earth Obs. Geoinf. 95, 102255 (2021).
Chakraborty, T. C. & Qian, Y. Urbanization exacerbates continental-to regional-scale warming. One Earth 7, 1387–1401 (2024).
Reinhart, V. et al. Comparison of ESA climate change initiative land cover to CORINE land cover over Eastern Europe and the Baltic States from a regional climate modeling perspective. Int. J. Appl. Earth Obs. Geoinf. 94, 102221 (2021).
Hoffmann, P. et al. High-resolution land use and land cover dataset for regional climate modelling: historical and future changes in Europe. Earth Syst. Sci. Data Discuss. 2022, 1–50 (2022).
Wan, Z. MODIS land surface temperature products users’ guide. Inst. Comput. Earth Syst. Sci. 805 (2006).
Snyder, W. C., Wan, Z., Zhang, Y. & Feng, Y.-Z. Classification-based emissivity for land surface temperature measurement from space. Int. J. Remote Sens. 19, 2753–2774 (1998).
Chakraborty, T. C., Lee, X., Ermida, S. & Zhan, W. On the land emissivity assumption and Landsat-derived surface urban heat islands: a global analysis. Remote Sens. Environ. 265, 112682 (2021).
Mu, Q., Zhao, M. & Running, S. W. MODIS global terrestrial evapotranspiration (ET) product (NASA MOD16A2/A3). Algorithm Theor. Basis Doc. Collect. 5, 600 (2013).
European Commission. Joint Research Centre. GHSL Data Package 2023. (Publications Office, LU, 2023).
Zhao, L., Lee, X., Smith, R. B. & Oleson, K. Strong contributions of local background climate to urban heat islands. Nature 511, 216–219 (2014).
Nardi, F., Annis, A., Di Baldassarre, G., Vivoni, E. R. & Grimaldi, S. GFPLAIN250m, a global high-resolution dataset of Earth’s floodplains. Sci. Data 6, 1–6 (2019).
Venter, Z. S., Chakraborty, T. & Lee, X. Crowdsourced air temperatures contrast satellite measures of the urban heat island and its mechanisms. Sci. Adv. 7, eabb9569 (2021).
Hsu, A., Sheriff, G., Chakraborty, T. & Manya, D. Disproportionate exposure to urban heat island intensity across major US cities. Nat. Commun. 12, 2721 (2021).
Brousse, O. et al. The local climate impact of an African city during clear‐sky conditions—implications of the recent urbanization in Kampala (Uganda). Int. J. Climatol. 40, 4586–4608 (2020).
Masson, V. et al. City-descriptive input data for urban climate models: model requirements, data sources and challenges. Urban Clim. 31, 100536 (2020).
Krayenhoff, E. S., Moustaoui, M., Broadbent, A. M., Gupta, V. & Georgescu, M. Diurnal interaction between urban expansion, climate change and adaptation in US cities. Nat. Clim. Chang. 8, 1097–1103 (2018).
Krayenhoff, E. S. Cooling hot cities: a systematic and critical review of the numerical modelling literature. Env. Res Lett. 16, 053007 (2021).
Stewart, I. D. & Oke, T. R. Local climate zones for urban temperature studies. Bull. Am. Meteorol. Soc. 93, 1879–1900 (2012).
Zhang, K. et al. Increased heat risk in wet climate induced by urban humid heat. Nature 617, 738–742 (2023).
Zhao, L. Global multi-model projections of local urban climates. Nat. Clim. Change 11, 152–157 (2021).
Li, D. et al. Urban heat island: aerodynamics or imperviousness? Sci. Adv. 5, eaau4299 (2019).
Caldwell, P. M. The DOE E3SM coupled model version 1: description and results at high resolution. J. Adv. Model Earth Syst. 11, 4095–4146 (2019).
Cherchi, A. et al. Global mean climate and main patterns of variability in the CMCC‐CM2 coupled model. J. Adv. Model. Earth Syst. 11, 185–209 (2019).
Elguindi, N. et al. Regional climate model RegCM: reference manual version 4.5. (Abdus Salam ICTP, Trieste, 2014).
Demuzere, M. et al. Combining expert and crowd-sourced training data to map urban form and functions for the continental US. Sci. Data 7, 264 (2020).
Oleson, K. W. & Feddema, J. Parameterization and surface data improvements and new capabilities for the Community Land Model Urban (CLMU). J. Adv. Model. Earth Syst. 12, e2018MS001586 (2020).
Chakraborty, T., Venter, Z. S., Qian, Y. & Lee, X. Lower urban humidity moderates outdoor heat stress. AGU Adv. 3, e2022AV000729 (2022).
Chen, G. Global projections of future urban land expansion under shared socioeconomic pathways. Nat. Commun. 11, 537 (2020).
Gao, J. & O’Neill, B. C. Mapping global urban land for the 21st century with data-driven simulations and Shared Socioeconomic Pathways. Nat. Commun. 11, 1–12 (2020).
Li, X. et al. Global urban growth between 1870 and 2100 from integrated high resolution mapped data and urban dynamic modeling. Commun. Earth Environ. 2, 1–10 (2021).
He, W. et al. Global urban fractional changes at a 1 km resolution throughout 2100 under eight scenarios of shared socioeconomic pathways (SSPs) and representative concentration pathways (RCPs). Earth Syst. Sci. Data 15, 3623–3639 (2023).
Marcotullio, P. J., Keßler, C. & Fekete, B. M. Global urban exposure projections to extreme heatwaves. Front. Built Environ. 8, 947496 (2022).
Chamberlain, H. R. et al. Building footprint data for countries in Africa: to what extent are existing data products comparable? Comput. Environ. Urban Syst. 110, 102104 (2024).
Li, X., Chakraborty, T. C. & Wang, G. Comparing land surface temperature and mean radiant temperature for urban heat mapping in Philadelphia. Urban Clim. 51, 101615 (2023).
Lawrence, D. M. The Land Use Model Intercomparison Project (LUMIP) contribution to CMIP6: rationale and experimental design. Geosci. Model Dev. 9, 2973–2998 (2016).
Demuzere, M. et al. Impact of urban canopy models and external parameters on the modelled urban energy balance in a tropical city. Q. J. R. Meteorol. Soc. 143, 1581–1596 (2017).
Habitat, U. N. In Tracking Progress Towards Inclusive, Safe, Resilient and Sustainable Cities and Human Settlements. SDG 11 Synthesis Report-High Level Political Forum 2018. (United Nations, 2018).
Dieleman, F. & Wegener, M. Compact city and urban sprawl. Built Environ. 30, 308–323 (2004).
Pozoukidou, G. & Chatziyiannaki, Z. 15-minute city: decomposing the new urban planning eutopia. Sustainability 13, 928 (2021).
Browning, M. et al. Measuring the 3-30-300 rule to help cities meet nature access thresholds. Sci. Total Environ. 907, 167739 (2023).
Hsu, A. et al. Measuring what matters, where it matters: a spatially explicit urban environment and social inclusion index for the sustainable development goals. Front. Sustain. Cities 2, 62 (2020).
Bailey, J. et al. Localizing SDG 11.6. 2 via Earth observation, modelling applications, and harmonised city definitions: policy implications on addressing air pollution. Remote Sens. 15, 1082 (2023).
Nice, K. A., Demuzere, M., Coutts, A. M. & Tapper, N. Present day and future urban cooling enabled by integrated water management. Front. Sustain. Cities 6, 1337449 (2024).
Edens, B. et al. Establishing the SEEA ecosystem accounting as a global standard. Ecosyst. Serv. 54, 101413 (2022).
Keith, D. A. et al. A function-based typology for Earth’s ecosystems. Nature 610, 513–518 (2022).
Venter, Z. S. et al. ‘Uncertainty audit’for ecosystem accounting: Satellite-based ecosystem extent is biased without design-based area estimation and accuracy assessment. Ecosyst. Serv. 66, 101599 (2024).
McManamay, R. A. et al. Dynamic urban land extensification is projected to lead to imbalances in the global land-carbon equilibrium. Commun. Earth Environ. 5, 70 (2024).
Huang, S. et al. Widespread global exacerbation of extreme drought induced by urbanization. Nat. Cities 1–13 (2024).
Chen, B. et al. Wildfire risk for global wildland–urban interface areas. Nat. Sustain. 7, 474–484 (2024).
Friedl, M. A. et al. Medium spatial resolution mapping of global land cover and land cover change across multiple decades from Landsat. Front. Remote Sens. 3, 894571 (2022).
Zhang, X. et al. GLC_FCS30D: the first global 30 m land-cover dynamics monitoring product with a fine classification system for the period from 1985 to 2022 generated using dense-time-series Landsat imagery and the continuous change-detection method. Earth Syst. Sci. Data 16, 1353–1381 (2024).
O’Neill, B. C. et al. The roads ahead: narratives for shared socioeconomic pathways describing world futures in the 21st century. Glob. Environ. Change 42, 169–180 (2017).
Rubel, F. & Kottek, M. Observed and projected climate shifts 1901-2100 depicted by world maps of the Köppen-Geiger climate classification. Meteorol. Z. 19, 135 (2010).
Earth Resources Observation And Science (EROS) Center. Collection-2 Landsat 8-9 OLI (Operational Land Imager) and TIRS (Thermal Infrared Sensor) Level-2 Science Products. U.S. Geological Survey https://doi.org/10.5066/P9OGBGM6 (2013).
Pekel, J.-F., Cottam, A., Gorelick, N. & Belward, A. S. High-resolution mapping of global surface water and its long-term changes. Nature 540, 418–422 (2016).
Gorelick, N. et al. Google Earth Engine: planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27 (2017).
Woodhouse, I. H. On ‘ground’truth and why we should abandon the term. J. Appl. Remote Sens. 15, 041501–041501 (2021).
Acknowledgements
Pacific Northwest National Laboratory is operated for the U.S. Department of Energy (DOE) by Battelle Memorial Institute under contract DE-AC05-76RL01830. This study is supported by a DOE Early Career award to T.C. as well as the Coastal Observations, Mechanisms, and Predictions Across Systems and Scales-Great Lakes Modeling (COMPASS-GLM) and Integrated Coastal Modeling (ICoM) projects. The latter two are multi-institutional projects supported by DOE’s Office of Science’s Office of Biological and Environmental Research. L.Z. acknowledges the support by the U.S. National Science Foundation (CAREER Award Grant No. 2145362) and the Institute for Sustainability, Energy, and Environment at the University of Illinois Urbana-Champaign. We thank Samapriya Roy and the Earth Engine Community Data Catalog for providing seamless access to many of these data products, Mattia Marconcini for discussions on the world settlement footprint suite of products, Panagiotis Sismanidis for providing details on the methodology used to develop the ESA CCI dataset, and Kanishka Balu Narayan for discussions on urban representation in Integrated Assessment Models.
Author information
Authors and Affiliations
Contributions
T.C. designed the study, processed the satellite observations, analyzed the data, and wrote the manuscript. Z.S.V., M.D., W.Z., J.G., L.Z., and Y.Q. provided comments and suggestions on the research design and writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chakraborty, T., Venter, Z.S., Demuzere, M. et al. Large disagreements in estimates of urban land across scales and their implications. Nat Commun 15, 9165 (2024). https://doi.org/10.1038/s41467-024-52241-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-52241-5
This article is cited by
-
United States multi-sector land use and land cover base maps to support human and Earth system models
Scientific Data (2025)
-
Predicting the impact of dynamic global urban expansion on urban soil organic carbon
Scientific Reports (2025)
-
Global South shows higher urban flood exposures than the Global North under current and future scenarios
Communications Earth & Environment (2025)
-
Contrasting effects of urbanization on vegetation between the Global South and Global North
Nature Sustainability (2025)