Abstract
Urbanization often results in unequal outcomes in social well-being, particularly in rapidly growing megacities within developing countries. This study investigates spatial inequalities in urban public service accessibility across communities in Beijing, with a focus on migrant populations who often face systemic disadvantages. Using fine-scale spatial and census data, we identify a significant negative correlation between facility accessibility and the proportion of migrants. Among the lowest-income communities, those with higher migrant shares experienced accessibility distances 2.09 times greater than others. In high-migrant areas, inequality levels surpassed the city average by 14.57%, reflecting entrenched spatial disparities. Interpretable machine learning models reveal key threshold effects: when housing prices fall below 80,590 yuan/m² and migrant ratios exceed 32%, inequalities rise sharply. Furthermore, more than 18.98% of communities with high migrant population proportions exhibited cumulative inequality, where multiple disadvantage factors overlap and reinforce each other. These findings highlight how spatial, economic, and demographic vulnerabilities intersect, underscoring the urgent need for data-informed, equity-oriented urban planning to foster more inclusive, resilient, and sustainable cities.
Similar content being viewed by others
Introduction
The sustainable development of cities is closely tied to the future of humanity1, drawing increasing attention from countries worldwide and being included in the United Nations 2030 Sustainable Development Goals (SDGs)2,3. In recent decades, the world has undergone a rapid urbanization process, with megacities experiencing prioritized growth4. Economic globalization and urban migration have dual impacts on the social structure of global cities: On one hand, there is increasing integration across social, racial, and cultural lines, which is a crucial element of global urban development5. On the other hand, globalization has significantly intensified social divisions, creating new barriers and inequalities within cities6. Consequently, megacities at the top of the global urban hierarchy are often the most polarized, marked by severe wealth disparity and social marginalization7. A paradox exists among those leading cities: they concentrate local elites and affluent individuals, but also experience a growing low-income population and an increasing number of marginalized immigrants and minority groups8.
If urbanization continuously leads to spatial inequalities, it will pose a significant challenge to sustainable urban development9. This is particularly true for many Global South countries, where megacities face increasing demands for infrastructure and resources, causing disparities in access to and quality of services between urban centers and suburban areas10. Simultaneously, economic growth and development often fall short of meeting the needs of rapidly expanding urban populations11, resulting in a lack of basic infrastructure and issues, such as overcrowding and pollution12. As social inequality intensifies globally, spatial inequalities within complex urban environments have become a focal point of policy research13. A notable characteristic of urbanization is the unequal accessibility of basic public service facilities14,15, with not all urban residents having equal access to essential services. A truly sustainable city should provide equitable public services and facilities for all social groups and classes (SDG 11). In the context of rapid urbanization, marginalized groups, despite making significant contributions to urban development, often face restrictions in their living environments that limit access to basic services, such as education, healthcare, and social security, especially in large cities16.
Smoyer defines spatial equity as “a necessary analytical tool for identifying underserved areas, assessing the impacts of existing urban service provision policies, and offering recommendations on how to allocate scarce public facilities”17. In rapidly urbanizing countries like China, significant disparities often exist in access to basic public services between local residents and migrants of different socioeconomic statuses in megacities, highlighting the sharp contradictions of spatial and social inequalities in urban environments18. This study conducts a quantitative analysis of the spatial accessibility of public facilities among different residential communities (i.e., the smallest administrative unit in Chinese cities) in Beijing. It addresses a critical research gap by uncovering how spatially compounding effects of cumulation and thresholds amplify accessibility inequalities. Specifically, the spatial cumulation effect highlights the long-term and structural nature of disparities in infrastructure and public service access among different social groups. These inequalities result from the historical layering of biased investments, institutional imbalances, and residential segregation15,19. As resources continue to concentrate in core areas, peripheral communities experience structural spatial inertia, leading to persistent spatial exclusion and strong path dependence. The threshold effect, rooted in nonlinear social responses and threshold theory14, suggests that inequality intensifies sharply once disadvantages in access surpass a critical level—especially for vulnerable populations. By revealing how these two effects interact to exacerbate spatial inequality in service access, this study addresses a gap in understanding the dynamic mechanisms of urban inequity. It offers a new perspective for examining spatial justice in rapidly urbanizing regions and provides both theoretical and empirical support for more nuanced, stratified resource allocation.
Firstly, we used the Amap API for public transit route planning to obtain the shortest distance to the nearest public facilities and services, including education, healthcare, public spaces, such as parks and plazas, and public safety institutions like police stations. This allowed us to create a high-resolution, 100 m interval raster grid covering the entire study area20,21, which was then aggregated to generate a detailed community-level urban accessibility dataset (for specific methods, see the Supplementary Appendix). Secondly, using fine-grained data from the 2020 Seventh National Population Census of China, we extracted the proportions and structural attributes of different residential groups at the community scale and categorized them by socioeconomic statuses based on housing prices. Drawing on concepts from public health research, such as quintiles and disparity ratios22,23, we analyzed urban inequalities in public facility accessibility among different residential groups. Thirdly, utilizing high-resolution spatial data, we identified the cumulative effects of public facility accessibility and further assessed how differences among residential groups impact the levels of spatial inequality. Lastly, we employed interpretable machine learning methods to measure the combined effects of group identity and other socioeconomic factors on public facility inequality, identifying existing threshold effects in service accessibility distances (SAD). Based on these thresholds, we used a customized quadrant analysis to identify specific communities exhibiting inequalities.
This study establishes a robust framework for understanding the multifaceted impacts of urban governance systems on accessibility inequality, providing empirical insights for policymakers to develop strategies that aim at promoting more equitable urban development. The findings contribute theoretical insights (i.e., spatial cumulative and thresholds effects) into the spatial inequalities associated with the growing migrant population and intensifying social stratification in megacities of emerging countries, offering support for policy formulation to effectively address these challenges.
Results
Structural disparities of urban inequality
We assessed community inequality using six types of urban public facility accessibility, including educational resources (e.g., primary and secondary schools), medical resources (e.g., general and high-quality hospitals), urban public spaces, and public security agencies. We also measured urban comprehensive service accessibility through wavelet transform fusion, providing more accurate community inequality evaluation24 (for specific methods, see the SI Appendix).
Besides, we evaluated the differences in urban SAD at the community level according to socioeconomic statuses and residential groups (SI Appendix, Supplementary Table 1–5), uncovering the interplay between residential attributes and economic statuses. Figure 1a shows the spatial distribution pattern of social stratification across different groups of migrant resident ratios (MRR) and average housing prices (AHP), which serve as proxies for representing socioeconomic status. Communities with highest 1/5 socioeconomic status mainly consists of local residents (Local Resident Ratio > 70%) residing in core areas (<5 km from the city center). In contrast, the community group with the lowest 1/5 socioeconomic status consists of 20% to 80% migrant residents (Average MRR = 40%), mainly residing in suburban areas (>10 km from the city center). Figure 1b illustrates significant differences in facility accessibility under various socioeconomic strata within the similar residential groups. A statistically significant relationship is observed across all groups with different socioeconomic statuses (\({\beta }_{{All\; SES}}\) > 0, P < 0.01), with an increase in the migrant resident ratio being significantly associated with a greater urban SAD. The differences in community SAD among migrant residents in low and lower-middle socioeconomic strata are the highest (\({\beta }_{{Low\; and\; Medium\; Low\; SES}}\) > 1, P < 0.01). Even within the community group with the highest socioeconomic status, disparities in access to services remain, albeit at a reduced level (\({\beta }_{{High\; SES}}\) = 0.58, P < 0.01), indicating persistent social and spatial inequalities.
a Communities with high socioeconomic statuses (red dots) mainly consist of local residents living in city center, while communities with low socioeconomic statuses (blue dots) primarily consist of migrant residents living in urban suburbs. b There are significant differences in urban facility accessibility due to different residential groups among these communities with different socioeconomic statuses, with the slope between facility accessibility and the percentage of migrant residents being significantly positive for groups with various socioeconomic statuses.
To measure the disparities between different groups, the facility accessibility disparity ratios are also calculated by dividing the population into quintiles based on local residents proportion and socioeconomic status (for specific materials, see the SI Appendix). The disparity ratio of comprehensive facility accessibility between the communities with the lowest ( > 80th percentile) and highest socioeconomic statuses ( < 20th percentile) is 2.09 (SI Appendix, Supplementary Table 1), indicating that the lowest-income group takes 109% higher distance costs in accessing urban public facilities compared to the highest-income group. The disparity ratio for urban SAD between the community groups with the highest and lowest MRR within the lowest socioeconomic stratum is 1.58 (SI Appendix, Supplementary Table 3). In terms of respective public facility types, the accessibility disparity ratio for high-quality hospital resources is the largest at 2.34, with migrant residential communities taking 134% (P < 10-4) higher distance costs compared to local residential communities. Similarly, the accessibility disparity ratio for middle education resources is significant, with migrant residential communities taking 100% (P < 10-4) higher distance costs. In contrast, the accessibility disparity ratio for public spaces is relatively smaller (DR = 1.29, P < 10-4) (SI Appendix, Supplementary Table 2).
Spatial effects of urban inequality
Urban service inequality exhibits an intertwined agglomeration phenomenon, driven by long-term concentrated construction investment in densely populated central communities while neglecting areas with low economic activity25,26. This creates a significant association in the spatial layout of various infrastructures. Pearson correlation coefficients of accessibility among communities for six types of facilities (SI Appendix, Supplementary Table 9 and Supplementary Fig. 16–17) confirm strong correlations at the 5% significance level. The correlation is stronger for similar services, such as primary and middle education (Pearson’s r = 0.5129, P < = 10-4) and medical resources, including comprehensive and grade A hospitals (Pearson’s r = 0.4973, P < = 10-4).
By calculating the standard deviation (SD) as an indicator of the variability in accessibility distances for respective types of community services, a significant relationship is observed with comprehensive SAD (\({\beta }_{{SD}}\) = 1.2141, P < 0.01) (Fig. 2a). This indicates that communities with better comprehensive services tend to have more public facilities. Given the considerable advantages in comprehensive accessibility of local residential communities, public resource allocation policies tend to prioritize the local population27, exacerbating the challenges for communities with a relatively higher proportion of migrant residents in accessing various public services equitably.
a Communities with an advantage in comprehensive service accessibility have more balanced and improved services across various aspects; b The relationship between the standard deviation (σ) and the mean (μ) of the accessibility of various facilities within different communities at the subdistrict scale is depicted, with the black dashed line representing the fitted I threshold, and the color intensity indicates the proportion of migrant residents, revealing the correlation between inequality levels and migrant resident ratios.
Additionally, through the aggregation based on administrative boundaries, we calculate the inequality index (Inequality, I) of service accessibility within communities at a higher spatial scale (i.e., subdistrict) (for specific methods, see the SI Appendix). As shown in Fig. 2b, the higher the proportion of migrant population in a specific subdistrict, the greater the degree of inequality. For subdistricts dominated by the migrant residents (Average MRR = 50%), the inequality in accessibility to community facilities (Mean \({I}_{{Migrant}}\) = 0.6747) is 14.57% higher (P = 0.0419) than that in subdistricts primarily composed of the local residents (Average MRR = 18%, Mean \({I}_{{Local}}\) = 0.5889). At the subdistrict scale, facility accessibility disparity ratios across different groups show significant changes (\({\delta }_{{Disparity\; ratio}}\) = +59.00%); while the Gini coefficient (\({\delta }_{{Gini}}\) = -58.01%) and Moran’s I (\({\delta }_{{Moran}{\rm{\mbox{'}}}s\,I}\) = -24.22%) representing the inequality level of SAD simultaneously decline (SI Appendix, Supplementary Figs. 19–20). This reflects two factors: migrant resident ratio distribution characteristics vary across spatial scales (SI Appendix, Supplementary Figs. 21–22), and urban infrastructure spatial cumulative effects constrain results, with higher spatial scale perspectives often reducing internal inequality of spatial agglomeration distribution. Since the disparity ratio between different groups does not decrease, the inequality evaluation of urban facility accessibility based on residential groups’ identity at a fine spatial scale of the community level is valuable for subsequent urban service allocation and optimization.
Detecting threshold effects of urban inequality
Our findings indicate that community service accessibility levels are significantly influenced by several key factors. Migrant resident ratio and average housing price representing socioeconomic status have positive (\({\beta }_{{MRR}}\) = 0.496, P < 0.01) and negative (\({\beta }_{{AHP}}\) = -2.05E-06, P < 0.01) effects, respectively, explaining 8.88% and 8.82% of the variation at the community scale (SI Appendix, Supplementary Table 11). Additionally, distance to urban center (DUC) has exhibited a positive dominant effect on community facility accessibility distance (\({\beta }_{{DUC}}\) = 0.050, P < 0.01), explaining 67.79% of variation. This suggests that the increasing distance from the urban center plays a major role in the accessibility of urban services for the community. Conversely, the built-up area ratio (\({\beta }_{{BAR}}\,\)= -0.550), road network density (RND) (\({\beta }_{{RND}}\) = -0.674), and population density (PD) (\({\beta }_{{PD}}\) = -2.00E-04) all have significant reducing effects on SAD (P < 0.01). The green area ratio (GAR) in a community also shows a negative effect on overall facility accessibility (\({\beta }_{{GAR}}\) = -0.018), although this result is not significant. These findings demonstrate that the inequality in facility accessibility levels observed at the community scale, to some extent, can be addressed through targeted urban planning and interventions.
Furthermore, we used an interpretable machine learning model, LightGBM, to calculate the SHAP values for eight types of significant urban characteristics. The SHAP analysis results are shown in Fig. 3a, with the features ordered by importance on the Y-axis (Fig. 3a). The distance from the city center emerges as the most important factor influencing the level of comprehensive service accessibility in communities, followed by average housing price, which represents socioeconomic status. The proportion of migrant population ranks the sixth among the selected interpretable factors. For different types of facilities, the importance of influencing factors varies (SI Appendix, Supplementary Fig. 27). For example, the residential groups identity is more important for educational resources (primary and middle school accessibility) than the socioeconomic factor of housing prices. For other service facilities, community housing prices are relatively more important.
a Influencing importance of proximity to the city center, socioeconomic attributes, road, population, and building density, and migrant population proportion on service accessibility distance. b and c Partial dependence plots of the SHAP value for migrant resident ratio and socioeconomic status on urban service accessibility, illustrating the impact of a single variable on the SHAP value, independent of other variables.
Our research found a critical threshold for the univariate effect of the observed migrant population proportion and socioeconomic status at the community scale (for specific methods, see the SI Appendix). As shown in Fig. 3b, when the migrant resident ratio exceeds the critical threshold (\({{\rm{\theta }}}_{{\rm{MRR}}}\) = 32.06%), there is a prominent increase in the comprehensive SAD with the rise in the migrant resident ratio. The accessibility distances of primary education (\({{\rm{\theta }}}_{{\rm{MRR\; PE}}}\) = 35.21%) and middle education (\({{\rm{\theta }}}_{{\rm{MRR\; ME}}}\) = 29.85%) are most sensitive to the migrant population proportion. This is directly related to the influence of household registration system on children’s local access to high-quality education service28. In contrast, the impact on the accessibility of general medical resources is relatively moderate, with a higher critical threshold for comprehensive hospitals (\({{\rm{\theta }}}_{{\rm{MRR\; CH}}}\,\)= 38.01%) and a lower one for grade A hospitals (\({{\rm{\theta }}}_{{\rm{MRR\; AH}}}\) = 35.50%) (SI Appendix, Supplementary Fig. 28). Overall, communities with higher migrant proportions face greater challenges in accessing various services, but critical threshold distributions varies across different facility types. For parks, squares, and public security agencies, the explanatory power of change in the threshold of migrant resident ratio is low (R2 = 0.1842), and the difference is not significant. This indicates that the distribution of these public facilities in Beijing is relatively balanced concerning migrant and local residents29.
Figure 3c shows that the factor of housing prices, representing socioeconomic status, has a threshold adaptation in terms of the accessibility to urban comprehensive service (\({\theta }_{{AHP}}\) = 80, 590 yuan). This indicates that its impact on service accessibility is limited, with varying threshold effects across different urban public services (SI Appendix, Supplementary Fig. 29). The threshold for comprehensive hospitals is the lowest (\({\theta }_{{AHP\; CH}}\) = 63, 767 yuan), while the threshold for middle education resources is relatively higher (\({\theta }_{{AHP\; ME}\,}\) = 70, 607 yuan) compared to primary education. The existence of critical thresholds for socioeconomic attribute implies that, in terms of general locations of urban public facilities, higher house price locations do not necessarily guarantee better public service accessibility. Notably, the thresholds of house prices for various facilities exceed AHP in those migrant population dominated communities (Average AHP = 64, 925 yuan). Migrant residents usually face higher distance costs for education, medical, and public space resources to save on housing prices when accessing various urban public services30.
Identifying cumulation effects of urban inequality
We developed a quadrant division method to quantify and prioritize infrastructure service investment for urban communities, addressing social inequality for low socioeconomic and migrant populations (for specific methods, see the SI Appendix). Based on the factors of AHP representing community socioeconomic status and MRR as variables, we divided all communities into four quadrants and established quadrant cut-off points based on the critical threshold of each variable (\({\theta }_{{MRR}}\) = 32.06%, \({\theta }_{{AHP}}\) = 80,590 yuan).
Figure 4a focuses on the fourth quadrant with a total of 323 communities, characterized by low socioeconomic statuses and high MRR, (SI Appendix, Supplementary Fig. 30). These communities face significant social inequality in comprehensive facility service accessibility, accounting for 18.98% of all communities, 22% of total populations, and 37% of total areas (Fig. 4b, right column). Due to the structural inequality of socioeconomic status and residential groups identity, the disparity ratio of comprehensive facility accessibility between the second quadrant (Hi-AHP; Lo-MRR) communities and the fourth quadrant (Lo-AHP; Hi-MRR) communities is 1.88, implying an additional 88.08% distance cost in comprehensive facility accessibility for the fourth quadrant communities.
a Quadrant analyses of residential groups and socioeconomic attributes for prioritizing urban public facility investment to mitigate the accessibility inequality of community basic services; b Proportions of the population and area in four quadrant categories with respect to the total city population and area.
For different public facility types, we can further separately identify communities with relative deficiencies of urban service accessibility, allowing for a targeted optimization of infrastructure investment (SI Appendix, Supplementary Fig. 31). The numbers of communities needing investment in primary and middle education resources are 274 and 205, accounting for 16.11% and 12.05%, respectively. The numbers of communities needing investment for comprehensive hospital and grade A hospital resources are 154 and 36, respectively (SI Appendix, Supplementary Fig. 32). In addition, we delineate the spatial distribution of communities in prioritized investment quadrants for different types of public facilities, enabling the city to concentrate infrastructure investment efforts on specific communities most in need.
Discussion
The concentration of urban populations enhances the efficiency and cost-effectiveness of city facilities, but proximity does not benefit all groups equally9,14. Integrating fairness into urban sustainable science system modeling is crucial for advancing research in this area31,32. By analyzing and visualizing urban facilities accessibility distances at the community scale within Beijing’s built-up areas, we have established a fine-grained urban spatial dataset. This dataset offers a transparent and reproducible data science perspective to analyze the structural social differences and spatial inequalities in access to urban services under different residential groups.
Empirical findings from migrant communities in Beijing, particularly those with low-income residents, reveal significant disparities in accessing local public facilities via the public transportation system compared to more affluent local communities. Migrant residents in urban peripheries generally face greater challenges in accessing nearby facilities. Moreover, local residents, with lower opportunity costs, are more willing to engage and interact with the urban environment33. Previous studies34 have largely underestimated the urban service access inequalities faced by communities dominated by migrant residents due to a lack of focus on household registration system to identify different groups. When addressing urban facility accessibility inequalities, it is crucial to consider the spatial scale of the analysis, as this affects the calculated inequality indicators. The interconnected nature of facility planning often leads to oversight of internal community inequalities. Finally, using facilities accessibility indicators, along with socioeconomic status and residential group identity for quadrant analysis, can help identify, quantify, and prioritize communities in need of investment and construction. So, we explored the cumulative and thresholds phenomena behind the inequalities in urban service distribution. By integrating considerations of equity in urban planning, this approach can help achieve a fair allocation of urban public services for different groups across multiple spatial scales.
China’s household registration system (called hukou) has undergone significant evolution over its long history (SI Appendix, Supplementary Table 13). Initially designed to serve functions of administrative management and social order maintenance, it has been used by national and local governments to restrict population mobility, control population size, achieve localized management, and maintain social stability, which also specifically distinguishes between local and migrant populations. We find that migrants without local household registration face significantly longer travel distances to access education and healthcare services. Under China’s unique hukou system, this pattern reflects an institutionalized barrier, as access to public resources remains closely tied to hukou status. Without local registration, migrants are often excluded from equal rights to schooling and medical care.
In recent years, China has undertaken reforms to its hukou system, notably through the 2014 “New Urbanization Plan” and implementing policies, such as the residence permit system and points-based settlement schemes, designed to extend equal rights and public service access to migrant population35. Nevertheless, in megacities like Beijing, high thresholds for hukou acquisition and the slow spatial reallocation of public services have created a disconnect between reform goals and actual needs. As a result, migrants remain marginalized in access to basic services like education and healthcare. Our findings reveal significant social inequality in infrastructure access based on residential status, underscoring the need for hukou reforms to go beyond formal eligibility and promote genuine social integration. Only by addressing both institutional and spatial barriers can inclusive urban governance and equitable, sustainable public service systems be achieved.
To promote equitable public service provision and spatial justice in rapidly urbanizing megacities, this study proposes a multi-level intervention framework based on our findings. Spatial interventions include enhancing disadvantaged communities’ access to high-quality services through improved transportation connectivity—such as more efficient public transit transfers, targeted shuttle services, and expanded non-motorized transport networks. Using a “quadrant identification” approach, we recommend prioritizing neighborhoods with low socioeconomic status and high migrant populations. Targeted measures include expanding schools in rental-dense areas, establishing embedded primary healthcare stations, and building pocket parks as part of urban renewal to improve both accessibility and functional diversity. Social targeting should incorporate identified thresholds (e.g., migrant population exceeding 32%) into resource allocation models as triggers for intervention. For such communities, we propose setting a “basic service threshold” and delivering essential services via mobile clinics and multifunctional community hubs to enhance responsiveness and resilience. Infrastructure planning in peripheral and newly developed areas should integrate “social vulnerability weighting” to ensure that vulnerable populations are prioritized in site selection. At the institutional level, reforming the linkage between hukou and public service entitlements is key. This includes accelerating hukou reform, expanding access to affordable housing, and reducing settlement thresholds for migrants. Policies, such as proximity-based school enrollment and interprovincial healthcare integration should be strengthened to ensure equal access for migrant families. Finally, advancing community co-governance, improving grassroots service networks, and fostering resident engagement can enhance service delivery efficiency and build an inclusive, responsive urban support system.
This study has several limitations that point to directions for future research. First, our analysis relies on the AMP platform’s public transit routing model29,36, which does not fully capture the diverse travel behaviors across social groups. While public transport is the primary mode for low-income migrants, middle- and high-income local residents are more likely to use private vehicles, cycling, or multimodal options. Relying solely on public transit to estimate accessibility may underestimate service access for affluent groups while masking constraints faced by the economically disadvantaged communities37. Future research should incorporate multimodal travel data and individual mobility patterns—such as trip frequency and mode choice—to enhance the validity and robustness of results. Second, the analysis does not account for variations in service capacity and quality. The assumption of “unlimited service capacity” may underestimate access inequalities arising from differences in facility level and service provision, particularly in healthcare and education38. For example, disparities between top-tier hospitals and community clinics are not adequately captured in the current framework. Future studies should integrate multiple data sources to incorporate service level and capacity indicators, thereby enabling a more comprehensive assessment of the multidimensional structure of spatial justice. Third, due to the limited availability of income data in China, this study uses neighborhood-level AHP as a proxy for socioeconomic status (SES). While this is a common and partly representative approach39,40,41, it fails to capture individual-level heterogeneity—such as high-education, low-income or low-education, high-wealth populations. Future research should draw on census and survey data to include variables, such as education, occupation, income, employment status, and vehicle ownership, in order to construct a more accurate SES index and improve understanding of the relationship between social structure and spatial inequality. Despite limitations in travel behavior modeling, service supply assessment, and social stratification representation, this study sheds light on the unequal patterns of service access in Beijing. It offers valuable insights for advancing future urban planning research that is more precise, equitable, and human-centered.
In conclusion, as public transportation systems and urban infrastructure continue to improve, increasingly fine-grained and real-time spatial data will become available to support participatory and evidence-based planning42,43. Our study demonstrates that such data can enable cities to directly and comprehensively assess how spatial accumulation and threshold effects exacerbate inequalities in facility distribution. By identifying these patterns and disparities, urban planners can more effectively promote inclusive urban development and enhance social equity, offering important guidance for future human-centered, precise, and equitable urban planning research.
Methods
Data acquisition
Geospatial analysis considering street networks and pedestrian infrastructure can provide precise data on accessibility and travel times, enhancing the understanding of spatial inequalities in urban locations44. In this study, the locations of public services were sourced from the latest Amap POI data (https://lbs.amap.com/), covering six categories: educational resources (primary and middle schools), medical resources (grade A hospitals and general comprehensive hospitals), public spaces, and police stations. Social attribute data at the community scale, such as permanent resident populations, including total population, migrant population, and population with different education levels, were obtained from the Beijing Government’s Seventh Census Survey (https://tjj.beijing.gov.cn/). Housing price, as a proxy of socioeconomic statuses of local residents, were sourced from the Lianjia website (https://m.lianjia.com/), one of China’s most well-known online real estate platforms, providing detailed housing information and timely market price data. A Python-based crawler program was used to collect all housing price data within the built-up areas of Beijing from this website, totaling 17,620 housing records with address, coordinates, price per square meters, street, residential area, and construction year.
Unlike traditional static 2SFCA methods for calculating accessibility45, we leveraged big data from public transportation to obtain the real-world access distances between two points through the Amap API. The path planning API provides walking, public transport, and driving query and distance calculation interfaces in HTTP format and returns data in JSON or XML format, enabling the development of path planning functions that reflect the actual accessibility distance of urban locations (SI Appendix Supplementary Fig. 4).
Integration of comprehensive facilities accessibility
In urban planning and management, the accessibility of infrastructure services is a crucial factor affecting residents’ quality of life and urban functionality. The distribution and accessibility of different types of infrastructure significantly impact community development and residents’ happiness. To comprehensively assess infrastructure service accessibility, we integrate accessibility data from multiple facilities and generate a composite indicator of SAD, providing a more scientific and comprehensive basis for quantitative analysis and decision-making. Wavelet transform is a multi-scale image fusion algorithm that decomposes different data images into components at multiple scales and frequencies, representing the characteristics of the original image with wavelet coefficients. During data fusion, weights are assigned based on the magnitude or energy of the wavelet coefficients in each data source46,47. By combining weighted wavelet coefficients and then performing an inverse transform, the final fused data is reconstructed. Wavelet transform is an excellent data fusion method, offering optimal performance in fusing and preserving spatial information of images48,49,50.
The wavelet transform formula is as follows:
where \({\rm{f}}\left({\rm{t}}\right)\) is the image signal vector; \({\rm{\varphi }}\left({\rm{t}}\right)\) is the wavelet transform function; \({\rm{\alpha }}\) is the wavelet transform scale; \({\rm{\tau }}\) is the image signal translation amount, and b is the parameter. The basic concept of wavelet transform is to convert the wavelet function to \({\rm{\tau }}\), then take the inner product with the analyzed signal \({\rm{f}}\left({\rm{t}}\right)\) at different scales \({\rm{\alpha }}\), ultimately achieving multi-scale fusion of images.
Urban facilities accessibility disparity ratio
Following the method of Tong et al.38, we constructed the disparity ratio of accessibility at the community level based on socioeconomic and residential groups structures, with the disparity ratio formula for socioeconomic attributes as follows:
For the lowest socioeconomic class (the bottom 20% of neighborhoods), the disparity ratio by residential groups is calculated as:
We used regression equations to calculate regression slopes for different socioeconomic classes and residential groups percentages, comparing these results with the disparity ratio results in terms of magnitude, demonstrating the prevalence of structural inequalities (SI Appendix, Supplementary Table 5). We applied these disparity ratios to other specific facility service types, such as basic education, healthcare, public spaces, and police stations (SI Appendix, Supplementary Table 1-4), and used Student’s t-test to assess the statistical differences between the two components of the composition.
Calculation of inequality indicators
The Gini coefficient (Gini) quantifies the dispersion of samples without considering social stratification and is a statistic used to measure the inequality of income or resource distribution. Here, we apply this concept to the accessibility of public services in urban locations, measuring the distribution inequality of public service accessibility among different communities. The Gini coefficient remains unchanged for uniform growth, meaning that the region’s size does not automatically affect the inequality measurement.
where n is the number of communities, \({{\rm{x}}}_{{\rm{i}}}\) and \({{\rm{x}}}_{{\rm{j}}}\) are the accessibility distances of services for the i-th and j-th locations, and \(\bar{{\rm{x}}}\) is the average SAD in the region.
Considering the varying population sizes of each community spatial unit, we calculated the population-based Gini coefficient for public service accessibility, using a weighting factor:
where \({{\rm{y}}}_{{\rm{m}}}\) is the accessibility distance of services for the m-th region, \({{\rm{w}}}_{{\rm{m}}}\) and \({{\rm{w}}}_{{\rm{n}}}\) are the population sizes of the m-th and n-th regions, \(\bar{{\rm{y}}}\) is the averageSAD in the region, and k is the number of spatial regions in the city.
The Moran’s I
Spatial autocorrelation index measures the degree of spatial autocorrelation in a dataset. The global Moran’s I for community-level public service accessibility assesses the clustering or dispersion of public resources, identifying issues of concentration or inadequacy of public service accessibility in specific areas.
where n is the number of communities, \({{\rm{x}}}_{{\rm{i}}}\) is the community-level accessibility distance, and \(\bar{{\rm{x}}}\) is the average accessibility distance. Combining these two indicators provides a comprehensive analysis of public service accessibility, revealing overall distribution inequalities and identifying spatial distribution characteristics.
Calculating subdistrict inequality index
Spatial facility inequality (I) was calculated using the method proposed by Brelsford et al.25, Pandey et al.15, and Zhou et al.51 measuring the inequality index of community facility accessibility, which reflects the degree of inequality in infrastructure accessibility distances at the subdistrict scale.
where I is the inequality index, ranging from 0 (lowest inequality) to 1 (highest inequality), and \(\mu\) and \(\sigma\) are the mean and SD of infrastructure accessibility in the subordinate communities under the same subdistrict, respectively.
Hierarchical linear regression analysis of socioeconomic attributes and residential groups structure on community infrastructure accessibility (SAD)
The analysis focuses on structural differences in residential groups characteristics and socioeconomic status at the community spatial level, with the community’s economic status based on the average housing price data. This method has been used in public health and energy assessment literature to evaluate complex relationships between income and race52,53.
Building econometric models
We performed multivariate regression analysis to better understand the factors influencing community urban facilities accessibility distance, besides the concerned socioeconomic attributes and residential group’s structure. Predictor variables were selected based on previous research54. If a variable (apart from those used in interaction terms) was highly correlated with other variables (i.e., its variance inflation factor (VIF) value was greater than 3), it was excluded from the final model to control for multicollinearity effects. Table S8 lists the VIF values of the variables used in the final equation. The selected variables in the final model included the community’s MRR, AHP, proportion of residents with a bachelor’s degree or higher (BDR), DUC, building area ratio (BAR), RND, GAR, and PD.
Constructing econometric models
We used the following model for spatial correlation regression analysis:
where \(S{AD}\) is the average accessibility distance to urban public services at the community scale; \({\alpha }_{0}\) is a constant term; \({\alpha }_{1}\), \({\alpha }_{2}\), \(\beta\) are the regression coefficients of each variable, measuring the overall level of regional dependency; \(A{HP}\) represents the AHP; \({NRR}\) represents the MRR, \(X\) is other variables, such as population proportion with higher education et al. and \(\varepsilon\) is the residual vector.
Machine learning
As a data-driven technique, is characterized by its ability to autonomously learn the relationships between input and output variables, achieving decoupling. Compared to traditional statistical models, machine learning models excel in handling complex nonlinear classification and regression problems, helping to better understand and interpret data. Essentially, the decoupling aspect allows machine learning models to better capture complex patterns and associations in data, enhancing their generalization and adaptability.
Traditional econometric models primarily identify key determinants, which are highly sensitive to outliers and often overlook important nonlinear interactions. We adopted the LightGBM model, a comprehensive learning model that can handle multicollinearity, overcoming various limitations and accelerating the training process of traditional gradient boosting decision trees (GBDT) by over 20 times while maintaining almost the same accuracy55.
Ensemble learning
Compared to single base learners, ensemble learning offers greater effectiveness and adaptability56. These individual learners usually include well-known machine learning algorithms, such as artificial neural networks and decision trees. LightGBM, using a state-of-the-art gradient boosting framework, consistently achieves higher accuracy within a shorter training cycle than machine learning algorithms like eXtreme Gradient Boosting (XGBoost)57. The GOSS algorithm in the LightGBM model addresses issues in the adaptive boosting (AdaBoost) algorithm, assigning greater weights to misclassified samples in previous iterations58. By calculating suitable weights for different samples through gradients, samples with significant gradients contribute more information gain to the model. To maintain accuracy in evaluating information gain by retaining large gradient points and randomly sampling small gradient points at specific proportions, the algorithm combines the exclusive feature binding (EFB) algorithm and histograms, where the EFB algorithm helps reduce feature dimensions in the dataset. This approach is robust to some outliers and can also simplify logistic regression models, reducing the risk of model overfitting.
Nonlinear interpretability
While the LightGBM model excels in accuracy and generalization, it lacks the same interpretability as linear models. To address this issue, we used the Shapley Additive Explanation (SHAP) algorithm to determine the specific impact of each factor on accessibility.
Specifically, SHAP is a local interpretation method based on cooperative game theory59, and the corresponding Shapley values in machine learning can quantify the contribution of each feature to the whole model, providing both global and local explanations for advanced models based on aggregation results60. The calculation is as follows:
The main contribution of SHAP is generating local additive feature attributions, where the sum of all SHAP values equals the difference between the actual prediction value and the average prediction value. Additionally, the magnitude of SHAP values is used to measure feature importance, revealing their impact on model prediction capability61.
Univariate attribution
The relationship between independent variables and urban public service accessibility involves interactions with other independent variables. Therefore, it is necessary to exclude these interactions during the further analysis.
In this calculation, \({SHAP}({X}_{{i}_{{jj}}})\) is the SHAP value of feature j after eliminating all of the interaction effects for other independent variables.
Quadrant analyses
Are widely used to identify areas of interest with specific characteristics62. To address issues of accessibility inequality for migrant vulnerable groups in the community, we attempted to quantitatively identify communities through quadrant division. For the coordinates of horizontal and vertical variables, we determined the threshold effect points (fitting a relative index (logarithmic) function equation to find zero points). As described in the main text, the quadrant analysis method was further introduced for the spatial scope of Beijing.
As a representative of emerging megacities in developing countries worldwide, Beijing faces a great challenge of spatial inequalities in public facilities63. According to China’s seventh population census data, the permanent population of Beijing in 2020 was 21.893 million, with 8.7178 million within the Fifth Ring Road, accounting for 39.82% of the total population, comparable to those international metropolises like New York and Seoul. Among them, the migrant population was 2.7454 million, accounting for 31.49%. In China’s rapid urbanization process, high market housing prices and strict household registration systems have led to significant inequalities in accessing urban basic services in Beijing’s urban space and social group structures30. Therefore, analyzing the relationship between spatial inequalities and urbanization factors in Beijing based on finely spatialized demographic and social attribute data is both accurate and representative64.
Data availability
All the data used in this study are publicly available. The locations of community public services were sourced from the latest identified Amap POI data (https://lbs.amap.com/), covering six categories: educational resources, medical resources, public spaces, and police stations. The social data at the community scale, such as permanent resident populations, including total population, migrant population, and population with different education levels, were obtained from the Beijing Government’s Seventh Census Survey (https://tjj.beijing.gov.cn/). The market prices of community commodity housing were sourced from the Lianjia website (https://m.lianjia.com/), one of China’s most well-known online real estate platforms, providing detailed housing information and timely market price data.
Code availability
Scripts used to generate the figures in this paper were developed in Python (3.11.5). These scripts and the necessary input files are available via GitHub at https://github.com/Songwm26/Spatial-Service.
References
Seto, K. C. et al. Sustainability in an urbanizing planet. Proc. Natl. Acad. Sci. USA 114, 8935–8938 (2017).
Bain, P. G. et al. Public views of the sustainable development goals across countries. Nat. Sustain. 2, 819–825 (2019).
Elmqvist, T. et al. Sustainability and resilience for transformation in the urban century. Nat. Sustain. 2, 267–273 (2019).
Sun, L. et al. Dramatic uneven urbanization of large cities throughout the world in recent decades. Nat. Commun. 11, 5366 (2020).
Derex, M. et al. Experimental evidence for the influence of group size on cultural complexity. Nature 503, 389–391 (2013).
Wang, Q. et al. Urban mobility and neighborhood isolation in America’s 50 largest cities. Proc. Natl. Acad. Sci. USA 115, 7735–7740 (2018).
Nilforoshan, H. et al. Human mobility networks reveal increased segregation in large cities. Nature 624, 586–592 (2023).
Arvidsson, M., Lovsjö, N. & Keuschnigg, M. Urban scaling laws arise from within-city inequalities. Nat. Hum. Behav. 7, 365–374 (2023).
Tóth, G. et al. Inequality is rising where social network segregation interacts with urban topology. Nat. Commun. 12, 1143 (2021).
Mackness, K., White, I. & Barrett, P. 20-minute neighborhoods: opportunities and challenges. Handb. Sustain. Sci. Future 1–22 (Springer, 2023).
Johnson, M. T. J. & Munshi-South, J. Evolution of life in urban environments. Science 358, eaam8327 (2017).
Giles-Corti, B. et al. City planning and population health: a global challenge. Lancet 388, 2912–2924 (2016).
Fisk, D. The urban challenge. Science 336, 1396–1397 (2012).
Weiss, D. J. et al. A global map of travel time to cities to assess inequalities in accessibility in 2015. Nature 553, 333–336 (2018).
Pandey, B., Brelsford, C. & Seto, K. C. Infrastructure inequality is a characteristic of urbanization. Proc. Natl. Acad. Sci. USA 119, e2119890119 (2022).
Different should not mean unequal. Nat. Cities 1, 175 (2024).
Smoyer-Tomic, K. E., Hewko, J. N. & Hodgson, M. J. Spatial accessibility and equity of playgrounds in Edmonton, Canada. Can. Geogr. 48, 287–302 (2004).
Chen, B., Liu, D. & Lu, M. City size, migration and urban inequality in China. China Econ. Rev. 51, 42–58 (2018).
Pandey, B., Brelsford, C. & Seto, K. C. Rising infrastructure inequalities accompany urbanization and economic development. Nat. Commun. 16, 1193 (2025).
Hansen, W. G. How accessibility shapes land use. J. Am. Inst. Plan. 25, 73–76 (1959).
Stokes, E. C. & Seto, K. C. Tradeoffs in environmental and equity gains from job accessibility. Proc. Natl. Acad. Sci. USA 115, E9773–E9781 (2018).
Braveman, P. Health disparities and health equity: concepts and measurement. Annu. Rev. Public Health 27, 167–194 (2006).
Tong, K. et al. Measuring social equity in urban energy use and interventions using fine-scale data. Proc. Natl. Acad. Sci. USA 118, e2023554118 (2021).
Gao, J. et al. A wavelet transform-based image segmentation method. Optik 208, 164123 (2020).
Brelsford, C. et al. Heterogeneity and scale of sustainable development in cities. Proc. Natl. Acad. Sci. USA 114, 8963–8968 (2017).
Bettencourt, L. M. A. et al. Growth, innovation, scaling, and the pace of life in cities. Proc. Natl. Acad. Sci. USA 104, 7301–7306 (2007).
Song, Y. What should economists know about the current Chinese hukou system?. China Econ. Rev. 29, 200–212 (2014).
Ye, J. et al. Pursuing a brighter future: Impact of the Hukou reform on human capital investment in migrant children in China. China Econ. Rev. 85, 102160 (2024).
Lu, Y. L. et al. Inclusive green environment for all? An investigation of spatial access equity of urban green space and associated socioeconomic drivers in China. Landsc. Urban Plan. 241, 104926 (2024).
Fischer, T. Spatial inequality and housing in China. J. Urban Econ. 134, 103532 (2023).
Fan, C. et al. Equality of access and resilience in urban population-facility networks. npj Urban Sustain. 2, 9 (2022).
Giang, A. et al. Equity and modeling in sustainability science: Examples and opportunities throughout the process. Proc. Natl. Acad. Sci. USA 121, e2215688121 (2024).
Liao, Y. & Zhang, J. Hukou status, housing tenure choice and wealth accumulation in urban China. China Econ. Rev. 68, 101638 (2021).
Wu, X. & Zheng, B. Household registration, urban status attainment, and social stratification in China. Res. Soc. Stratification Mobil. 53, 40–49 (2018).
National Development and Reform Commission of the People’s Republic of China, National New-type Urbanization Plan (2014-2020) (Beijing: National Development and Reform Commission, 2014). Available at http://www.gov.cn/zhengce/2014-03/16/content_2640075.htm.
Cao, Y., Cao, K. & Kwan, M.-P. Measuring accessibility of hierarchical healthcare facilities from the spatio-sentimental perspective. Int. J. Geogr. Inf. Sci. 2025, 1–25 (2025).
Meng, M. et al. Impact of traveller information on mode choice behaviour. Proc. Inst. Civ. Eng. Transp. 171, 11–19 (2018).
Tong, D. et al. From proximity to quality: the capitalization of public facilities into housing prices. Ann. Am. Assoc. Geographers 113, 2435–2455 (2023).
Wang, X. et al. How does socioeconomic status influence social relations? A perspective from mobile phone data. Phys. A 615, 128612 (2023).
Wu, Y. et al. How do urban services facilities affect social segregation among people of different economic levels? A case study of Shenzhen city. Environ. Plann. B Urban Anal. City Sci. 50, 1502–1517 (2023).
Zhang, T. et al. Discovering income-economic segregation patterns: a residential-mobility embedding approach. Comput. Environ. Urban Syst. 90, 101709 (2021).
Zheng, Y. et al. Spatial planning of urban communities via deep reinforcement learning. Nat. Comput. Sci. 3, 748–762 (2023).
Batty, M. Digital twins in city planning. Nat. Comput. Sci. 4, 192–199 (2024).
Abbiasov, T. et al. The 15-minute city quantified using human mobility data. Nat. Hum. Behav. 8, 445–455 (2024).
Chen, X. & Jia, P. A comparative analysis of accessibility measures by the two-step floating catchment area (2SFCA) method. Int. J. Geogr. Inf. Sci. 33, 1739–1758 (2019).
Menaka, D., Padma Suresh, L. & Premkumar, S. S. Wavelet transform-based land cover classification of satellite images. In Artificial Intelligence and Evolutionary Algorithms in Engineering Systems: Proceedings of ICAEES 2014, Volume 2, 845–854 (Springer, 2015).
He, X. et al. The coordination relationship between urban development and urban life satisfaction in Chinese cities—An empirical analysis based on multi-source data. Cities 150, 105016 (2024).
He, X., Cao, Y. & Zhou, C. Evaluation of polycentric spatial structure in the urban agglomeration of the Pearl River Delta (PRD) based on multi-source big data fusion. Remote Sens 13, 3639 (2021).
Pradhan, B., Jebur, M. N., Shafri, H. Z. M. & Tehrany, M. S. Data fusion technique using wavelet transform and Taguchi methods for automatic landslide detection from airborne laser scanning data and QuickBird satellite imagery. IEEE Trans. Geosci. Remote Sens. 54, 1610–1622 (2016).
Sun, L., Tang, L., Shao, G. & Qiu, Q. A machine learning-based classification system for urban built-up areas using multiple classifiers and data sources. Remote Sens. 12, 91 (2019).
Zhou, Y. et al. Satellite mapping of urban built-up heights reveals extreme infrastructure gaps and inequalities in the Global South. Proc. Natl. Acad. Sci. USA 119, e2214813119 (2022).
Williams, D. R., Lawrence, J. A. & Davis, B. A. Racism and health: evidence and needed research. Annu. Rev. Public Health 40, 105–125 (2019).
Nuru-Jeter, A. M. et al. Relative roles of race versus socioeconomic position in studies of health inequalities: a matter of interpretation. Annu. Rev. Public Health 39, 169–188 (2018).
Fraser, T. et al. How far I’ll go: Social infrastructure accessibility and proximity in urban neighborhoods. Landsc. Urban Plan. 241, 104922 (2024).
Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30, 3149–3157 (2017).
Mienye, I. D. & Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access 10, 99129–99149 (2022).
Zhong, J. et al. Robust prediction of hourly PM2.5 from meteorological data using LightGBM. Natl. Sci. Rev. 8, nwaa307 (2021).
Meng, Y. et al. What makes an online review more helpful: an interpretation framework using XGBoost and SHAP. values J. Theor. Appl. Electron. Commer. Res. 16, 466–490 (2020).
Halpern, R. et al. Redlining and a child’s chance of surviving the first year of life. Proc. Natl. Acad. Sci. USA 120, e2221505120 (2023).
Lundberg, S. M. & Lee, S. I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).
Molnar, C., Casalicchio, G., Bischl, B. Interpretable machine learning—a brief history, state-of-the-art and challenges. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases 417–431 (Springer, 2020).
Chen, M., Liu, W. & Tao, X. Evolution and assessment on China’s urbanization 1960–2010: Under-urbanization or over-urbanization? Habitat Int 38, 25–33 (2013).
Qi, Y. et al. Decade-long changes in spatial mismatch in Beijing, China: are disadvantaged populations better or worse off? Environ. Plan. A 50, 848–868 (2018).
Giannotti, M. et al. Inequalities in transit accessibility: contributions from a comparative study between Global South and North metropolitan regions. Cities 109, 103016 (2021).
Acknowledgements
This study was supported by National Natural Science Foundation of China (No. 42121001, 42171204, 42450273).
Author information
Authors and Affiliations
Contributions
M.C. designed research; M.C. and W.S. performed research; W.S. and L.C. contributed new reagents/analytic tools; W.S. and H.J. analyzed data; M.C., W.S., Y.Y., and H.J. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, M., Song, W., Yang, Y. et al. Spatially compounding effects of cumulation and thresholds amplify urban inequality in megacities. npj Urban Sustain 6, 5 (2026). https://doi.org/10.1038/s42949-025-00312-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42949-025-00312-x






