Abstract
Sustainable development is an imperative worldwide1,2,3 but metrics and data on poverty and quality of life have remained too coarse and abstract to characterize challenges adequately and guide practical progress4,5. Nowhere is this challenge greater than in Africa4,5,6, where we still know little about the spatial details of development3,7,8,9. Here we leverage a comprehensive, high-precision dataset of building footprints to identify infrastructure deficits and infer informal settlements down to the street block level10,11,12 everywhere in sub-Saharan Africa. We identify a general pattern of informality with cities showing, on average, greater access to infrastructure and services than rural and peri-urban areas. We show that such patterns of informality are characterized by consistent statistical distributions reflecting uneven local development2,13,14. We also show that these physical measures of informality are systematically associated with many indicators of human deprivation, which form a single principal component co-varying predictably with specific changes in street access to buildings. These results demonstrate that the localization of sustainable development is possible down to the street level at a continental scale and provide a general distributed strategy for accelerating progress in infrastructure and service expansion that taps local innovations in systematic, equitable and context-appropriate ways7,11,12,15.
Similar content being viewed by others
Main
The critical need for sustainable development has come into sharp focus recently, along with the ambition to address associated challenges all over the world in the decades ahead1 within planetary boundaries16. These imperatives have led to landmark international agreements to address climate change and achieve specific Sustainable Development Goals, such as the worldwide eradication of extreme poverty5. However, practical progress towards these goals and the integrated science and technology necessary to support them has remained slow. Two interlocking factors contribute to this situation. First, there is a mismatch of scales between commitments made by national governments and processes of human development, which take place primarily at smaller scales in human settlements and their local communities3,5,7,17,18. Second, at these smaller geographic scales, and especially in lower-income nations, there is a lack of comparable, standardized and sufficiently rich data4,5,19,20,21, leading to calls for innovations towards their generation and analysis4,7,9,20,22.
In contrast, there has been significant progress in the scientific understanding of cities and neighbourhoods2. These environments not only house a growing majority of the world’s population but also promote systemic societal change2,3. Specifically, the process of urbanization is associated with long-run improvements in many interlocking facets of human development, including higher real personal incomes, greater access to healthcare and education, and expanded public services2,3,13,23. This is owing to urban network (agglomeration) effects in socioeconomic activities, which increase their productivity and complexity, and accelerate their outputs with population size, and of economies of scale in infrastructure and service delivery characteristic of the denser built environments of cities2,24,25. Because of these general effects, it is easier, faster and more productive to extend socioeconomic opportunities, infrastructure and services to populations in a larger city than to create them in the first place in small towns and rural areas26. As a result, rapid urbanization often induces increased rural-to-urban migration and results in a general pattern of infrastructure and service delivery spreading along the urban hierarchy, from larger cities to smaller settlements3,27. This dynamical pattern of development implies a transition over space and time, where signatures of higher human development are nucleated unevenly with higher probability in better connected central locations in larger cities and spread from there, eventually, to all constituent local communities and other less-urbanized regions. Here we show that this pattern of development is also characteristic of sub-Saharan Africa and a feature of its informal settlements3,28.
Reflecting this uneven dynamics, fast urbanization often becomes associated with greater inequalities not only between larger cities and rural areas but also on smaller scales, between neighbourhoods within each settlement3,18,27. The most critical of these inequalities, because it entails many others, is the ‘challenge of slums’ (or informal settlements)4,5,7,11,28. These are neighbourhoods resulting from land settlement without coordinated infrastructure or legal frameworks26,29. As a result, informal settlements are typically associated with multidimensional poverty, lack of basic services and insecure land tenure4,5,14,15. In 2003, the United Nations declared slums “the face of 21st century urbanization”, motivating the first global studies and the estimate of about 1 billion people living in informal settlements worldwide. To address this problem, slum reduction and eventual eradication became central to the United Nations Millennium Development Goals (goal 1) and the Sustainable Development Goals target 11.1. Current assessments confirm that we are late to meet these goals and estimate that 1.1 billion people now live in informal settlements worldwide.
The present work started in support of data collections by local communities to improve self-declared slums5,15,30. In collaboration with federations of non-governmental organizations and local communities, initial efforts developed a set of standardized surveys and maps of buildings and public services in thousands of neighbourhoods in 18 countries and 224 cities2,30. As these and other methodologies—including remote sensing14,21,31,32,33,34,35, mobile phone records36 and crowd-sourced mapping12,37—continued to improve, an essential spatial typology of informal settlements emerged defined by buildings without street access. Lack of street access to residences and workplaces is the common mediator of many physical and socioeconomic deficits, including lack of official addresses and associated socioeconomic stigma, lack of emergency services (fire protection and ambulance), and disconnection from basic services, especially water and sanitation, which are piped along public ways5,11,15,26,38 (Supplementary Discussion). Recent empirical studies of informal settlements26,30,39 show that the physical mismatch between buildings and street networks violates the basic principles of urban built environments2,11. This violation entails a severe cost–benefit trade-off for slum residents, between the possibility of settlement and obtaining only incipient access to urban network effects. Because this situation is costly to both residents and their societies, it must eventually be resolved through the expansion of institutions and infrastructure, which requires precise and careful spatial localization9,26.
Another difficulty deals with interpreting the function of built environments given data on buildings and street networks. Over the past few years, topological methods capturing the detailed relational nature of buildings to streets—regardless of specific geometry—have been proposed, empirically tested and implemented computationally. These advances now allow us to identify and characterize each street block over vast regions of the world2,10,11,12. Here we use a complete dataset of building footprints for sub-Saharan Africa to take this analysis to a continental scale. The data consist of 9.8 million blocks, containing >415 million buildings in 50 nations and 2,190 urban areas (Extended Data Table 1), characterizing the living environments of 1.152 billion people across economically, culturally and geographically diverse settings28,29. Along with street access, we also characterize each block in terms of its number of buildings, their sizes and spatial densities, and estimate its resident population by downscaling worldwide raster maps to local street block geometries19 (Methods and Supplementary Methods). This procedure creates a standardized and internationally comparable dataset supporting the localization of sustainable development metrics and population at the block level, filling a substantial empirical gap, especially in Africa22,28,29. We use this evidence to estimate informal settlements across sub-Saharan Africa and to quantify general patterns of human development connecting infrastructure access to social services and human capabilities, including measures of health, education and income.
Measuring infrastructure access and informality
An informal settlement (slum) is defined by the United Nations as “a settlement in which the majority of households experience one or more of the following deprivations: lack of secure tenure, lack of access to improved water sources, lack of improved sanitation facilities, insufficient living space, poor structural durability of the dwelling”. Except at the extremes, these properties are hard to measure both in terms of access to the relevant information and because of inherent ambiguities, leading to difficulties in classifying neighbourhoods as slums versus non-slums5,28,40. A similar problem affects studies that attempt a binary classification of neighbourhoods using machine learning applied to satellite and aerial imagery14,41. Some surveys evade these difficulties by relying on community self-identification and assessing the residents’ experience of existing services and living conditions3,5,30,38,42. However, such surveys are not extensive.
Inspired by work co-producing mapping and socioeconomic data with local organizations, we created a simple criterion for identifying informality, which can be measured objectively and has deep roots in urban science: the lack of street access to buildings. This quantity is of fundamental and practical interest because it is based on how cities are assembled as self-consistent physical and socioeconomic networks2,25. Water, electricity, drainage and sanitation are delivered to each building via adjacent streets, together with emergency services. Addresses are also codified along municipal streets, enabling many forms of socioeconomic recognition and access, including rights and obligations of land tenure. For all these reasons, street access to each place of work and residence is not only a simple necessary condition for development but also a strategic policy solution with many co-benefits26 (Supplementary Discussion).
Physical connection to buildings is a local feature of street networks. Because it is a relational quantity, it characterizes the spatial organization of cities regardless of their geometry11. For example, it is independent of whether street plans are curvy or gridded and of block or buildings’ sizes. These features make the identification of street access to buildings a mathematical problem in topology2,11. The relevant relation is the unobstructed adjacency of buildings to street networks and to each other, which is captured quantitatively by a block graph (Fig. 1, Extended Data Fig. 1 and Supplementary Methods). The block graph is a network of buildings as nodes and their spatial adjacency as edges. Network analysis of each block graph identifies every building’s access level as the length along the shortest path to any node at the street boundary, shown by different colours in Fig. 1 and Extended Data Fig. 1. The maximum number of parcels to be crossed characterizes the block complexity, k, for the least-accessible building. This metric is simple and intuitive because it reflects the personal experience of residents and visitors to buildings in any street block. It helps urban planners develop solutions that address each household or business and to know objectively who has been left behind. The approach is also very efficient because it decomposes the complex (and seemingly intractable) geography of nations and cities into many independent blocks, which can be analysed in parallel.
a, The decomposition of the subcontinent into nations, cities and rural areas. b,c, Lusaka (Zambia; b) with the rectangular highlight shown in c. A more detailed region, with constituent blocks, is shown in e. c–f, We observe a large variety of block types and shapes, delimited by streets shown in white. Building footprints are black polygons visible in d and f along with a land parcel identified around each building (thin white lines). These parcels form a block graph expressing their spatial relationships (Supplementary Methods and Extended Data Fig. 1). Block graph analysis reveals how far each building is from the street network. The block complexity k is the number of building layers away from the street network for the most inaccessible building. It measures the difficulty of extending infrastructure and services to every building within the block (colour bar). The block in d is classified as informal by this measure (k = 8), showing also the accessibility level of each building in colour. The block in f is formal (k = 3). See https://www.millionneighborhoods.africa for an interactive map: zoom in and toggle to ‘Satellite Map’ to visualize underlying buildings and street networks and assess the quality of data and metrics for each block. Panels a and b, made with Natural Earth at naturalearthdata.com (using rnaturalearth). Panels c–f, made with OpenStreetMap at openstreetmap.org.
Recent work developed the mathematical and algorithmic methods to perform this analysis in general block geometries10,11 but remained limited to small scales by data quality and computational efficiency. Here we extend and improve these methods and apply them to systematically constructed datasets of building footprints, street networks and population for the whole of sub-Saharan Africa (see Supplementary Methods and Supplementary Notes for data-quality assessments and limitations). Extended Data Table 1 provides summary statistics, showing that the average block complexity across sub-Saharan Africa is k = 8.
Different ranges of k entail very different living conditions. Small k = 1 and k = 2, denotes universally accessible (or ‘planned’) city blocks, where every building has direct street access or via an easement between buildings. Values k = 3 and k = 4 are less accessible but can result from longer driveways and non-residential backyard buildings. Progressively higher k values characterize more severe infrastructure deficits and impeded mobility almost always associated with informal settlements (Fig. 1). This shows how k has an important role in urban planning: it is a quantitative objective function that signals whether expanding the access infrastructure is necessary in each block and, if so, when sufficient new accesses have been provided10,11. The block complexity has another interpretation as a block’s topological radius. It measures the number of building layers between the most inaccessible building and the closest street so that the block’s land area, AB ≈ k2/nb, where nb is the local density of buildings (Extended Data Fig. 1f,g). Below, we use k to characterize informal settlements because it provides a necessary condition for universal access to infrastructure, whereas the mean or median access levels can hide deficits to specific buildings. Some false positives—blocks with large k that are not informal settlements—are found occasionally for institutional campuses such as colleges, hospitals or airports, but are rare and easily identified by place names, building shapes and sizes.
Figure 1 illustrates the general procedure. Using building footprints and street network data along with population raster maps (at larger scales), we identify each city block as a geo-referenced polygon delimited by streets and other boundaries. This procedure is general and can be applied worldwide to produce uniquely identifiable block-level spatial units, similar to the systems used to collect and report local census data. Figure 1a shows the decomposition of sub-Saharan Africa into nations and human settlements (cities and towns), with the city of Lusaka (Zambia) highlighted and expanded in Fig. 1b. The rectangular box is then expanded in Fig. 1c, to reveal its decomposition in terms of blocks. The colours show each block’s k for a community area known as George. Blocks are identified as informal settlements by high block complexity (orange and yellow, Fig. 1d,e), as is known from local surveys. Other blocks are formal in the sense that their buildings have direct street access and correspondingly low k (Fig. 1f). Informal settlement blocks are often larger and show clear lack of street access to buildings (black polygons, Fig. 1d). Extended Data Table 1 shows that, on average, buildings in blocks with greater complexity are smaller and more prevalent in peri-urban and rural areas. Dead-end streets are common in informal settlements10,11 (Fig. 1d).
In addition, we produce estimates of the block ambient (resident and working) population. To do this, we projected population estimates from two raster datasets in wide use—LandScan43 and WorldPop8—to each block using building footprint area (see ‘Block population estimation’ in Supplementary Methods). In this way, we can quantify variations in local population size, density and building area, assessing crowding as population per building area. Comparing differences in block population estimates from the two data sources provides measures of statistical error and uncertainty21. The final ingredient of our analysis is the aggregation of blocks into regions and their characterization into land-use categories: urban, secondary urban, peri-urban and rural. Urban denotes a block’s inclusion in a city or town defined by the Global Human Settlements Layer (GHSL), which provides internationally comparable urban area definitions (see ‘Defining urban, peri-urban, conurban areas’ in Supplementary Methods for discussion and other definitions). Because urban delineations are partly based on population density thresholds and spatial morphology, they probably underestimate actual urban extent at low densities in adjacent areas. To investigate this issue, we characterize blocks in the commutable peripheries of urban areas as peri-urban, defined as a 10-km buffer beyond the GHSL boundaries. Secondary urban areas are smaller settlements, which become associated with larger cities within this buffer and together form conurbations. Regions outside these three types are classified as non-urban or rural.
The statistics of informal settlements
We now characterize the statistics of infrastructure deficits and inferred residential informality across scales. Figure 2 shows the distribution of the sub-Saharan African population as a function of k and settlement type. This histogram, along with its type components, is well fit by a log-normal distribution (Extended Data Fig. 2). This is a common distribution for urban socioeconomic data such as income, which arises as the result of multiplicative processes2 (Fig. 2a). Figure 2b shows the decomposition by settlement type, indicating that the majority of the population of sub-Saharan Africa remains rural (56%), consistent with the urbanization rate for the subcontinent estimated by the UN Population Division (42%).
a, Histogram of total population in blocks with different k and levels of urbanization. We see that extreme infrastructure deficits are relatively rare in cities (blue) and occur almost exclusively in urban peripheries (dark green) and rural areas (light green). This normalized histogram is well fit by a log-normal distribution (\(\langle {\rm{l}}{\rm{n}}k\rangle =1.74\), σlnk = 0.81) as are its subcomponents by settlement type and in every city (Extended Data Fig. 2). b, The total population of sub-Saharan Africa by urbanization type, showing that the majority remains rural (56%). c, Total population by levels of block complexity, k, showing that about 0.55 billion live in blocks with strong street access deficits (k > 5), of which 74% is rural.
Transitional peri-urban areas, probably resulting from urban expansion, contain a total population around 200 million people, comparable to urban cores9. Figure 2c shows the distribution of population totals across different levels of k, showing that more than half of the population of sub-Saharan Africa lives in blocks with substantial or extreme lack of infrastructure accesses. This is over half a billion people (0.55 billion, k > 5), of which the majority (0.41 billion, or 74%) are rural. These numbers change with the chosen threshold k, resulting in totals of 0.65 billion for k > 4 (72% rural) and 0.47 billion for k > 6 (77% rural). These are new estimates for the total population of sub-Saharan Africa living in informal settlements, but note that the majority is rural.
Besides calculating aggregate population totals, the distinctive strength of our approach lies in the localization of informality at the block level and providing an objective spatial measure of infrastructure deficits to estimate its severity. As such, our findings are more positive than large-scale estimates suggest. A takeaway from these results is that they do not support a simple dichotomy of neighbourhoods classified as either slums or non-slums. Instead, we observe a broad spectrum of access deprivation, with the most common neighbourhoods in Fig. 2a (median k = 3) having (almost) complete infrastructure access to each building. There is, however, also a long tail of neighbourhoods whose buildings are much less accessible, including many extreme cases at k > 6. Our results show that blocks with very high k are quite rare in central urban areas (Extended Data Fig. 3), although there are clearly some well-known cases14. This finding has the important consequence that many of the assumed characteristics of slums—being urban, high population densities and crowding—are not typical of the general situation of informality in sub-Saharan Africa, which is much more rural and also peri-urban and at low density9,44 (Extended Data Fig. 3). Surprisingly, crowding is much more likely in rural areas where most buildings are small (median area 20 m2), despite very low population densities at the larger scales of blocks and regions (Extended Data Table 1).
Because blocks tile the entire territory, this analysis can be produced for any aggregate scale such as cities, regions and nations. Figure 3 shows the spectrum of block complexity for each of the subcontinent’s largest urban areas. Figure 3a shows urban areas sorted by population size with Lagos (Nigeria) being the largest in the subcontinent with about 22.5 million people. (Lagos is projected to grow to about 80 million by the end of the century45). Figure 3b shows cities sorted instead by higher levels of access deficits. We observe that larger African cities actually outperform their rural areas in infrastructure access (Extended Data Figs. 3 and 4), but none has completely addressed the challenge of slums. Some cities such as Antananarivo (Madagascar) remain mostly informal, whereas others such as Dakar (Senegal) or Cape Town (South Africa) have provisioned more extensive street access, although many clearly identifiable spatial pockets remain (see interactive map). Extended Data Fig. 2 supports the generality of the underlying statistics, showing that the variation of k for each city is well characterized by standard statistical distributions, especially the log-normal. Supplementary Table 1 provides best-fit distributions for all nations and major cities in sub-Saharan Africa.
a, Cities ranked by population showing, for example, that Lagos (Nigeria) probably already provides infrastructure access to >65% of its population (k ≤ 3). b, Cities ranked by higher levels of infrastructure deficits, k. Antananarivo (Madagascar) stands out as the subcontinent’s most informal major city, but cities such as Dar es Salaam (Tanzania) or Kampala (Uganda) also present significant challenges. Cities with the highest levels of access provision include Dakar (Senegal) and the cities of South Africa (Supplementary Notes). Extended Data Fig. 2 illustrates fits to standard statistical distributions and Supplementary Table 1 provides best-fit parameters for all major cities in sub-Saharan Africa. AGO, Angola; BFA, Burkina Faso; CIV, Côte d’Ivoire; CMR, Cameroon; COD, Democratic Republic of the Congo; COG, Congo; ETH, Ethiopia; GHA, Ghana; GIN, Guinea; MLI, Mali; MOZ, Mozamique; NGA, Nigeria; KEN, Kenya; SDN, Sudan; SEN, Senegal; SOM, Somalia; TGO, Togo; TZA, Tanzania; UGA, Uganda; ZAF, South Africa; ZMB, Zambia.
Infrastructure deficits are more severe for conurbations (Extended Data Fig. 4a,b), because these enlarged city definitions include peri-urban areas where lower-density informal settlements are common9. Aggregating to the national scale (Extended Data Fig. 4c,d) confirms that the least-accessible blocks and the majority of the population experiencing access deprivation are rural but also identifies nations with substantially less access, such as Chad, Madagascar, Mozambique or South Sudan (see Supplementary Notes for uncertainty assessments).
Links between street access and human development
We now demonstrate that block complexity has a deeper functional meaning, not only expressing spatial access deprivation but also entailing many dimensions of low human development5,14,28. This includes direct physical issues, such as lack of piped water and sanitation, but also many socioeconomic characteristics such as lower education and wealth, and worse health.
At present, there are no standard local data collections for human development indicators across sub-Saharan Africa22,28. The Demographic and Health Surveys (DHS) programme, supported by international agencies in collaboration with national statistics, partially fills this gap (Supplementary Methods). These surveys are less extensive than our block metrics and, moreover, lack spatial precision. To compare the two types of evidence, we aggregated block data to subnational administrative divisions made available in the DHS. The spatial scale of Fig. 1b shows that this aggregation mixes heterogeneous neighbourhoods with significantly different characters3,27, but we will demonstrate that correlations remain generally consistent and significant.
Using these spatially aggregated data, we performed statistical analyses to establish the link between a large set of socioeconomic development measures and spatial access deprivation measured by block complexity (Fig. 4, Supplementary Table 2 and Extended Data Figs. 2–5). First, we correlated the variation of 67 different dimensions of human development on changes in block complexity (Supplementary Table 2). We grouped these metrics in thematic groups including direct estimates of economic well-being, education and literacy, health, basic services, and household characteristics, including quality of housing and crowding. All these variables show systematic and significant correlations (P < 0.001) on block complexity across the subcontinent. For example, urban slum populations estimated at the national level are strongly correlated with higher average block complexity (Spearman’s ρ = 0.62). Child mortality rate (ρ = 0.49), underweight (ρ = 0.66) and stunting (ρ = 0.58) all increase with larger k, whereas the fraction of live births in health facilities decreases (ρ = −0.58). Higher block complexity is negatively correlated with measures of education at all levels (primary, secondary and higher, ρ = −0.53, ρ = −0.61 and ρ = −0.69, respectively) and median years of education. Higher spatial access deprivation is also associated with lower female literacy (ρ = −0.57; Fig. 4a), although there are some exceptions such as high-k regions of rural Namibia, a nation well known for its National Literacy Program, showing the potential for policy. Measures of access across all basic services, as expected, show a consistent pattern of increasing deficits with larger k, as do decreasing quality of housing, measured by earth or sand and ‘natural’ floors (ρ = −0.61). Measurements of crowding and wealth support this general picture associating multidimensional deprivations with block complexity, but the pattern of correlations adds interesting detail. For example, the fraction of households with one or two persons per sleeping room (no crowding) decreases with larger k, whereas with larger numbers increases, as does the average number of persons per sleeping room (ρ = 0.42). Parallel to these findings, the fraction of households in the 2 highest wealth quintiles decreases with k (ρ = −0.49), whereas it increases in the lower 3 quintiles, especially the lowest (ρ = 0.43). This also shows that informal settlements are not simply associated with extreme poverty, as is known from local surveys26. We also find that greater wealth inequality, measured by regional Gini coefficients, is associated with larger spatial access deprivation. Measures of economic wealth by consumption goods21,32,36 follow a similar pattern, with households with a refrigerator, private car, computer, television or mobile phone (ρ = −0.67, ρ = −0.65, ρ = −0.61, ρ = −0.71 and ρ = −0.44, respectively) decreasing with k, the latter showing a weaker negative correlation. We have also found that statistical associations between k and wealth predictions inferred from machine learning (trained on DHS data and other inputs32,46) are strong and suggestive of additional predictive power relative to current inputs to these models including cell phones and night lights (Supplementary Notes). All these statistical relationships are stable across nations and robust to the inclusion of control variables. These relationships become stronger (higher R2) when treated at the national level via country fixed effects (R2 = 0.61), and stronger still when controlled for settlement type via the share of population living in urban areas in each region (R2 = 0.82; Extended Data Table 2).
a–d, Measures of service provision, education and wealth are mutually correlated in demographic and health surveys and well characterized by the first component in a principal component analysis (PCA) of 67 distinct variables (Supplementary Table 2 and Extended Data Fig. 2). The lines shows the variation of the principal component (PC) with block complexity k at the regional level across nations and urban areas. The relationship is negative (colour scale), showing that higher k (and inferred informality) is systematically associated with lower female literacy (5.3% reduction per k; a), lower access to a water source on premises (2.8% reduction per k; b), lower access to an improved sanitation facility (4.5% reduction per k; c) and with lower wealth (1.5% reduction per k; d). Extended Data Fig. 5 shows a larger set of human development outcomes. Extended Data Fig. 6 shows fit residuals.
Because we observe that diverse measures of human development are mutually correlated, in agreement with studies in other local contexts3,18,23, we also characterized their joint variation with k. To do this, we performed a principal component analysis across regions and identified the leading collective dimensions of variation. The first principal component explains most of the variation (48%; Extended Data Table 2). Figure 4 illustrates this behaviour, showing the variation of four different socioeconomic variables, specifically female literacy, access to water, sanitation and household wealth (see Extended Data Fig. 5 for more quantities). These results also show that denser (and more urban) regions perform better on average in all dimensions of human development relative to less dense and more rural regions13,44. Extended Data Table 2 shows how country fixed effects account for more variation, but that other factors such as road density, building size, building-to-land-area ratio and population density are not statistically significant (P > 0.25) once block complexity and share of urban population are factored in. Extended Data Fig. 6 shows the residuals from this analysis, pointing out regions over-performing and under-performing the model’s predictions.
We conclude that there is a clear and systematic pattern of multidimensional development across scales with greater local street access and urbanization associated with expanded socioeconomic opportunity, education, better health and improved access to all basic services. These associations are probably the result of multiple pathways, including the direct effects of infrastructure access deprivation on human development and also the sorting of disadvantaged population into less well-serviced areas (Supplementary Discussion). The relatively larger multidimensional deprivations in rural and peri-urban areas speak to the necessity to extend existing physical and socioeconomic access to these territories9,29.
Empirical tests and policy implications
We have shown how fundamental concepts from urban science translated into advanced network analysis methods can be applied to high-precision spatial data to quantify detailed physical access deficits, which in turn underlie an array of deprivations associated with low human development, from basic services to health and socioeconomic opportunity. We demonstrated that blocks with severe lack of street access to buildings can be identified as informal settlements, leading to an objective method for producing localized street-level estimates of populations living in poverty21,32,36, the main target of Sustainable Development Goals 1.1 and 11.15. Furthermore, we ground-truthed this measure with community mapping of self-declared slums across nine cities (Extended Data Fig. 7), and predicted additional informal settlements to be surveyed. High values of k are also robustly associated with multidimensional outcomes of disadvantage as measured by the best available subnational demographic and health surveys (Supplementary Discussion and Supplementary Notes).
The present study focuses on sub-Saharan Africa but we provide open-source software to apply these methods elsewhere. Currently, analysing the entire subcontinent requires high-performance computing, but smaller-scale analyses (blocks or local areas) are fast using desktop computing (see ‘Data availability’ and ‘Code availability’). It will be important to update the current results as the built environment of sub-Saharan Africa changes and continues to be better mapped, and to extend the analysis to other world regions, especially to Asia and in the Americas where fast urbanization remains markedly informal but comprehensive local characterizations are still lacking. It will also be critical to monitor improvements in infrastructure delivery over time to track localized Sustainable Development Goals along with other metrics of human development and climate risk.
There are limitations to the present results related to data quality and completeness that we expect will be fully mitigated as data collections and mapping continue to improve. At present, building footprints are identified from aerial and satellite images using manual tracing, edge detection algorithms, machine learning and quality controls by human operators (Supplementary Methods). This process achieves very good results, verifiable against high-precision remote-sensing images (available online at millionneighbourhoods.africa). Outstanding issues are that small structures, probably non-residential, are not always well captured by these methods and that footprint placement varies in accuracy at about a metre scale, which is insufficiently precise for surveying and cadastral mapping. Further improvements in data and analysis will require more precise building extraction from higher-resolution images, including detailed building shapes and textures, street mapping, and geospatial placement. It will also need other important contextual information such as building types, height, materials and services, all of which speak to local quality of life and human development issues5,10. These improvements are already routinely achieved by aerial (drone) mapping coupled to building shape extraction using machine learning or tracing, followed by context-sensitive ground surveys. However, such methods cannot yet be deployed at a continental scale. Street network data have improved markedly in accuracy and completeness over the past few years owing to contributions from individuals, but also from corporations, and humanitarian efforts focusing on under-mapped regions37,47 (Supplementary Notes). The OpenStreetMap street network data used here have been estimated to be approaching worldwide completion48, and in most African nations is many times larger than official government sources (Supplementary Table 3), making it difficult to validate against any other standard. These data probably remain somewhat incomplete in some regions, potentially biasing up our estimates of infrastructure deficits. To address this issue, we assessed street network completeness using both external sources and internal modelling of OpenStreetMap contributions over time, which provide mutually independent estimates (Supplementary Tables 3 and 4). We found no evidence of k estimation bias from these sources in the form of a statistically significant correlation between variations in average k and estimates of street network completeness. However, growth patterns in OpenStreetMap data contributions suggest that several nations in the horn of Africa and the Sahel may be less complete, especially away from urban areas (Supplementary Table 3). Analysis of the variation of human development indicators suggests that environments with the lowest levels of access (k > 13) may be associated with some under-mapped regions (Extended Data Fig. 8). Analysis of the residuals to the best-fit linear model in Extended Data Fig. 6 supports the view that (very low) levels of development in these regions may be underestimated, especially close to national borders in northern Mauritania and Southern Angola (Extended Data Fig. 6c). Very high k blocks are overwhelmingly rural and peri-urban, very large, and very sparsely populated (Extended Data Fig. 8c–e and Supplementary Notes). Conversely, the same evidence supports highly accurate characterizations of urban areas.
The systematic analysis of local street infrastructure deficits and informal settlements developed here provides several ingredients for innovation and policy. First, there is no empirical basis for the classification of local communities as a dichotomy of slums versus non-slums, as has been assumed in policy and classification from remote sensing15,21,41. Instead, our analysis points to the existence of a continuous spectrum of access, with most neighbourhoods in sub-Saharan Africa experiencing a range of infrastructure and services deficits, and only a relatively much smaller set being very deprived, denser and more complex14, a finding consistent with case studies38. Moreover, neighbourhoods in the larger cities of sub-Saharan Africa tend to show fewer and less severe deficits than in their corresponding rural and peri-urban areas where, in contrast, the nature of poverty and informality is very different9,13,15,38. In this sense, the infrastructure deficits across sub-Saharan Africa present different challenges that include relatively small but systemic improvements in most neighbourhoods in urban areas, intense interventions in rarer but denser urban slums14,42, and comprehensive infrastructure development in relatively easier but extensive peri-urban9 and rural areas. The block-level decomposition transforms these vast urban and regional planning challenges into a treatable modular problem, where each block can be considered separately with solutions naturally co-produced by resident communities and local governments. Other co-benefits, such as the generation of cadastral maps and official addresses are naturally created by this process (Fig. 1c–f and Extended Data Fig. 1) and can support context-appropriate institutions promoting land-use rights and responsibilities, and a formalized tax base for local governments29.
Second, despite a number of recent claims based on macroscopic evidence that African cities are lagging in terms of economic development, and that African urbanization is fundamentally different because of the abundance of informal settlements6,28,49,50, we would argue—without trivializing absolute levels of deprivation—that the current evidence points in a different direction. We find that African cities, with variation as in Fig. 3, are national engines of development in that they are doing significantly better on average than their adjacent peri-urban and rural areas across every dimension of development, including infrastructure delivery, basic services, healthcare and socioeconomic development13,28. Also, this gradient of development along the urban hierarchy, from larger to smaller cities and towns, is typical of past patterns of urbanization elsewhere2,3,23,40.
Third, the empirical evidence introduced here supports an emerging general understanding of the connection between urbanization and human development. Urban science has focused on the role of street networks supporting mobility and socioeconomic interactions, from which they derive value and predictable quantitative properties2,25,26. The dividend from these connections is only latent in fast-developing cities with generalized infrastructure deficits. As physical access networks are improved, they are thus predicted to support the expansion and diversification of socioeconomic networks, which render cities more innovative and productive2. Such expansion is, of course, also helped or hindered by social factors such as trust, conflict or segregation, which become dominant as physical access is provided5,18,38. This view of development as a partly physical network process provides concrete strategies for human-centric policies that create virtuous cycles of change, whereby investments in (missing) physical connectivity can more than pay for themselves by promoting broader socioeconomic development, which in turn can support stronger institutions, infrastructure and services9,28.
The prospect of a general localized approach to sustainable development means that millions of neighbourhoods around the world will be developing in parallel over the coming decades, connecting to their local infrastructure networks and becoming increasingly formalized in the sense of public services, addresses and land uses, adding to their resilience to climate change and other stresses15,39. This shared global experience brings online a vast peer-to-peer network of local innovators solving similar challenges, supporting the creation of general knowledge and technologies truer to the living experience of cities. In this way, we may leverage the growth of our scientific understanding of cities and the expanding possibilities of larger data to support and accelerate human development that is faster, more universal and more sustainable.
Methods
Datasets
Building footprints data were produced by Ecopia Landbase Africa in 2022, powered by Maxar imagery and accessed via DigitizeAfrica Platform at https://platform.ecopiatech.com. Street network data were obtained from OpenStreetMap available at www.openstreetmap.org, retrieved from the Geofabrik at https://download.geofabrik.de/africa.html. Population data were obtained from two worldwide raster maps by LandScan43, available at https://landscan.ornl.gov, and WorldPop8, available at https://data.worldpop.org/GIS/Population/Global_2000_2020_Constrained. Urban area geometries were based on the GHSL Urban Centre Database51 at https://ghsl.jrc.ec.europa.eu. Peri-urban areas are defined as a commutable 10-km spatial buffer around GHSL boundaries. National estimates of slum population in urban areas were obtained from the United Nations Human Settlement Programme (UN-Habitat) Global Urban Indicators Database 2020, available at https://data.unhabitat.org/pages/housing-slums-and-informal-settlements. Demographic and health survey data are available at https://api.dhsprogram.com and https://spatialdata.dhsprogram.com.
Block generation and population estimates
Block delineations are based on all connected streets in OpenStreetMap, from which we excluded the categories of ‘footway’ and ‘path’ as these are reported irregularly and are not typically associated with infrastructure access. Blocks are defined as land geometries (polygons) circumscribed by street, roads and other boundaries such as rivers and coastlines. Block-level population estimates are produced by down-allocating population from spatial grids at larger scales (1 km and 100 m for LandScan and WorldPop, respectively) to each block proportionally to building area.
Data validation and analysis
We inspected and validated building footprints against satellite imagery in a variety of locations to confirm completeness and accuracy, except for very small buildings, which are probably not residential. Correlation analysis between block characteristics and demographic and health indicators were obtained from the DHS programme at the subnational level. Principal component analysis was performed over 67 indicators (dimensions) in 219 subnational region–year observations (Supplementary Table 2 and Extended Data Table 2). The DHS data used in correlational analysis cover 238 administrative regions across 22 countries and 40 unique surveys between 2010 and 2021, producing 367 unique observations. We validated k as an estimator of informal settlements in nine urban areas with self-declared slum surveys (Extended Data Fig. 7), and performed a systematic assessment of the completeness of OpenStreetMap street network data (Supplementary Tables 3 and 4 and Extended Data Fig. 8a,b). See ‘Validation of data quality’ in Supplementary Methods and ‘Data quality and completeness assessments’ in Supplementary Notes.
Online visualization
We created an online interactive visualization (available at www.millionneighborhoods.africa) to allow readers to replicate the multi-scale analysis of Fig. 1. Hovering over any region creates a display panel (top left) that shows local data on k complexity, estimated population, building counts, and building and land areas. Starting with a zoomed-out map of the full continent, readers can hover over each country to obtain national statistics. Then they can zoom in anywhere to obtain similar statistics for each urban, peri-urban and rural (non-urban) area. Zooming in further shows the same quantities for each street block while also displaying building footprints and streets. Toggling to ‘Satellite Map’ (top-left corner) shows satellite images for comparison, permitting visual assessment and validation. The visualization also shows population density at high (block) precision by choosing ‘Population Density’ in the dropdown menu in the top-left corner. Choosing ‘Block Type’, in the same menu, reveals a region’s classifications as urban, peri-urban or rural.
Data availability
All block-level data for sub-Saharan Africa, including aggregations to GHSL and Africapolis urban definitions, are available for public download via the Harvard Dataverse repository at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DQY54U. The database is also available online at www.millionneighborhoods.africa/download as an interactive map to visualize diverse statistics including block complexity and population density. All primary data sources are available via public sources with the exception of the DHS and Ecopia datasets, which require data usage agreements. To access DHS data, users must request access to ‘SURVEY’ and ‘GPS’ data for all countries in sub-Saharan Africa following these instructions: https://dhsprogram.com/data/Access-Instructions.cfm. We provide application processing interface (API) code to download the surveys in the kblock-analysis repository. To access the Ecopia building footprint data, users should email admin@digitizeafrica.ai to obtain credentials for the DigitizeAfrica Platform. Users who have difficulty accessing the above data should contact the authors, who maintain back-ups of the source data. Source data are provided with this paper.
Code availability
The code for generating the underlying database, including block complexity, population and block geometries, is available on GitHub at https://github.com/mansueto-institute/kblock with DOI-minted source code available via Zenodo at https://doi.org/10.5281/zenodo.12636819 (ref. 52). The code for reproducing the analysis, including all figures and tables, is available on GitHub at https://github.com/mansueto-institute/kblock-analysis with DOI-minted source code available via Zenodo at https://doi.org/10.5281/zenodo.15702173 (ref. 53). A Code Ocean capsule with a fully reproducible example is available at https://doi.org/10.24433/CO.1487090.v1.
References
Keith, M. et al. A new urban narrative for sustainable development. Nat. Sustain. 6, 115–117 (2022).
Bettencourt, L. M. A. Introduction to Urban Science: Evidence and Theory of Cities as Complex Systems (MIT Press, 2021).
Brelsford, C., Lobo, J., Hand, J. & Bettencourt, L. M. A. Heterogeneity and scale of sustainable development in cities. Proc. Natl Acad.Sci. USA https://doi.org/10.1073/pnas.1606033114 (2017).
Moreno, E. L. Slums of the World: The Face of Urban Poverty in the New Millennium?: Monitoring the Millennium Development Goal, Target 11–World-Wide Slum Dweller Estimation (UN-HABITAT, 2003).
Mitlin, D. & Satterthwaite, D. Urban Poverty in the Global South: Scale and Nature (Routledge, 2013).
Kates, R. W. & Dasgupta, P. African poverty: a grand challenge for sustainability science. Proc. Natl Acad. Sci. USA 104, 16747–16750 (2007).
Porto de Albuquerque, J. et al. The role of data in transformations to sustainability: a critical research agenda. Curr. Opin. Environ. Sustain. 49, 153–163 (2021).
Linard, C., Gilbert, M., Snow, R. W., Noor, A. M. & Tatem, A. J. Population distribution, settlement patterns and accessibility across Africa in 2010. PLoS ONE 7, e31743 (2012).
Olaniran, T. O. & Aule, T. T. Systematic approach to sustainable urban development: reviewing challenges of informal settlements and peri-urban growth in sub-Sahara Africa. Urban Plan. Transp. Res. 13, 2495660 (2025).
Brelsford, C., Martin, T. & Bettencourt, L. M. Optimal reblocking as a practical tool for neighborhood development. Environ. Plann. B https://doi.org/10.1177/2399808317712715 (2017).
Brelsford, C., Martin, T., Hand, J. & Bettencourt, L. M. A. Toward cities without slums: topology and the spatial evolution of neighborhoods. Sci. Adv. 4, eaar4644 (2018).
Soman, S., Beukes, A., Nederhood, C., Marchio, N. & Bettencourt, L. Worldwide detection of informal settlements via topological analysis of crowdsourced digital maps. ISPRS Int. J. Geoinf. 9, 685 (2020).
World Health Organization & United Nations Children’s Fund (UNICEF) Progress on Sanitation and Drinking Water—2015 Update and MDG Assessment (World Health Organization, 2015).
Li, C. et al. Slum and urban deprivation in compacted and peri-urban neighborhoods in sub-Saharan Africa. Sustain. Cities Soc. 99, 104863 (2023).
Satterthwaite, D. et al. Building resilience to climate change in informal settlements. One Earth 2, 143–156 (2020).
Steffen, W. et al. Planetary boundaries: guiding human development on a changing planet. Science 347, 1259855 (2015).
Elias, P. & de Albuquerque, J. P. in Localizing the SDGs in African Cities Sustainable Development Goals Series (eds Croese, S. & Parnell, S.) 115–131 (Springer, 2022); https://doi.org/10.1007/978-3-030-95979-1_8.
Sheth, S. K. & Bettencourt, L. M. A. Measuring health and human development in cities and neighborhoods in the United States. npj Urban Sustain. 3, 7 (2023).
Montgomery, M. R. The urban transformation of the developing world. Science 319, 761–764 (2008).
Blumenstock, J. E. Fighting poverty with data. Science 353, 753–754 (2016).
Burke, M., Driscoll, A., Lobell, D. B. & Ermon, S. Using satellite imagery to understand and promote sustainable development. Science 371, eabe8628 (2021).
Borel-Saladin, J. Data dilemmas: availability, access and applicability for analysis in sub-Saharan African cities. Urban Forum 28, 333–343 (2017).
Sahasranaman, A. & Bettencourt, L. M. Life between the city and the village: scaling analysis of service access in Indian urban slums. World Dev. 142, 105435 (2021).
Bettencourt, L. M. A., Lobo, J., Helbing, D., Kühnert, C. & West, G. B. Growth, innovation, scaling, and the pace of life in cities. Proc. Natl Acad. Sci. USA 104, 7301–7306 (2007).
Bettencourt, L. M. A. The origins of scaling in cities. Science 340, 1438–1441 (2013).
Streets as Tools for Urban Transformation in Slums (UN-Habitat, 2014).
Pandey, B., Brelsford, C. & Seto, K. C. Infrastructure inequality is a characteristic of urbanization. Proc. Natl Acad. Sci. USA 119, e2119890119 (2022).
Parnell, S. & Pieterse, E. Africa’s Urban Revolution (Zed Books, 2014).
The State of African Cities, 2010: Governance, Inequality and Urban Land Markets (UN-Habitat, 2010).
Patel, S., Baptist, C. & D’Cruz, C. Knowledge is power—informal communities assert their right to the city through SDI and community-led enumerations. Environ. Urban. 24, 13–26 (2012).
Jean, N. et al. Combining satellite imagery and machine learning to predict poverty. Science 353, 790–794 (2016).
Yeh, C. et al. Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nat. Commun. 11, 2583 (2020).
Prieto-Curiel, R., Patino, J. E. & Anderson, B. Scaling of the morphology of African cities. Proc. Natl Acad. Sci. USA 120, e2214254120 (2023).
Kohli, D., Sliuzas, R., Kerle, N. & Stein, A. An ontology of slums for image-based classification. Comput. Environ. Urban Syst. 36, 154–163 (2012).
Friesen, J., Taubenböck, H., Wurm, M. & Pelz, P. F. The similar size of slums. Habitat Int. 73, 79–88 (2018).
Blumenstock, J., Cadamuro, G. & On, R. Predicting poverty and wealth from mobile phone metadata. Science 350, 1073–1076 (2015).
Herfort, B., Lautenbach, S., Porto de Albuquerque, J., Anderson, J. & Zipf, A. The evolution of humanitarian mapping within the OpenStreetMap community. Sci. Rep. 11, 3037 (2021).
Beard, V. A., Satterthwaite, D., Mitlin, D. & Du, J. Out of sight, out of mind: understanding the sanitation crisis in global south cities. J. Environ. Manag. 306, 114285 (2022).
Quaye, I., Amponsah, O., Azunre, G. A., Takyi, S. A. & Braimah, I. A review of experimental informal urbanism initiatives and their implications for sub-Saharan Africa’s sustainable cities’ agenda. Sustain. Cities Soc. 83, 103938 (2022).
Daniere, A. G. & Takahashi, L. M. Poverty and access: differences and commonalties across slum communities in Bangkok. Habitat Int. 23, 271–288 (1999).
Leonita, G., Kuffer, M., Sliuzas, R. & Persello, C. Machine learning-based slum mapping in support of slum upgrading programs: the case of Bandung City, Indonesia. Remote Sens. 10, 1522 (2018).
Gulyani, S., Bassett, E. M. & Talukdar, D. Living conditions, rents, and their determinants in the slums of Nairobi and Dakar. Land Econ. 88, 251–274 (2012).
Rose, A. et al. LandScan Global 2020. LandScan https://doi.org/10.48690/1523378 (2020).
Tusting, L. S. et al. Mapping changes in housing in sub-Saharan Africa from 2000 to 2015. Nature 568, 391–394 (2019).
Hoornweg, D. & Pope, K. Population predictions for the world’s largest cities in the 21st century. Environ. Urban. 29, 195–216 (2017).
Chi, G., Fang, H., Chatterjee, S. & Blumenstock, J. E. Microestimates of wealth for all low- and middle-income countries. Proc. Natl Acad. Sci. USA 119, e2113658119 (2022).
Anderson, J., Sarkar, D. & Palen, L. Corporate editors in the evolving landscape of OpenStreetMap. ISPRS Int. J. Geoinf. 8, 232 (2019).
Barrington-Leigh, C. & Millard-Ball, A. The world’s user-generated road map is more than 80% complete. PLoS ONE 12, e0180698 (2017).
Pieterse, E. Grasping the unknowable: coming to grips with African urbanisms. Soc. Dyn. 37, 5–23 (2011).
Turok, I. & McGranahan, G. Urbanization and economic growth: the arguments and evidence for Africa and Asia. Environ. Urban. 25, 465–482 (2013).
European Commission, Joint Research Centre Description of the GHS Urban Centre Database 2015: Public Release 2019. (Publications Office, 2019).
Marchio, N. & Smith, M. M. mansueto-institute/kblock: beta release. Zenodo https://doi.org/10.5281/zenodo.12636819 (2024).
Marchio, N. mansueto-institute/kblock-analysis: updates 06-19-2025. Zenodo https://doi.org/10.5281/zenodo.15702173 (2025).
Acknowledgements
We thank I. Blair-Freese and the Bill & Melinda Gates Foundation for access to the Ecopia building footprints data; A. Beukes, G. Birch, C. Brelsford and P. Conceição for discussions; and audiences at the UN-Habitat Assembly and Slum Dwellers International for useful comments. D. Halpern, M. Martinez, M. Smith, I. Sachango, C. Nederhood and S. Soman contributed to underlying methods, code and visualizations. This work was partially supported by the Mansueto Institute for Urban Innovation and the Susan and Richard Kiphart Center for Global Health and Social Development at the University of Chicago.
Author information
Authors and Affiliations
Contributions
L.M.A.B. and N.M. conceived of the paper. N.M. performed data analysis and visualizations. L.M.A.B. and N.M. wrote the paper. L.M.A.B. and N.M. reviewed the paper.
Corresponding author
Ethics declarations
Competing interest
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks Michele Acuto, Marshall Burke, Arsham Ghavasieh and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Block graph construction and the interpretation of block complexity.
a, Street blocks are defined by their street boundaries (blue lines) and building footprints (small grey polygons). b, To construct the block graph, each building is abstracted to its centroid (black dots); c, a Voronoi decomposition of each block’s area is created, generating a land parcel around each building clipped to street boundaries. This creates a land parcel map for each block. d, Block graph with building centroids as nodes and connections between adjacent land parcels as edges. e, Each node in the block graph is characterized by its shortest distance to nodes at the street boundary. Colours characterize the topological distance along the graph to the infrastructure network. This determines the block complexity, k, as the shortest distance along the graph for the least accessible building. This procedure is repeated for all street blocks in sub-Saharan Africa, Fig. 1, and is visualized online at https://millionneighborhoods.africa. f, For compact shapes, k can be interpreted as the block’s characteristic length scale (radius), measuring the distance from the centre (least accessible parcel) to the surface in units of the local building density. On average, this implies that the block’s land area AB ∝ k2/nb, where nb is the building density. g, This relationship is shown for all blocks in Lusaka’s conurbation (grey points), with log-log linear best fit (black line, slope = 0.999 ± 0.014, R2 = 0.61), and a line of slope unity (yellow) for comparison. The fit shows dispersion around this expectation because of block shape variations. Panels a–e, made with OpenStreetMap at openstreetmap.org.
Extended Data Fig. 2 Statistical characterization k frequency distributions.
a, The distribution for the entire sub-Saharan Africa, b, limited to urban areas, c, peri-urban and d, rural areas, as in Fig. 2a. A log-normal distribution (blue line) is the best fit in all cases but parameters vary reflecting a growing mean, variance and skew for urban blocks, peri-urban, and rural areas, respectively. Other best-fit distributions (gamma, logistic, Poisson, and Weibull) are shown for comparison in different colours. e-p, Block complexity frequency distributions for select major sub-Saharan cities. Distribution parameters for all nations in sub-Saharan Africa and major urban areas are given in Table S1.
Extended Data Fig. 3 The relationship between average block complexity and population by settlement type.
Note how lack of street access to buildings is predominantly a feature of rural areas, whereas urban areas in most nations have lower k. Peri-urban areas show much larger variability, with good access in several nations (South Africa, Zimbabwe, Botswana), but manifestly lower in many others (Sudan, South Sudan, Chad, Mozambique), see also Tables S3-S4 and Supplementary Notes for a discussion and tests of data completeness. These results suggest widely varying degrees of planning and infrastructure development across African nations, and also broader regions.
Extended Data Fig. 4 The spectrum of informality across major sub-Saharan African conurbations.
a-b, Analogous to Fig. 3, but for conurbations, Supplementary Methods. c, Summary average block complexity across all blocks in nations in sub-Saharan Africa. d, Summary average block complexity in conurbations, for each nation. Panels c and d, made with Natural Earth at naturalearthdata.com (using rnaturalearth).
Extended Data Fig. 5 Extended version of Fig. 4, showing the correlation of a larger number of indicators along their main principal component versus k block complexity.
Symbol size denotes population density (corresponding to more urban regions) and colours show the specific indicator variation along the PC1 variation with block complexity, k. We observe that measures of advantage, such as percent of women with secondary or higher education, of households with a refrigerator, of population with electricity or births delivered at a health facility are all anti-correlated with k and decline with each unit increment by −4.2, −1.9, −4.3 and −5.5%, respectively. Conversely, measures of disadvantage such as percent of population with natural floors, of population using open defecation, < 5 child mortality rate, and women aged > 6 with no education are all positively correlated with k, with each increment leading to a 3.5, 1.9, 7.5 and 3.5% increase in these metrics.
Extended Data Fig. 6 Map of observed versus predicted regression results and residuals for DHS regions.
a, Observed levels of the PC1 development composite from available data. b, Model predictions based on linear regression of k with controls and country fixed effects, Extended Data Table 2. c, Difference between observed and predicted development composite (residuals). Positive residuals – where the model underpredicts development – such as in southern Angola/northern Namibia and northern Mauritania, are typically associated with regional effects not well captured by country fixed effects, such as near national borders. These regions are all rural and very sparsely populated, Extended Data Fig. 8c–e, with very high k, subject to possible infrastructure underestimation but, more likely, to limitations of the linear model at extreme k values, Extended Data Fig. 8a–b. Detailed residuals analysis using individual DHS measures show that these areas (Cunene, Angola; Oshikoto, Namibia; Ohangwena, Namibia; Omusati, Namibia; Assaba, Mauritania) have higher than expected levels of women who are literate, women with a secondary or higher education, women who are currently married using family planning, and with access to health facilities for labor and delivery. Regions showing lower development than predicted are primarily concentrated in the Western Sahel. Such differences have also recently been linked to armed conflict, which depresses human development despite infrastructure access. Panels a–c, subnational region boundaries made with DHS (using rdhs); national boundaries made with Natural Earth at naturalearthdata.com (using rnaturalearth).
Extended Data Fig. 7 Ground-thruthing estimated informality with self-declared slum surveys.
a, Block delineations with Know Your City surveyed neighborhoods in Freetown, Sierra Leone and corresponding block complexity (colours). b, The block complexity across the city, with surveyed neighborhoods outlined for reference. We see a general statistical agreement between high block complexity and self-declared slum neighborhoods, but also observe high levels of inferred informality across many blocks not surveyed. c, The block complexity of self-declared slum neighbourhoods (red bars) is higher than the average across several cities (blue bars). However, in cities with larger deficits of access infrastructure (such as Freetown, Sierra Leone or Monrovia, Liberia), the sample of self-declared slums shows block complexity statistics that more closely resemble the average. Black error bars denote upper and lower quartiles; minimum and maximum k are shown as yellow error bars. d, The frequency histogram for blocks in self-declared slums (red bars) show that they have larger k≥3 than across all urban areas in sub-Saharan Africa (blue bars). We conclude that the expected criterion of k≥3 for classifying informal settlements does apply to this set of surveys. Panels a and b, made with OpenStreetMap at openstreetmap.org; made with Natural Earth at naturalearthdata.com (using rnaturalearth).
Extended Data Fig. 8 Correlation between population share per block complexity and PC1 for DHS regions, and estimates of street network completeness at the national level.
a, Low block complexity k shows a manifestly positive correlation with PC1 (pink box), indicating good systemic development outcomes for higher fractions of the population in accessible blocks. For k > 5 the correlation turns negative, indicating that more population at these levels of access contribute to decreases in development indicators. These decreases worsen until k ≃ 13. Beyond this value of k, the correlation remains negative but becomes more uncertain. b, The ratio of OSM street network length to that reported in the CIA World Factbook, Table S4, versus block complexity k. Only four nations (Eritrea, South Sudan, São Tomé and Príncipe and South Africa) show ratio values below unity, potentially indicating underestimation of accesses. Of these, only Eritrea and South Sudan show large block complexity, see Supplementary Notes. c, Blocks with k > 13 are very sparsely built up with the vast majority having less than 1% of their area occupied by buildings. d, These blocks are extremely large spatially. e, They are mostly rural and peri-urban. This suggests that street access to buildings becomes less important under these circumstances than in urban areas, which in turn show greater access, higher densities and higher levels of multidimensional development.
Supplementary information
Supplementary Information
This file contains Supplementary Methods, Tables 1–4, Discussion, Notes and additional references.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Bettencourt, L.M.A., Marchio, N. Infrastructure deficits and informal settlements in sub-Saharan Africa. Nature 645, 399–406 (2025). https://doi.org/10.1038/s41586-025-09465-2
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41586-025-09465-2