Abstract
High-precision and wide coverage data on rural household wealth are essential for bridging national-level rural revitalization policies with micro-level rural entities, enabling the precise allocation of public resources. However, the vast number and dispersed distribution of rural communities in China make wealth data difficult to collect and scarce in availability. To address this challenge, this study proposes an integrated technical framework that combines “sky” remote sensing imagery with “ground” village street view imagery to construct a fine-grained, computable representation of rural household wealth. Centered on the intelligent interpretation of rural housing features, we extract wealth-related visual elements from imagery and regress them against benchmark survey-based household wealth indicators to develop a high-accuracy township-level wealth prediction model (R² = 71%). This model is used to generate a nationwide, township-scale rural household wealth map. Our findings reveal a distinct “bimodal” spatial distribution of rural wealth in China, characterized by a polarization pattern: higher in the south and east, and lower in the north and west. This approach offers a scalable and cost-effective alternative to traditional household surveys, supporting the identification of rural development gaps and facilitating more targeted and effective rural policy implementation.
Similar content being viewed by others
Introduction
China is vigorously advancing its rural revitalization strategy, aiming to address the long-standing issues of unbalanced and insufficient rural development with greater precision. In this context, the spatial differentiation of rural wealth and poverty, along with the formulation of targeted policy responses, has become a central topic in geographical research (Zhou and Liu, 2022). With the increasing complexity of rural dynamics, fine-grained, timely and scalable data have become essential in designing responsive policies and evaluating their impact on improving the well-being of rural populations (Li et al., 2022, 2023). Nonetheless, comprehensive micro-level spatial data delineating rural wealth remains exceedingly scarce. For example, the “Urban and Rural Construction Statistical Yearbook” reports a 40% decline in rural population alongside a 68% increase in total rural housing area between 1990 and 2022 (Ministry of Housing and Urban-Rural Development, 2023), indicating substantial wealth redistribution of wealth However, these statistics are aggregated at national or provincial levels, and the spatial distribution of assets at the township or village scale remains largely unclear.
Currently, rural household wealth data in China primarily relies on questionnaire-based surveys. Yet with approximately 3.4 million natural villages widely dispersed across the country (Yang et al., 2016), implementing such surveys at scale is both challenging and costly. This methodology faces limitations in scope, high financial and labor costs, and temporal challenges, particularly in remote regions (Dang et al., 2019). Most low- and middle-income countries often face difficulties in conducting similar social surveys. For example, only about 50% of countries or regions have internationally comparable data related to Zero Poverty (SDG 1) (Sachs et al., 2022). These constraints underscore the need for a scalable, cost-efficient, and spatially detailed data framework to complement and enhance existing survey-based approaches.
Rural housing stands out as the primary spatial asset for rural wealth, reflecting the investment, both long-term investment behavior and aspirations for improved living standards (Xu et al., 2022). Over the past decades, transformations in rural housing—from simple cottages to multi-story buildings—have mirrored China’s broader socioeconomic development. The proportion of housing in total rural household wealth increased from 33% to 60% of household wealth from 1990 to 2013 (Knight et al., 2020). Household surveys, such as the Demographic and Health Surveys (DHS), widely used in low- and middle-income countries, often assess household wealth by examining housing structure, size, materials, and associated assets such as vehicles and appliances (Corsi et al., 2012). Researchers have developed composite wealth indices by combining these features, demonstrating their effectiveness as proxies for long-term economic status (Corsi et al., 2012). Due to their interpretability and cross-context comparability, such indices have been widely applied in global and continental-scale wealth studies (Jean et al., 2016; Yeh et al., 2020; Chi et al., 2022). Consequently, the intelligent interpretation of rural housing features has emerged as a key approach to measuring household wealth in rural settings.
Remote sensing imagery offers a promising, cost-effective solution for large-scale wealth prediction. Initial studies used nighttime light (NTL) data as a proxy for economic activity (Elvidge et al., 2009; Henderson et al., 2011, 2012; Xu et al., 2018, 2019; Wang et al., 2021), but its predictive performance varies across spatial scales—strong at national levels but prone to noise at finer scales such as villages (Elvidge et al., 2020). Large-scale efforts have used high-resolution satellite data to characterize urbanization, focusing on urban land use classification (Gong et al., 2020; Li et al., 2023c), urban impervious surface (Sun et al., 2019) and urban dynamics (Li et al., 2018). High-resolution daytime imagery captures explicit wealth-related features, including housing structures and road networks, offering more explainable results for rural studies (Li et al., 2023b; Huang et al., 2024). Pioneering studies have demonstrated the potential of neural networks trained on satellite images to predict poverty and wealth patterns with high accuracy (Jean et al., 2016; Yeh et al., 2020). With the advent of sub-meter resolution in remote sensing imagery, global (Microsoft Corporation, 2022),continental-scale (Chamberlain et al., 2024), or national (Li et al., 2023a; Deng et al., 2023) building footprint datasets have become available, significantly enhancing the spatial clarity of housing asset data. In the rural China context, scholars try to construct a large-scale benchmark label dataset (Deng et al., 2024) and age-informative building footprint (Li et al., 2021), finally complete the China Rural Housing Database (CRHD) (Li et al., 2022). CRHD could provide essential support for deriving wealth-related attributes such as the number, area, and age of rural houses.
Street view imagery offers a valuable complement to satellite-based approaches. Capturing horizontal perspectives, it reveals structural and material details that top-down remote sensing often misses. Large-scale street view data are increasingly utilized to predict the wealth level of communities (Gebru et al., 2017; Fan et al., 2023; Li et al., 2024). Enabled by the widespread adoption of smartphones and the rise of rural social media engagement, this bottom-up data generation approach yields vast volumes of grassroots visual data (Xu et al., 2022). These images offer observable cues such as building height, construction materials, façade finishes, air conditioning units, and even safety-related aspects, enriching the understanding of rural household assets.
This paper synthesizes remote sensing with street view imagery to construct a national, township-level rural wealth index centered on housing characteristics. We assess over 1.85 million street view images from more than 30,000 townships, building a model that combines subjective visual quality evaluation with objective indicators such as housing dimensions, construction materials, and amenities. By integrating these features with survey-based wealth labels, we develop a machine learning model capable of accurately predicting sub-indices and a composite wealth score, ultimately producing a high-resolution map of rural wealth distribution across China.
The study’s culmination provides a wealth distribution map that speaks directly to policy-makers, equipping them with the precise data necessary for targeted rural policy formulation and effective revitalization initiatives. It’s an endeavor that not only maps wealth but also the pathways to a sustainable rural future, in alignment with the global Sustainable Development Goals.
Methods
This paper first establishes an actual wealth label database based on rural household questionnaire survey data and establishes a regression relationship with the feature characteristics of intelligent image interpretation to form a wealth prediction model. Furthermore, this paper extends the wealth prediction model to areas where questionnaires have not been conducted, based on large-scale remote sensing imagery and village street view imagery, forming a township-scale wealth distribution map of China. In terms of actual wealth label generation, this paper uses the rural household questionnaire database from the Rural Construction Evaluation Project (https://www.mohurd.gov.cn/gongkai/zc/wjk/art/2022/art_17339_766562.html) in 2022, covering 28 provinces, 124 counties, 1,678 townships, and 180,000 questionnaires. These questionnaire data have a wide spatial coverage and a large volume, recording information on farmers’ income, property, consumption, etc., and involving counties with different natural geographic conditions and development levels, which have high statistical representativeness. For feature input, the paper collected about 1.85 million rural housing street view images from the “Cun Cun Pai” crowdsourced mini-program and Rural Human Settlement Environment Survey Database for intelligent interpretation, combining the features of the existing rural housing vector database to train a wealth prediction model for multiple wealth indicators. Based on this, the paper first reveals the regional differences in the wealth of rural households in China at a national scale, using townships as units. Finally, the paper conducts an in-depth study on the regional differences in rural household wealth in China, studying the impact of natural and economic factors on wealth accumulation through regression analysis.
Research data
The research data of this paper mainly covers three aspects (Fig. 1): First, rural household questionnaire survey data, as a measure of actual rural wealth, used for label input in the prediction model. Second, village street views and remote sensing interpretation of the rural housing database, used for feature input in the prediction model. Among them, part of the data corresponding to the questionnaires is used to train the machine learning model, and the imagery data from areas not covered by the questionnaires is used to predict wealth distribution nationwide. Third, statistical data on natural, social, and economic aspects, used to analyze the factors affecting the regional distribution of wealth in China.
Village questionnaire survey for generating wealth labels
The village questionnaire survey of the Rural Construction Evaluation Project is an important basis for the wealth labels in this paper. In 2022, the evaluation covered 102 national sample counties and 22 provincial sample counties, totaling 124 sample counties. These sample counties have a wide coverage and strong regional representativeness. When selecting national sample counties, 3 to 5 counties from each province are selected to report, considering different regional environmental backgrounds and economic development levels. Therefore, using the sample counties from the Rural Construction Evaluation Project as data labels ensures the typicality and representativeness of label input and aids in extending the prediction model nationwide. The paper collected about 180,000 valid questionnaires, covering 1678 townships. The content of the questionnaires mainly involves family basic information, rural housing and supporting facilities, and includes rich information on wealth elements such as income, assets, and consumption. The volume of questionnaire data is sufficient, and the content dimensions are rich, which helps to train a high-performance model.
Wealth is commonly viewed as an asset. From a dynamic perspective, income can be converted into savings and solidified into stock assets, while consumption is the expenditure of stock assets (Pender et al., 2012). Therefore, this paper proposes to assess the wealth level of rural households from three dimensions: income, assets, and consumption. There are a total of 7 related questions in the villagers’ questionnaire, as shown in Table S3 in the Supplementary Materials (SM). In terms of income, since income information is relatively private, the questionnaire is designed to use ranges as answer options. Although setting income intervals reduces precision, it can better avoid the issue of respondents concealing information. In terms of assets, core assets such as rural housing, home appliances, and cars are mainly considered. Among them, rural housing is the most important asset for farmers, accounting for nearly 60% of household wealth. Cars are also one of the more important assets. In terms of consumption, electricity consumption is mainly considered. The higher the wealth level, the higher the electricity consumption tends to be.
Village street view imagery and rural housing database for generating wealth features
The village street view imagery used in this study was primarily sourced from Cun Cun Pai, a grassroots image crowdsourcing platform that encourages local residents to document their own villages under the slogan “Every village deserves to be seen.” The platform supports the collection of rural human settlement imagery and the implementation of questionnaire surveys in designated sample counties as part of a public-interest initiative. It has accumulated a substantial volume of village street view imagery that is spatially matched with villagers' questionnaire responses. Approximately 100,000 street view images—primarily featuring rural housing—were selected to ensure geographic alignment with the 180,000 household survey responses, allowing the survey data (used as wealth labels) to be associated with corresponding imagery for wealth feature extraction. For model extension and large-scale prediction, this study further utilized the Village Human Settlement Environment Survey Database, which contains approximately 1.85 million village street view images covering 30,667 townships across China, accounting for roughly 75% of all townships nationwide. Based on this comprehensive dataset, a national, township-level map of rural household wealth was generated.
Additionally, this study utilized the vector boundary map of Chinese townships to perform spatial statistical analysis on the Chinese Rural Housing Vector Database (Li et al., 2022) (CRHVD). CRHVD used in this study is a high-resolution geospatial dataset that contains detailed polygon features of individual rural residential buildings across China. Each building footprint is associated with attributes such as construction area, construction age (modern or traditional), and geographic coordinates, enabling spatial analysis at both the village and township scales. Importantly, the vector database includes full coverage of the 1678 townships considered in this study. For each of these townships, we extracted structural indicators such as: Number of rural houses, Average building area, Proportion of old-style rural houses. These variables were used to supplement the wealth prediction model and interpret patterns of rural housing development. The integration of this vector dataset ensures spatial consistency between remotely sensed building data, street view imagery, and household-level survey labels, thereby strengthening the model’s explanatory power and interpretability.
Natural, social, and economic data explaining regional differences in rural household wealth
After predicting the wealth of rural households nationwide and revealing their regional differences, this paper further analyzes their influencing factors. The main dimensions are natural geography and socio-economy. The data was obtained and integrated from statistical yearbooks, population censuses, network collection, remote sensing products, and other approaches. In terms of natural geography, it mainly involves elevation, temperature, river systems, etc., with data coming from spatial statistics on map products. Elevation data comes from 30 m resolution DEM data, temperature data from the National Earth System Science Data Center (https://www.geodata.cn/data), and river system data from publicly available online data (https://zhuanlan.zhihu.com/p/590827818). From a socio-economic perspective, urbanization rate and per capita GDP are employed as explanatory variables. The data are derived from the County Statistical Yearbook 2020 (https://cnki.istiz.org.cn/CSYDMirror) and the Seventh National Population Census (https://www.stats.gov.cn/sj/pcsj/rkpc/7rp/zk/indexch.htm).
Wealth measurement method
Real wealth labels based on villagers' questionnaires
This paper calculates and obtains related wealth indicators from the rural household questionnaire data. Starting from three dimensions—income, assets, and consumption—a total of 13 sub-indexes were obtained (Table S4). In addition, principal component analysis was used to reduce dimensions and aggregate these 13 sub-indexes, with the first principal component being used as a composite wealth index. Both these sub-indexes and the composite index were used for model training.
Sub-indexes
A series of characteristics, such as rural housing, living facilities, and energy consumption, are closely related to the wealth of farmer households. Especially for rural housing, housing construction is an important indicator of rural economic development, with more than 60% of farmers’ income being invested in their homes (Knight et al., 2020). As income levels increase, cars also become a major tool for wealthier farmers to enjoy a better life, which in turn indirectly reflects the wealth level of farmer households.
Therefore, in the 13 sub-indexes of this paper, housing-related indicators make up the majority, totaling 10 items. The rest include family income, car ownership, and electricity consumption indicators. The specific indicators are defined in Table S4. Housing-related indicators mainly consider factors such as floor height, base area, house structure, exterior wall decoration, toilet, kitchen, bathroom, and internal facilities such as tap water, air conditioning, broadband, etc. Farmers invest their income in upgrading their homes to achieve their ideal comfortable living environment. From the basic size and safety of the house to the esthetic appearance and the comfort and convenience of facilities, these are all reflections of the gradual increase in farmers’ wealth accumulation levels. Clearly, larger houses and more robust structures require more capital investment, and the same goes for purchasing more air conditioning, broadband, and other facilities.
In addition, the functional zoning within the house and comprehensive construction are characteristics of modernized rural housing, where independent kitchens, bathrooms, and flush toilets are also important guarantees of housing quality. However, the provision of these facilities is still not widespread in many backward areas. For example, only 54% of self-built homes nationally are equipped with flush toilets (Table S4), and some areas still use pit latrines. Cars are another important asset for farmer households. According to the questionnaire survey, about 62% of farmer households in China own a car. Since farmers usually count the cars of family members working away from home and not just those residing in the village, this percentage may be an overestimate of local rural assets. The average monthly electricity bill in the summer reflects the rural households’ capacity to use household appliances and is another important representation of wealth level.
Composite index
This paper further aggregates the above 13 indicators to form a composite wealth index. The method of principal component analysis is used to reduce the dimensions of the 13 indicators, and the first principal component is used as the composite wealth index. This type of operation is widely used in the process of composite wealth analysis to generate reliable and stable labels for machine learning (Jean et al., 2016; Yeh et al., 2020; Chi et al., 2022).
For the combination of economic variables related to family wealth, the first principal component obtained through principal component analysis can often be interpreted as a family wealth index, or a proxy variable for the overall and long-term socioeconomic level of a family. When dimensionality reduction is performed on a set of related household economic variables using principal component analysis, the first principal component often captures the largest variance in the data. Since these economic variables all reflect family wealth, such as housing and cars, the first principal component is likely to represent a common wealth factor that explains the most significant variance in the data. In fact, from an economic sense, this implies an assumption: wealth is the most important factor affecting the differentiation of these economic variables. Therefore, it is reasonable to interpret the first principal component as a wealth index or a proxy variable for socioeconomic status.
From Eq. (1), in principal component analysis, each principal component is a linear combination of the original variables, where each variable is multiplied by its corresponding weight. These weights, also known as loadings or coefficients, are different for each variable in each principal component. Suppose there are n related variables \({x}_{1}\), \({x}_{2}\), …, \({x}_{n}\). The first principal component \({{pc}}_{1}\) can be expressed as: (1) Here, \({w}_{1}\), \({w}_{2}\),… \({w}_{n}\) are the weights assigned to each variable \({x}_{1}\), \({x}_{2}\)… \({x}_{n}\) in \({{pc}}_{1}\). These weights determine the contribution of each variable to \({{pc}}_{1}\). The selection of weights aims to maximize the variance explained by \({{pc}}_{1}\) while ensuring that the first principal component is uncorrelated with the other principal components.
Following established practice in asset-based wealth index construction (Corsi et al., 2012; Jean et al., 2016; Yeh et al., 2020; Chi et al., 2022), we retain the raw scale of the first principal component from PCA. This allows the index to preserve the proportionate variance across units and better reflect relative disparities without artificial compression into a fixed range.
Wealth feature indicators from the imageries
This paper further extracts related wealth proxy variables from national remote sensing and street view imagery, realizing low-cost, large-scale, and high-precision feature extraction of rural household wealth elements (see S1 of Supplementary Materials). A total of 10 major wealth feature characteristics were identified, mainly including housing features and facility features. The specific feature indicator definitions and data sources are as follows in Table S5.
In the housing dimension, this paper primarily identified features such as the base area of rural housing, number of floors, exterior wall decoration types, and rural housing quality, while also counting the number of houses in townships. These variables collectively reflect the scale and quality of rural housing. The base area of the building and the number of floors complement each other, together representing the scale of the rural housing. Both the building’s base area and the number of floors serve as indicators of the scale of rural housing construction—larger footprints and taller structures each reflect greater household investment and housing size. The scale of rural housing is directly linked to the cost of raw construction. The costs of construction materials and labor, such as steel, cement, and bricks, are basically proportional to the size of the rural housing. Therefore, the larger the rural housing, the more wealth is embedded in it.
The type of exterior wall decoration further reflects the financial strength of the rural household. Houses with tiled exteriors are more esthetically pleasing and sought after by many modern rural housing constructions, and the material cost is also higher, hence farmers who opt for tiles usually have more substantial financial resources. In contrast, houses with exposed exterior walls indicate that the rural household does not have sufficient funds to build a better house, reflecting a lower level of wealth accumulation. This paper also uses the quality of rural housing as one of the features reflecting the wealth of farmers. The quality of rural housing has a strong correlation with the disposable income of farmers (Xu et al., 2022). Families with higher income levels are likely to build higher-quality houses. Therefore, the quality of rural housing can be used to predict wealth levels. The total number of houses in townships reflects the aggregation degree of housing in the township and may produce a scale effect on the accumulation of farmers’ wealth.
In the facilities dimension, this paper primarily identified facilities such as cars, motorcycles, and air conditioners. Cars are another important asset related to wealth. A car, along with a house, is an important representation of the economic status of a rural family. Especially in the rural marriage market, farmer families with cars often have a competitive edge. In rural areas, the presence and number of cars can also reflect the wealth level of farmers and villages. Motorcycles are also an important asset in the daily life of farmers, but are slightly less indicative of wealth than cars.
Air conditioning is another important support facility for modern rural life. With global warming and an increase in extreme weather events, air conditioners have become an important guarantee for farmer families to maintain a comfortable living environment and combat heat or severe cold. Operating air conditioners also consumes a large amount of electricity. Air conditioning is not a rigid need, and families that can afford air conditioners typically have a significant income. Additionally, since the air conditioning unit is usually mounted on the exterior wall, it is easily captured by village street view imagery. Therefore, this paper conjectures that air conditioning can be used to predict the wealth level of farmers.
After identifying or extracting the above feature elements, this paper calculates the average value or proportion of the related indicators at the township level. Ideally, wealth-related variables should be calculated at the village level, but there are fewer images at the village scale. This paper chooses to use townships as the unit for calculating wealth. Typically, a township contains an average of more than 20 villages. All wealth-related variables are aggregated at the township level.
In the dimension of averages, there are four indicators: floor height, base area, and quality of rural housing. In the proportion dimension, there are five indicators: car rate, motorcycle rate, air conditioning rate, rate of tiled exterior walls, and rate of exposed exterior walls. In the scale dimension, the total number of houses in each township is obtained using spatial statistics. The proportion indicators are calculated according to Eq. (2):
where \({R}_{v}\) represents the proportion for the corresponding variable. \({D}_{i}\) is a dummy variable that equals 1 when the related element is successfully detected and 0 when it is not detected. \(N\) is the number of houses or images at the township level. Among them, the rates of tiled exterior walls, exposed exterior walls, and air conditioners are calculated based on the number of houses. Since vehicles cannot be identified as belonging to any particular rural house, the car rate and motorcycle rate are calculated based on the number of images.
Deep learning from the street view imagery
In this study, two types of models were used to extract wealth-related information from the 1.85 million village street view imagery. First, we adopted a pretrained housing quality evaluation model developed by Xu et al. (Xu et al., 2022), which produces a housing quality index based on labeled street view imagery with a correlation coefficient of 0.87 (see Supplementary Material S1 for details). Second, we independently developed a suite of object detection models to identify visible wealth-related features from rural street view imagery.
These features include doors, windows, cars, motorcycles, air conditioning units, and façade characteristics. The presence of vehicles and air conditioning equipment serves as a proxy for household economic status and modern living standards, while door and window counts are used to infer the number of storeys. Collectively, these elements provide a comprehensive, multidimensional representation of rural housing conditions.
For these tasks, we selected YOLOv5 due to its proven balance between detection accuracy and computational efficiency, particularly in small-object scenarios common in rural environments (Zhao et al., 2019; Ge et al., 2021). The model’s real-time performance and deployment flexibility have made it a widely adopted framework in both academic and applied vision tasks (Jocher, 2020). To improve model precision, we manually annotated a large number of training samples for each target category and trained separate YOLOv5 models for each feature. Regarding the storey height variable, it refers to the number of floors of a building, rather than the vertical height in meters. This is estimated from street view imagery based on a rule-based post-processing step applied to YOLOv5 detection outputs. Specifically, for each detected window or door, we generate a vertical line through the image and count the number of doors/windows it intersects. The maximum count across all such vertical lines is used to infer the number of storeys.
A total of six detection models—including those for cars, motorcycles, and air conditioners—were trained on datasets split into training (70%), validation (10%), and testing (20%) subsets. All models were trained under consistent experimental settings on a workstation equipped with an Intel® Core™ i7-10700K CPU and an NVIDIA GeForce RTX 3060 GPU. Model performance was evaluated using standard object detection metrics, including average precision (AP) and F1-score.
Random forest for establishing the regression relationship between wealth features and labels
This paper uses random forest to train a regression model for inferring family wealth from image feature indicators. Random forest is a classifier that includes multiple decision trees, where the final prediction result depends on the mode of the outputs from each decision tree. Random forests have the following advantages: they do not require normalization of the data, can perform nonlinear prediction tasks, and can eliminate the interference of missing values, thereby ensuring a broader range of prediction. In addition, sub-indexes are often non-linearly distributed, such as floor height, car rate, etc., which are mainly long-tail distributions. Random forests can avoid estimation bias caused by non-normality of the data. Therefore, this paper selects random forest as the machine learning model.
This paper uses ten-fold cross-validation for model training. Specifically, the 1,678 townships are randomly divided into 10 groups; each group is used as the validation group, and the other 9 groups are used as the training group for training and validation. By averaging the evaluation of multiple models, a more robust model with generalizable predictive effects is obtained. During the model training process, some parameter indicators are set. The number of decision trees is set to 100 (ntree = 100), which means the random forest model contains 100 decision trees. The number of features randomly sampled at each node is set to 1 (nfeature = 1), indicating that each decision tree randomly selects 1 feature to evaluate when splitting a node.
Result
Performance of image-based feature detection
The six deep learning models developed in this study achieved strong performance (Table 1), with an average overall precision of 94.53%. This indicates that, on average, the models accurately detected 94.53% of the target features across all categories. Among them, the model for exposed exterior wall detection achieved the highest precision (99.43%), demonstrating excellent accuracy in identifying wall materials. The car detection model also performed robustly, with an average precision of 98.29%, while the model for door and window detection—used for estimating storey height—achieved a slightly lower precision of 88.95%, likely due to occlusion, structural variation, or complex image conditions.
The remaining models, including those for motorcycles, air conditioners, and tiled exterior walls, all exceeded 90% in precision, suggesting consistent reliability across a diverse set of visual targets. Overall, these detection models demonstrate strong generalization capabilities and high predictive accuracy under uniform training conditions. These results provide a solid foundation for the downstream construction of wealth-related feature indicators and support the broader goal of linking visual attributes of rural housing with household wealth estimation at scale.
Performance of wealth-related index prediction
To validate the reliability of household facility measurements derived from street view imagery, we conducted a comparative analysis between the image-derived indicators and corresponding values from our ground truth household survey data. As shown in the heatmap below (Fig. 2), the correlation coefficients between survey-based and image-based indicators demonstrate consistently positive and, in many cases, strong associations.
For instance, the Pearson correlation coefficient between image-derived and survey-reported floor height reaches 0.80, and that for toilet rate is 0.61, indicating strong agreement between the two sources. Other key indicators also show moderate correlations, including independent bathroom rate (r = 0.54), cooling facility rate (r = 0.42), and car ownership rate (r = 0.32). These results support the validity of our imagery-based measurements and demonstrate that visual features extracted from street-level imagery can effectively capture various aspects of household infrastructure.
This comparison demonstrates the credibility and robustness of using street view imagery for large-scale measurement of household infrastructure. To further improve the predictive accuracy of each survey-based indicator, we employed a random forest model that integrates all available survey-based indicators as reference variables. This modeling approach enhances the consistency and interpretability of the predicted results. Consequently, our method enables spatially continuous estimation of rural housing conditions and household facilities, particularly valuable in regions where traditional household surveys are unavailable, infrequent, or costly to conduct.
The random forest models demonstrate strong predictive performance for the composite wealth index and its sub-indicators (Table 2). The predicted composite index demonstrates a strong correlation with the actual values (r = 0.85), accompanied by a root mean square error (RMSE) of 0.55 and a coefficient of determination (R² = 0.71), confirming that features extracted from remote sensing imagery and village street view imagery can effectively estimate rural household wealth at the township level. The composite index shows the second-highest correlation among 13 indicators, following only floor height, indicating its robustness in capturing wealth-related information.
Among sub-dimensions, income prediction is less accurate (R2 = 0.55, r = 0.66), likely due to the indicator’s reliance on a binary income threshold, which fails to reflect income variability and introduces noise. In contrast, summer monthly electricity consumption performs well (R2 = 0.66, r = 0.81), as it is a continuous variable with clear recall and strong links to household economic capacity.
Asset-related indicators also show strong predictive accuracy. Floor height exhibits the highest correlation with survey-based values (r = 0.89, R² = 0.80), supported by its reliable detectability in street view imagery. The presence of flush toilets and independent bathrooms also shows strong correlations—r = 0.83, R² = 0.67 and r = 0.81, R² = 0.66, respectively—highlighting their importance as key indicators of rural housing improvements. The cooling facility rate performs similarly well (r = 0.82, R² = 0.68), largely attributable to the visibility of external air conditioning units in the imagery. In contrast, indicators less directly tied to housing quality yield weaker results; for instance, car ownership demonstrates a lower correlation (r = 0.57, R² = 0.32), likely due to overreporting of vehicles owned by migrant family members but not residing locally.
Overall, these results validate the effectiveness of using machine learning and image-based features to predict rural wealth patterns, providing a scalable and high-resolution alternative to traditional survey-based approaches.
Spatial distribution and characteristics of rural household wealth
Analysis of the composite wealth index for rural households reveals a national average of −0.06, reflecting relative economic status rather than deficits, with pronounced wealth disparities across regions. Higher wealth concentrations appear along the Yangtze River and southeastern coastal regions, while a bimodal distribution pattern suggests two distinct clusters of wealth levels in the country (Fig. 3).
This figure presents the spatial distribution of the composite rural household wealth index across China at the township level. a shows the town-level composite wealth index, while b–k show housing quality, building base area, floor height, rate of air-conditioner, rate of car, rate of motorcycle, rate of tiled exterior wall, number of houses and rate of old houses, respectively.
As shown in Table 3, the average quality score of rural housing in national townships is 5.78. The average floor height is 1.59 floors, of which 85.6% of townships have floor heights of less than 2 floors, indicating that rural housing in China mainly consists of 1 or 2 floors. The difference in rural housing quality and floor height is more pronounced in the north-south direction, with larger building areas in the mid-latitude regions. The average building base area is 107 square meters, which is close to the 110 square meters reported in the questionnaire statistics. An average of 24% of rural houses have air conditioning facilities, with higher air conditioning rates mainly concentrated around the Beijing area and along the Yangtze River. 34% of rural houses have tiled exterior walls, while 11% have exposed exterior walls. Exposed exterior walls are more common in the western regions. The distribution of tiling rates is relatively less distinct. In terms of transportation tools, cars were captured in 14% of the images, while motorcycles accounted for only 11%. While vehicle presence is often associated with rural wealth, the image-derived car and motorcycle rates should be interpreted cautiously. Differences in photo timing, camera angles, and local parking behaviors may lead to visibility biases—particularly in northern regions where outdoor parking is more common. These limitations reinforce the need to interpret such indicators in combination with other wealth features rather than in isolation.
China’s rural household wealth is markedly heterogeneous. County-level analyses show a bimodal wealth distribution, suggesting polarization into higher and lower wealth zones (Fig. 4d–f). Wealthier areas, particularly in the Yangtze River and southeastern coastal regions, contrast starkly with poorer counterparts in the northwest, northeast, and Tibetan Plateau.
a is the county-level composite wealth index; b is the composite wealth index of the nine major agricultural zoning areas; c is the village street view imagery of different wealth levels from north to south; d is the frequency histogram of the composite wealth index at the county level; e is the frequency histogram grouped by the east and west sides of the Hu Line; f is the histogram grouped by the sides of the Qinling Huaihe Line.
Geographic disparities are pronounced, with wealth typically greater in the south than in the north and in the east more than in the west. The demarcating Hu Line and Qinling Huaihe Line reflect these disparities, with wealth indices on their east and south sides considerably higher than on their west and north sides. Within China’s nine major agricultural zones, wealth peaks in the Yangtze River’s fertile basin and subsides in the remote Qinghai-Tibet, Northeast, and Inner Mongolia regions.
A granular view, informed by intelligently interpreted features, confirms southern regions surpass the north in wealth indices, with homes here averaging higher floor counts and larger base areas Table 4. Modern construction materials, air conditioning, and vehicle ownership rates further delineate the divide, showcasing the South’s deeper investment in housing and development.
China’s wealthiest rural counties, representing the top 10% in household wealth index, cluster along the Yangtze River and southeast coast, forming economic belts enriched by decades of progressive reform (Fig. 4). Notably, these regions harness the advantages of special economic zones like Xiamen and Shenzhen, blending traditional agriculture with modern enterprises. Models of urbanization, such as the Wenzhou and Putian models, exemplify the seamless integration of rural spaces into the economic fabric without uprooting communities.
The fecund basins around Dongting, Tai, and Poyang Lakes bolster this wealth, supporting rice agriculture that thrives on surplus for both local sustenance and nationwide supply. This prosperity has allowed for significant reinvestment in rural infrastructure and housing, amplifying the wealth of these areas.
Conversely, the bottom 10% of counties, predominantly west of the Hu Line and in the northeast, struggle with geographic and infrastructural barriers. Over half of these counties lie in border areas with limited access to national economic streams and infrastructural support, hampered by rugged plateau terrains and underdeveloped agriculture. The disparity underscores China’s diverse economic landscape and highlights the need for targeted development strategies in its remote rural regions.
Explanatory analysis of natural and economic drivers of rural household wealth
The growth and accumulation of rural household wealth is shaped by a combination of natural and economic factors. This process unfolds within the relatively stable natural geographic environment of rural China, while also being influenced by regionally differentiated economic development patterns in the eastern, central, and western parts of the country since the reform and opening-up era. Accordingly, this study investigates the factors contributing to rural wealth accumulation from both natural and economic perspectives.
In the natural geography dimension, we consider three foundational variables: elevation, average annual temperature, and river network density, all measured at the county level. These variables tend to be stable over time, yet exert strong, long-term effects on agricultural production capacity and regional development. In the economic geography dimension, we include per capita GDP and the urbanization rate, both of which are widely recognized as indicators of regional development stages and have significant explanatory power for differences in household income and wealth. To control for spatial heterogeneity, we also include dummy variables for China’s nine major agricultural zones.
Based on these variables, this article further constructs an OLS regression model to explain rural family wealth, as follows:
Where \(w\) is the composite wealth index, \(c\) is the constant term, \(h\) is the altitude, \(t\) is the average temperature, \(r\) is river density, \(g\) is per capita GDP, \(u\) is urbanization rate, \({D}_{i}\) is the dummy variable, and \(e\) is the residual term. \(\beta\) series are the estimated coefficients. All variables except the urbanization rate \(u\) have been log-transformed. The regression results are shown in Table 5.
In Model (1), which includes only elevation and temperature, the model already achieves a strong fit (R² = 0.65), with both variables highly significant (p < 0.01). Adding river density in Model (2) does not notably change the model fit, but river density itself remains statistically significant (p < 0.05). The findings suggest that rural areas at higher elevations tend to have lower composite wealth scores—likely due to harsher terrain, weaker agricultural productivity, and less developed infrastructure. In contrast, warmer regions, mainly in southern China, exhibit higher rural wealth levels. These areas benefit from more favorable climatic conditions, longer growing seasons, and stronger population attractiveness. For instance, southern rice-producing regions often support multiple harvests per year, generating greater agricultural surplus and facilitating household wealth accumulation. Warmer climates also attract seasonal migrants from colder regions, such as northeastern China, further enhancing local economic vitality. Areas with dense river networks, such as those surrounding Dongting Lake, Poyang Lake, and Tai Lake in the Yangtze River Basin, show stronger rural development supported by abundant water resources and intensive agriculture.
In the economic dimension, Model (3) adds urbanization rate and per capita GDP to the baseline specification. Both variables exhibit strong and significant positive effects (p < 0.01) on rural wealth. Higher per capita GDP reflects stronger regional economies and greater household disposable income, which translates into greater investment and consumption capacity in rural areas. Urbanization serves as a key driver of income growth, as rural residents benefit from access to urban labor markets. Over the past four decades, rural-to-urban labor migration has played a central role in raising household incomes. When the agricultural zone dummies are added in Model (5), all variables remain significant, and the model fit increases to R² = 0.799, indicating the robust influence of both natural and economic factors on rural wealth accumulation.
In sum, this analysis highlights that rural wealth is shaped by the foundational constraints of geography and the dynamic opportunities of economic development. Natural variables such as elevation, temperature, and river density influence wealth through their effects on agricultural productivity and accessibility. In parallel, regional economic conditions—captured through per capita GDP and urbanization—exert a strong and positive influence on household wealth accumulation.
Discussion
This study presents a comprehensive technical framework for evaluating Chinese rural family wealth through the intelligent interpretation of rural housing assets. By integrating overhead remote sensing imagery with ground-level street view images, and applying deep learning methods, we constructed a finely-resolved, township-scale composite index of rural family wealth across the country. This is the first attempt to generate such a national-level wealth map that systematically captures both structural and functional aspects of rural assets.
Our results demonstrate strong overall model performance, with the composite index achieving a Pearson correlation coefficient of 0.85 and a coefficient of determination (R² = 0.71) when compared to survey-based data. Specifically, image-derived indicators such as housing height, independent bathroom rate, flush toilet rate, and cooling facility rate all exhibit correlation coefficients exceeding 0.80, underscoring the model’s effectiveness in capturing infrastructure-based proxies of rural wealth. In terms of household consumption, summer electricity expenditure also shows strong predictive alignment (r = 0.81, R² = 0.66). In contrast, the model performs less effectively in predicting household income (r = 0.66, R² = 0.44), likely because income is less directly observable through visual cues captured in imagery.
Despite the model’s promising results, several limitations remain. Most notably, the current index is developed at the township level, rather than at the more granular village level, which limits its applicability for fine-scale rural governance and policy design. Villages represent the fundamental spatial unit of rural revitalization efforts and the primary interface for policy implementation. While remote sensing imagery can provide house-level precision, the availability of street view images is still limited. In this study, the average number of images per village is approximately five, which may result in inadequate sample representation. Consequently, township-level aggregation was adopted. In future work, as crowdsourced rural street view platforms become more prevalent, it will be feasible to scale this framework to the village level, enabling more detailed wealth diagnostics and localized intervention planning.
Additionally, some indicators, such as car and motorcycle ownership, exhibited relatively lower correlations. This is likely due to two factors: (1) survey-reported ownership may include vehicles belonging to migrant family members who no longer reside locally, thus overestimating local wealth; and (2) street-level imagery may fail to consistently capture mobile assets due to seasonal usage, shared ownership, or occlusions. These limitations highlight the challenges of using static, image-based methods to assess non-permanent or mobile assets. To address this, our composite wealth index integrates a variety of household characteristics, ensuring greater robustness by reducing the weight of any single noisy indicator.
Lastly, while PCA was effective in synthesizing multidimensional wealth indicators and capturing their variance structure, it does not explicitly account for spatial autocorrelation across townships. Future studies may consider employing spatially weighted PCA or geographically weighted PCA to better incorporate the influence of local spatial patterns and enhance the spatial sensitivity of wealth assessments.
Data availability
The datasets used for training the models and predicting township-level rural wealth indices, as well as the datasets used for explaining spatial wealth differentiation, are publicly available at: https://doi.org/10.6084/m9.figshare.28946042.v1.
References
Chamberlain HR, Darin E, Adewole WA et al. (2024) Building footprint data for countries in Africa: to what extent are existing data products comparable? Comput Environ Urban Syst 110:102104. https://doi.org/10.1016/j.compenvurbsys.2024.102104
Chi G, Fang H, Chatterjee S, Blumenstock JE (2022) Microestimates of wealth for all low-and middle-income countries. Proc Natl Acad Sci USA 119:e2113658119
Corsi DJ, Neuman M, Finlay JE, Subramanian SV (2012) Demographic and health surveys: a profile. Int J Epidemiol 41:1602–1613
Dang H-A, Jolliffe D, Carletto C (2019) Data gaps, data incomparability, and data imputation: a review of poverty measurement methods for data‐scarce environments. J Econ Surv 33:757–797
Deng W, Xu W, Huang Y, Li X (2024) A large-scale multipurpose benchmark dataset and real-time interpretation platform based on chinese rural buildings. IEEE J Sel Top Appl Earth Obs Remote Sens 17:10914–10928
Deng X, Liang Y, Li X, Xu W (2023) Recognition and spatial distribution of rural buildings in Vietnam. Land 12:2142. https://doi.org/10.3390/land12122142
Elvidge CD, Hsu F-C, Zhizhin M et al. (2020) Indicators of electric power instability from satellite observed nighttime lights. Remote Sens 12:3194
Elvidge CD, Sutton PC, Ghosh T et al. (2009) A global poverty map derived from satellite data. Comput Geosci 35:1652–1660
Fan Z, Zhang F, Loo BP, Ratti C (2023) Urban visual intelligence: Uncovering hidden city profiles with street view images. Proc Natl Acad Sci USA 120:e2220417120
Ge Z, Liu S, Wang F et al (2021) YOLOX: Exceeding YOLO Series in 2021. https://github.com/ultralytics/yolov5
Gebru T, Krause J, Wang Y et al. (2017) Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proc Natl Acad Sci USA 114:13108–13113
Gong P, Chen B, Li X et al. (2020) Mapping essential urban land use categories in China (EULUC-China): preliminary results for 2018. Sci Bull 65:182–187. https://doi.org/10.1016/j.scib.2019.12.007
Henderson JV, Storeygard A, Weil DN (2012) Measuring economic growth from outer space. Am Econ Rev 102:994–1028
Henderson V, Storeygard A, Weil DN (2011) A bright idea for measuring economic growth. Am Econ Rev 101:194–199
Huang Y, Xu W, Chen D et al. (2024) Sky-scanning for energy: unveiling rural electricity consumption patterns through satellite imagery’s convolutional features. ISPRS Int J Geo-Inf 13:345. https://doi.org/10.3390/ijgi13100345
Jean N, Burke M, Xie M et al. (2016) Combining satellite imagery and machine learning to predict poverty. Science 353:790–794
Jocher G (2020) YOLOv5 by Ultralytics. https://github.com/ultralytics/yolov5
Knight J, Li S, Wan H (2020) The increasing inequality of wealth in China. In: Sicular T, et al (eds) Changing trends in China's inequality: Evidence, analysis, and prospects. Oxford University Press, New York, pp 109–144. https://doi.org/10.1093/oso/9780190077938.003.0004
Li S, Liang Y, Deng X et al. (2023a) Exploring the spatial heterogeneity of rural development in laos based on rural building spatial database. Land 12:2008. https://doi.org/10.3390/land12112008
Li W, Sun R, He H et al. (2024) Perceptible landscape patterns reveal invisible socioeconomic profiles of cities. Sci Bull 69:3291–3302. https://doi.org/10.1016/j.scib.2024.06.022
Li X, Chen D, Xu W et al. (2023b) Explainable dimensionality reduction (XDR) to unbox AI ‘black box’ models: a study of AI perspectives on the ethnic styles of village dwellings. Humanit Soc Sci Commun 10:1–13. https://doi.org/10.1057/s41599-023-01505-4
Li X, Gu Y, Deng W et al. (2023) The technical framework of computable village South China Geogr J. 1:22–25
Li X, Xu W, Huang Y et al. (2022) Spatial distribution of rural building in China: remote sensing interpretation and density analysis. Acta Geogr Sin 77:835–851. https://doi.org/10.11821/dlxb202204005
Li X, Zhou Y, Zhu Z et al. (2018) Mapping annual urban dynamics (1985–2015) using time series of Landsat data. Remote Sens Environ 216:674–683. https://doi.org/10.1016/j.rse.2018.07.030
Li Y, Xu W, Chen H et al. (2021) A novel framework based on mask R-CNN and histogram thresholding for scalable segmentation of new and old rural buildings. Remote Sens 13:1070. https://doi.org/10.3390/rs13061070
Li Z, He W, Cheng M et al. (2023c) SinoLC-1: the first 1-meter resolution national-scale land-cover map of China created with a deep learning framework and open-access data. Earth Syst Sci Data 15:4749–4780. https://doi.org/10.5194/essd-15-4749-2023
Microsoft Corporation (2022) GlobalMLBuildingFootprints. https://github.com/microsoft/GlobalMLBuildingFootprints
Ministry of Housing and Urban-Rural Development (2023) China urban-rural construction statistical yearbook. China Statistics Press, Beijing
Pender J, Marré A, Reeder R (2012) Rural wealth creation concepts, strategies, and measures. USDA-ERS Economic Research Report
Sachs JD, Kroll C, Lafortune G et al. (2022) Sustainable development report 2022. Cambridge University Press
Sun Z, Xu R, Du W et al. (2019) High-resolution urban land mapping in China from Sentinel 1A/2 Imagery based on Google Earth Engine. Remote Sens 11:752. https://doi.org/10.3390/rs11070752
Wang Z, Román MO, Kalb VL et al. (2021) Quantifying uncertainties in nighttime light retrievals from Suomi-NPP and NOAA-20 VIIRS Day/Night Band data. Remote Sens Environ 263:112557
Xu W, Chen H, Frias-Martinez E et al. (2019) The inverted U-shaped effect of urban hotspots spatial compactness on urban economic growth. R Soc Open Sci 6:181640. https://doi.org/10.1098/rsos.181640
Xu W, Gu Y, Chen Y et al. (2022) Combining deep learning and crowd-sourcing images to predict housing quality in rural China. Sci Rep. 12:19558
Xu W, Li X, Chen H (2018) A comparative research on the rank-size distribution of cities in China and the United States based on urban nighttime light data Prog Geogr 37(3):385–396
Yang R, Xu Q, Long H (2016) Spatial distribution characteristics and optimized reconstruction analysis of China’s rural settlements during the process of rapid urbanization. J Rural Stud 47:413–424
Yeh C, Perez A, Driscoll A et al. (2020) Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nat Commun 11:2583
Zhao M, Zhou Y, Li X et al. (2019) Applications of satellite remote sensing of nighttime light observations: advances, challenges, and perspectives. Remote Sens 11:1971. https://doi.org/10.3390/rs11171971
Zhou Y, Liu Y (2022) The geography of poverty: review and research prospects. J Rural Stud 93:408–416. https://doi.org/10.1016/j.jrurstud.2019.01.008
Acknowledgements
We thank all participants and local officials who supported the village data collection process. We also acknowledge the technical support provided by the research team at the Rural Construction Evaluation Project.
Author information
Authors and Affiliations
Contributions
XW and GY primarily wrote the main manuscript text and analyzed the data. LQ and HY prepared all the figures. LX supervised the project. All authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
All survey data used in this study were anonymized and collected with appropriate institutional permissions. No identifiable personal information was involved.
Informed consent
All survey data used in this study were anonymized and collected with appropriate institutional permissions. No identifiable personal information was involved.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Xu, W., Li, Q., Huang, Y. et al. Beyond surveys: high-resolution mapping of rural wealth in China using satellite and street view imagery. Humanit Soc Sci Commun 12, 1023 (2025). https://doi.org/10.1057/s41599-025-05400-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1057/s41599-025-05400-y