Abstract
Human mobility modelling has attracted scholarly attention from physics-based methods and social science explanatory approaches. However, there is limited knowledge of the nonlinear relationship of flows and distance in intercity mobility and regional differences in the nonlinear relationship. Focusing on China’s long-distance and large-scale mobility during the Spring Festival, this paper develops a framework to explain the nonlinear relationship. Using the Gradient Boosting Decision Tree (GBDT) model and Tencent Big Data, we find that there are three types of nonlinear relationships, namely plateau (almost zero distance decay parameter), drop (decreasing distance decay parameter) and rebound (increasing distance decay parameter after decreasing). The provincial differences also reveal that the nonlinear relationships depend on the domestic relative location and the intra-provincial urban system. This result shows that the cities in the coastal province enjoy a more inclusive spatial structure, which supports the migration from the periphery of the province. In contrast, the inland cities are concerned with embracing the migrants and settling them down.
Similar content being viewed by others
Introduction
Human mobility is an everlasting attractive topic and has still attracted global academic attention in recent years (Alessandretti, Aslak, and Lehmann, 2020; Lao et al., 2022; Calafiore et al., 2023; Schläpfer et al., 2021). The universal law of human mobility patterns was discovered through the enlightenment of the Newtown Gravity model. Thanks to advances in information communication technology (ICT), scholars can find nuanced human mobility patterns with unprecedented fine spatial resolution with big data. The work is now undertaken under two perspectives (Gu et al., 2023; Haraguchi et al., 2022). One is from physics-based modelling, and the other is the social science-based explanation. The former seeks to explore universal laws of human mobility enlightened by physics, such as the scaling law and the network method (Pappalardo et al., 2023). The latter focuses on explaining the social and economic changes, for instance, the impacts of Pandemics on mobility patterns (Zhang et al., 2022). These two perspectives have long been separated from each other, but the emerging explanatory machine learning brings the potential to combine these two branches together (Gilpin et al., 2018).
Due to its large scale and social context, China’s mobility patterns have drawn plenty of academic attention (Cui et al., 2020; Shen et al., 2023). Among these interesting patterns, Chunyun movements during Spring Festival (a Chinese-style Christmas Day) are one of the hottest topics. Large-scale and long-distance travels occur before and after Spring Festival, which gives an annual precious chance for family members to get together (Yin et al., 2020; Zhu et al., (2022)). China’s Chunyun is an important illustration of long-term migration motivated by many variables. The migration of workers from less developed to more developed regions highlights broader patterns in migration and urbanisation in China’s developing cities (Pan and Lai, 2019). Nearly 3 billion people were on their roads home during the busiest time of the Spring Festival. A large number of non-locally registered migrants typically return to their hometowns during the Spring Festival, only to return to the city for work at the year’s start. Because of this, they regularly travel between their hometowns and destination cities as a population of temporary “migrants.” The dynamics, states, and trends of this periodical movement are considerably amplified by the Spring Festival, a significant occasion for the Chinese community (Wu et al., 2016; Cui et al., 2020). Therefore, this research aims to clarify the underlying causes and consequences of these widespread mobility patterns by examining the spatial and temporal aspects of population movements during this time, shedding light on the larger dynamics of urbanisation and migration in the quickly changing Chinese urban landscape.
In the burgeoning China’s intercity mobility research, big data and machine learning are employed to dismantle the mobility patterns, but most fail to understand the uneven development context behind mobility. This paper contributes to the field by introducing nonlinear relationships and inter-provincial differences. First, we discover the nonlinear relationship of distance-decay in China’s mobility based on the Gradient Boosting Decision Tree (GBDT). The geographical distance becomes a crucial factor due to the associated costs and uncertainties, influencing population movement (Niedomysl and Fransson, 2014; Wu et al., 2016). This development presents new challenges and patterns in population movement research, especially in the context of periodical migration (Song et al., 2010). While distance decay is widely recognised in the literature, linear formation is still under debate (Gao et al., 2021). We try to show a nonlinear relationship between distance and flows during China’s Spring Festival. Second, we bring the regional differences to the modelling, which are significant to understanding China’s mobility patterns. For instance, Guangdong province has outward flows before the Spring Festival, while Hubei province is the opposite. Building on the periodical flow of population movement, our study contributes significantly to further explaining the phenomena of employment and settlement separation, or urbanisation, at the national scale.
In the following parts, we first review the literature on China’s mobility. Then, the methodology and data are shown in the third section. The results are analysed in the next. The conclusion is drawn in the final part.
Literature review
Mobility in China’s Urbanization Process
China’s expeditious urbanisation, constituting a pivotal facet of the nation’s socio-economic metamorphosis, has emerged as the principal catalyst propelling internal population mobility. This urbanisation phenomenon, particularly conspicuous in coastal regions, is propelled by their sophisticated industrial foundations and open economic environments, resulting in substantial labour migration from inland areas (Pan and Lai, 2019). This migratory surge augments the economic vitality of coastal areas and profoundly influences the hinterlands’ developmental trajectories. Concurrently, the urbanisation dynamics in China are profoundly shaped by national policies, notably in provincial capitals and other locales endowed with elevated administrative status. Due to their political and economic prominence, these urban centres have assumed pivotal roles as primary destinations for population influx (Shen, 2013).
As economic activities agglomerate within urban centres, yielding a concomitant increase in employment opportunities, a notable escalation in housing prices transpires (Gu et al., 2021). The formidable amalgamation of exorbitant living costs and constraints imposed by the hukou (household registration) system presents formidable obstacles for numerous migrants seeking settlement in urban locales (Chan, 2009). The resultant socio-economic cleavages engendered by this urbanisation trajectory attain conspicuous manifestation during the Spring Festival travel rush, wherein a substantial urban workforce undertakes a mass return to rural domiciles. This annual phenomenon serves to underscore the pronounced disparities between urban and rural domains, as well as developed and less developed regions (Gu et al., 2023; Wang et al., (2021). The Spring Festival migration, recognised as one of the world’s largest human migrations, not only mirrors the profound shifts in China’s social structure induced by urbanisation but also accentuates the economic and social differentials between urban and rural spheres, thereby posing novel challenges to the nation’s social fabric and prospective developmental trajectories.
Modelling the mobility in urban China
Previous studies relied on survey data on mobility in urban China, such as national census data and the China Migrants Dynamic Survey (Gu et al., 2022; Shen, 2016). The dynamics and granularity of mobility were limited in this research. Scholars can now explore the nuanced mobility in urban China with big data due to the advances in information communication technology. Social media big data (such as Baidu and Tencent migration big data), mobile signalling data and other location-based service data provide great opportunities for researchers to detect inter-city mobility in urban China with higher accuracy and more details, which is beyond the imagination of previous studies on the inter-provinces flow data (Liu et al., 2015).
In the era of big data, there are two main branches of methods used to model mobility in urban China. First, physics-based methods, such as network analysis, are introduced to disentangle the complicated nexus of urban systems (Pan and Lai, 2019; Xu et al., 2017). The field is filled with fresh modelling technologies, such as the Exponential Random Graph Model (Zhang et al., 2020) and the Weighted Stochastic Block Model (Zhang et al., 2020). However, these studies paid attention to the universal laws, leaving much concern on explaining the real world. On the contrary, the social-science-based method still tries to model the complexity of mobility in urban China. They attempt to build a comprehensive theory to explain mobility and then examine the hypothesis with regressions (Gu et al., 2023). However, the linear assumption limits them from approaching the unknown darkness in mobility, where nonlinearity is common.
The emerging explanatory machine learning offers a third way of modelling mobility. For example, Gradient Boosting Decision Tree could show the nonlinear relationship of variables by a partial dependent plot, contradicting the traditional claims that machine learning could just predict rather than explain. Most of this new explanatory machine learning, such as TOD, is employed in intra-urban transportation research. It calls for the extension of the method to inter-city mobility.
Research design
Conceptual framework
Expanding on the existing body of research, we propose a comprehensive framework designed to delve into the intricacies of periodical mobility during the Spring Festival (Wang et al., 2016). Our approach centres on examining the interaction of periodical mobility with interprovincial spatial configurations, ultimately leading to a model that effectively captures the nuances of interprovincial periodical mobility in China. This novel framework is adept at deciphering the intricate patterns of periodical mobility and the underlying processes within China’s unique provincial landscape, where rapid urbanisation has accelerated human mobility and led to spatially uneven development. In the four decades following the economic reforms, this developmental landscape has evolved into a complex patchwork of provinces, each characterised by its own set of disparities (Fang et al., 2020). This aspect is crucial for the larger understanding of China’s urbanisation dynamics, linking periodical mobility patterns to broader socio-economic transformations.
Within this framework, geographical distance is recognised as a crucial, albeit nonlinear, factor influencing mobility, bringing with it varying degrees of cost and uncertainty that sculpt migration patterns (Zipf, 1946). Such disparities in mobility give rise to diversity, which then exacerbates the spatial divisions between provinces (Cui et al., 2020). The hukou system imposes additional layers of complexity, inducing seasonal migrations between urban and rural areas, a phenomenon most observable during the Spring Festival Fig. 1.
Acknowledging the intricate relationship between geographical distance and mobility, our study probes into the nuances of provincial disparities, revealing insights into the urban-rural dynamics and the larger canvas of regional development (Amini et al., 2014; Liu et al., 2014). By probing into the nuances of provincial disparities and revealing insights into urban-rural dynamics and the larger canvas of regional development, our study sheds light on the ongoing separation of employment and settlement in China. This separation, driven by periodical flows, is crucial for understanding the contours of urbanisation at the national scale. We propose three nuanced ‘distance-mobility intensity’ hypotheses to capture the essence of these complex interactions :
The Plateau Hypothesis (A: Plateau) suggests that within a province’s confines, mobility intensity initially increases only marginally with distance, quickly reaching a plateau. This saturation in mobility dynamics over short to moderate distances can be ascribed to the province’s cohesive socio-economic structure and integrated infrastructure.
The Drop Hypothesis (B: Drop) posits that mobility intensity experiences a sharp decline upon encountering provincial boundaries. This ‘boundary effect’ results from a mix of administrative, socioeconomic, and cultural discontinuities that create invisible impediments to movement, further reinforced by the hukou system (Chan, 2014; Chen and Fan, 2016).
The Rebound Hypothesis (C: Rebound) asserts that mobility intensity experiences a revival beyond a certain threshold due to the allure of more economically developed provinces. With their superior economic conditions, job opportunities, and public services, these regions act as magnets for populations from less developed or remote areas, potentially rejuvenating mobility intensity over longer distances.
Our approach, diverging from conventional models, emphasises mobility’s periodical and multifaceted nature, particularly during the Spring Festival migration. It underlines the intricate interplay of inter-provincial distances and migratory flows, integrating socio-economic and cultural factors. By adopting nonlinear assessments, our study expands the understanding of distance decay typologies, enhancing the analysis of spatial trends and informing strategies for optimising metropolitan areas and urban conglomerates.
Data sources and selection for mobility analysis
In this research endeavour, we have harnessed the Tencent Migration Dataset to dissect the intricate patterns inherent in population migration. Originating from Tencent’s expansive software ecosystem, this dataset uniquely captures inter-city population movements by analysing geolocation data derived from smart device users. The dataset’s credibility and reliability are underscored by its extensive utilisation in scholarly research focusing on urban connectivity and population mobility. Serving as the bedrock of our analysis, the Tencent Location Big Data Platform stands out for its precise and comprehensive coverage of migration data.
In anticipation of the 2018 Spring Festival from February 1st to 14th, our research team meticulously collected data encompassing a national scope and beyond. This rigorous data collection effort resulted in approximately 40,289 daily records detailing intricate aspects of population flows, such as origins, destinations, volumes, and timings of migrations. A comprehensive data cleaning process involving deduplication and annual aggregation was then employed, which refined the dataset to 20,155 records. These records meticulously represent the prefectural-level city population flows during the festival period, thus constituting the core of our analysis, focusing on the robustness of urban connectivity during the Spring Festival.
For the analytical framework of our study, we adopted variables predominantly influenced by gravity models. The dependent variable, denoting the strength of inter-city connectivity (Flow), is derived from the migration data of Tencent company, which has the largest number of internet and smartphone users in China (Zhang et al., 2020). The principal independent variable is the geometric distance between city centres (Distance). Additionally, control variables were incorporated, including city population size, economic output (GDP), public service levels (fiscal, technological, and educational expenditures), environmental quality (air quality and PM2.5 levels), population density, and industrial structure (percentage of the secondary sector). These variables were meticulously sourced from the ‘China City Statistical Yearbook,’ ensuring a comprehensive and robust analysis of the inter-city population movements during this pivotal period. (Table 1).
Modeling Population Mobility: Techniques and Analytical Processes
In this research, we rigorously explore the intricate dynamics of urban traffic intensities by leveraging decision tree-based ensemble learning methods with a focused application of the gradient-boosted machine (GBM) approach, as innovatively introduced by Friedman (2001). The selection of GBM, in preference to traditional linear regression models, is anchored in its exceptional proficiency in handling nonlinear relationships and its robustness in managing large, complex datasets like ours. GBM uniquely integrates decision trees with an iterative optimisation process, effectively enhancing the model’s accuracy and performance over successive iterations (Yang et al., 2024; Leech et al., 2023). This method captures intricate data patterns that linear models often miss (Sprangers et al., 2021; Leech et al., 2023). The utilisation of biased dependency analysis graphs permits a deeper understanding of the marginal role of independent variables, thereby addressing the challenge of predictive analytics without in-depth analysis. Furthermore, the deployment of partial dependence graphs offers valuable insights into the effects of variable interactions.
At the beginning of our study, we conducted logarithmic transformations during data preprocessing to mitigate data skewness. Following this, the Gradient Boosting Machine (GBM) model, which combines decision trees and iterative improvements, was employed. Its strategic use of pseudo-residuals is crucial for capturing intricate data features, thereby enhancing predictive precision significantly (Bühlmann and Hothorn, 2007). The hyperparameter tuning process was meticulously performed using RandomizedSearchCV, a highly effective method for exploring the stochastic hyperparameter space to identify optimal model settings (Friedman, 2001).
Our study methodically tested three hypotheses pertaining to urban mobility intensity and distance, utilising the GBM model and partial dependence plots for empirical validation. GBM’s analytical prowess offers a more nuanced and sophisticated interpretation of complex relationships compared to linear regression models, making it particularly suitable for the hypotheses we posit (Pancerasa et al., 2019; Flowerdew, 2010). These hypotheses include the Plateau Hypothesis, which postulates stability in mobility intensity across short to medium distances within provinces; the Drop Hypothesis, suggesting a decline in mobility intensity at provincial borders; and the Rebound Hypothesis, proposing an increase in mobility intensity beyond a certain distance threshold, especially around economically advanced provinces.
By synergistically integrating these insights with the advanced capabilities of the GBM model and the interpretive power of partial dependence plots, our methodological approach establishes a comprehensive framework for deciphering the complex dynamics of urban mobility intensity as influenced by geographical distance. This approach, distinctly divergent from traditional linear models, facilitates a more nuanced and all-encompassing analysis. It brings to the forefront the pivotal role of distance as a fundamental factor in sculpting urban mobility patterns. This enhanced analytical capability allows for a deeper understanding of the multifaceted interplay between distance and urban mobility, highlighting the intricate relationships that underpin these critical urban dynamics.
This figure illustrates the conceptual framework used in the study, highlighting the complex dynamics of intercity population flow during China’s Spring Festival. It emphasizes the role of geographical distance and economic centers in shaping migration patterns, showing how these factors collectively contribute to the observed mobility trends.
This figure displays the population outflows from major cities during the 2019 Spring Festival, revealing significant migration trends from coastal regions to central and western provinces. The figure highlights the role of coastal cities as economic hubs that attract labor and the distinctive characteristics of population mobility in these areas.
Results
Spring festival migration: inter-city dynamics and trends
Population migration is instrumental in shaping China’s urbanisation, informatisation, industrialisation, and globalisation. It continues to be a major force driving urbanisation, and understanding its spatial dynamics is essential for interpreting the country’s economic and social transformations. Detailed visualisations of these dynamics disclose significant patterns and trends, facilitating a more thorough analysis (Salazar and Zhang, 2013; Liu and Shen, 2017). Such visual representations provide an all-encompassing perspective of inter-city population movements, emphasising critical trends in major provinces, especially during peak times like the Spring Festival. We analysed the net population inflow and outflow in various cities across the country during the 2019 Spring Festival, as shown in Tables 3 and 2. Table 3 lists the top 20 cities with the highest population inflow, highlighting significant increases as people returned home. Table 2 shows the top 20 cities with the highest population outflow, indicating where people left to return to their hometowns. The data confirm that while a small portion of people travelled for vacations or leisure, the vast majority returned home to celebrate the Spring Festival with their families. This pattern underscores the significance of the Spring Festival as a time for family reunions rather than for tourism (Tables 2 and 3).
Figure 2 and 3 depicts a pronounced pattern of population movement during the Chinese New Year, mainly from the eastern seaboard to the centre and west. This migration mirrors wider economic trends, where coastal cities have emerged as major population centres due to their advanced urbanisation and employment opportunities. Nationally, inter-city connections centre around principal urban areas, such as Beijing, Shanghai, Guangzhou (including Shenzhen), and Chengdu, forming a diamond pattern focusing on the northern axis (Gu and Shen, 2021) Fig. 2.
Our provincial-level analysis of data from eastern, central, western, and northeastern provinces reveals diverse migration patterns during the Spring Festival (Zhao et al., (2017)). Figure 3 shows Guangdong as a significant hub for intra-provincial movement, acting as both a primary source and destination for migrants. In contrast, Hubei, a central province, displays a trend of less outbound but considerable eastward migration, characterised by layered short-distance travel. As depicted in Fig. 3, Sichuan maintains consistent population flows, with over 4 million trips predominantly eastward. Both Hubei and Sichuan exhibit strong internal and adjacent provincial appeal but have limited cross-provincial attraction (Wang et al., 2016). The Northeast, particularly Heilongjiang, presents a distinctive pattern of inward migration, not attracting external employment, which results in substantial return migration during the Spring Festival. This trend reflects the center-ground theory of urban hierarchy.
The OD (origin-destination) population mobility map provides an insightful snapshot of migration trends, yet it falls short of fully capturing the intricate, nonlinear interplay between distance and mobility. This relationship is influenced by a variety of factors, including administrative boundaries, socio-economic elements, cultural contexts, and the complex hukou (household registration) system. Implementing the Gradient Booster Machine (GBM) model could shed more light on these nonlinear patterns, thereby enhancing our understanding of population migration in China. Combining visual data with theoretical modelling enriches our comprehension of the drivers of population mobility. This integration of data and theory not only aids in revealing underlying patterns but also assists in making informed decisions that align with the country’s dynamic socio-economic and cultural realities.
Figure 4 ranks the importance of variables on inter-city connection strength, with geographical distance accounting for 57.26% of the influence. This aligns with Chai J & Zhang R (2024), who studied city network mining in China’s Yangtze River Economic Belt, and Xiong Z. & Wang X. (2023), who emphasised the significance of geographical distance in urban agglomerations. Despite advances in transportation, geographical distance remains a pivotal factor affecting inter-city connections.
This figure shows the importance ranking of variables in the GBDT model regarding their influence on intercity connection strength. The results indicate that geographical distance is the most significant factor, accounting for 57.26% of the influence, underscoring the critical role of distance in shaping city connections.
Analysing partial dependence plots: interpreting spatial relationships
Building upon the experimental ideas from the previous section, this study incorporates the fundamental characteristics of inter-city population movement into the forecasting model by employing the Gradient Boosting Machine (GBM) method to generate Partial Dependence Plots (PDP). The purpose of these plots is to reveal patterns in the inter-provincial urban connections during the Spring Festival. We first model the complex relationships of these connections using GBM, followed by a visual interpretation through PDP tools to better understand and explain the model’s outcomes.
Figures 5–8 shows the partial dependence between urban distance and connection strength, both on a logarithmic scale. This illustrates the variation in spatial connection strength relative to urban distance. Each province, represented as a point of origin, exhibits unique trends, highlighting the diverse relationships between urban distance and population movement. This variability emphasises the nonlinear impact of geographical regions on population flows. The plots demonstrate a general decline in connection strength as urban distance increases. Notably, the initial plateau values are highest in the eastern provinces, diminishing through the central and northeastern areas, and are lowest in the western provinces. The central and northeastern regions, mainly contributing to labour migration, show a pronounced decrease in plateau values over greater distances. Several factors, such as geographic accessibility, economic development, and cultural practices, drive these trends. This is particularly evident in the western regions, which show a longer average journey for population flow, a lower propensity for long-distance travel, and a higher sensitivity to distance.
This figure depicts the partial dependence of urban connection strength on distance in eastern provinces such as Guangdong, Zhejiang, and Jiangsu. The figure shows a gradual decline in connection strength with increasing distance, reflecting the impact of geographical distance on urban connectivity in these economically developed regions.
This figure illustrates the partial dependence of urban connection strength on distance in central provinces like Henan, Jiangxi, and Hunan. Despite lower levels of economic development, these regions exhibit intense population mobility due to their large population bases and the significant outflow of labor.
This figure displays the partial dependence of urban connection strength on distance in western provinces like Gansu, Guangxi, and Guizhou. The figure shows a rapid decline in connection strength with increasing distance, reflecting the low internal and external mobility intensity in these less developed regions.
Figure 5 presents the partial dependence plots for the eastern provinces, illustrating the changes in urban connection strength relative to distance. In provinces such as Guangdong, Zhejiang, and Jiangsu, which are key destinations for China’s migrant population, there is a noticeable gradual decline in urban connections before the Spring Festival. This decline becomes more pronounced at greater distances. Around developed metropolitan areas, a radial connection pattern is observed, whereas less developed areas exhibit point-to-point connections that intensify with increasing distance, signifying a shift in connectivity beyond certain thresholds.
Figure 6 displays the distance partial dependence plots for the central provinces. In regions like Henan, Jiangxi, and Hunan, which are characterised by significant population outflow, there is a noticeable intensity in population movement and distinct plateau phases. These plots reveal that, despite lower economic development, these provinces experience intense population mobility, attributable to large population bases and abundant rural labour. The presence of other appealing provinces leads to a plateau phase, followed by a sharp decline and a subsequent peak. This pattern suggests a strong pull from city clusters, overpowering the cost of distance, especially evident beyond a distance of 7.
In Fig. 7, the partial dependence plots for the northeastern provinces are detailed. Here, Heilongjiang, Jilin, and Liaoning show pronounced plateau phases within short distances. Notably, Jilin and Liaoning exhibit a steep decrease in connection strength at medium distances, likely due to the lack of highly attractive cities within this range, reflecting the unique attraction dynamics of the region.
Finally, Fig. 8 presents the partial dependence plots for the western provinces. These regions have the longest average distances in the initial plateau phase and display the lowest average mobility intensity, marked by a rapid decline in the curve. Provinces such as Gansu, Guangxi, and Guizhou show unique characteristics in population movement, with low internal and external mobility intensity. This is partly due to less developed urban areas, transportation infrastructure, and primary industries, resulting in a lower propensity for provincial mobility.
Spatial visualisation of dependence plots: a provincial perspective
The analysis of the biased dependency graph reveals that eastern China is the main destination of population mobility, while central, western and northeastern China are the main departure points. Guangdong, as a representative of the east, has a city connection intensity that gradually decreases with distance, a pattern that is suitable for analysing mobility hotspots and can serve as a reference for other economically developed regions. Hubei, as an important population exporting province, despite its low economic development, has a large population base and an abundant rural labour force that shows strong mobility characteristics, which can be useful for other exporting provinces. Studying these two provinces helps to understand the impact of different levels of economic development and geographic locations on population mobility and urban connectivity, providing references for other regions.
In our thorough study of population mobility in Hubei and Guangdong provinces, we incorporated various factors such as mobility patterns, geographical modelling, employment trends, and regional development. These elements are intricately connected to the visual representations in Figs. 9 and 10, enhancing our comprehension of the underlying dynamics. Situated in central China, Hubei Province exhibits a distinctive pattern of population mobility. Particularly around Wuhan, the province experiences a periodical pattern of population movement, more pronounced during festivals like the Chinese New Year (Zhang et al., 2023). This pattern reflects not just cultural practices but also economic necessities. Due to limited local employment opportunities, many residents of Hubei seek jobs in other regions, resulting in a significant outflow. This outflow is balanced by inflows during festive periods, creating a periodical movement pattern.
This figure illustrates the spatial distribution of urban connections within Hubei Province, highlighting Wuhan’s role as a regional hub. The figure shows strong internal connectivity within the province, with limited external influence, reflecting Hubei’s status as a regional rather than a national hub.
This figure presents the spatial distribution of urban connections within Guangdong Province, emphasizing its role as a major economic center in southeastern China. The figure shows a unique corridor of inter-provincial connectivity along the National High-Speed Railway, indicating Guangdong’s significant influence on population mobility across the country.
As indicated in Fig. 8, the spatial distribution of these flows shows that Hubei possesses strong internal connectivity, but its external influence is comparatively limited. The dependency maps in the intuitive map reveal that cities in or near Hubei, such as Chongqing, Xinyang, and Yueyang, maintain robust connections, underscoring Hubei’s role as a regional hub rather than a national one. This aligns with the ‘centre-ground theory’, suggesting that while Wuhan’s influence as a central city is significant, it is predominantly local.
In contrast, Guangdong Province, a powerhouse in southeastern China, presents a distinctly different scenario (Fig. 10). As a hub of economic activity and job creation, it draws people from across China. However, Guangdong faces challenges in retaining this labour force, leading to a population where the origin (O) significantly exceeds the destination (D). The spatial visualisation in the figure underscores Guangdong’s extensive inter-provincial connectivity. The province establishes a unique corridor along the National High-Speed Railway (NHSR), fostering substantial inflow from distant regions. This pattern deviates from the expected norms of the center-zone theory, as cities in Guangdong, particularly those in the Pearl River Delta, exert influence well beyond their immediate geographic boundaries.
The contrasting patterns of Hubei and Guangdong mirror their respective economic strategies and regional roles. Hubei’s geographic position and economic approach have fostered a more balanced regional population flow. In contrast, Guangdong’s vigorous economic development and infrastructural advancements have created a pull that extends nationally and even interregionally. This disparity is not solely a function of geographic location but also results from economic policies, infrastructure development, and historical factors.
Hubei’s focus on regional connectivity and limited economic expansion has led to more localised patterns of population movement, positioning the province as a regional hub rather than a national attractor. Conversely, Guangdong’s integration into the global economy, especially in cities like Guangzhou and Shenzhen, has catalysed dynamic interactions in population mobility, drawing labour from all over the country.
The findings from visual data and spatial analyses provide a detailed understanding of the complex regional dynamics within China. The intricate patterns of population movement are crucial for developing effective talent attraction policies. In Hubei, where population flow is predominantly intra-provincial, strategies could be developed to boost local employment opportunities and retain the native workforce. The periodical nature of population movements, especially during festival periods, presents a chance to implement policies that encourage year-round economic activity and mitigate the seasonal impact on labour markets (Wang et al., 2016, 2020). In contrast, given Guangdong’s status as a national hub for talent, a distinct approach is required. Policymakers in Guangdong might concentrate on initiatives that improve the quality of life for both local and incoming residents and ensure the provision of housing and social services to support a diverse population. A deep understanding of the regional dynamics of population influx from distant areas is essential for devising targeted talent attraction policies that are in harmony with the province’s economic and demographic realities.
In conclusion, the comprehensive analysis of population mobility patterns in Hubei and Guangdong, grounded in economic development, geographic location, and infrastructure networks, offers invaluable insights for policymakers and urban planners. This analysis assists in accurately defining metropolitan areas and informs the development of effective talent attraction policies. By taking into account the intricate factors driving population movement, regions can refine their development strategies and promote inter-provincial collaboration. Such efforts are key to achieving balanced regional development and fostering economic growth.
Concluding remarks
Human mobility modelling has attracted scholarly attention both from physics-based methods and social science explanatory approaches. These two dividing perspectives both provide deep insights into human mobility but in different directions. Machine learning is an emerging way to combine the two aspects in the studies of human mobility. However, there is limited knowledge of the nonlinear relationship of flows and distance in intercity mobility, as well as regional differences in the nonlinear relationship.
This paper, based on LBS big data and the GBDT method, considering the influence of provincial administrative boundaries, identified three types of human mobility patterns in China during the Spring Festival travel rush: namely plateau, drop, and rebound. The mechanism behind these three intercity mobility patterns is the inequality of local administrative power under China’s administrative divisions, resulting in the uneven distribution of resources. These three mobility patterns apply to most of the provinces examined in this study. However, for other provinces (such as Xinjiang, Tibet, Qinghai, Ningxia, etc.), the data volume is insufficient to accurately depict the patterns. Among the 21 provinces, four (Heilongjiang, Jilin, Guangxi, and Gansu) exhibit a two-stage decline during the drop phase, unlike the direct drop observed in other provinces. This phenomenon may be attributed to locational factors and unique cultural ties, making Beijing and Guangzhou particularly attractive or concentrated destinations for residents of these provinces. However, in some remote provinces, different patterns may appear. For the West province, the rebound is more significant than that of other provinces. In the rebound pattern, there is a peak in intercity mobility after reaching a certain distance, which may be due to the population returning from megacities in other provinces. Despite the central government’s continuous advocacy for urbanisation nearby and in situ, this seems contrary to expectations. This has led to a situation where a large number of people are willing to migrate to farther but more developed coastal cities for work given the development of transportation infrastructure (Wang, 2018; Wang et al., 2020), even at the costs of a long distance far away home.
The findings of this study also have practical significance, demanding higher requirements for the central and local governments to promote equalisation of public services and balanced regional development strategies. In addition, since this paper focuses on the intercity population mobility in China during the Spring Festival travel rush, it also holds significant theoretical implications for reflection. China’s hukou system. The hukou system is a major cause of the massive Spring Festival travel rush and has led to some social hotspots, such as highway congestion. Behind the hukou system lies the dual urban-rural development pattern that China has yet to break. Whether the hukou system is still necessary warrants further consideration and reflection by scholars. Our analysis shows that there is still room for optimisation in the geographical space of the Spring Festival travel rush, which represents China’s largest-scale population migration behaviour. Behind the massive return migration during the Spring Festival is the plight of “being unable to return home” at other times of the year. We call for deepening the reform of the hukou system, breaking down the barriers between urban and rural areas, enabling people to enjoy high-quality employment opportunities and public services nearby and to enjoy better lives near their hometowns.
This research has its limitations. First, Tencent’s big data is intercity aggregated data, without showing the individual passengers’ characters and the travel purposes. Thus, we are unable to separate tourists from home-goers. Second, the lack of long-term historical observations limits our explorations on the evolution of human mobility in China’s Spring Festival. Third, international comparative studies is required for the further discovery of the universal law of the nonlinear relationship between flows and distance. The different roads home could be different in other countries.
Data availability
All data generated or analyzed during this study are included in this published article and its supplementary information files.
References
Alessandretti L, Aslak U, Lehmann S (2020) The scales of human mobility. Nature 587(7834):402–407
Amini A, Kung K, Kang C et al. (2014) The impact of social segregation on human mobility in developing and industrialized regions. EPJ Data Sci. 3:6. https://doi.org/10.1140/epjds31
Bühlmann P, Hothorn T (2007). Boosting algorithms: Regularisation, prediction and model fitting. Statist. Sci, 22(4). https://doi.org/10.1214/07-STS242
Calafiore A, Samardzhiev K, Rowe F, Fleischmann M, Arribas-Bel D (2023) Inequalities in experiencing urban functions. An exploration of human digital (geo-)footprints. Environ. Plan. B: Urban Anal. City Sci, 0(0). https://doi.org/10.1177/23998083231208507
Chai J, Zhang R (2024) Exploring the spatial correlation network and its formation mechanisms in urban land use performance: A case study of the Yangtze River Economic Belt. Land 13(7):1019. https://doi.org/10.3390/land13071019
Chan KW (2009) The Chinese Hukou System at 50. Eurasia. Geogr. Econ. 50(2):197–221. https://doi.org/10.2747/1539-7216.50.2.197
Chan KW (2014) China’s urbanisation 2020: A new blueprint and direction. Eurasia. Geogr. Econ. 55(1):1–9. pp
Chen C, Fan CC (2016) China’s hukou puzzle: Why don’t rural migrants want urban hukou? China Rev. 16(3):9–39. pp
Cui C, Wu X, Liu L, Zhang W (2020) The spatial-temporal dynamics of daily intercity mobility in the Yangtze River Delta: An analysis using big data. Habitat Int. 106:102174. p
Fang H, Wang L, Yang Y (2020) Human mobility restrictions and the spread of the Novel Coronavirus (2019-nCoV) in China. J. Public Econ. 191:104272. https://doi.org/10.1016/j.jpubeco.2020.104272
Flowerdew R (2010) Modelling migration with Poisson regression. In Handbook of Research on Social and Organizational Dynamics in the Digital Era (pp. 355–373). IGI Global. https://doi.org/10.4018/978-1-61520-755-8.ch014
Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann. Stat, 29(5). https://doi.org/10.1214/aos/1013203451
Gao K, Yang Y, Li A, Qu X (2021) Spatial heterogeneity in distance decay of using bike sharing: An empirical large-scale analysis in Shanghai. Transportation Res. Part D: Transp. Environ. 94:102814
Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: An overview of interpretability of machine learning. Paper presented at the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)
Gu H, Shen T (2021) Modelling skilled and less-skilled internal migrations in China, 2010–2015: Application of an eigenvector spatial filtering hurdle gravity approach. Popul., Space Place 27(6):e2439. https://doi.org/10.1002/psp.2439
Gu H, Jie Y, Li Z et al(2021) What Drives Migrants to Settle in Chinese Cities: a Panel Data Analysis. Appl. Spat. Anal. 14:297–314. https://doi.org/10.1007/s12061-020-09358-z
Gu H, Lin Y, Shen T (2022) Do you feel accepted? Perceived acceptance and its spatially varying determinants of migrant workers among Chinese cities. Cities 125:103626. https://doi.org/10.1016/j.cities.2022.103626
Gu H, Shen J, Chu J (2023) Understanding intercity mobility patterns in rapidly urbanising China, 2015–2019: Evidence from longitudinal Poisson gravity modeling. Ann. Am. Assoc. Geograph. 113(1):307–330. https://doi.org/10.1080/24694452.2022.2097050
Haraguchi M, Nishino A, Kodaka A, Allaire M, Lall U, Kuei-Hsien L, Kohtake N (2022) Human mobility data and analysis for urban resilience: A systematic review. Environ. Plan. B: Urban Anal. City Sci. 49(5):1507–1535
Lao X, Deng X, Gu H, Yang J, Yu H, Xu Z (2022) Comparing intercity mobility patterns among different holidays in China: A big data analysis. Appl. Spat. Anal. Policy 15(4):993–1020. https://doi.org/10.1007/s12061-021-09433-z
Leech G, Aitchison L, Panton, H (2023). Decision trees compensate formisspecification. arXiv.org. https://doi.org/10.48550/arXiv.2302.04081
Liu Y, Shen J (2017) Modelling skilled and less-skilled interregional migrations in China, 2000–2005. Popul., Space Place 23(4):e2027. https://doi.org/10.1002/psp.2027
Liu Y, Wang F, Xiao Y, Gao S (2014) Uncovering patterns of inter-urban trip and spatial interaction from social media check-in data. PLoS ONE 9(1):e86026. https://doi.org/10.1371/journal.pone.0086026
Liu Y, Liu X, Gao S, Gong L, Kang C, Zhi Y, Shi L (2015) Social sensing: A new approach to understanding our socio-economic environments. Ann. Assoc. Am. Geograph. 105(3):512–530. https://doi.org/10.1080/00045608.2015.1018773
Niedomysl T, Fransson U (2014) On Distance and the Spatial Dimension in the Definition of Internal Migration. Ann. Assoc. Am. Geographers, 104(2):357–372. https://doi.org/10.1080/00045608.2013.875809
Pan J, Lai J (2019) Exploration of migration patterns of China’s floating population, 2000–2010. Popul., Space Place 25(2):e2200. https://doi.org/10.1002/psp.2200
Pancerasa M, Sangiorgio M, Ambrosini R, Saino N, Winkler DW, Casagrandi R (2019) Reconstruction of long-distance bird migration routes using advanced machine learning techniques on geolocator data. J. Royal Society Interface 16(155):20190031. https://doi.org/10.1098/rsif.2019.0031
Pappalardo L, Manley E, Sekara V, Alessandretti L (2023) Future directions in human mobility science. Nat. Comput. Sci. 3(7):588–600. https://doi.org/10.1038/s43588-023-00469-4
Salazar NB, Zhang Y (2013) Seasonal lifestyle tourism: The case of Chinese elites. Ann. Tour. Res. 43:81–99. https://doi.org/10.1016/j.annals.2013.04.002
Schläpfer M, Dong L, O’Keeffe K et al. (2021) The universal visitation law of human mobility. Nature 593:522–527. https://doi.org/10.1038/s41586-021-03480-9
Shen J (2013) Increasing internal migration in China from 1985 to 2005: Institutional versus economic drivers. Habitat Int. 39:1–7. https://doi.org/10.1016/j.habitatint.2012.10.004
Shen J(2016) Error analysis of regional migration modeling Ann. Am. Assoc. Geographers 106(6):1253–1267. https://doi.org/10.1080/24694452.2016.1197767
Shen J, Gu H, Chu J (2023) Unravelling intercity mobility patterns in China using multi-year big data: A city classification based on monthly fluctuations and year-round trends. Comput. Environ. Urban Syst. 102:101954. p
Song C et al. (2010) Limits of Predictability in Human Mobility. Science 327:1018–1021. https://doi.org/10.1126/science.1177170
Sprangers O, Schelter S, de Rijke M (2021) Probabilistic gradient boosting machines for large-scale probabilistic regression. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp.1510-1520). https://doi.org/10.1145/3447548.3467278
Wang L (2018) High-speed rail services development and regional accessibility restructuring in megaregions: A case of the Yangtze River Delta, China. Transp. Policy 72:34–44. https://doi.org/10.1016/j.tranpol.2018.09.015
Wang L, Acheampong R. A, He S (2020) High-speed rail network development effects on the growth and spatial dynamics of knowledge-intensive economy in major cities of China. Cities 105:102772. https://doi.org/10.1016/j.cities.2020.102772
Wang L, Liu H, Liu Q (2021) Research on China’s urban network based on Tencent migration big data. Acta Geographica Sinica 76(4):853–869. https://doi.org/10.11821/dlxb202104006
Wang D, He S (Eds.) (2016) Mobility, sociability and well-being of urban living (pp. 189–230). Berlin: Springer
Wu W, Wang J, Dai T (2016) The Geography of Cultural Ties and Human Mobility: Big Data in Urban Contexts. Ann. Am. Assoc. Geographers 106(3):612–630. https://doi.org/10.1080/00045608.2015.1121804
Xiong Z, Wang X (2023) Multi-scaled city networks based on automotive industry value chain: A case study from Urban Agglomeration in the Middle Reaches of the Yangtze River. Trans Plan Urban Res. https://doi.org/10.1177/27541223231189815
Xu J, Li A, Li D, Liu Y, Du Y, Pei T, Ma T, Zhou C (2017) Difference of urban development in China from the perspective of passenger transport around Spring Festival. Appl. Geogr. 87:85–96. https://doi.org/10.1016/J.APGEOG.2017.07.014
Yang L, Yang H, Cui J, Zhao Y, Gao F (2024) Non-linear and synergistic effects of built environment factors on older adults’ walking behavior: An analysis integrating LightGBM and SHAP. Trans. Urban Data, Sci., Technol. 3(1-2):46–60. https://doi.org/10.1177/27541231241249866
Yin Z, Ouyang L, Wang D (2020) Reverse traffic flows: Visualizing a new trend in Spring Festival travel rush in China. Environ. Plan. A: Econ. Space 52(2):251–254. https://doi.org/10.1177/0308518X19860537
Zhang M, Wang S, Hu T, Fu X, Wang X, Hu Y, Bao S (2022) Human mobility and COVID-19 transmission: a systematic review and future directions. Ann. GIS 28(4):501–514. https://doi.org/10.1080/19475683.2022.2041725
Zhang W, Chong Z, Li X, Nie G (2020) Spatial patterns and determinant factors of population flow networks in China: Analysis on Tencent Location Big Data. Cities 99:102640
Zhang W, Fang C, Zhou L, Zhu J (2020) Measuring megaregional structure in the Pearl River Delta by mobile phone signaling data: A complex network approach. Cities 104:102809. https://doi.org/10.1016/j.cities.2020.102809
Zhang X, Ma W, Sheng SH (2023) Understanding the structure and determinants of economic linkage network: The case of three major city clusters in Yangtze River Economic belt. Front. environ. sci. https://doi.org/10.3389/fenvs.2022.1073395
Zhao J, Zhang K, Wang R (2017) The influence of social network on inter-provincial migration flows in China. J. Geograph. Sci. 27(10):1183–1200. https://doi.org/10.1007/s11442-017-1428-x
Zhu Y, Sun M, Li Q (2022) The spatiotemporal pattern and influencing factors of China’s interprovincial migration network: A complex network analysis. Population, Space and Place, e2418. https://doi.org/10.1002/psp.2418
Zipf GK (1946) The P 1 P 2 D Hypothesis: On the Intercity Movement of Persons. Am Sociol Rev. 11:677. https://doi.org/10.2307/2087063
Acknowledgements
This work was supported by the National Natural Science Foundation of China (42271219) and Peking University (Shenzhen) Future City Lab Techand Open Research Fund [201802].
Author information
Authors and Affiliations
Contributions
• Xiaofan Luan: Conception and design of the study, data analysis and interpretation, literature review, interpretation of results, revision of the manuscript, optimization of manuscript structure, editing and proofreading of the manuscript, supervision and guidance, funding acquisition, project administration, provision of resources. • Hurex Paryzat: Data analysis and interpretation, software programming and code development, literature review, figure drawing and visualization, statistical analysis, interpretation of results, drafting of the manuscript, revision of the manuscript, editing and proofreading of the manuscript. • Jun Chu: Conception and design of the study, data collection, data analysis and interpretation, experimental design, interpretation of results, drafting of the manuscript, revision of the manuscript, optimization of manuscript structure. • Hengyu Gu: Revision of the manuscript, optimization of manuscript structure, editing and proofreading of the manuscript, technical support. • Xinyi Shu: Literature review, drafting of the manuscript, data analysis and interpretation, figure drawing and visualization, statistical analysis, interpretation of results, drafting of the manuscript. • De Tong: Conception and design of the study, optimization of manuscript structure, supervision and guidance, project administration, funding acquisition, provision of resources. • Bowen Li: Data analysis and interpretation, software programming and code development, statistical analysis, optimization of manuscript structure, technical support.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical Approval
Ethical approval was not required as the study did not involve human participants or animal experiments.
Informed Consent
Informed consent was not required as the study did not involve human participants.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Luan, X., Paryzat, H., Chu, J. et al. Different roads take me home: the nonlinear relationship between distance and flows during China’s Spring Festival. Humanit Soc Sci Commun 11, 1356 (2024). https://doi.org/10.1057/s41599-024-03779-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1057/s41599-024-03779-8
This article is cited by
-
Effect of Working Patterns on Spatial Variation in Outdoor Leisure Activity of Fixed-Location Workers: A Case Study of Nanjing, China
Applied Spatial Analysis and Policy (2025)