Exploring the predictive ability of the CA–Markov model for urban functional area in Nanjing old city

Hu, Xinyu; Zhu, Wei; Shen, Ximing; Bai, Ruxia; Shi, Yi; Li, Chen; Zhao, Lili

doi:10.1038/s41598-024-69414-3

Download PDF

Article
Open access
Published: 08 August 2024

Exploring the predictive ability of the CA–Markov model for urban functional area in Nanjing old city

Xinyu Hu¹,
Wei Zhu¹,
Ximing Shen¹,
Ruxia Bai¹,
Yi Shi²,
Chen Li¹ &
…
Lili Zhao¹

Scientific Reports volume 14, Article number: 18453 (2024) Cite this article

4473 Accesses
11 Citations
Metrics details

Subjects

Abstract

With advancements in sustainable urban development, research on urban functional areas has garnered significant attention. In recent years, Point-of-Interest, with their large volume of information and ease of acquisition, have been widely applied in research on urban functional domains. However, scholars currently focus on the identification of urban functional areas, usually relying on data from a single period, whereas research on the prediction of functional areas has not yet been well validated. Therefore, in this study, we propose a new method based on several years of POI data to predict urban functional areas. Taking Nanjing City, Jiangsu Province, as an example, we first identified the functional area distribution of the old city of Nanjing over several years using POI data and then designed multiple sets of experiments to explore the CA–Markov model’s ability to predict functional areas from various aspects, including model overall accuracy, robustness, and comparison analysis between predictions and actual situations. The results show that (1) for mixed or single functional areas, the model’s predictions over several years tend to be stable, and the accuracy of the predictions over many years indicates the robustness of the model in predicting urban functional areas. (2) For mixed functional areas in cities, model predictions largely rely on the distribution of the base years used for prediction, leading to inaccurate results; thus, it is still not applicable for simulating and predicting mixed functional areas. (3) For single functional areas in cities or primary functions within an area, the model’s predicted degree of change was close to the actual degree of change, making the results referable.

Identification and prediction of mixed-use functional areas supported by POI data in Jinan City of China

Article Open access 20 February 2023

Studying the distribution patterns, dynamics and influencing factors of city functional components by gradient analysis

Article Open access 07 September 2021

Assessing urban vitality in high-density cities: a spatial accessibility approach using POI reviews and residential data

Article Open access 18 July 2025

Introduction

The acceleration of urbanization has led to rapid economic growth and continuous improvement in the living standards of residents. However, long-term extensive spatial expansion and the disorderly spread of cities have triggered a series of social, economic, and ecological problems¹, such as air pollution², environmental degradation³, traffic congestion⁴, and a reduction in agricultural land⁵. Urban functional areas, as the basic units of urban development, play a crucial role in the urbanization process. Their scientific delineation and rational allocation cannot only solve urban problems but also have considerable importance in optimizing urban spatial structures, rationally arranging spatial elements and promoting sustainable development^6,7,8. Conventional studies of urban functional areas have been mainly based on remote sensing imagery, subjective assessments, surveys, statistics, and land-use data, which typically consume substantial human and material resources, resulting in inaccurate outcomes⁹. In the context of informatization, point-of-interest (POI) data can describe the spatial attributes and geographic location information of cities and are characterized by large information content, easy acquisition, and convenient analysis, providing new perspectives for research on urban functional areas. Many researchers have investigated urban functional areas using POI data^{10,11,12,13,14,15}. Huang et al. identified urban functional areas by integrating the POI density and spatial heterogeneity¹⁶. Gao et al. extracted urban functional areas from POI and crowd activities based on social network locations¹⁷. Li et al. identified urban functional areas and their mixed-use degrees based on POI data⁷. Xu et al. determined the characteristics of urban functional distribution by integrating remote sensing images and POI^18,19. In their research, urban functional areas were usually identified and characterized based on POI data from a single time period, thereby facilitating sustainable urban development.

However, with the transition from incremental to stock planning, the configuration of urban functional areas has undergone unprecedented changes. These changes directly affect urban spaces and economic structures, leading to changes in resident lifestyles and urban service functions²⁰. For example, traditional industrial areas may be transformed into innovation and technology parks, whereas residential areas may integrate more commercial and public service facilities to meet the needs of urban residents. Changes in urban functional areas require urban planners and decision-makers to accurately grasp the development trends of these areas to effectively allocate them and devise rational planning strategies. Research that merely identifies urban functional areas is not comprehensive, and some scholars have focused on investigating changes in urban functional areas. Shen et al. explored the evolution of urban functions in the center of Shanghai²¹, Schiavina, whereas others examined the evolution of land use efficiency in urban functional areas²². Wang et al. investigated urban functional areas and their dynamic changes based on years of traffic smart card data²³, whereas Zhou et al. took Changchun as an example to explore the changes in urban land use functions and expansions under the backdrop of urbanization²⁴. Overall, the current research on urban functional areas primarily focuses on the identification, distribution characteristics, and dynamic changes in these areas, with less attention paid to the simulation and prediction of functional areas. Changes in urban functional areas directly impact urban planning and development; thus, predicting these changes can provide urban planners with a more forward-looking perspective. The prediction of functional urban areas is essential for solving urban problems, addressing future challenges, and promoting sustainable development.

In recent years, advancements in Geographic Information Systems (GIS), big data analytics, and the emergence of new data sources such as point of interest (POI) data have enabled the prediction of urban functional areas. However, despite the support of these technologies and data, the prediction of urban functional areas still faces many challenges, including the accuracy of data, applicability of models, and reliability of prediction results. In this context, the CA–Markov model, which has high international recognition, combines the spatiotemporal prediction capabilities of Cellular Automata (CA) with the quantity prediction capabilities of the Markov model. Based on multi-objective, multi-criteria allocation principles, it has achieved high urban prediction accuracy and gained widespread attention. Numerous studies have shown that the CA–Markov model can effectively coordinate with GIS to create suitable methods for dynamic modeling of spatial and temporal changes in cities^25,26,27. This model has been widely applied to predict changes in urban land use^{28,29,30,31,32}. Compared with land use, changes in urban functions are more rapid and complex. Therefore, as a mature model widely used in urban predictions, how capable is the CA–Markov model in predicting urban functional areas?

Hence, this study attempted to assess the capability of the CA–Markov model to predict changes in urban functional areas based on the identification of these areas over several years and the integration of POI data. This study specifically focused on the following: first, identifying the functional areas of the Nanjing old city district over several years based on POI data; second, exploring and evaluating the effectiveness and accuracy of the CA–Markov model in predicting changes in urban functional areas; and finally, discussing the potential of this model in the application of urban functional area predictions. This study aimed to contribute to the sustainable development of cities.

Materials and methods

Study area

Nanjing is the capital of the Jiangsu Province and serves as its political, economic, cultural, technological, educational, and financial hub. By the end of 2022, the permanent population of Nanjing reached 9.49 million, and its GDP increased to 1.69 trillion, placing it tenth among Chinese cities. In addition, Nanjing functions as a crucial nexus within the Yangtze River Delta area, embodying the representative stature among cities in China. Nanjing old city was selected as the research area, and the scope of the study is shown in Fig. 1.

Data sources and processing

The research data for this study were POI locations in Nanjing old city from 2015 to 2023 sourced from the Amap Open Platform (https://lbs.amap.com/). Individual POI data included attributes such as name, category, address, longitude, and latitude. According to the original data crawled from the Amap Open Platform, POI data was categorized into three levels. The primary classification is a major category classification, including major categories such as catering services, accommodation services, government agencies and social organizations, shopping services, and corporate enterprises. The secondary classification is a subtype of the primary classification, taking public services as an example, including government agencies, hospitals, schools, and other intermediate categories. The tertiary classification is a subcategory of the secondary classification, taking schools as an example, including primary schools, secondary schools, universities, and other subcategories. POI data are extensively categorized into multiple levels, leading to data redundancy and overlapping. In addition, the classification standards for POI data do not align with the urban land-use classification standards, making them inconvenient for research and discussion. Consequently, we removed points with lower public recognition and reclassified the POI data to clearly represent the characteristics of urban functional areas. Referencing the “Urban Land Classification and Planning Construction Land Standards (GB50137—2011)” and the “National Economic Industry Classification (GB/T4754—2017)” and following the principles of universality and consistency in POI classification, we reclassified POI data into six major functional categories: business, traffic, industrial, public services, green space, and residential, thereby obtaining POI functional type data for each year³³. The specific classification is shown in Table 1.

Table 1 Functional classification of POI.

Full size table

Research methodology

Study process

This study was divided into four steps, as shown in Fig. 2. First, POI data spanning several years were obtained from the Amap Open Platform and categorized into six functional types: traffic, industrial, business, public services, residential, and green space. Second, the TF-IDF method was utilized to identify urban functional areas over multiple years. Third, after the identification of urban functional areas over several years, the CA–Markov model was utilized to conduct multiple predictions. Finally, after the completion of predictions, the predicted distribution of functional areas was compared with the actual distribution, comprehensively studying the model’s predictive capability and accuracy in predicting functional areas from various perspectives.

TF-IDF model

Because functional area data could not be directly obtained, this study employed the TF-IDF model to calculate the importance of POI functional points both locally and globally and used the actual contribution of POI functional points to delineate urban functional areas. The TF-IDF model is a statistical method commonly used in information retrieval and data mining to evaluate the importance of a word to a document in a collection or corpus and has been proven to have good classification capabilities for urban functional area identification^{13,14,34,35,36}. TF represents term frequency, whereas IDF represents the inverse document frequency index. In this study, each spatial unit was considered a separate document, with the category of each POI within a document regarded as a word, thereby converting the analysis of the functions of spatial units into calculating the weight of each word in the collection of documents. After multiple experiments, this study empirically adopted a 100 m grid to homogenize and divide the research area, resulting in 4316 spatial units. Our goal was to use the TF-IDF algorithm to measure the contribution of POI functions, thereby identifying the functions within each grid. Drawing on the TF-IDF model, the actual contribution (AC) of POI functions can be calculated to comprehensively reflect the scale and usage characteristics of POI functions. The calculation formula is as follows:

$${f}_{i,j}=\frac{{n}_{i,j}}{\sum_{k} {n}_{k,j}}$$

(1)

$${b}_{i}=\text{log}\frac{|D|}{|\{j:{t}_{i}\in {d}_{j}\}|}$$

(2)

$$A{C}_{i}={f}_{i,j}\times {b}_{i}$$

(3)

where i represents the type of POI function, j represents the functional area where the POI function type is located, n_i,j represents the frequency of occurrence of the POI function type in the functional area, $\sum_{k}n$_k,j represents the total frequency of occurrence of all POI function types in the functional area, f_i,j represents the proportion of the total frequency of occurrence of all POI function types in the functional area, D represents the total number of functional areas, $|\{j:{t}_{i}\in {d}_{j}\}|$ represents the number of functional areas containing this POI function type, b_i represents the general importance of this POI function type, and AC_i represents the actual contribution of the POI function type.

CA–Markov model

Cellular Automata (CA) is a dynamic system characterized by discrete spaces and states. The states of all cells are updated simultaneously according to the transition rules. The state of a cell at any given moment depends on that cell and its neighborhood at the previous moment. The formula is as follows:

$$ {\text{S}}\_\left( {{\text{t}} + {1}} \right) = {\text{f}}\left( {{\text{S}}\_\left( {{\text{t}},} \right){\text{ N}}} \right) $$

(4)

where S represents the cell state, N represents the cellular field, t + 1 represents different moments in time, and f is the transformation rule of the local spatial cells.

The Markov model is a stochastic model in the time domain, where the state of an event at time t transitions to the state at time t + 1, and the state at time t + 1 is solely related to the state at time t. Its essence is to predict the probability of events. The transition matrix is a numerical representation of the likelihood of an event transitioning from time t to state t + 1 and serves as an important quantitative basis for the simulation and prediction results under the Markov model. This formula is expressed as follows:

$$ {\text{S}}\_\left( {{\text{t}} + {1}} \right) = {\text{P}}\_{\text{ij}} \times {\text{S}}\_{\text{t}} $$

(5)

P_ij is a state transition matrix, which reflects the probability of each type of urban functional area transforming into other types, calculated through historical data. The formula is as follows:

$${P}_{ij}=\left[\begin{array}{cccc}{P}_{11}& {P}_{12}& \dots & {P}_{1n}\\ {P}_{21}& {P}_{22}& \dots & {P}_{2n}\\ \dots & \dots & \dots & \dots \\ {P}_{n1}& {P}_{n2}& \dots & {P}_{nn}\end{array}\right]$$

(6)

0 < Pij < 1, (i,j = 1,2,…,n), where n represents the number of functional types, P represents the probability of initial type i transitioning to type j, i is the row of the matrix, and j is the column of the matrix. Each row of the matrix represents the probability that functional type i transitions to each functional type.

The CA–Markov model, formed by combining the two models, can achieve accurate predictions. Specifically, the CA–Markov model fully leverages the capabilities of the CA and Markov models to simulate spatial changes in complex systems and quantitatively predict long-time series. This study employed the CA–Markov model to predict changes in functional areas. After identifying the functions of the corresponding grids using the POI data, the data were rasterized, reclassified, and ultimately imported into the IDRISI software for predictive analysis (https://clarklabs.org/)³⁷. This section of the study explained the prediction process using the CA–Markov model through an experimental set. By using urban functional areas data from 2017 and 2020, it calculated the probability of each functional type transforming into other types, resulting in the functional area transition matrix for 2017–2020. Using the 2020 functional areas data, it predicted the distribution of functional areas for 2023. The predicted results were then compared with the actual functional areas distribution of 2023 to obtain the kappa coefficient, which indicated the accuracy of the CA–Markov model in predicting functional areas.

Designing three sets of experiments

After identifying the distribution of urban functional areas within the study area from 2015 to 2023, three sets of experiments were designed (Table 2) to comprehensively test the predictive ability of the CA–Markov model for urban functional areas. Initially, the forecasted outcomes for the functional areas in 2023 were juxtaposed with actual data to determine the overall accuracy, followed by the application of an identical methodology to determine the predictive accuracies for 2022 and 2021. The robustness of the CA–Markov model was confirmed based on the predictive accuracy obtained for the three experimental groups. Subsequently, the forecast data were imported into ArcGIS Pro for comparative spatial accuracy analysis.

Table 2 Three sets of experiments to verify the model’s ability.

Full size table

Result

Results of urban functional areas identification

Due to the mixed phenomena of functional areas within the study region, there were different functional distributions within the grid units. For instance, in some grids, one function’s actual contribution is absolutely dominant; in others, two functions’ contributions are dominant; and in some cases, the differences in the actual contributions among multiple functions are not significant, presenting a comprehensive distribution phenomenon. Based on this, the study first filtered out areas without POI, designating them as no-data zones. In the remaining grid units, when the actual contribution value of a certain function within a grid exceeded 50%, the functional attribute was designated as the attribute of the functional area. When the actual contribution of no function within a grid exceeded 50%, two functions with actual contributions between 20 and 50% were designated as a combined functional area. For example, if the actual contributions of commercial and residential functions within a grid were between 20 and 50%, the grid was classified as a commercial-residential functional area. The other cases were classified as integrated functional areas^6,38. Following this method, we identified the distribution of mixed functional areas within the Nanjing old city for the years 2017, 2020, and 2023, as specifically shown in Fig. 3.

Verification of identification results

After identifying the distribution of functional areas, this paper took the identification results of the functional areas in 2023 as an example, referencing the studies by Ding to validate the accuracy of identification³⁹. The study randomly selected 40 grids, judged the real attributes of functional areas based on Amap, and used a conformity scoring method to evaluate the accuracy of the identification results. A perfect score of 3 points indicates full compliance, and 0 points indicate complete non-compliance. If a single functional area is identified as a mixed functional area, it scores 2; if it is marked as a comprehensive functional area, it scores 1. If a mixed functional area is identified as containing a single functional area from that mix, or another mixed functional area containing a function from that mix, it scores 2; if it is marked as a comprehensive functional area, it scores 1. If a comprehensive functional area is identified as a single functional area or a mixed functional area, it scores 1.

The formula for calculating accuracy is:

$$a=\frac{\sum_{i=1}^{n} {x}_{i}}{\sum_{i=1}^{n} {X}_{i}}\times 100\text{\%}$$

(7)

In the formula, n represents the sample size, X_i is the sum of all samples with full accuracy, and x_i is the actual score of sample accuracy. The verification results of functional areas accuracy are shown in Table 3. Ultimately, by calculation, the overall accuracy of identifying the functional areas in the old city of Nanjing reached 80.83%, indicating the effectiveness of this method in identifying urban functional areas.

Table 3 Accuracy evaluation of urban functional area.

Full size table

We considered the complex distribution of functional areas and the fact that the CA–Markov model currently simulated land use as single use. Therefore, for the mixed functional areas, we assigned the function with the highest proportion to a single functional area. For example, if 60% of a business-public service functional area was business, the area was allocated as a business functional area. Using this method, we identified the distribution of single functional areas in Nanjing old city, namely the primary functions within the grid, as shown in Fig. 4.

Verification of overall accuracy and robustness in urban functional area prediction

After the functional areas were identified, they were imported into IDRISI for reclassification to facilitate further processing. Based on the functional area data from 2017 and 2020, a mixed-function transition matrix for 2017–2020 was generated using the Markov module to obtain the probability of converting each function type into other functions. Next, we used the functional area data from 2020 combined with the transition matrix to predict the functional area distribution in 2023. This study evaluated the precision of the predictions using the kappa coefficient, which ranges from 0 to 1. The higher the value, the more precise the results. The accuracy was the lowest when the value was between 0 and 0.2. When the value was between 0.4 and 0.6, the result was acceptable. A value between 0.61 and 0.8 indicates higher precision, and if the value is greater than 0.8, the accuracy of the result is very high. We compared the predicted results of the mixed functional areas in 2023 with the actual situation to obtain a kappa value representing the overall accuracy.

Overall accuracy and robustness of mixed functional areas

This study predicted the mixed functional areas for 2023 based on the distributions of 2017 and 2020, with the results shown in Fig. 5. By comparing this with the actual situation of mixed functional areas in 2023, we obtained the overall accuracy of the prediction. We predicted the mixed functional areas for 2022 and 2021 using the same method and obtained their respective predictive accuracies, as shown in Table 4. Overall, the accuracy of the CA–Markov model’s predictions for mixed functional areas in 2021, 2022, and 2023 tended to be stable, confirming its robustness for predicting mixed functional areas.

Table 4 Prediction accuracy of three groups of years in mixed functional areas.

Full size table

Overall accuracy and robustness of single functional areas

This study predicted the single functional areas for 2023 based on the distributions in 2017 and 2020, with the results shown in Fig. 6. We obtained the overall accuracy of the prediction by comparing this with the actual situation of single functional areas in 2023. We predicted the single functional areas for 2022 and 2021 using the same method and obtained their respective predictive accuracies, as shown in Table 5. Overall, the accuracy of the CA–Markov model’s predictions for single functional areas in 2021, 2022, and 2023 tended to be stable, confirming its robustness in predicting single functional areas.

Table 5 Prediction accuracy of three groups of years in single functional areas.

Full size table

Spatial accuracy of model

In addition to validating the kappa value and robustness of the model, this study further verified its spatial accuracy. Taking the predicted distribution of functional areas for 2023, based on the functional areas from 2017 and 2020, as an example, we imported the predicted results into ArcGIS Pro and conducted a differential analysis between the actual distribution of functional areas in 2023 and 2020 to obtain the actual changes in functional areas. Subsequently, a differential analysis was performed between the predicted distribution of functional areas in 2023 and the actual distribution in 2020, comparing the actual changes with the predicted changes to further analyze the accuracy of the CA–Markov model in predicting functional areas.

We first analyzed the comparison of mixed functional areas, with specific results shown in Fig. 7. Where the yellow blocks represent the parts that remain the same after comparing 2 years, indicating areas where the functions have not changed, and the blue blocks represent the different parts, showing the changes in the functional areas between the 2 years. It is visually apparent from the figure that the blue area in the right panel is much smaller than that in the left panel, indicating that the predicted changes in the mixed functional areas from 2020 to 2023 are much smaller than the actual changes in the mixed functional areas between 2020 and 2023. Through further analysis, we obtained a coincidence rate of 0.52 between the actual distribution results of mixed functional areas in 2023 and those in 2020, while the coincidence rate between the predicted distribution results of mixed functional areas in 2023 and the actual distribution results in 2020 was 0.81. In other words, the mixed functional areas changed by 48% from 2020 to 2023, whereas the predicted change in mixed functional areas from 2020 to 2023 was only 19%. This is obviously not consistent with the actual situation, indicating that the model’s prediction largely relies on the original mixed-use results, while the actual changes are much greater than the model’s predictions, suggesting a low predictive accuracy of the model for mixed-use area changes, making such predictions unreliable.

Using the same method, we analyzed the comparison of single functional areas, with specific results shown in Fig. 8. Similar to the previous analysis, yellow blocks represent the parts that remain the same after comparing 2 years, indicating areas where the functions have not changed, while blue blocks represent the different parts, showing the changes in functional areas between the 2 years. Directly from the graph, the difference between the two cannot be observed; through further analysis, we found that the overlap rate between the actual distribution results of single functional areas in 2023 and those in 2020 was 0.69, while the overlap rate between the predicted distribution results of single-function areas in 2023 and the actual distribution results in 2020 was 0.73. In other words, the single functional areas changed by 31% from 2020 to 2023, and the predicted change in single functional areas from 2020 to 2023 was 27%; the difference between the two was not significant, indicating that the model can predict changes in single functional areas.

In order to explore the specific situation of model prediction of changes in single functional areas, we calculated the proportion of major function prediction errors in single-function areas, as shown in Fig. 9. We found that the highest accuracy in prediction was for business function areas, with an accuracy of around 0.75, while the lowest was for industrial function areas, with an accuracy of around 0.60; the accuracies for the other four types of function areas were around 0.7, and overall, there was no obvious trend of dispersion in the proportion of prediction errors for the six types of function areas.

To further investigate the prediction errors for each type of function, we obtained the spatial prediction results for 2023 in ArcGIS Pro, as shown in Fig. 10. In this, gray grids represent the cases where predictions for the six types of functional areas were incorrect, while other colors represent correct predictions. Taking the commercial function area as an example, gray grids indicate instances where the true situation was a commercial function area but was incorrectly predicted, whereas red grids indicate instances where the true situation was a commercial function area and was correctly predicted. The results show that the grids where each type of function area was incorrectly predicted did not exhibit obvious spatial clustering characteristics.

Discussion

Research results compared to other studies

Many scholars have currently focused on the prediction of urban public areas, mainly covering aspects such as urban public facilities, the utilization of urban public spaces⁴⁰, the vitality of urban public spaces⁴¹, urban traffic flow⁴², urban population growth⁴³, urban land use changes³², and urban green space demand²⁷. In contrast, urban function is a key element that reflects urban morphology and influences urban structure, and the development of urban functional area is closely related to urban spatial structure. This study focused on the analysis of functional area identification and distribution characteristics using POI data, attempting to explore the predictive ability of the CA–Markov model for functional areas, and to make reasonable functional zoning and layout decisions, including guiding the layout of various businesses such as catering and entertainment, optimizing urban public spaces, open spaces, and green spaces, and assisting land use planning to improve spatial efficiency. Through this, reasonable urban spatial structures can be created to address urban issues and achieve sustainable development^44,45.

Currently, scholars mainly rely on the CA–Markov model to study the evolution patterns and trends of geographic spatial systems, predict future spatial patterns and states, and apply it primarily in areas such as land use change simulation²⁸, ecosystem evolution analysis⁴⁶, urban expansion simulation², resource management and environmental protection⁴⁷. However, fewer scholars have paid attention to the model’s prediction of urban functional areas. The CA–Markov model combines the spatial self-organizing properties of cellular automata with the time series forecasting ability of Markov chains, enabling the model to simultaneously handle complex dynamic changes in space and time³⁰. This characteristic makes the model suitable for addressing issues such as urban functional areas, where spatial distribution evolves over time. Based on identifying functional areas, this study delved into the CA–Markov model’s ability to predict functional areas, thus expanding the model’s applicability.

This study took the old city area of Nanjing as an example, using POI data to identify the distribution of urban functional areas. Based on this, it studied the capability of the CA–Markov model to predict functional areas over several years. The results show that the model reliably predicts the main functions within the functional areas, which is consistent with the findings of Zheng’s study³⁸. Unlike previous studies, this research provided a more comprehensive explanation of the model’s capability to predict urban functional areas through various aspects such as model accuracy, robustness validation, comparative analysis of predicted changes, and specific functional areas prediction outcomes. We found that the model is robust in predicting functional areas, however, its predictions for mixed functional areas are inaccurate, and it currently cannot support predictions for mixed functional areas. Additionally, this study specifically analyzed the probability of prediction errors for each type of functional area and the spatial occurrences of these errors.

Feasibility analysis

The methodology of this study utilized POI data from multiple time points, covering a span of 9 years from 2015 to 2023; validation with multi-year data ensures the universality of the research findings, enabling a more comprehensive explanation of the CA–Markov model’s predictive capabilities for functional areas and facilitating the rational allocation of urban functions.

For the identification of urban functional areas, this study conducted accuracy verification, obtaining an identification accuracy of 0.81 after comparison with real functional areas. We believe that the identified functional areas data can support the exploration of the model’s predictive capabilities.

To assess the applicability of models in predicting urban functional areas, this study first verified them through robustness and kappa values, both of which reflect the accuracy of the model predictions to some extent, but still do not fully explain the superiority and importance of the models. Consequently, this study further analyzed the spatial accuracy of the CA–Markov model predictions by comparing the actual distribution of functional areas with the predicted results, providing a detailed prediction for each type of functional area, thereby offering a more comprehensive interpretation of the model’s ability to predict urban functional areas.

Limitations and future directions

In order to explore the scientific validity and integrity of the CA–Markov model in predicting urban functional areas, this study set different time spans for validation, conducting three sets of tests to examine the prediction results for time spans of 1 year, 2 years, and 3 years, as shown in Table 6. We calculated the average accuracy values for the three time spans and found that they were similar, thus selecting a 3-year time span for the prediction study. Additionally, since the data for this study spanned 9 years from 2015 to 2023, setting a time span of 4 years or more would not allow for the formation of three sets of control experiments, making the validation results less representative. In future research, as more POI data is collected, it will be possible to explore predictions with longer time spans.

Table 6 Model accuracy results for different time spans.

Full size table

Furthermore, in the case of predicting single functional areas, we found that the predicted results still required improvement compared with the actual situation. In reality, there are situations in which a greenspace function is surrounded by business functions. In the model’s prediction, when the surrounding area is entirely a business function, the green space in the middle also has a high probability of being converted into a business function, implying that the surrounding business functions affect the existence of green space. This aligns with the rules of the model but does not fully match reality. In reality, if a green space within a business area is a park or plaza, it is likely to persist and not convert because it is surrounded by business functions. Therefore, there is room for further enhancement in the accuracy of the CA–Markov model’s predictions for functional areas.

The urban spatial structure is shaped by factors such as population distribution, land use, and transportation systems^48,49,,50, which interact with each other to determine the form and function of the city. Urban functional areas, as an essential component of urban spatial structure, reflect the primary uses and characteristics of different regions within a city¹⁴, such as business, residential, and industrial areas. Therefore, research on urban functional areas must consider information on population dynamics, land use patterns, and traffic flow. Future studies could use models incorporating factors like population shifts, employment changes, traffic volume, road accessibility, and changes in land use to comprehensively analyze the characteristics of changes in urban functional areas, develop suitability atlases that affect the transformation of each functional area, and make more reasonable and reliable predictions on the development of urban functional areas to promote sustainable and efficient city operations.

This study has certain limitations. First, for the same factual POI, some POI locations were overlooked during the multiyear data statistics on Amap, such as sculptures in parks reflecting urban functions. More precisely, they were ignored in some years but appeared in others. However, POI in green park spaces exist objectively every year, which could affect the judgment of POI functional weight, thereby further influencing the identification and prediction results of functional areas for each year. However, Amap is China’s leading provider of digital map content, navigation, and positioning services, widely recognized and used by users in the industry, and its reputation and influence within the industry also provide a certain guarantee for the reliability of its data. In this study, considering the minimal possibility of these POI being overlooked in the statistics, and for the total number of POI across multiple years, the proportion of overlooked POI is negligible, we believe that this has little impact on the identification of functional areas and the prediction results. Additionally, in future research, with the advancement of technology, obtaining more accurate POI data can further improve the conclusions of this study.

Conclusion

Research on urban functional areas is crucial for promoting sustainable urban development. Identifying urban functional areas and predicting their changes help in understanding the development of urban functions, thereby enabling the rational allocation of the distribution of urban functional areas, advancing the rationalization of urban spatial structures, and thus solving urban problems. This study initially identified the distribution of urban functional areas in the old city area of Nanjing over several years and then used the CA–Markov model to predict their changes. The results show that, for mixed or single functional areas, the model’s predictions over several years tend to be stable, and the accuracy of the predictions over many years indicates the robustness of the model in predicting urban functional areas. For mixed functional areas in cities, model predictions largely rely on the distribution of the base years used for prediction, leading to inaccurate results, and it is still not applicable for simulating and predicting mixed functional areas. For single functional areas in cities or the primary function within an area, the degree of change predicted by the model was close to the actual degree of change, making the results referable.

In summary, this study proposed the identification of functional areas based on POI data and explored the capacity of the CA–Markov model to predict changes in these areas, presenting a standardized and scalable workflow. This approach addressed the challenge of functional area identification within urban function area research, facilitating an understanding and rational arrangement of urban spatial structures and functional layouts to enhance spatial utilization efficiency. Moreover, within the context of sustainable development, the approach of this paper provided fresh perspectives for forecasting and researching future urban functional areas.

Data availability

The datasets used and analysed during the current study available from the corresponding author on reasonable request.

References

Shao, Z., Bakker, M., Spit, T., Janssen-Jansen, L. & Qun, W. Containing urban expansion in China: The case of Nanjing. J. Environ. Plan. Manag. 63, 189–209 (2020).
Article Google Scholar
Hien, P. D., Men, N. T., Tan, P. M. & Hangartner, M. Impact of urban expansion on the air pollution landscape: A case study of Hanoi, Vietnam. Sci. Total Environ. 702, 134635 (2020).
Article CAS PubMed Google Scholar
Mao, D. et al. China’s wetlands loss to urban expansion. Land Degrad. Dev. 29, 2644–2657 (2018).
Article Google Scholar
Lu, J., Li, B., Li, H. & Al-Barakani, A. Expansion of city scale, traffic modes, traffic congestion, and air pollution. Cities 108, 102974 (2021).
Article Google Scholar
Radwan, T. M., Blackburn, G. A., Whyatt, J. D. & Atkinson, P. M. Dramatic loss of agricultural land due to urban expansion threatens food security in the Nile Delta, Egypt. Remote Sens. 11, 332 (2019).
Article ADS Google Scholar
Wang, Z., Ma, D., Sun, D. & Zhang, J. Identification and analysis of urban functional area in Hangzhou based on OSM and POI data. PLoS One 16, e0251988 (2021).
Article CAS PubMed PubMed Central Google Scholar
Li, Y., Liu, C. & Li, Y. Identification of urban functional areas and their mixing degree using point of interest analyses. Land 11, 996 (2022).
Article Google Scholar
Hu, Y. & Han, Y. Identification of urban functional areas based on POI data: A case study of the Guangzhou economic and technological development zone. Sustainability 11, 1385 (2019).
Article Google Scholar
Cui, H., Wu, L., Hu, S., Lu, R. & Wang, S. Recognition of urban functions and mixed use based on residents’ movement and topic generation model: The case of Wuhan, China. Remote Sens. 12, 2889 (2020).
Article ADS Google Scholar
Chen, J. et al. Comprehensive strength evaluation system of commercial centres based on multi-source data: A case of Hefei central city. Sci. Rep. 13, 17082 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Sun, Z., Jiao, H., Wu, H., Peng, Z. & Liu, L. Block2vec: An approach for identifying urban functional regions by integrating sentence embedding model and points of interest. ISPRS Int. J. Geo-Inf. 10, 339 (2021).
Article Google Scholar
Zhang, X., Li, W., Zhang, F., Liu, R. & Du, Z. Identifying urban functional zones using public bicycle rental records and point-of-interest data. ISPRS Int. J. Geo-Inf. 7, 459 (2018).
Article Google Scholar
Zhai, W. et al. Beyond Word2vec: An approach for urban functional region extraction and identification by combining Place2vec and POIs. Comput. Environ. Urban Syst. 74, 1–12 (2019).
Article Google Scholar
Niu, H. & Silva, E. A. Delineating urban functional use from points of interest data with neural network embedding: A case study in Greater London. Comput. Environ. Urban Syst. 88, 101651 (2021).
Article Google Scholar
Wang, T., Wang, H., Wang, Z. & Huang, J. Dynamic risk assessment of urban flood disasters based on functional area division—A case study in Shenzhen, China. J. Environ. Manag. 345, 118787 (2023).
Article Google Scholar
Huang, C., Xiao, C. & Rong, L. Integrating point-of-interest density and spatial heterogeneity to identify urban functional areas. Remote Sens. 14, 4201 (2022).
Article ADS Google Scholar
Gao, S., Janowicz, K. & Couclelis, H. Extracting urban functional regions from points of interest and human activities on location-based social networks. Trans. GIS 21, 446–467 (2017).
Article Google Scholar
Song, J., Lin, T., Li, X. & Prishchepov, A. V. Mapping urban functional zones by integrating very high spatial resolution remote sensing imagery and points of interest: A case study of Xiamen, China. Remote Sens. 10, 1737 (2018).
Article ADS Google Scholar
Xu, S. et al. A new remote sensing images and point-of-interest fused (RPF) model for sensing urban functional regions. Remote Sens. 12, 1032 (2020).
Article ADS Google Scholar
Wang, J., Gao, C., Wang, M. & Zhang, Y. Identification of urban functional areas and urban spatial structure analysis by fusing multi-source data features: A case study of Zhengzhou, China. Sustainability 15, 6505 (2023).
Article Google Scholar
Shen, Y. & Karimi, K. Urban evolution as a spatio-functional interaction process: The case of central Shanghai. J. Urban Des. 23, 42–70 (2018).
Article Google Scholar
Schiavina, M. et al. Land use efficiency of functional urban areas: Global pattern and evolution of development trajectories. Habitat Int. 123, 102543 (2022).
Article Google Scholar
Wang, Z. et al. Identifying urban functional areas and their dynamic changes in Beijing: Using multiyear transit smart card data. J. Urban Plan. Dev. 147, 04021002 (2021).
Article Google Scholar
Zhou, G., Zhang, J., Li, C. & Liu, Y. Spatial pattern of functional urban land conversion and expansion under rapid urbanization: A case study of Changchun, China. Land 11, 119 (2022).
Article CAS Google Scholar
Guan, D. et al. Modeling urban land use change by the integration of cellular automaton and Markov model. Ecol. Model. 222, 3761–3772 (2011).
Article Google Scholar
Riccioli, F., El Asmar, T., El Asmar, J.-P. & Fratini, R. Use of cellular automata in the study of variables involved in land use changes: An application in the wine production sector. Environ. Monit. Assess. 185, 5361–5374 (2013).
Article PubMed Google Scholar
Roose, M. & Hietala, R. A methodological Markov-CA projection of the greening agricultural landscape—A case study from 2005 to 2017 in southwestern Finland. Environ. Monit. Assess. 190, 1–13 (2018).
Article Google Scholar
Hamad, R., Balzter, H. & Kolo, K. Predicting land use/land cover changes using a CA-Markov model under two different scenarios. Sustainability 10, 3421 (2018).
Article Google Scholar
Zhang, Z., Hu, B., Jiang, W. & Qiu, H. Identification and scenario prediction of degree of wetland damage in Guangxi based on the CA-Markov model. Ecol. Indic. 127, 107764 (2021).
Article Google Scholar
Fu, F., Deng, S., Wu, D., Liu, W. & Bai, Z. Research on the spatiotemporal evolution of land use landscape pattern in a county area based on CA-Markov model. Sustain. Cities Soc. 80, 103760 (2022).
Article Google Scholar
Huang, Y., Yang, B., Wang, M., Liu, B. & Yang, X. Analysis of the future land cover change in Beijing using CA–Markov chain model. Environ. Earth Sci. 79, 60 (2020).
Article ADS Google Scholar
Karimi, H., Jafarnezhad, J., Khaledi, J. & Ahmadi, P. Monitoring and prediction of land use/land cover changes using CA-Markov model: A case study of Ravansar County in Iran. Arab. J. Geosci. 11, 1–9 (2018).
Article Google Scholar
Luo, G., Ye, J., Wang, J. & Wei, Y. Urban functional zone classification based on POI data and machine learning. Sustainability 15, 4631 (2023).
Article Google Scholar
Miao, R., Wang, Y. & Li, S. Analyzing urban spatial patterns and functional zones using sina Weibo POI data: A case study of Beijing. Sustainability 13, 647 (2021).
Article Google Scholar
Kong, X. et al. CoPFun: An urban co-occurrence pattern mining scheme based on regional function discovery. World Wide Web 22, 1029–1054 (2019).
Article Google Scholar
Sun, Y., Yin, H., Wen, J. & Sun, Z. Urban region function mining service based on social media text analysis. Int. J. Softw. Eng. Knowl. Eng. 31, 563–586 (2021).
Article Google Scholar
Yang, X., Zheng, X.-Q. & Chen, R. A land use change model: Integrating landscape pattern indexes and Markov-CA. Ecol. Model. 283, 1–7 (2014).
Article Google Scholar
Zheng, M., Wang, H., Shang, Y. & Zheng, X. Identification and prediction of mixed-use functional areas supported by POI data in Jinan City of China. Sci. Rep. 13, 2913 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Ding, Y., Xu, H. & Wang, C. Urban functional area identification based on OSM road network and POI data. Geo Gis 36, 57–63 (2020).
Google Scholar
Sun, P., Hou, R. & Lynch, J. Measuring the utilization of public open spaces by deep learning: A benchmark study at the detroit riverfront. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 2228–2237 (2020).
Liu, Y. & Guo, X. A dynamic prediction framework for urban public space vitality: From hypothesis to algorithm and verification. Sustainability 16, 2846 (2024).
Article Google Scholar
Polson, N. G. & Sokolov, V. O. Deep learning for short-term traffic flow prediction. Transp. Res. Part C Emerg. Technol. 79, 1–17 (2017).
Article Google Scholar
Farooq, M. & Muslim, M. Dynamics and forecasting of population growth and urban expansion in Srinagar city–a geospatial approach. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 40, 709–716 (2014).
Article Google Scholar
Yi, D., Yang, J., Liu, J., Liu, Y. & Zhang, J. Quantitative identification of urban functions with fishers’ exact test and POI data applied in classifying urban districts: A case study within the sixth ring road in Beijing. ISPRS Int. J. Geo-Inf. 8, 555 (2019).
Article Google Scholar
Tu, W. et al. Portraying urban functional zones by coupling remote sensing imagery and human sensing data. Remote Sens. 10, 141 (2018).
Article ADS Google Scholar
Liu, J., Zhang, L. & Zhang, Q. The development simulation of urban green space system layout based on the land use scenario: A case study of Xuchang city, China. Sustainability 12, 326 (2019).
Article Google Scholar
Jia, W. et al. Multi-scale spatiotemporal pattern analysis and simulation (MSPAS) model with driving factors for land cover change and sustainable development goals: A case study of Nepal. Remote Sens. 14, 6295 (2022).
Article ADS Google Scholar
Sahebgharani, A., Wiśniewski, S., Borowska-Stefańska, M., Kowalski, M. & Mokoei, K. Analyzing the effect of depopulation on the spatial structure of the city of Łódź, Poland: Development and application of an integrated land use and transportation model. Habitat Int. 143, 102992 (2024).
Article Google Scholar
Hui, E. C., Li, X., Chen, T. & Lang, W. Deciphering the spatial structure of China’s megacity region: A new bay area—The Guangdong-Hong Kong-Macao Greater Bay Area in the making. Cities 105, 102168 (2020).
Article Google Scholar
Lifei Wang, Yue Dai, Lingyun Han, Zhen Xu*. Optimizing urban resource efficiency: A scenario analysis of shared sports facilities in fostering sustainable communities in Nanjing, China. J. of Cleaner Production. 468, 143082. https://doi.org/10.1016/j.jclepro.2024.143082 (2024).

Download references

Acknowledgements

This work was supported by the Priority Academic Program Development of Jiangsu Higher Education Insitutions, and the National Natural Science Foundation of China: 52378048.

Funding

National Natural Science Foundation of China: 52378048; Funding2: Priority Academic Program Development of Jiangsu Higher Education.

Author information

Authors and Affiliations

College of Landscape Architecture, Nanjing Forestry University, Nanjing, 210037, China
Xinyu Hu, Wei Zhu, Ximing Shen, Ruxia Bai, Chen Li & Lili Zhao
School of Architecture, Southeast University, Nanjing, 210096, China
Yi Shi

Authors

Xinyu Hu
View author publications
Search author on:PubMed Google Scholar
Wei Zhu
View author publications
Search author on:PubMed Google Scholar
Ximing Shen
View author publications
Search author on:PubMed Google Scholar
Ruxia Bai
View author publications
Search author on:PubMed Google Scholar
Yi Shi
View author publications
Search author on:PubMed Google Scholar
Chen Li
View author publications
Search author on:PubMed Google Scholar
Lili Zhao
View author publications
Search author on:PubMed Google Scholar

Contributions

Initial idea and research plan were provided by X.H. and W.Z. Material preparation, data collection and analysis were performed by W.Z., X.S., R.B and C.L. The first draft of the manuscript was written by W.Z. and all authors commented on previous versions of the manuscript. X.H. and Y.S. contributed more to the revision of the manuscript. X.H. and Y.S. provided project support. X.H.,W.Z. and L.Z. contributed to the revision of the manuscript.

Corresponding author

Correspondence to Xinyu Hu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Hu, X., Zhu, W., Shen, X. et al. Exploring the predictive ability of the CA–Markov model for urban functional area in Nanjing old city. Sci Rep 14, 18453 (2024). https://doi.org/10.1038/s41598-024-69414-3

Download citation

Received: 03 March 2024
Accepted: 05 August 2024
Published: 08 August 2024
Version of record: 08 August 2024
DOI: https://doi.org/10.1038/s41598-024-69414-3

This article is cited by

Construction of forest ecological security patterns based on MSPA model and circuit theory in the Desertification Control forests in South China Karst
- Song Zhang
- Yu Zhang
- Zhaohua Wang
npj Heritage Science (2025)

Subjects

Abstract

Similar content being viewed by others

Identification and prediction of mixed-use functional areas supported by POI data in Jinan City of China

Studying the distribution patterns, dynamics and influencing factors of city functional components by gradient analysis

Assessing urban vitality in high-density cities: a spatial accessibility approach using POI reviews and residential data

Introduction

Materials and methods

Study area

Data sources and processing

Research methodology

Study process

TF-IDF model

CA–Markov model

Designing three sets of experiments

Result

Results of urban functional areas identification

Verification of identification results

Verification of overall accuracy and robustness in urban functional area prediction

Overall accuracy and robustness of mixed functional areas

Overall accuracy and robustness of single functional areas

Spatial accuracy of model

Discussion

Research results compared to other studies

Feasibility analysis

Limitations and future directions

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Construction of forest ecological security patterns based on MSPA model and circuit theory in the Desertification Control forests in South China Karst

Search

Quick links