Introduction

Urbanization is reshaping urban forms at an unprecedented pace. However, the prevailing governance paradigm—centered on efficiency, form, and capital—increasingly reveals risks in practice, such as spatial homogenization and placelessness, which in turn weaken residents’ emotional attachment to the city and undermine public identity1. The concept of collective memory was first introduced in 1925 by French sociologist Maurice Halbwachs2. It is regarded as a group-oriented construct derived from individuals’ association and reconstruction of past experiences, characterized by collectivity, continuity, and constructiveness3. As a product of intra-group relationships, collective memory depends on spatial and locational anchoring for its formation and continuation. Simultaneously, space and place are capable of bearing asynchronous individual memories4, particularly in the context of the internet, which offers a platform for communication that transcends spatial and temporal constraints. In light of this, we advocate for an integrative approach that examines collective memory formed in urban space alongside the representation of urban space within collective memory. By employing the concept of urban collective memory, we aim to reinterpret the interaction between urban space and the public, and to explore its significance for both citizens and the city itself. “The power of place”, as Dolores Hayden aptly noted, is “the power of ordinary urban landscapes to nurture citizens’ public memory”5.

Halbwachs’ discussion on memory and place has garnered extensive attention in the fields of urban planning and architecture, but particularly within geography since the 1990s. Existing studies can be broadly categorized into three thematic areas. First, the theme of “collective memory and space/place” explores how urban space and form act as representations or containers of memory. These studies examine how collective identity and historical consciousness are constructed through specific buildings, spatial configurations, and events, and how they convey the symbols and meanings of collective memory6,7. Meanwhile, as modern technology and popular culture continuously replicate, disseminate, and layer urban representations, old historical scenes coexist with new commercial landscapes, resulting in a multi-layered and parallel accumulation of urban imagery4. Scholars have investigated the role of collective memory in reinforcing place identity through perspectives such as national and identity recognition8, urban memory9, urban regeneration10, and public space11. Second, the theme of “collective memory and landscape” builds upon Guy Debord’s notion in The Society of the Spectacle, which posits that modern social space is shaped by a symbolic system of “spectacle”. The “spectacle” refers to a mediated reality in which social relations are increasingly experienced through images and representations rather than through direct experience. This symbolic landscape reflects the structure of power and the formalization of memory in society12. Researchers have examined the sociological significance of landscapes, ranging from historical and cultural landscapes13(e.g., memorial, war, and nostalgic landscapes), to landscapes of terror14, and everyday landscapes15. Third, studies on “collective memory and ritual/tourism” analyze how collective memory is constructed through performative practices, including rituals, parades, and tourism activities16. These strands of research primarily focus on the relationship between collective memory and place identity. Compared to these broader explorations, research on urban collective memory provides a more focused inquiry into the capacity and mechanisms through which urban space bears collective memory. First, the components of urban collective memory are diverse and distinctive. Sabaté and Tironi emphasize the importance of urban memory elements, advocating for their role as sustainable cultural resources that merit protection17. Second, urban collective memory originates from place and acts as a cultural symbol of locality, embodying unique historical and cultural attributes18. These attributes are fundamentally different from those of modern public spaces: contemporary society cannot truly reconstruct the past, but can only evoke it within public space through commemorative acts2,3. Third, urban collective memory evolves with the development of local culture. Regional culture emerges from the combined influences of politics, economy, and society. Only by understanding these local dynamics can we grasp the temporality and locality of urban collective memory19. Across different historical periods, transformations in urban historical culture are often expressed through contemporary urban landscapes, which serve as symbolic markers of urban collective memory. These landscapes can effectively reflect changes in collective memory over time20. Therefore, adopting urban collective memory as an analytical perspective not only reveals residents’ emotional attachment to space but, more importantly, enables the construction of a measurable and operable quantitative link between the built environment, social identity, and historical continuity. Specifically, collective memory serves as an intermediary between material carriers and social representations. It can be translated into not only observable spatial symbols, such as paths, nodes, and landmarks, but also computable semantic parameters, including Frequency, narrative themes, and contextual associations. These variables can be used to assess how urban spaces bear public identity, evoke historical consciousness, and promote cultural continuity. In this way, urban design and planning can move beyond visible configurations of form and function to integrate the often-overlooked “invisible values” of collective memory, achieving a holistic optimization from physical infrastructure to cultural resilience.

Urban collective memory research related to spatial mapping and visualization can be traced back to the method of urban imageability mapping. In his seminal work, The Image of the City, Kevin Lynch proposed that people form mental representations, or “images”, of their urban environments, consisting of five fundamental spatial elements: paths, nodes, landmarks, districts, and edges21. Respondents were asked to draw hand-sketched maps of specific urban areas based on their memories using these five elements21. These sketching exercises were often accompanied by verbal interviews22, questionnaires23, and cognitive tasks24. However, such studies are time-consuming and generally limited in sample size. This method addressed the question of how individuals externalize their spatial knowledge of familiar environments, providing a foundation for the framework of spatial cognitive structure. Through conducting field studies in cities such as Boston, Lynch later discovered that these mental maps were not solely individual. Due to overlapping experiences among many people, shared urban images could emerge at a population level21. These public images transcend individual cognition and essentially suggest an underlying framework of collective memory. Halbwachs emphasized that groups “inscribe” their experiences into everyday environments, noting that places are containers of memory, and space offers a material anchor for memory3. Thus, once a Lynchian urban image is widely and consistently shared among individuals, it becomes a place memory—a spatial manifestation of social memory.

With the widespread adoption of the internet, social media platforms such as Twitter, Weibo, Dianping, Instagram, and Facebook have provided individuals with communication channels that transcend spatial and temporal constraints25,26,27. These platforms allow users to share space-related events, emotions, and narratives, which become digitally traceable place memories. Hoskins introduced the concept of “digital networked memory”, arguing that the hyper-connectivity of social media enables personal recollections to converge in real time into an online, dynamic archive of collective memory28. These geo-tagged texts, photographs, and videos represent the digital inscription of contemporary urban memory. Goodchild’s concept of citizen sensors further suggests that geo-tagged data generated via social media constitutes spatial information collected by the public, offering a means to quantify and systematize urban spatial perception. The spatial memories captured through social media are not confined to individual cognition; they are shared, transmitted, and amplified at the group level, thereby contributing to the formation of a broader and more aggregated urban collective memory29.

In recent years, the development of emerging technologies, such as artificial intelligence (AI), machine learning (ML), and natural language processing (NLP), has provided efficient tools for processing large-scale data and diverse information. For example, Nikhil Naik and Jade Philipoom30 trained computer vision algorithms to predict street safety perception rankings. Other scholars have employed manually labeled data to train deep convolutional neural networks (DCNNs) to explore the relationship between visual features of the built environment and human spatial perception31,32. Demircan Tas and Rohit Priyadarshi Sanatani33 proposed a location-based aspect-based sentiment analysis (ABSA) method to analyze crowdsourced evaluations of urban environments, significantly improving the accuracy of theme word extraction and aspect sentiment classification. These tools have shown considerable advantages in studies that reveal how aggregated individual data from large populations inform collective public perceptions of urban spaces. With the help of NLP techniques, it becomes possible to extract rich semantic information related to urban space from vast amounts of social media text data, such as tweets, comments, and posts. This data can be used to identify keywords, emotional tones, emotional intensity, and spatial types, thereby facilitating the creation of more detailed “memory maps”.

However, existing research largely focuses on exploring the unidirectional relationship between the public and space, that is, the aggregation of individual perceptions, without delving deeply into the collective identity that arises within a group due to shared urban spaces. This study proposes a new framework based on the spatial cognitive structure of urban imagery. By utilizing “digital networked memory” data from social media platforms and employing NLP and geographic information system (GIS) technologies, we aggregate individual urban spatial imagery information and conduct classification, quantification (element identification), emotional analysis (emotion intensity), and temporal analysis (memory sedimentation). Through the classification, aggregation, and mapping of individual urban memory digital traces, we construct a dynamic, emotionally layered “urban collective memory map”. This method not only meets the requirements of collective memory in terms of shareability, symbolism, and historical context but also overcomes the limitations of traditional oral history and survey methods, which are constrained by small sample sizes and slow updates. It provides a more sensitive and innovative spatial tool for urban space renewal, cultural heritage preservation, and the study of local identity.

Methods

Study area

We selected Nanjing, the capital city of Jiangsu Province, China, as the subject of our analysis. Nanjing has a rich historical and cultural heritage, having been the capital of six ancient Chinese dynasties. It is one of China’s first officially designated historical and cultural cities, boasting one World Heritage Site, 55 national key cultural heritage sites, 114 provincial-level cultural heritage sites, and 347 municipal-level cultural heritage sites34. The old town of Nanjing retains a relatively complete traditional landscape, an integrated street and alley layout, residential architectural style, and some traditional lifestyles, offering a rich urban historical and cultural landscape. In 2010, the Nanjing Municipal Government approved the “Nanjing Historical and Cultural City Protection Regulation”35.

Nanjing covers an area of 6587.04 square kilometers36 and consists of 11 districts: Xuanwu, Qinhuai, Jianye, Gulou, Qixia, Yuhuatai, Jiangning, Pukou, Liuhe, Lishui, and Gaochun37 (Fig. 1). As a modern metropolis blending history with vitality, Nanjing had 65.4 million mobile internet users by 2022, accounting for 99.6% of the population, as reported in the 2022 Jiangsu Province Internet Development Report. Social media is an active component of daily life for Nanjing residents, with Weibo usage accounting for 38.5%38.

Fig. 1: Map of Nanjing’s administrative districts and divisions. Source: author.
figure 1

a Nanjing municipal area. b Jiangsu provincial area.

Data sources

Currently, Weibo is the most widely used open social media platform in China, with a large user base and diverse data types, including geographic check-in functionality. The data for this study was collected from the Sina Weibo client using a web scraping tool, which obtained a total of 72,653 geo-tagged check-in records within the urban area of Nanjing from January 1, 2018, to December 31, 2024. This dataset includes user ID data, check-in location names, types, coordinates (latitude and longitude), and comment data. Due to the large volume of check-in data, which also contained numerous invalid entries, it was essential to filter the collected raw text data to ensure its reliability. Private commercial places such as “apartments”, “food”, “restaurants”, “stadiums”, “research institutions”, “bookstores”, and “hotels” were excluded from further analysis, as they are not directly related to the urban environment. However, locations that provide public services and infrastructure, such as “roads”, “landmarks”, “places”, “scenic spots”, “campuses”, “libraries”, “shopping streets”, and “cultural centers”, were retained, as these are relevant to the urban environment. Additionally, advertisements and unrelated data were removed based on the textual content. Ultimately, 9921 valid check-in records were retained. The spatial distribution of Nanjing based on check-in data is shown in Fig. 2.

Fig. 2: Spatial distribution of Nanjing’s check-in data.
figure 2

Source: author.

The urban planning and administrative boundary data were sourced from the Nanjing Urban Land Plan (2021-2035)39 and the National Geographic Information Public Service Platform (Tianditu) (https://cloudcenter.tianditu.gov.cn). This data includes planning and construction information on urban boundary lines, urban development, ecological protection, cultural heritage protection, and key spatial nodes, which is used for correspondence and analysis in the collective memory map.

The road network data was obtained from Amap (https://lbs.amap.com/) via web scraping. Amap, as China’s leading navigation and mapping service provider, offers highly accurate road network data, especially in China, with advantages in real-time data, precision, and detail. Based on Nanjing’s road construction standards and actual road conditions, the data quality was ensured by retaining information on expressways, fast roads, main roads, secondary roads, national highways, and county roads, while excluding other road types. The data were processed to remove duplicate lines, floating lines, and internal dense lines. The centerlines of the roads were extracted, and a 55 m wide buffer zone was established as the foundational data for “path” map analysis and extraction.

Research idea

This study is based on the spatial cognitive structure framework of the five basic spatial elements of urban imagery (paths, nodes, landmarks, districts, and edges). It analyzes and extracts urban collective memory information from Weibo platform check-in data and maps it onto a spatial map (Fig. 3). By applying natural language processing to the semantic information of Weibo platform check-in data, and combining spatial analysis methods, the study identifies and quantifies the collective memory categories and intensities of these spatial elements. This approach explores their spatial distribution characteristics within urban collective memory and the regional differentiation features. Furthermore, by comparing the findings with urban planning data as a baseline, the study analyzes the relationship between urban collective memory and urban planning, revealing the degree of alignment between collective memory and the actual spatial layout of the city.

Fig. 3
figure 3

Methodological framework for urban collective memory mapping.

Semantic information recognition based on the BERT model

In the Weibo platform check-in data, the semantic information within the text data contains rich spatial function information and emotional signals. This study adopts BERT (Bidirectional Encoder Representations from Transformers), a deep learning NLP pre-training model proposed by Google AI Research in October 2018. BERT has two versions, BASE and LARGE. This study utilizes the BASE version40. The BERT model performs exceptionally well in natural language processing tasks, particularly in sentiment analysis and semantic classification. It requires minimal adjustment to the output layer, without the need for large-scale task-specific modifications to the model architecture. This allows for accurate interpretation of semantic information in text data through its bidirectional context-aware capability. The study primarily focuses on analyzing the semantic information to identify the spatial elements of nodes and landmarks, as well as the emotional information embedded in the check-in points.

To identify the spatial elements of nodes and landmarks, the research team constructed a dataset containing 1000 labeled samples and performed training and fine-tuning on the BERT model. The final performance of the model (with an accuracy of 0.9377 and an F1 score of 0.9478) indicates that the fine-tuned model is well-optimized and effective in accurately identifying the nodes and landmarks information in the text data. Using the fine-tuned model, we extracted potential urban nodes and landmarks from the 9921 Weibo check-in data, providing an accurate data foundation for subsequent collective memory analysis.

To analyze the sentiment information in the text data from the check-in data, this study fine-tuned the BERT model using the weibo_senti_100k.csv dataset. This dataset contains over 100,000 sentiment-labeled Weibo posts, with ~50,000 positive and 50,000 negative comments, providing a strong foundation for sentiment analysis training. The model’s final performance (with an accuracy of 0.98279 and F1 score of 0.98278) demonstrates its ability to effectively analyze sentiment. Using this fine-tuned BERT model, we identified the sentiment intensity expressed in the text of each check-in point, providing an essential emotional dimension for analyzing the intensity of collective memory.

Urban collective memory intensity evaluation

To quantify the collective memory intensity of each node and landmark, this study integrates four factors as indicators for memory strength analysis: frequency, time span, emotional intensity, and influence level. Frequency reflects the activity level at each check-in point, directly correlating with the spread of collective memory. By calculating the frequency of check-ins at each point, the study identifies key locations that receive frequent attention and interaction on the social media platform. Time span measures the duration of collective memory associated with specific spatial elements, indicating the continuity of memory over time. By analyzing the earliest and latest timestamps of the check-in data, the study determines how long collective memory lasts for each spatial element. Emotional intensity is an important indicator of collective memory, reflecting users’ emotional inclination and emotional investment towards specific spatial elements. Sentiment analysis is employed to quantify the emotional intensity of each check-in point, providing an emotional dimension to memory strength. Lastly, influence level measures the social impact and broad dissemination of the check-in data by counting the number of shares, comments, and likes at each check-in point. This metric reflects the level of engagement and societal influence generated by social media activity. By integrating these four factors, the study effectively quantifies the collective memory intensity for each spatial element, providing a comprehensive analysis of how collective memory manifests in the city and evolves.

To aggregate these factors, the Entropy Weight Method is employed. This method determines the weight of each factor by measuring the information entropy of the indicators, allowing the comprehensive score of memory strength to be established. First, the data for the density, time span, emotional intensity, and influence level indicators are standardized to ensure that all indicators share the same dimension. Then, the information entropy is calculated to measure the degree of dispersion of each indicator. A smaller entropy value indicates that the indicator provides more information. Based on the entropy values, the weight of each indicator is calculated, with indicators having smaller entropy values being assigned higher weights, as expressed in Eq.(1):

$${w}_{j}=\frac{1-{H}_{j}}{\mathop{\sum }\nolimits_{j=1}^{n}\left(1-{H}_{j}\right)}$$
(1)

In this formula, \({w}_{j}\) represents the weight of the \(j\)-th indicator, \({H}_{j}\) is the entropy value of the \(j\)-th indicator, and \(n\) is the total number of indicators.

Finally, the comprehensive score is obtained by performing a weighted sum of the standardized values of each sample for each indicator, multiplied by their respective weights, as shown in Eq.(2):

$${S}_{i}=\mathop{\sum }\limits_{j=1}^{n}{w}_{j}{x}_{{ij}}^{{\prime} }$$
(2)

In the formula, \({S}_{i}\) represents the comprehensive score of the \(i\)-th sample, and \({x}_{{ij}}^{{\prime} }\) denotes the normalized value of the \(j\)-th indicator for the \(i\)-th sample.

Cluster analysis

To evaluate the spatial continuity and discontinuity patterns of the Weibo check-in data, this study employs clustering analysis, specifically the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithm. DBSCAN is a density-based clustering method that partitions districts based on the density of spatial data and effectively identifies noise points (outliers). This method is particularly suitable for handling spatial clusters with varying densities and shapes, and can identify meaningful “districts” and “edges” in spatial analysis of social media data.

This study utilizes the DBSCAN analysis tool in ArcGIS Pro to identify the spatial clustering characteristics of Weibo check-in data. By using GPS location information from social media check-ins, the spatial distribution of social media activities is classified. The experiment revealed that adjusting the parameters of the minimum number of points and search radius in DBSCAN leads to different spatial granularity classification patterns. After experimentation, this study selected a 1200 m distance radius and 20 points as the minimum clustering size for spatial analysis in the Nanjing metropolitan area. For the analysis of the main urban area of Nanjing, a 500 m distance radius and 5 points were chosen as the minimum clustering size. The clustering results effectively reveal the aggregation and distribution patterns of social media activities in urban space, delineating core and edge areas of social media activity.

Kernel density analysis

To further assess the spatial continuity and discontinuity patterns of Weibo check-in data, this study also employs kernel density analysis. Kernel density analysis is a spatial analysis method designed to evaluate the density distribution of point events within a geographic area. This method applies a kernel function weight to each raster cell and calculates the density of surrounding point data, revealing the concentration districts and edges of social media activities. Kernel density analysis helps identify the distribution characteristics of collective memory in urban space, especially in areas with frequent social media activity. In this study, we used a Gaussian kernel function for kernel density calculations. The form of the kernel function is expressed in Eq.(3):

$$f\left(x\right)=\frac{1}{{nh}}\mathop{\sum }\limits_{i=1}^{n}{K}_{h}\left(x-{x}_{i}\right)$$
(3)

In the formula:

\(f(x)\) represents the density value at raster cell \(x\), indicating the weighted density of social media check-in points at that location.

\(n\) denotes the total number of points in the dataset (i.e., the total number of check-in points).

\(h\) is the bandwidth parameter (also referred to as the smoothing parameter), which controls the influence range of the kernel function and affects the smoothness of the resulting density surface.

\({K}_{h}\left(x-{x}_{i}\right)\) is the Gaussian kernel function used to weight the contribution of each check-in point.

The Gaussian kernel function is calculated in Eq.(4):

$${K}_{h}\left(x-{x}_{i}\right)=\frac{1}{\sqrt{2\pi {h}^{2}}}\exp \left(-\frac{{\left(x-{x}_{i}\right)}^{2}}{2{h}^{2}}\right)$$
(4)

In the formula, \(x\) and \({x}_{i}\) represent the spatial locations of the raster cell and the social media check-in point \({x}_{i}\), respectively. The bandwidth parameter \(h\) controls the spatial influence range of the kernel function. Through this kernel function, check-in points closer to \({x}_{i}\) are assigned higher weights, while those farther away receive lower weights.

Regional collective memory spatial differentiation analysis

Based on the urban community and township boundaries of Nanjing, spatial analysis methods are employed to systematically summarize and quantitatively analyze the spatial distribution of collective memory within the region. The analysis of collective memory involves aggregating memory data from different areas of Nanjing, aiming to reveal the spatial differentiation characteristics of collective memory across various regions and communities, and to quantify regional differences. Specifically, the first step involves aggregating the memory data within the boundaries of Nanjing’s communities and townships, covering multiple dimensions such as social media check-in frequency, emotional intensity, and impact level. Each region’s collective memory is then evaluated based on these indicators. Subsequently, using these evaluation results, the collective memory spatial differentiation index for each region is calculated, and these data are standardized according to regional area to generate a memory distribution map for each region. Through the analysis of the spatial differentiation of collective memory in Nanjing, significant differences in collective memory across different regions are revealed.

Results

Urban collective memory map

Based on Lynch’s theory of urban imagery and the five essential elements of urban intention, the collective memory components reflected in the content of the social media data are classified, quantified, and visualized. Figure 4 presents the collective memory map of Nanjing, which was drawn based on the five elements derived from the analysis of social media data. This map vividly illustrates the distribution patterns of different spatial types within the collective memory system of the city.

Fig. 4: Urban collective memory map.
figure 4

Source: author. a Map of collective memory in the Nanjing metropolitan area. b Enlarged map of collective memory in the Nanjing main urban area.

Urban collective memory path map

Based on Kevin Lynch’s definition, we interpret “paths” as “channels of movement for observers, including roads with distinct characteristics and prominent living attributes”. Figure 5 shows the primary paths in Nanjing, identified through social media analysis, with the color coding based on linear density ranking. The “paths” extracted from social media data differ significantly from the road classifications in the official road network, categorized by different levels. The path density distribution shows a regional pattern, with high-density paths concentrated in block-like shapes. This spatial feature reflects the coupling relationship between urban functional layouts and traffic network structures. The paths with the highest linear density of collective memory are mainly concentrated in the central urban area of Nanjing, with Xinjiekou as the core node, covering major roads such as Zhongshan East Road, Xuanwu Avenue, Zhonghua Road, Central Road, and Zhongshan South Road. Located in the old city, these roads not only connect the city’s core commercial district but also feature a high population flow and dense distribution of iconic buildings. As a result, they have considerable historical and cultural value, with prominent historical and cultural attributes. By contrast, some roads in areas such as Jiangbei Center, South City Center, High-tech Sub-center, Lishui Sub-center, and Gaochun Sub-center also exhibit relatively high collective memory density, but their formation mechanisms differ. These roads are more closely linked to Nanjing’s modern urban construction process, where the strengthening of commercial functions has shaped a new urban spatial image. Since their surrounding areas are mainly residential districts characterized by frequent daily activity, their living attributes are more prominent.

Fig. 5: Urban collective memory path map.
figure 5

Source: author. a Map of collective memory paths in the Nanjing metropolitan area. b Enlarged map of collective memory paths in the Nanjing main urban area.

From the spatial distribution of path collective memory, the memory elements of paths and their surrounding environment are influenced by multiple factors. One key factor is the degree of functional complexity of the path, which refers to the path’s ability to link multiple functions such as transportation, commerce, culture, and tourism. The higher the functional complexity, the more frequent and extensive the movement along the road, leading to deeper and more significant memory points associated with the road. This, in turn, enhances the road’s importance within the cognitive system of urban residents. Therefore, we can conclude that the spatial attributes, functional attributes, and socio-cultural value of paths combine to determine their position and characteristics within the collective memory system.

Urban collective memory node map

Based on Kevin Lynch’s definition, we interpret “nodes” as “conceptual anchor points, places in the urban space that hold significant social or functional meaning, often transportation hubs, concentrations of functions, or cultural heritage sites”. By using the fine-tuned BERT model to classify the textual content of geotagged social media data, and selecting the top 20% of node data based on memory strength assessment scores, 255 key nodes in the collective memory of city residents were identified, as shown in Supplementary Table 1. These key nodes are displayed in Fig. 6.

Fig. 6: Urban collective memory node map.
figure 6

Source: author. a Map of collective memory nodes in the Nanjing metropolitan area. b Enlarged map of collective memory nodes in the Nanjing main urban area.

The concentration of nodes is higher in the central urban area, while the distribution of nodes in other areas is relatively scattered. By analyzing the names and spatial layout of the nodes, and comparing them with the baseline data from the “Nanjing Territorial Spatial Plan (2021–2035)”, significant overlap can be observed between the historical and cultural protection planning map, ecosystem protection map, comprehensive transportation planning map, and innovation space structure map, and the node distribution map. Based on the comparative analysis, the collective memory nodes in Nanjing were divided into four major types: historical and cultural landscape nodes, natural landscape nodes, university education nodes, and public transportation facility nodes. Historical and cultural landscape nodes include landmarks such as the Confucius Temple, Presidential Palace, and Ming Xiaoling Mausoleum. These nodes carry rich historical and cultural information and preserve historical memories. Natural landscape nodes, such as Xuanwu Lake, Purple Mountain, and Wuxiang Mountain, provide recreational and leisure functions, enriching the public life of the city. University education nodes, including campuses such as Southeast University’s Jiulonghu Campus, Nanjing Forestry University’s Xinzhuang Campus, and Nanjing University’s Xianlin Campus, have a unique cultural atmosphere and accumulate rich educational and cultural memories, serving as an important part of the city’s culture. Public transportation facility nodes, including metro stations, bus terminals, railway stations, and airports located near major landmarks, are important symbols of urban modernization. These transportation hubs gather large crowds and enhance people’s perception of the city, becoming integral parts of the city’s collective memory.

In the central urban area, the distribution of these four types of nodes shows a multi-center aggregation pattern, reflecting the diversity of urban functions and the complexity of spatial structure. Historical and cultural landscape nodes are mainly concentrated in the core area of the old city, showcasing the city’s long-standing cultural heritage. Natural landscape nodes are distributed along the city’s ring road, forming green ecological corridors. University campuses are typically located on the outskirts of the city, serving both educational functions and enriching local culture. Major transportation hubs are strategically placed along important transit corridors such as metro and high-speed rail lines, supporting the city’s mobility. This multi-center, multi-functional node spatial pattern creates an intertwined and complementary relationship between different types of node memory elements. Together, they strengthen the collective memory of both residents and visitors, highlighting the city’s rich cultural and functional landscape.

Urban collective memory landmark map

Based on Kevin Lynch’s definition, we interpret “landmarks” as “external reference points that observers cannot enter, serving as important points of reference when perceiving a city”. These landmarks include iconic sculptures, monuments, structures, and buildings. Using the fine-tuned BERT model, text classification was performed on social media data with geographical check-in coordinates. The top 20% of landmarks based on memory strength evaluation scores, totaling 84 landmarks (see Supplementary Table 2), were selected as the key landmarks in the collective memory of city residents, as shown in Fig. 7.

Fig. 7: Urban collective memory landmark map.
figure 7

Source: author. a Map of collective memory landmarks in the Nanjing metropolitan area. b Enlarged map of collective memory landmarks in the Nanjing main urban area.

Landmarks are more densely concentrated in the main urban area, while their distribution in other regions is relatively sparse. By analyzing the names and spatial distribution of landmarks and comparing them with the baseline from the “Nanjing Territorial Spatial Plan (2021–2035)”, it was found that the historical and cultural protection planning maps for the entire city and central urban area overlap significantly with the landmark map. The data on peaks, historical buildings, and nationally, provincially, and municipally protected cultural heritage sites were summarized and overlaid. From this comparative analysis, Nanjing’s collective memory landmarks can be categorized into two main types: natural landmarks and historical/cultural landmarks. Natural landmarks are primarily represented by mountains in the city’s topography, such as the peaks of Purple Mountain and Mufu Mountain. These mountains, serving as prominent visual markers and guides in the city, create significant visual anchor points in the urban space. Due to their grand scale and long-term stability, these mountains become key references for spatial orientation for both residents and visitors. Purple Mountain, as an important carrier of both natural and cultural heritage, further strengthens its position in the city’s collective memory by blending scenic significance with rich humanistic spirit. Historical and cultural landmarks mainly include historic buildings and various levels of cultural heritage protection units. Representative historical buildings include ancient temples such as Qixia Temple, Jinghai Temple, Pilu Temple, Yulian Temple, Qingliang Temple, Chongqing Temple, and Doushuai Temple, as well as city gate ruins from Nanjing’s ancient city walls, such as Zhonghua Gate, Guanghua Gate, Jiefang Gate, Jiqing Gate, Qingliang Gate, Changgan Gate, and Xuanwu Gate. These historical landmarks often have a long history and rich cultural significance, bearing the crucial historical memories of Nanjing as a capital during the Six Dynasties and an important town during the Ming and Qing Dynasties. They are continuously reinforced in the collective consciousness of the residents through daily life, festive ceremonies, and tourism experiences.

In terms of spatial distribution, these landmarks exhibit a generally dispersed layout with relatively large distances between nodes. This dispersion reflects, on one hand, the accumulation and preservation of historical heritage throughout Nanjing’s urban development—particularly within the main urban area, where historical and cultural resources are dense and abundant. On the other hand, it also reveals the city’s spatial diversity and layered structure: natural topography and historical-cultural heritage intertwine spatially, forming a complex and multi-dimensional urban memory landscape. Nanjing’s system of collective memory landmarks not only highlights the city’s deep historical and cultural foundations but also embodies the dynamism brought by urban spatial renewal and functional transformation. Together, they construct a diverse and richly textured urban memory tapestry.

Urban collective memory district map

Based on Kevin Lynch’s definition, we interpret “districts” as “parts of the city that observers mentally enter, characterized by shared and recognizable features”. In this study, kernel density analysis was conducted in ArcGIS Pro on social media check-in data to identify districts with high collective memory intensity through spatial statistical methods. The magnitude of kernel density values reflects the frequency of social activity in different locations and the concentration of collective memory, thereby revealing the spatial distribution characteristics of urban collective memory. The results of the collective memory districts are shown in Fig. 8.

Fig. 8: Urban collective memory district map.
figure 8

Source: author. a Map of collective memory disrticts in the Nanjing metropolitan area. b Enlarged map of collective memory districts in the Nanjing main urban area.

By comparing the distribution patterns of kernel density and clustering results, districts were delineated using kernel density values ranging from 1 to 90. The kernel density analysis indicates that districts with higher density values largely overlap with Nanjing’s public service center system and fall within the current urban built-up areas. Within these districts, the spatial structure, historical heritage, and social activities jointly shape collective memory. The functional elements within these districts mainly exhibit four distinct characteristics: scenic centers, cultural cores, university areas, and gateway nodes. Scenic centers serve as important landmarks of touristic and leisure-related memory, thanks to their rich natural landscapes and historical-cultural background. Cultural cores are primarily located in the city center, reflecting Nanjing’s deep historical and cultural accumulation. University areas, with their high population mobility and strong academic atmosphere, represent key spaces of collective memory, particularly for younger generations. Lastly, gateway nodes, including transport hubs and commercial centers, embody the collective perception of urban dynamism and mobility.

A closer analysis of the spatial characteristics of collective memory within Nanjing’s main urban area reveals that the old city district exhibits the highest intensity and concentration of collective memory. This is closely related to the district’s rich accumulation of historical and cultural heritage. National, provincial, and municipal-level cultural heritage protection units, along with numerous historic districts, are predominantly located within the old city. These districts include the Yihe Road Historic District, Meiyuan New Village Historic District, Presidential Palace Historic District, Chaotiangong Historic District, Nanbuting Historic District, Confucius Temple Historic District, Hehuatang Historic District, Sanjiaoying Historic District, and the Jinling Machinery Manufacturing Bureau Historic District. These districts not only retain well-preserved historical urban forms and abundant cultural relics, but also embody significant modern historical events and collective memories, making them highly recognizable and strongly resonant places in the urban landscape. This phenomenon indicates that Nanjing’s old city, as the cultural and historical core, possesses significantly higher spatial recognition and collective memory intensity than do newly developed districts. It further suggests that historically significant districts generate stronger collective memories than do non-historic districts.

Urban collective memory edge map

Based on Kevin Lynch’s definition, we interpret “edges” as “continuous breaks” or “linear elements other than paths”. In this study, spatial clustering of social media data was conducted using the DBSCAN algorithm to identify high-density areas of memory distribution, revealing irregular spatial edges within the city. In ArcGIS Pro, point data analysis was carried out using a clustering radius of 1200 m and a minimum cluster size of 20 points for the entire Nanjing metropolitan area, and a radius of 500 m with a minimum of 5 points for the main urban area. Based on the clustering results, the edge-related elements of collective memory were identified for both the metropolitan and main urban areas of Nanjing (Fig. 9).

Fig. 9: Urban collective memory edge map.
figure 9

Source: author. a Edge analysis map of the Nanjing metropolitan area, generated with a 1200-m radius and a minimum cluster size of 20 points. b Edge analysis map of Nanjing’s main urban area (zoomed-in), generated with a 500-m radius and a minimum cluster size of 5 points.

The spatial distribution of edge-related collective memory elements in the city exhibits clear topographical dependency and socio-cultural attributes. These elements are primarily concentrated along ridgelines of scenic mountain areas, shorelines of major water systems, and administrative division boundaries. This distribution pattern is influenced not only by natural landform constraints but also by the evolution of urban functions, cultural layering, and residents’ spatial perceptions.

Firstly, from the perspective of natural topography, the ridgelines of scenic areas serve as natural interfaces of the terrain and are often perceived as “edges” in visual, spatial, and psychological terms. Several of Nanjing’s major mountain formations—including Laoshan National Forest Park, Zhongshan Scenic Area, Niushou Mountain Cultural Tourism Zone, Fangshan Scenic Area, and Wuxiang Mountain National Forest Park—are primarily developed on the southern or southeastern slopes where gradients are gentler, facilitating transportation and functional zoning of the scenic sites. This development pattern not only follows principles of accessibility and safety determined by terrain but also reinforces the perception of ridgelines as natural thresholds for entering or exiting scenic areas, thereby strengthening public awareness of “urban edges” and “natural edges” in collective memory.

Secondly, in terms of water systems, the Yangtze River and Qinhuai River—Nanjing’s two most significant water bodies—not only form the ecological backbone of the city but also serve as key edge elements in spatial cognition. The Yangtze River, which runs through the city, naturally divides it into northern and southern sections. Its vast width and the bridges that span it create a strong sense of spatial separation, making it the most direct and prominent edge between the old urban core and the new Jiangbei district. The Qinhuai River, regarded as the cultural cradle and historical landmark of Nanjing, has historically played a central role in urban development and city defense systems. Numerous cultural heritage sites and tourist attractions are located along its banks. The deep cultural legacy of the river has gradually cemented its role as a cognitive baseline and cultural dividing line in the city’s spatial structure, serving as a vital point of reference in residents’ collective memory for recognizing and delineating urban space.

Finally, at the administrative division level, the edge memory elements of Nanjing’s urban core largely overlap with the scope of the old city, with the dividing lines between Gulou District and Jianye District, and between Qinhuai District and Jianye District serving as typical examples. These administrative edges often reflect pronounced functional differences—Gulou and Qinhuai Districts represent the traditional core areas of Nanjing, characterized by historical culture, education and research institutions, and long-established residential life. By contrast, Jianye District symbolizes the city’s direction of new urban expansion, concentrating functions such as government administration, financial and commercial services, and cultural exhibitions. These functional distinctions reinforce residents’ cognitive understanding of urban divisions and, over time, have been internalized through everyday experience as a collective memory structure of urban edges. Therefore, edge memory elements concretely reflect the spatial social functional differences and the continuity of historical and cultural identity within the city.

Urban collective memory spatial differentiation map

Using social media text data, the regional analysis map was created, with Fig. 10 showing the differentiation of collective memory strength across different districts. Areas with high collective memory strength are often closely linked to rich historical heritage, rapid modern development, a strong commercial atmosphere, and abundant cultural resources. The distribution of collective memory in Nanjing is mainly concentrated in the central urban districts, including Xuanwu, Gulou, Jianye, Qinhuai, and Yuhuatai Districts.

Fig. 10: Urban collective memory spatial differentiation map.
figure 10

Source: author. a Map of collective memory spatial differentiation in the Nanjing metropolitan area. b Enlarged map of collective memory spatial differentiation in the Nanjing main urban area.

The areas with the highest collective memory strength in Qinhuai District include Fuguanban, Chaotiangong Street, Yueyahu Street, Zhonghua Gate Street, and Confucius Temple Street, which are home to significant historical and cultural landmarks such as the Confucius Temple and the Qinhuai River Scenic Belt. In Jianye District, Xinglong Street is located near the Nanjing Olympic Sports Center and the Hexi CBD, representing a strong modern urban image. Mochou Lake Street is supported by cultural tourism resources such as Mochou Lake Park, the Memorial Hall to the Victims of the Nanjing Massacre, and the Nanjing Yunjin Research Institute, which carry cultural symbolism. In Gulou District, Hunan Road Street is a traditional commercial area, while the districts of Ninghai Road, Huaqiao Road, and Renmin South Road preserve many historical buildings from the Republic of China era, forming a unique urban landscape. In Xiaguan Street, historical sites like the Ming City Wall, Yijiangmen, and Yuejiang Tower, along with natural landmarks such as Xiushui Park and Xiaotaoyuan, contribute to the area’s rich collective memory. In Xuanwu District, Suojin Village Street is close to Nanjing Forestry University, fostering an academic atmosphere; Xiaolingwei Street is located on the southern slope of Zijin Mountain, rich in historical and natural resources. Xinjiekou Street, located in the heart of the city, is the most prosperous commercial center in Nanjing. Xuanwumen Street, adjacent to Xuanwu Lake and Jiming Temple, is another important landmark area. Meiyuan New Village Street is an essential historical and cultural district with profound cultural heritage. In Yuhuatai District, the China (Nanjing) Software Valley, a national software industry base, is home to numerous high-tech enterprises, reflecting the city’s collective memory in industrial transformation and technological innovation in the new era.

Outside the main urban areas, some suburban regions in Nanjing also exhibit certain collective memory characteristics. In Pukou District, the area under the management of the Pearl Spring Committee is home to Pearl Spring Scenic Area and Pukou Laoshan Forest Park, reflecting a memory feature dominated by natural landscapes. The High-tech Development Zone gathers industrial platforms such as the Pukou Software Park and the Integrated Circuit Design Base. In Lishui District, Shiqiao Street houses the Shiqiao National Film and Television Base, a well-known attraction in Lishui. In Qixia District, the Zijin (Xianlin) Technology and Entrepreneurship Special Community, located near Xianlin University Town, is a distinctive area that integrates university-based technology transformation, innovation, and entrepreneurship incubation. In Gaochun District, the Gaochun County Tuanjiewei Breeding Farm is transitioning from traditional agricultural cultivation to a modern agricultural demonstration area, combining agricultural production with sightseeing experiences. These suburban areas, while not as densely concentrated as the main urban areas, still contribute distinct elements to the city’s collective memory, showcasing the diversity of Nanjing’s urban and rural integration.

Discussion

This study proposes a method for constructing urban collective memory maps. First, by introducing the theory of urban imagery and the spatial cognitive framework of cities, it considers the shared urban spatial cognition formed by the public, including “paths”, “nodes”, “landmarks”, “districts”, and “edges”, as a potential collective memory map of the public urban image. This provides an important theoretical framework for understanding the relationship between urban collective memory and space and for constructing spatial maps of urban collective memory. Secondly, in terms of methodology, traditional urban memory research often relies on qualitative methods such as interviews and surveys21, which have limitations in capturing urban collective memory and mapping it across large areas and large datasets. This study introduces social media data and utilizes machine learning-based natural language processing methods and geographic spatial analysis techniques to efficiently and accurately spatialize urban collective memory over large areas, revealing its spatial distribution. Additionally, the study not only focuses on the types and quantity distribution of collective memory spaces but also incorporates sentiment analysis and temporal analysis, taking into account the impact of emotions and time on memory intensity. In conclusion, the theory and methodology of constructing the urban collective memory map proposed in this study further enhance the scientific rigor and practical value of collective memory research by efficiently and precisely revealing the emotional intensity and spatial distribution of collective memory within urban spaces. It provides significant theoretical support and practical foundations for urban spatial renewal, cultural heritage preservation, the shaping of urban identity, and future urban planning, thereby promoting a comprehensive understanding of urban social functions and the ongoing development of cultural heritage.

The Nanjing case validates the effectiveness of the five-element framework in revealing the mechanisms of collective memory at a macro scale, while simultaneously exposing multiple tensions such as urban evolution, data bias, and planning discrepancies. The path element shows that the main city’s radial roads and complex transportation hubs exhibit the highest memory density, aligning with Lynch’s (1960) concept that paths are central to urban cognition41, thereby highlighting the close relationship between transportation functions and collective memory. However, areas like the Jiangbei New District and the southern new city, which are more recent strategic corridors, have lower memory values. This suggests that the symbolic significance of transportation infrastructure requires longer-term social practices and event accumulation to solidify into shared memory. This also reflects the bias in the travel radius of Weibo users, who tend to focus on the central urban areas. The node diagram shows a dual-peak structure where historical heritage and modern business cores coexist. Emotional weighting indicates that the “emotional plateau” in the Confucius Temple-Qinhuai River area significantly elevates the overall node index, supporting the idea that the coupling of “emotion-symbol-space” is key to nodes becoming anchors of urban memory. This finding is consistent with Zhang et al.42 research, which points out that historical sites and cultural centers are crucial for shaping collective memory. However, some high-value points, such as Hexi CBD, exhibit extreme emotional polarization, suggesting that “high frequency ≠ high recognition”, which requires further analysis of both positive and negative semantics in subsequent field interviews. The landmark element shows that natural mountains and city wall gates are scattered, forming a “visual” memory network. Comparing the historical timeline, pre-Ming Dynasty sites are frequently mentioned, while 20th-century industrial heritage appears only sporadically, revealing a gap in the current memory spectrum regarding modern history. This necessitates the active inclusion of symbols of modernity in heritage narratives and urban branding. In the district element, memory patches of the old city’s cultural core, university areas, and scenic spots are clearly separated, indicating Nanjing’s multi-core cultural structure. However, a comparison with land use planning reveals that high-memory districts overlap significantly with high-intensity development zones, which could lead to the risks of “cultural overload-functional conflict”. By contrast, the memory vacuum in the new city sub-center points to a lack of public cultural facilities and narrative spaces. The edge element highlights how the Yangtze River, water systems, and mountain ridges strengthen the “mental barriers” of natural edges, while administrative edges show almost no trace in the Weibo data, indicating that institutional edges have limited impact on public everyday cognition. To enhance identity in cross-river development strategies along the Yangtze, perceptible cross-edge experiences (such as cultural corridors or nightscape routes) need to be created, rather than relying solely on policy texts.

In summary, the mapping results not only validate the applicability of the five-element model but also reveal issues such as the lag in memory accumulation, emotional polarization, and absence of modern layers in the memory spectrum. For planning practice, the guiding principle should be “fill gaps—reduce overload—connect fractures”. This involves inserting commemorative activities and cultural nodes in new cities to create spaces that encourage collective memory and cultural engagement. In the old city, it is crucial to balance tourism, commerce, and daily life to ensure that historical and cultural elements are preserved while maintaining the functionality and vibrancy of everyday spaces. Additionally, creating cross-edge narrative corridors at the intersection of natural and administrative boundaries can help bridge these spaces and foster a sense of unity and identity across different districts. These strategies aim to transform collective memory into a practical tool for spatial governance and cultural resilience. Furthermore, the memory vacuum in the central area highlights a lack of public cultural facilities and narrative spaces, which requires attention in future urban planning and development.

Despite its strengths, this study also has certain limitations. First, the data is primarily derived from the Weibo platform, which has a user base that skews towards younger and more urbanized populations. As a result, the collective memories of certain groups, such as older adults or low-income populations, may not be fully represented43. Future research could incorporate data from multiple platforms, such as WeChat and Douyin, to further enrich the panoramic view of collective memory. Second, the issue of contextuality in sentiment analysis still needs further refinement. The simplicity of social media language may impact the accuracy of memory space identification and classification. Future studies could integrate more multimodal data, such as images and videos, or employ more advanced deep learning models to further improve the precision of memory space analysis44. Additionally, the time span of this study is relatively short, covering only the years 2018 to 2024, and does not fully explore how collective memory evolves over time. Future research could conduct longitudinal analyses to investigate the dynamic changes in collective memory in relation to social changes, historical events, and cultural evolution45,46.