Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives

Chai, Haoyue; Lu, Shiyu; Ni, Leqian; Zhu, Yuanyuan; Zhou, Yanbing; Xiao, Yinfei; Mo, Zifeng; Zheng, Tianxiang

doi:10.1038/s40494-026-02439-1

Download PDF

Article
Open access
Published: 24 March 2026

Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives

Haoyue Chai¹,
Shiyu Lu¹^na1,
Leqian Ni¹^na1,
Yuanyuan Zhu²,
Yanbing Zhou¹,
Yinfei Xiao¹,
Zifeng Mo³ &
…
Tianxiang Zheng¹

npj Heritage Science volume 14, Article number: 176 (2026) Cite this article

967 Accesses
Metrics details

Abstract

Sustainable heritage management requires understanding visitors’ perceptions beyond technological approaches. This study integrates deep learning (DL), data mining, and spatial grid analysis to investigate tourist impressions of Chinese classical gardens, using 4980 images from Dianping.com and 4024 photographs from TripAdvisor. A DL-based semantic segmentation method combined with spatial grid analysis identifies key garden elements and their spatial distributions. Relationships among compositional features (i.e., component combinations, composition layouts, and composition hotspots) are then examined. Perceptual differences between Chinese and Western tourists across northern and southern gardens are also compared. Results show that the presence, spatial coverage, and compositional prominence of garden elements vary by garden type and cultural background, reflecting distinct visitor preferences. Pixel-level analysis of photographic composition provides an objective, fine-grained understanding of landscape preferences, offering practical insights for heritage planning, visitor experience design, preservation policy, and cross-cultural value co-creation.

An integrated framework to quantify tour-view experience in classical gardens using computer vision and spatial analysis

Article Open access 25 February 2026

Modeling visual perception of Chinese classical private gardens with image parsing and interpretable machine learning

Article Open access 16 February 2026

Digital preservation of classical gardens at the San Su Shrine

Article Open access 21 February 2024

Introduction

Visual media are reshaping the ways individuals engage with the landscapes they visit. Photographic composition displays identifiable patterns, suggesting that landscape design can be systematically analyzed and evaluated. In cultural heritage and tourism contexts, user-generated photographs (UGPs) provide valuable empirical evidence for examining these visual patterns, revealing how interactions with attractions relate to place, environment, and emotional experience¹. They also serve as important cues that help travelers identify, categorize, and compare different heritage elements, while reflecting perceptions of specific architectural types within destination images². From a sustainability perspective, UGPs offer insight into the visual representation of heritage sites, supporting the evaluation of promotional strategies, the improvement of heritage management, the strengthening of social connections, and the creation of shared value in cross-cultural contexts³. Accordingly, tourists’ interpretations of heritage places play a critical role in shaping destination image⁴ and informing landscape planning⁵.

Chinese classical gardens (CCGs) have long been recognized as important carriers of cultural and historical heritage⁶. The distinctiveness of CCGs lies in their irregular forms and intricate architectural structures, which make their intended meanings challenging to interpret. A growing body of research has sought to incorporate tourists’ memories, emotions, and experiences into the understanding of heritage values⁷. Existing studies address diverse aspects of CCGs, including garden architecture⁸, garden elements⁵, landscape restoration⁹, and landscape preferences^10,11. For example, garden elements are typically defined through observable spatial attributes and color schemes^12,13, such as vegetation, rockery textures, enclosing structures (e.g., walls and screens), and architectural components including traditional windows, doorways, and pathways^5,14,15,16. Thematic color schemes—particularly greens and earth tones—also enhance spatial coherence and support cognitive mapping^13,17. To analyze complex landscape photographic compositions, recent studies have increasingly leveraged the amalgamation of deep learning (DL) and artificial intelligence (AI) to enhance cultural heritage theory and practice in photos¹⁸ for preserving, protecting, and rehabilitating cultural heritage. For instance, DL frameworks have been developed in heritage protection for image enhancement¹⁹, crack repair²⁰, color degradation²¹, surface impairment^22,23, and damaged murals²⁴, enabling early warnings of cultural asset deterioration²⁵. These techniques generally fall into two categories: object detection and semantic segmentation²⁶. Object detection identifies and localizes specific targets using bounding boxes, operating at the object level (e.g., detecting the presence and position of a car). Applications include extracting scenic spots²⁷, quantifying the landscape preference²⁸, and identifying specific cultural aspects²⁹ and objects²⁸. Semantic segmentation classifies each pixel into predefined categories without distinguishing individual instances (e.g., all pixels belonging to cars may be labeled in blue in a segmentation map), thereby operating at the pixel level. This approach has been successfully applied to tasks such as on-site inspection of remote heritage buildings³⁰.

However, several limitations remain in the existing literature. First, current studies examining landscape photography in relation to visitor perception or destination image have predominantly focused on the observer’s perspective, recruiting participants for either eye-tracking experiments^1,3,8,31 or questionnaire surveys³². Eye-tracking experiments have been extensively applied to understand observers’ visual behaviors, particularly visual attention. Researchers have also integrated other analytical techniques with eye-tracking analysis^1,3. Questionnaire surveys have also been widely used to examine how visual preferences, experiences, cultural identity, and emotions shape perception^33,34,35. However, research adopting the photographer’s perspective within CCG contexts remains limited. Second, research on landscape preferences in photographic compositions has primarily focused on the identification of objects^27,28, spots³, or shapes⁸. Some researchers have linked the photographic compositions to esthetic scale, demonstrating its impact on cultural heritage identity^36,37. However, although pixel-level analysis provides greater analytical granularity than object-based approaches, semantic segmentation has rarely been applied to heritage photographs to examine tourist perception. In particular, automated identification of compositional structures, extraction of irregular elements (e.g., vegetation and water), and assessment of their spatial organization (e.g., coverage ratios) remain insufficiently explored. Third, cultural background and individual characteristics further shape variations in tourist perception across destinations^12,16,38. Yet cross-cultural analyses of tourist perceptions and behaviors across different heritage types remain limited¹⁶.

These gaps motivate the present study, which addresses three research questions: how garden-related compositional elements in heritage photographs can be quantitatively identified using advanced computer vision techniques; how photographic compositions reflect the landscape preferences of content creators; and whether heritage landscape elements evoke consistent perceptions among tourists. Utilizing various quantitative methodologies, specifically DeepLab visual processing, spatial grid analysis, and k-means clustering, we explored the photographic compositions of diverse Chinese gardens through 9004 UGPs sourced from Dianping.com and TripAdvisor.com. This research offers multiple additions to the literature on cultural heritage. First, we integrate DL–based semantic segmentation with spatial grid analysis to identify garden compositional elements and their spatial distributions, enabling examination of the relationships among compositional structures, spatial layouts, visual hotspots, and tourist preferences. Second, we propose an objective, data-driven framework for analyzing UGPs. This framework enables the inference of photographers’ perceptions and landscape preferences while clarifying the relationship between compositional structures and garden design and planning. Third, we differentiate tourists’ perceptions between two groups of tourists with diverse backgrounds and two famous gardens, highlighting how tourists’ perceptions and destination image are shaped by cultural variations and different types of CCGs. Our findings yield significant insights for landscape design in classical gardens, enhance the integration of heritage marketing and tourist engagement, and furnish scientific guidance for developing a more appealing and functional garden environment.

Methods

Research framework

This study leverages the theoretical framework of environmental psychology, DL methodologies, spatial grid analysis, and data mining to decode cross-cultural tourists’ perceptions through UGPs in two famous CCGs. First, DL-based semantic segmentation is applied to each UGP to assign a category label to every pixel, followed by a label-to-component mapping associated with historical gardens. Second, occupancy analysis is conducted to quantify the spatial distribution and proportional coverage of each component within the image. Third, the Rule of Thirds³⁹ is introduced to illustrate a geometric sense of proportion⁴⁰ in the psychology of content providers, underscoring a nuanced grid analysis. Fourth, component-to-cell occupancy metrics and spatial grid analysis are computed to evaluate compositional features, including component combinations, layout structures, and visual hotspots. Figure 1 presents the overall research framework.

Study areas

In alignment with previous research on the comparative spatial structures⁴¹ and the significant role of royal gardens in Northern China and private gardens in Southern China⁴², this study selects the Summer Palace (SP) in Beijing and the Humble Administrator’s Garden (HAG) in Suzhou as the focal study areas. Located in Beijing, Northern China, the SP is a representative royal garden renowned for its integration of large-scale natural landscapes, monumental architecture, and imperial cultural symbolism. The HAG, a classical private garden in Suzhou, Southern China, is celebrated for its refined spatial organization, close interaction between architecture and nature, and rich literary and esthetic traditions. Both are recognized as China’s ‘Four Great Classical Gardens’ and have been classified as world cultural heritage relics by the United Nations Educational, Scientific, and Cultural Organization (UNESCO) (http://whc.unesco.org/en/list/), underscoring the value of natural and cultural sceneries with historical merit. The selection of these two sites ensures data availability, comparability across contrasting garden typologies, and analytical diversity, while maintaining methodological consistency. Consequently, other CCGs are not included in this study.

Data collection

This study incorporated UGPs to analyze tourists’ perceptions of heritage sites, as visual data capture experiential and perceptual dimensions that are often difficult to obtain through scaled questionnaires or interviews⁴³. Dianping.com was selected to represent Eastern tourists due to its extensive repository of photographs related to the selected gardens, exceeding that of comparable platforms such as Mafengwo.cn, Ly.com, and Ctrip.com. TripAdvisor, which has a larger quantity, higher quality, and greater ownership of user-generated CCG photographs than Instagram, Flickr, and Wikimedia Commons, was established to serve Western tourists. This data selection strategy follows the approach adopted by Sun et al.¹⁶. We developed a web crawler to extract user-generated images from the photo galleries of Dianping and TripAdvisor. To ensure dataset balance and comparability, 2736 photographs were collected for each garden from each platform. This sample size was determined by the minimum number of available photographs across the four site-platform combinations, specifically those from the HAG on TripAdvisor. When the number of available images exceeded this threshold, photographs were randomly sampled.

Data preprocessing

All images underwent manual screening to ensure sampling accuracy for each study site. Certain images were excluded from subsequent analysis due to challenges in recognition, the prominence of human close-ups, or their lack of connection to the places under study. After human verification, the initial dataset of 10,944 photographs was refined to a final collection of 9004 photographs, which comprises 2495 for HAG on Dianping (HAG_D), 2023 for HAG on TripAdvisor (HAG_T), 2485 for SP on Dianping (SP_D), and 2001 for SP on TripAdvisor (SP_T). Prior to analysis, platform-specific visual artifacts were removed. For Dianping images, watermarks were eliminated by cropping the lower 15% of the image area to ensure consistency and avoid interference with pixel-level semantic segmentation.

Semantic segmentation

We introduced DeepLab, a semantic segmentation approach extensively utilized in computer vision⁴⁴, to identify landscape components within each photograph. First, all images were resized to 512 × 512 pixels to conform to the input requirements of the DeepLab architecture. Second, we utilized the pre-trained DeepLab V3+ model available in the TensorFlow Model Zoo, trained on the ADE20K dataset. Third, each standardized image was processed through the model to generate a segmentation map, in which every pixel was assigned a class ID (i.e., label). Fourth, we created a label-to-component mapping associated with historical gardens, as outlined in Table 1. These components are fundamental features of classical gardens^5,15. We further refined the labels to keep those relevant to our task and classified each reserved label into one of ten components. Fifth, segmentation maps were resized back to the original image dimensions to ensure spatial consistency for subsequent analysis. Finally, manual verification was conducted to reduce potential misclassification and enhance labeling accuracy.

Table 1 Meaning of the 10 fundamental compositional components (inspired by Kiani and Khakzand’s study⁶⁴ and adapted by the authors)

Full size table

Occupancy statistics

Rather than quantifying architectural attributes such as enclosing walls or vegetation density⁵, which primarily reflect physical design characteristics, this study evaluates semantic components through occupancy analysis from the tourists’ visual perspective. This approach emphasizes the spatial distribution of landscape elements that appear within photographic compositions. The occupancy analysis consisted of a presence indicator and a coverage ratio of the landscape components. The presence indicator was determined by the occurrence probability that any image contains the element, as detailed in formulas (1)–(10). The coverage-ratio index was measured by pixel-level proportion of the component within images that contain it, as detailed in formulas (11)–(20).

$$O{P}_{A}=\frac{Number\,of\,images\,containing\,architectural\,openings}{Total\,number\,of\,images}$$

(1)

$$O{P}_{B}=\frac{Number\,of\,images\,containing\,buildings}{Total\,number\,of\,images}$$

(2)

$$O{P}_{D}=\frac{Number\,of\,images\,containing\,decorative\,elements}{Total\,number\,of\,images}$$

(3)

$$O{P}_{{E}}=\frac{Number\,of\,images\,containing\,enclosure\,walls}{Total\,number\,of\,images}$$

(4)

$$O{P}_{L}=\frac{Number\,of\,images\,containing\,literati\,garden\,furnishings}{Total\,number\,of\,images}$$

(5)

$$O{P}_{M}=\frac{Number\,of\,images\,containing\,mountains\,and\,hills}{Total\,number\,of\,images}$$

(6)

$$O{P}_{P}=\frac{Number\,of\,images\,containing\,pathways\,and\,bridges}{Total\,number\,of\,images}$$

(7)

$$O{P}_{R}=\frac{Number\,of\,images\,containing\,rocks\,and\,boulders}{Total\,number\,of\,images}$$

(8)

$$O{P}_{T}=\frac{Number\,of\,images\,containing\,trees\,and\,plants}{Total\,number\,of\,images}$$

(9)

$$O{P}_{W}=\frac{Number\,of\,images\,containing\,water\,features}{Total\,number\,of\,images}$$

(10)

$$C{R}_{A}=\frac{Number\,of\,pixels\,belonging\,to\,architectural\,openings}{Total\,pixels\,in\,the\,image}$$

(11)

$$C{R}_{B}=\frac{Number\,of\,pixels\,belonging\,to\,buildings}{Total\,pixels\,in\,the\,image}$$

(12)

$$C{R}_{D}=\frac{Number\,of\,pixels\,belonging\,to\,decorative\,elements}{Total\,pixels\,in\,the\,image}$$

(13)

$$C{R}_{E}=\frac{Number\,of\,pixels\,belonging\,to\,enclosure\,walls}{Total\,pixels\,in\,the\,image}$$

(14)

$$C{R}_{L}=\frac{Number\,ovf\,pixels\,belonging\,to\,literati\,garden\,furnishings}{Total\,pixels\,in\,the\,image}$$

(15)

$$C{R}_{M}=\frac{Number\,of\,pixels\,belonging\,to\,mountains\,and\,hills}{Total\,pixels\,in\,the\,image}$$

(16)

$$C{R}_{P}=\frac{Number\,of\,pixels\,belonging\,to\,pathways\,and\,bridges}{Total\,pixels\,in\,the\,image}$$

(17)

$$C{R}_{R}=\frac{Number\,of\,pixels\,belonging\,to\,rocks\,and\,boulders}{Total\,pixels\,in\,the\,image}$$

(18)

$$C{R}_{T}=\frac{Number\,of\,pixels\,belonging\,to\,trees\,and\,plants}{Total\,pixels\,in\,the\,image}$$

(19)

$$C{R}_{W}=\frac{Number\,of\,pixels\,belonging\,to\,water\,features}{Total\,pixels\,in\,the\,image}$$

(20)

Component combination identification

Component combinations are operationalized using co-occurrence cardinality, defined as the number of distinct landscape component types (out of the ten predefined categories) identified within a single image. This image-level metric captures the compositional complexity reflected in UGPs. The distribution of co-occurrence cardinality is then computed to determine how many images contain exactly one, two, and up to ten component types. This analysis enables assessment of the diversity and structural richness of landscape elements within photographic compositions.

Composition layout recognition

To examine spatial composition, we incorporate the Rule of Thirds as a grid-based analytical framework. The Rule of Thirds divides an image into three equal horizontal and three equal vertical sections, forming a 3 × 3 grid with four intersection points that represent areas of visual emphasis. The analysis begins by constructing a density distribution map. For each landscape component, we calculate the intersection area between its bounding box and each of the nine grid cells. Particularly, bounding boxes are generated directly from the segmentation masks to enable instance-level compositional analysis. For each class (i.e., the labels listed in Table 1), we identify all spatially discrete regions—defined as connected components—within the corresponding segmentation mask. The bounding box of each instance is then projected back to the original image dimensions and constrained to valid pixel coordinates. This ensures that distinct objects belonging to the same class are treated as independent instances in subsequent spatial analysis. To simplify the resulting distribution, we determine each component’s spatial intensity by identifying the single grid cell with which it shares the largest intersection area; this cell is defined as the component’s primary compositional location. The analysis then shifts perspective to identify the dominant component within each grid cell. The analysis culminates in the identification of the dominant component within each grid cell. This reverses the analytical perspective: rather than determining where a given component is most prominent, we evaluate which component predominates within each cell. For each of the nine cells, all components whose primary spatial intensity falls within that cell are compared, and the component occupying the largest area is designated as dominant. This multi-step procedure establishes a robust framework for recognizing compositional layouts and underpins the statistical analyses reported in the results section.

Composition hotspot detection

The composition hotspot detection reflects the tourist photo-taking locations. Initially, we constructed a vector consisting of 10 components using the one-hot encoding method. The vector was subsequently applied to each image. Subsequently, k-means clustering was carried out on the vectors to identify image clusters for each source. The elbow technique indicated that the optimal k values were 5, 6, 7, and 5 for HAG_D, HAG_T, SP_D, and SP_T, respectively. Finally, we eliminated the clusters based on the findings of the predominant co-occurrence components outlined in the subsection on component combination identification.

Results

Semantic segmentation and component occupancy

Figure 2 illustrates representative examples of semantic segmentation results. Figure 3 displays the occurrence probability and average coverage ratios of the 10 components from different sources, thereby identifying the constituent composition of the photos and representing the overall landscape preferences. Table 2 reports the occurrence probability of each component, and Fig. 4 displays selected examples featuring the original photographs and their semantic components. Generally, across both gardens, components such as ‘enclosure walls,’ ‘trees and plants,’ and ‘water features’ are predominantly significant and are likely integral to Chinese classical landscape design. These are followed by ‘literati garden furnishings,’ ‘decorative elements,’ and ‘buildings.’ Specifically, according to Fig. 3a and Table 2, ‘trees and plants’ frequently serve as the backdrop in most images of the CCGs favored by tourists (over 95% of the images). Eastern tourists favor ‘enclosure walls,’ ‘trees and plants,’ and ‘water features’ more frequently than their Western counterparts, particularly for the SP (see Fig. 4a, b, g, h, t). Components such as ‘architectural openings,’ ‘mountains and hills,’ ‘pathways and bridges,’ and ‘rocks and boulders’ attract Western tourists (see Fig. 4c, e, n, i, p, q, t, v) to capture photographs as mementos of their travels, in contrast to Eastern visitors.

**Fig. 2: Examples of UGPs under study, and their segmented images.**

**Fig. 3: Histogram of component occurrence and average coverage ratio across different sources (Source: created by the authors).**

**Fig. 4: Examples of UGPs under study and the segmented semantic components.**

Table 2 Results of the occurrence probability of the 10 fundamental compositional components across different sources

Full size table

The average coverage ratio in Fig. 3b indicates that ‘trees and plants’ are not only frequently photographed but also compositionally central, suggesting that visitors, particularly domestic travelers, are continually attracted to greenery. Although ‘water features’ occur less frequently than vegetation, their relatively high coverage ratios indicate expansive and open spatial compositions. Certain components, such as ‘mountains and hills,’ ‘pathways and bridges,’ and ‘rocks and boulders,’ may exhibit low frequency (see Fig. 3a). Yet, they possess a moderate coverage ratio upon occurrence. Typically, there are few buildings in SP’s UGPs for Eastern tourists (see Fig. 3a). Nonetheless, they dominate the image due to their scale and architectural prominence. It is characterized by a substantial coverage ratio of 15% (ranked 2nd), demonstrating a moderate occurrence frequency of 856, which accounts for 34.31%. Some components, such as ‘mountains and hills,’ ‘pathways and bridges,’ and ‘rocks and boulders,’ may also exhibit low frequency, but they possess a moderate coverage ratio upon occurrence. Figure 4j–o, r–t may provide insights into these aspects.

Figure 5 presents histograms illustrating the distribution of coverage-ratio intervals for each component across different sources. Most components exhibit either a long-tail or single-dominant distribution pattern. Among them, ‘buildings,’ ‘mountains and hills,’ and ‘pathways and bridges’ are particularly noteworthy. For the ‘buildings’ component, SP_D demonstrates a distinct and relatively even distribution across coverage-ratio intervals (refer to Fig. 5b), indicating that buildings of varying scales—from minor elements to dominant architectural subjects—are consistently represented in these photographs (see Fig. 4i, k, l, m). In contrast, SP_T is concentrated in the lower coverage intervals (below 0.5), with very few Western tourist photographs depicting buildings occupying more than half of the frame (see Fig. 4c, d, v). This suggests that Eastern tourists are more inclined to incorporate buildings of diverse scales into their compositions at SP. Regarding ‘mountains and hills,’ SP_T shows a pronounced concentration in the [0, 0.1) interval, with over 100 images within this range (refer to Fig. 5f). This indicates that international tourists more frequently capture distant mountainous scenery as background elements (see Fig. 4d, u, v). The distribution of ‘pathways and bridges’ further reveals a consistent preference among Western tourists in both gardens (see Figs. 5g and 4c, g, p), suggesting that linear spatial elements play a stronger compositional role in their photographic practices.

**Fig. 5: Histogram of coverage ratio bins for each component across different sources (Source: created by the authors).**

Co-occurrence among multiple components

The co-occurrence of multiple components can substantially enhance the compositional richness and esthetic complexity of garden photographs. Figure 6 presents a histogram illustrating the distribution of component combinations across different sources, thereby revealing tourists’ preferred landscape configurations. Overall, most photographs contain between one and five component types, suggesting a tendency to balance visual simplicity with compositional diversity. Despite certain similarities, notable differences emerge between visitor groups and gardens. For instance, Western tourists were less inclined to capture single-component images in HAG compared with SP. Instead, they more frequently composed photographs featuring three or four components in HAG (see Fig. 6), indicating a preference for more layered spatial compositions in this garden. In contrast, domestic tourists exhibited relatively consistent co-occurrence patterns across both gardens, suggesting stable compositional preferences regardless of site context. However, international tourists demonstrated greater variability between HAG and SP. Specifically, for Western visitors, the distributions of component counts differ markedly between the two gardens: images containing 1, 2, 5, and 6 components are more prevalent in SP than in HAG (see Fig. 4c, q, u, v), whereas photographs featuring 3, 4, 7, and 8 components are more predominant in HAG compared to SP (see Fig. 4e, g, i).

**Fig. 6: Histogram of the co-occurrence of multiple compositional components across different sources (Source: created by the authors).**

Composition layout recognition

Figure 7 presents the histogram of compositional layout across grid positions for different sources. Overall, essential components such as ‘trees and plants,’ ‘water features,’ ‘literati garden furnishings,’ and ‘enclosure walls,’ which visitors frequently mention, are most frequently located in the central and bottom rows of the frame. This distribution suggests that these elements typically occupy the foreground or near-ground spatial zones in tourist photographs. By contrast, the upper row contains fewer components, indicating either reduced visual emphasis or a background function within the composition. Clear cultural distinctions are also evident. Photographs taken by Eastern tourists tend to exhibit more centralized or balanced layouts, whereas Western tourists more frequently produce bottom-heavy compositions, concentrating visual weight in the lower portion of the frame. Further examination reveals that Eastern tourists demonstrate a stronger tendency to utilize upper-row space, particularly in HAG, suggesting an effort to incorporate vertically layered elements into the frame. However, this tendency is less pronounced in SP. In contrast, Western tourists consistently favor positioning components in the center and bottom rows, reinforcing a compositional strategy anchored in foreground emphasis. At the component level, ‘water features’ are frequently arranged in bottom-left positions (see Fig. 4a, f, h, k, n, p, u), indicating a recurring spatial preference in framing water elements. ‘Buildings’ in SP predominantly occupy bottom-row positions (see Fig. 4l, o, r), whereas in HAG they are more often centered within the frame (see Fig. 4a, b, d). Additionally, as shown in Figs. 7 and 4c, g, q, Western tourists tend to position ‘pathways and bridges’ toward the bottom-left or bottom-right areas, further emphasizing directional or linear foreground elements in their compositions.

**Fig. 7: Histogram of compositional layout within each spatial grid across different sources (Source: created by the authors).**

Composition hotspot detection

Table 3 summarizes the clustering results, listing potential photography locations in the third column and corresponding illustrative examples in the final column. These locations represent popular tourist photography hotspots identified using semantic analysis. At HAG, tourists most frequently capture the lotus pond or the Zigzag Bridge area, where these features serve as focal subjects and trees, flora, and enclosure walls form the background. These selections center on symbolically significant and iconic features, consistent with HAG’s literati narrative. Cluster #2 indicates that Western tourists tend to prefer viewpoints featuring notable architecture and water elements, whereas Eastern tourists more often include a wider range of landscape components within courtyard settings. For Eastern tourists, distinct clusters (i.e., Clusters #3 and #5) highlight repeated emphasis on particular architectural and natural attractions. For Western tourists, unique clusters (i.e., Clusters #3, #4, #5, and #6) typically feature ‘buildings,’ ‘decorative elements,’ ‘enclosure walls,’ ‘literati garden furnishings,’ ‘trees and plants,’ and ‘water features’ in varied configurations. Together, these patterns point to greater emphasis on architectural components, pathway variety, and framed views incorporating water and islands. Among Western tourists, the ‘Elegant Bamboo House Courtyard’ (Cluster #4) and ‘Moon Gate Views’ (Cluster #6) are the only clusters in which ‘architectural openings’ emerge as a key element; this component does not appear as a key element in any Eastern tourist cluster. This pattern further indicates that Western tourists are more likely to focus on specific architectural openings and framed views. In summary, Eastern tourists more often produce balanced compositions that foreground literati-associated elements, whereas Western tourists more frequently capture architecturally framed and visually striking perspectives.

Table 3 Image clusters across different sources based on compositional components

Full size table

In SP, the clusters show greater thematic diversity, although their overall sizes are generally smaller than those in HAG. Approximately half of the clusters include ‘buildings’ (three of seven for Eastern tourists; two of five for Western tourists), suggesting that tourists value iconic structures and enclosed views, potentially shaped by historical education, cultural identity, and national pride associated with the garden. Both Eastern and Western tourists at SP consistently show interest in locations with prominent ‘water features’, including the “Kunming Lake Shoreline,” “Marble Boat,” “Suzhou Street Area,” “Hall of Embracing the Universe & Lakeside Views,” and “Seventeen-Arch Bridge & South Lake Island Approaches.” Taken together, the expansive water body and its associated features appear to be major attractions for both tourist groups. Specifically, the primary cluster for Eastern tourists at SP centers on the expansive “Kunming Lake Shoreline & Willow-lined Walks,” incorporating ‘decorative elements’, ‘trees and plants’, and ‘water features’. In contrast, Western tourists focus more on the “Marble Boat (Qingyan Boat)” cluster, which highlights ‘decorative elements’, ‘literati garden furnishings’, and ‘water features’. Overall, Eastern tourists appear to be drawn to the lake’s broad esthetic appeal and expansive views, whereas Western tourists more often focus on specific, prominent architectural features embedded in the landscape. The “Long Corridor” (Cluster #2 in SP_D) emerges as a key attraction for Eastern tourists, whereas the “Suzhou Street Area” (Cluster #2 in SP_T) draws substantial interest among Western tourists, possibly reflecting its distinctive cultural experience and scenic setting. The “Tower of Buddhist Incense Complex and Views” (Cluster #4 in SP_D) contrasts with the “Temple Complexes on Longevity Hill” (Cluster #3 in SP_T). Both tourist groups show interest in religious and architectural complexes on Longevity Hill, but their focal emphases differ subtly. In conclusion, Eastern tourists tend to engage with imperial narratives and themes of harmony, whereas Western tourists are more often drawn to dramatic, visually symbolic sites.

Discussion

The significance of heritage cannot be fully understood without considering human involvement and perception²⁸. Photographic compositions reflect the travelers’ perceived preferences and on-site behaviors at a given heritage location. This study applies a quantitative framework to analyze the photographic compositions of UGPs in CCGs. This study has analyzed the photographic compositions of Chinese and Western visitors in the northern and southern CCGs, utilizing 4980 images from Dianping and 4,024 images from TripAdvisor. A comparison of component occupancy, compositional layouts, and perceived hotspots highlights differences in visual attention between Chinese and Western tourists. The analysis integrates quantitative approaches such as the Rule of Thirds, k-means clustering, and DeepLab. The findings reveal distinct patterns between Chinese and Western visitors and a north–south perceptual disparity across the gardens. Specifically: 1) Tourists preferentially include greenery; among international travelers, attention is more frequently directed toward ‘architectural openings,’ ‘mountains and hills,’ ‘pathways and bridges,’ and ‘rocks and boulders’. 2) Although some components appear infrequently, they can disproportionately influence the photograph; in domestic images, ‘buildings’ show wide variation in coverage ratios. 3) Most photographs incorporate one to five component types to emphasize the garden landscape; domestic tourists exhibit a stable co-occurrence pattern across gardens, whereas overseas travelers capture a broader range of components. 4) Photographs by Eastern travelers tend to be more centralized or balanced compositions, whereas those by Western visitors more often exhibit pronounced bottom-heavy layouts. 5) Eastern and Western travelers appear to favor different photographic hotspots. These results advance understanding of destination-image reconfiguration in heritage landscapes by documenting diverse representations of tourists’ perspectives. They also inform strategies to align destination appeal with cross-cultural differences. The findings support sustainable heritage management by enhancing managers’ ability to recognize and respond to shifts in photo-producer perspectives. While HAG and SP serve as illustrative cases, the analytical approach is transferable to other heritage sites. Building on critical heritage studies, this study evaluates how tourists ascribe significance to heritage. By examining photographic composition mechanisms in cross-cultural travel imagery, the study provides new evidence for understanding how heritage assets are appreciated and assigned meaning. First, this study integrates a DL–based semantic segmentation model with spatial grid analysis to identify an expanded set of garden-related compositional components and map their spatial distributions in cultural heritage photographs. This framework enables quantitative links to be established between compositional elements—component co-occurrence, layout patterns, and visual hotspots—and tourists’ photographic preferences. Second, the study provides an objective assessment of UGPs. Using a comprehensive analytical pipeline, the findings capture photographers’ viewpoints and landscape preferences while clarifying the potential implications of compositional elements for garden design and planning. Third, the study compares perceptions across two tourist groups with distinct cultural backgrounds and two renowned gardens, highlighting how cultural diversity and CCG type shape tourists’ interpretations.

This study provides theoretical contributions to heritage-perception research. First, we introduce a DL-based semantic segmentation workflow for photographic analysis that identifies a broad set of landscape components and characterizes their compositional arrangements in studies of historic gardens. Specifically, 150 semantic labels are consolidated into 10 garden-element categories to examine the photographic compositions of UGPs in CCGs. We then quantify component co-occurrence, layout patterns, and visual hotspots using presence–coverage indices and their spatial distributions. These presence–coverage metrics differ from most related work, which has relied on architectural drawings or satellite imagery to infer spatial configurations^41,45,46. Although this approach does not directly evaluate architectural authenticity, vegetation condition, or decorative quality, the images capture tourists’ holistic visual impressions of the gardens. The proposed presence–coverage indices and their applications extend the literature on the conceptual representation of destination image. Moreover, pixel-level segmentation enables the automated delineation of landscape components in historic gardens. Compared with object detection approaches used in prior studies^27,28, the proposed semantic analysis delineates pixel-accurate boundaries and shapes. It also accommodates non-rigid or amorphous regions, supports comprehensive scene parsing, improves sensitivity to small or overlapping objects, and provides occupancy grids for geometry-sensitive applications.

Second, this study investigates landscape preferences by examining a comprehensive set of compositional elements in historic-garden photographs from the image producers’ perspective. These compositional elements encompass cultural syntagms (i.e., component combinations), structural conventions (i.e., layout patterns), and interaction anchors (i.e., visual hotspots). This producer-oriented perspective differs from most prior studies, which typically infer attention areas from viewers (e.g., visual preferences)^3,8,11,31,36. Because UGPs are created during travel, they provide a complementary lens to observer-centered evaluations and can therefore support cautious inferences about travelers’ shooting intentions. By analyzing UGPs, researchers can characterize photographer–landscape interactions and infer patterns of preference and human–environment engagement¹⁰. Our analysis extends Li and Qiu’s study²⁸ by moving from five landscape types, two-element co-occurrences, and within-garden locations to ten constituent compositions, all-component co-occurrences, and within-photo spatial structures. It also foregrounds core compositional elements (e.g., layouts and hotspots) to link these patterns with tourists’ preferences. Likewise, the scale of co-occurrence combinations is comparable to that reported by Wang et al.⁸ in a different context, who found that integrating intricate building forms can enhance visual attractiveness, whereas excessive building density may induce visual fatigue and discomfort. Building on this line of work, we incorporate a broader range of constituent components beyond buildings using an automated, segmentation-based approach (i.e., semantic segmentation) rather than viewer-based measurements (i.e., eye-tracking).

Third, this study proposes a quantitative framework for analyzing tourists’ perceptions in CCGs. Existing methods for examining tourists’ perceptions of cultural heritage have been predominantly qualitative^16,28,47, relying mainly on questionnaires^{5,32,33,35,37,48,49,50,51} and interviews^7,51,52, as well as approaches such as netnography⁶ and grounded theory¹⁶; comprehensive quantitative evidence remains limited. We use DL-based recognition techniques to automatically extract salient information from photographs, which may be underrepresented in subjective evaluations, as suggested in other contexts²⁸. We also employ spatial grid analysis to compute cell-occupancy maps, enabling an automated characterization of component spatiality. Our work highlights the value of large-scale datasets and quantitative approaches for decoding travelers’ perceptions⁵² and identifying photographic hotspots, aligning with recent initiatives aimed at integrating tourist perceptions into heritage management² and emphasizing the intersection of heritage landscapes with environmental context, human cognition, and sensory experience^35,38.

Fourth, this study extends the literature on CCGs by foregrounding cross-cultural destination images captured by Eastern and Western visitors. This cross-cultural comparison identifies both shared regularities and systematic differences in landscape components and their compositions in CCGs, consistent with observations reported in another context⁸. Our results suggest that Eastern tourists tend to seek culturally immersive, context-rich scenes, whereas Western travelers more often favor visually distinctive subjects and object-centered compositions. These patterns are in line with prior evidence^53,54. Cultural psychology suggests that individuals internalize culturally preferred modes of attention through sustained exposure to cultural elements⁵⁵. This “cultural imprinting” may heighten sensitivity to culturally normative cues: individuals with a more independent self-construal tend to privilege focal objects and people, whereas those with a more interdependent self-construal are more attuned to contextual information and to the relationships between focal subjects and their surroundings. In our sample, these tendencies are reflected in Western visitors’ greater emphasis on salient built features (e.g., buildings and walkways) and a preference for more structured viewpoints, which may echo Renaissance-derived conventions of proportion and order. This tendency aligns with the “external perspective” (Ich-Ferne), which foregrounds an object or topic as distinct from its environment. By contrast, Eastern visitors more often adopt dynamic, multidimensional viewpoints that emphasize harmony and fluidity—consistent with Taoist thought and Chinese esthetic traditions—and this tendency is reflected in balanced, often centered compositions (e.g., plants and water). This resonates with the “internal perspective” (Ich-Nähe) in psychology, in which the environment is perceived as interrelated and holistic⁵³, as also suggested by traditional Chinese calligraphy⁵⁴. Together, these psychological and cultural lenses help explain how compositional arrangements may mirror tourists’ cultural orientations and affective ties to the places they document⁵³. The results also suggest that service providers and site managers should attend to visitors’ perspectives when interpreting historic-site values⁷. Overall, our framework supports research on heritage communication and cross-cultural engagement.

Fifth, this study addresses a gap in the literature by comparatively evaluating northern and southern CCGs. Globalization can erode regional cultures and contribute to the homogenization of urban landscapes across many countries⁴¹. Accordingly, a careful examination of the distinctive attributes of local cultural heritage landscapes is warranted⁴¹. This study highlights the importance of attending to regional historical and cultural landscapes when interpreting heritage sites. By adopting a cross-cultural lens, we reveal both cognitive diversity and cultural coherence across multiple gardens within destination images. This diversity is reflected in the different co-occurrence patterns of component combinations observed in both gardens, as detailed in the subsection on co-occurrence among multiple components. By contrast, the enclosure walls at HAG and the buildings at SP remain salient in tourists’ collective perception, despite cross-cultural differences between Western and Eastern visitors.

Our findings also have practical implications for heritage management and destination marketing. First, destination marketing organizations (DMOs) can strengthen photography-based campaigns by tailoring content to tourists from different cultural backgrounds. Such culturally targeted strategies can stimulate interest and travel behavior across markets⁵⁶. DMOs can leverage the distinct attributes of northern and southern gardens to develop market-specific promotional materials for Eastern and Western audiences. For example, during peak periods of Western visitation, DMOs could feature under-visited buildings framed by literati garden furnishings and flora to encourage exploration and photography, consistent with Western tourists’ preference for more diverse component combinations. This approach could help redistribute visitor flows, reducing pressure on heavily used viewpoints and mitigating localized environmental impacts. Moreover, our results show that ‘buildings’ form distinct compositional layouts in HAG and SP, enabling site-specific selection of advertising imagery. Decorative elements that attract Western tourists can be accentuated to enhance visual salience in promotional materials, consistent with prior work^5,15. Hotspots identified via semantic segmentation and spatial grid analysis can further guide the selection of vantage points and framing for promotional photography. Overall, recognizing regional and cultural variation in heritage interpretation can support more inclusive and sustainable heritage communication through diverse photographic narratives that engage different audiences.

Second, this study can help DMOs evaluate destination-focused promotional initiatives, improve landscape quality, and refine marketing strategies using evidence from UGPs. Managers can examine how branding and on-site management shape tourists’ impressions not only through official operations and personnel but also through tourists’ own perspectives. In particular, they can assess whether user-generated contributions reinforce favorable destination perceptions⁵⁶, thereby providing an operational way to evaluate whether UGPs cultivate positive destination images. Building on our findings, managers can adjust marketing strategies in light of component co-occurrence, compositional layouts, and hotspots extracted from UGPs. A clearer understanding of photo-producers’ landscape preferences can help identify target markets, inform the design of tourism products and settings aligned with visitors’ expectations, and enhance satisfaction by highlighting distinctive attributes or directing attention to under-visited places²⁸. For example, well-designed garden paths and visually coherent layouts may increase a garden’s appeal, consistent with Liu et al.’s⁵⁷ observation of a close relationship between human activities and the natural environment. Given the high visibility of water across both gardens—and the prominence of buildings among Western tourists at SP—DMOs could consider enhancing lake-based experiences (e.g., boat services) and strengthening architectural presentation at SP (e.g., lantern lighting). More broadly, landscape style and spatial layout should be considered in garden design to improve overall tourist experience and satisfaction and, as previously reported, to reduce stress, improve mood, and encourage physical activity¹⁰. Overall, these results provide managers with an analytical lens for linking marketing actions with tourists’ perceptions.

Third, this study may help local governments develop evidence-informed policies for heritage conservation and enhance tourist experience value through co-creation. By examining cross-cultural landscape preferences and multi-garden design, our study underscores the need for coordinated strategies to manage heritage assets, consistent with prior work⁵⁸. Local governments should improve the perceptual accessibility of diverse heritage expressions and foster cross-cultural engagement among tourists from different cultural backgrounds, thereby supporting conservation and enriching tourist experiences. Governments can promote the photography-relevant features identified here—such as salient architectural components, fine-grained spatial layouts, and effective framing angles—to support conservation and help international tourists overcome communication barriers. This, in turn, calls for local governments to integrate heritage landscapes into urban design as tangible cultural representations, creating perceptual links between global travelers and destination landscapes and, ultimately, strengthening tourists’ co-creation of heritage value.

Fourth, incorporating UGPs into organization-generated photographs may motivate tourists to produce and share higher-quality images online. Tourists may favor value-added offerings that support the creation of UGPs. In turn, these user-led modes of place promotion may generate credible sources of destination imagery and viable alternatives to official representations⁵⁹. This approach directly involves tourists in creating promotional content⁶⁰. Such co-created content may elicit more positive responses during on-site photography, thereby supporting heritage digitization and marketing, informing service optimization, and improving interpretation and display quality—ultimately enhancing tourist satisfaction and loyalty.

This study has several limitations that suggest directions for future research. First, the robustness of our findings may be constrained because the study relied on two platforms (Dianping and TripAdvisor) and a limited sample. Images from other social media platforms (e.g., Flickr, Wikimedia Commons, or Xiaohongshu) and travel websites (e.g., Mafengwo.cn or CTrip) were not included in our sample. Analyzing a more comprehensive dataset can mitigate sampling error⁶¹; therefore, the model should be validated across additional channels, particularly international platforms with abundant content (e.g., Flickr and Wikimedia Commons). Second, because of data availability, we examined only two well-known Chinese gardens, limiting the scope of cross-cultural comparison. Other renowned gardens and additional source markets were not examined; future work could expand to these contexts to further test cross-cultural differences. Third, following Sun et al.’s study¹⁶, Eastern and Western tourist perceptions were proxied using Dianping.com and TripAdvisor.com, respectively. However, images on Dianping.com may not originate exclusively from Chinese tourists, and TripAdvisor.com posts may include contributions from Chinese users. Consequently, proxy-based misclassification may introduce sample bias and affect cross-group comparability. Fourth, although architectural specifics (e.g., engraving patterns, typologies, and decorative features of windows and doors) can enhance destination imagery^6,15, our semantic segmentation pipeline does not capture such fine-grained details. In this instance, DeepLab trained on ADE20K did not support reliable identification of these features. Similarly, some elements (e.g., floors and animals) were excluded from the analysis; future work could incorporate these categories to extend the semantic label set. Fifth, this study assessed the feasibility of applying DeepLab v3+ to cultural-heritage imagery. To achieve higher segmentation accuracy, the model should be adapted using domain-specific, human-annotated images; this constitutes a key direction for future work to improve semantic-segmentation precision. Sixth, for the Dianping dataset, we cropped the bottom 15% of each image to reduce watermark interference. Although the main content was largely preserved, cropping may have affected component-level segmentation performance. Future work could explore more robust preprocessing (e.g., vision transformers⁶² and dynamic networks⁶³) to mitigate cropping-related artifacts. Lastly, we focused on semantic segmentation and spatial grid analysis to identify compositional essentials and examine how these features relate to tourist preferences, emphasizing the cognitive dimension of tourists’ destination images. Future studies could examine UGPs from an affective perspective, incorporate content-provider demographics, and apply complementary methods (e.g., color extraction, object detection, sentiment analysis, and esthetic rating) to further evaluate tourist preferences.

Data availability

Data supporting the findings of this study are available from the following websites (accessed on May 10, 2025): https://www.dianping.com/shop/jXbmjaCBdAMesfbx/photos, https://www.dianping.com/shop/G9K4MX4uAHO4TMjs/photos, https://www.tripadvisor.com/Attracvtion_Review-g294212-d311538-Reviews-Summer_Palace_Yiheyuan-Beijing.html and https://www.tripadvisor.com/Attraction_Review-g297442-d505717-Reviews-Humble_Administrator_s_Garden-Suzhou_Jiangsu.html.

References

Li, X., Wang, P., Li, L. & Liu, J. The influence of architectural heritage and tourists’ positive emotions on behavioral intentions using eye-tracking study. Sci. Rep. 15, 1447 (2025).
Article PubMed PubMed Central Google Scholar
Wei, Y. & Chen, M. Tracing the evolution of tourist perception of destination image: a multi-method analysis of a cultural heritage tourist site. Sustainability 17, 5476 (2025).
Article Google Scholar
Wang, X. M., Che, B. Q., Lou, Q. & Zhu, R. Integrated eye-tracking response surface analysis to optimize the design of garden landscapes. Land 13, 1045 (2024).
Article CAS Google Scholar
Oteros-Rozas, E., Martín-López, B., Fagerholm, N., Bieling, C. & Plieninger, T. Using social media photos to explore the relation between cultural ecosystem services and landscape features across five European sites. Ecol. Indic. 94, 74–86 (2018).
Article Google Scholar
Yang, Z. et al. A study on visual impact assessment of the enclosing wall entourage of Chinese traditional private garden. Environ. Impact Assess. Rev. 105, 107427 (2024).
Article Google Scholar
Zheng, Y., Wei, W., Zhang, L. & Ying, T. Tourist gaze at Chinese Classical Gardens: the embodiment of aesthetics (Yijing) in tourism. J. Hosp. Tour. Res. 48, 353–379 (2024).
Article Google Scholar
Zhang, R. R., Wang, J. N. & Brown, S. The charm of a thousand years’: exploring tourists’ perspectives of the ‘culture-nature value’ of the humble administrator’s garden, Suzhou, China. Landsc. Res. 46, 1071–1088 (2021).
Article Google Scholar
Wang, X. M., Che, B. Q. & Zhu, R. Eye-tracking and psychological analysis: the impact of building shape on visitor visual preference. Buildings 14, 2733 (2024).
Article Google Scholar
Cui, Y. Research on garden landscape reconstruction based on geographic information system under the background of deep learning. Acta Geophys. 71, 1491–1513 (2023).
Article Google Scholar
Su, C., Chen, Y. X., Wang, X. & Song, L. Identifying cultural perceptions by nature language processing method: differences of historical gardens and urban parks in Suzhou, China. ACM J. Comput. Cult. Herit. 18, 7 (2025).
Google Scholar
Chen, X., Zhang, Q., Liu, W., Rahimi, R. & Qi, J. Exploring visual attention and perception in hospitality and tourism: a comprehensive review of eye-tracking research. J. Hosp. Tour. Insights (in press) https://doi.org/10.1108/JHTI-06-2024-0621 (2025).
Xu, Y. et al. Towards sustainable heritage tourism: a space syntax-based analysis method to improve tourists’ spatial cognition in Chinese historic districts. Buildings 10, 29 (2020).
Article CAS Google Scholar
Lee, A. H. What does colour tell about tourist experiences?. Tour. Geographies 25, 136–157 (2023).
Article Google Scholar
Yang, C. et al. Digital characterization of the surface texture of Chinese classical garden rockery based on point cloud visualization: small-rock mountain retreat. Herit. Sci. 11, 13 (2023).
Article Google Scholar
Wang, Z., Zheng, R., Tang, J., Wang, S. & He, X. The aesthetic imagery of traditional garden door and window forms: a case study of the four major traditional gardens of lingnan. Buildings 15, 513 (2025).
Article Google Scholar
Sun, T., Li, Y. & Tai, H. Different cultures, different images: a comparison between historic conservation area destination image choices of Chinese and Western tourists. J. Tour. Cult. Change 21, 110–127 (2023).
Article Google Scholar
Ding, M. C., Zhang, J. P., Shen, G. Q., Zheng, Q. Y. & Yuan, H. From photographic images to hierarchical networks? Color associations of a traditional Chinese garden. Color Res. Appl. 48, 735–747 (2023).
Article Google Scholar
Jiang, L. et al. A bibliometric insight into immersive technologies for cultural heritage preservation. Herit. Sci. 13, 126 (2025).
Article Google Scholar
Belhi, A., Bouras, A., Al-Ali, A. K. & Foufou, S. A machine learning framework for enhancing digital experiences in cultural heritage. J. Enterp. Inf. Manag. 36, 734–746 (2023).
Article Google Scholar
YANG, T., WANG, S., PEN, H. & WANG, Z. Automatic identification and inpainting of cracks in mural images based on improved SOM. J. Tianjin Univ. Sci. Technol. 53, 932 (2020).
Google Scholar
Ren, H., Sun, K., Zhao, F. H. & Zhu, X. Dunhuang murals image restoration method based on generative adversarial network. Herit. Sci. 12, 39 (2024).
Article Google Scholar
Li, Q. et al. Non-destructive testing research on the surface damage faced by the Shanhaiguan Great Wall based on machine learning. Front. Earth Sci. 11, 1225585 (2023).
Article Google Scholar
Yan, L. A., Chen, Y. L., Zheng, L. & Zhang, Y. Application of computer vision technology in surface damage detection and analysis of shedthin tiles in China: a case study of the classical gardens of Suzhou. Herit. Sci. 12, 72 (2024).
Article Google Scholar
Jia, M. et al. Automatic restoration of dunhuang murals and process visualization method based on deep learning. Appl. Sci. 15, 1422 (2025).
Article CAS Google Scholar
Cheng, Y., Huang, J., Zhang, Y. & Peng, N. Artificial intelligence in cultural heritage conservation. Chin. J. Nat. 46, 261–270 (2024).
Google Scholar
Xia, M., Lu, Z. & Wang, F. Multi-modal social media analytics: a sentiment perception-driven framework in Nanjing Districts. IEEE Access 13, 12603–12622 (2025).
Article Google Scholar
Chen, Y. C., Yu, K. M., Kao, T. H. & Hsieh, H. L. Deep learning based real-time tourist spots detection and recognition mechanism. Sci. Prog. 104, 00368504211044228 (2021).
Article PubMed PubMed Central Google Scholar
Li, Y. W. & Qiu, B. Using deep learning approaches to quantify landscape preference of the chinese grand canal: an empirical case study of the Yangzhou ancient canal. Sustainability 16, 3602 (2024).
Article Google Scholar
Li, Y. Y. et al. Detection and recognition of Chinese porcelain inlay images of traditional Lingnan architectural decoration based on YOLOv4 technology. Herit. Sci. 12, 137 (2024).
Article CAS Google Scholar
Liu, Z. et al. Semantic segmentation and photogrammetry of crowdsourced images to monitor historic facades. Herit. Sci. 10, 27 (2022).
Article Google Scholar
Sheng, K. Y., Liu, L., Wang, F., Li, S. N. & Zhou, X. An eye-tracking study on exploring children’s visual attention to streetscape elements. Buildings 15, 605 (2025).
Article Google Scholar
Ortanderl, F. & Bausch, T. Wish you were here? Tourists’ perceptions of nature-based destination photographs. J. Destin. Mark. Manag. 29, 100799 (2023).
Google Scholar
Jones, C. E., Theodosis, S. & Lykourentzou, I. The Enthusiast, the interested, the sceptic, and the cynic: understanding user experience and perceived value in location-based cultural heritage games through qualitative and sentiment analysis. ACM J. Comput. Cult. Herit. 12, 1–26 (2019).
Article Google Scholar
Yang, Y., Wang, Z. Y., Shen, H. & Jiang, N. P. The impact of emotional experience on tourists’ cultural identity and behavior in the cultural heritage tourism context: an empirical study on Dunhuang Mogao Grottoes. Sustainability 15, 8823 (2023).
Article Google Scholar
Zhang, Z., Wang, X. & Jiang, M. Empirical study on emotional perception and restorative effects of suzhou garden landscapes: text mining and statistical analysis. Land 14, 122 (2025).
Article Google Scholar
Svobodova, K., Sklenicka, P., Molnarova, K. & Vojar, J. Does the composition of landscape photographs affect visual preferences? The rule of the Golden Section and the position of the horizon. J. Environ. Psychol. 38, 143–152 (2014).
Article Google Scholar
Yang, W., Chen, Q., Huang, X., Xie, M. & Guo, Q. How do aesthetics and tourist involvement influence cultural identity in heritage tourism? The mediating role of mental experience. Front. Psychol. 13, 990030 (2022).
Article PubMed PubMed Central Google Scholar
Wang, Y., Jiang, Y., Ning, X. & Gao, L. Bridging cultural perspectives: developing a sustainable framework for the comparative aesthetic evaluation of eastern and western art. Sustainability 16, 5674 (2024).
Article CAS Google Scholar
Korkmaz, O. Primary perceptual field in visual materials. Soc. Sci. 4, 525–533 (2009).
Google Scholar
Gowlett, J. A. The vital sense of proportion: transformation, golden section, and 1:2 preference in Acheulean Bifaces. PaleoAnthropology, 174–187 (2011).
Wu, W., Zhou, K., Li, T. & Dai, X. Spatial configuration analysis of a traditional garden in Yangzhou city: a comparative case study of three typical gardens. J. Asian Archit. Build. Eng. 24, 593–604 (2025).
Article Google Scholar
Jia, S. et al. Conservation and management of Chinese classical royal garden heritages based on 3D digitalization- a case study of Jianxin courtyard in Jingyi garden in Fragrant Hills. J. Cult. Herit. 58, 102–111 (2022).
Article Google Scholar
Cui, R., Cheng, M., Xin, S., Hua, C. & Yao, Y. International tourists’ dark tourism experiences in China: the case of the memorial of the victims of the Nanjing Massacre. Curr. Issues Tour. 23, 1493–1511 (2020).
Article Google Scholar
Huang, Z. H. et al. Predicting microvascular invasion and early recurrence in hepatocellular carcinoma using DeepLab V3+ segmentation of multiregional MR habitat images. Acad. Radiol. 32, 3342–3357 (2025).
Article PubMed Google Scholar
Zhou, K., Wu, W., Dai, X. & Li, T. Quantitative estimation of the internal spatio-temporal characteristics of ancient temple heritage space with space syntax models: a case study of daming temple. Buildings 13, 1345 (2023).
Article Google Scholar
Lin, Y. J. Spatiotemporal narrative structure of the lingering garden based on traditional Chinese conception of time and space. Landsc. Res. 48, 45–63 (2023).
Article Google Scholar
Zhou, W., Chen, L.-Y. & Chou, R.-J. Important factors affecting rural tourists’ aesthetic experience: a case study of Zoumatang village in Ningbo. Sustainability 13, 7594 (2021).
Article Google Scholar
Szubert, M., Warcholik, W. & Zemla, M. The influence of elements of cultural heritage on the image of destinations, using four Polish cities as an example. Land 10, 671 (2021).
Article Google Scholar
Fan, X. et al. The effect of mixed emotions on the behavioral intentions of tourists at overseas Chinese cultural heritage sites: with place attachment and memorable tourism experience as mediators. J. Tour. Cult. Change (in press) https://doi.org/10.1080/14766825.2025.2465346 (2025).
Lin, Z., Yang, Z. & Yuan, J. Research on the design and image perception of cultural landscapes based on digital roaming technology. Herit. Sci. 12, 397 (2024).
Article Google Scholar
Chen, X. & Shi, Q. Reflections on the reconstruction of architectural heritage in China: a case study of Suzhou. J. Asian Archit. Build. Eng. 1–11 (2025).
Wan, Y. K. P. & Li, J. Chinese tourists’ aesthetic gaze upon urban heritage. Tour. Recreat. Res. 1–16 (2024).
Bao, Y. et al. Aesthetic preferences for eastern and western traditional visual art: identity matters. Front. Psychol. 7, 1596 (2016).
Article PubMed PubMed Central Google Scholar
Zhang, B. Reflection and comparison of the eastern and western aesthetics of photography composition. In Proc. 2017 Int. Conf. Sports, Arts, Educ. Manage. Eng. (SAEME 2017) (eds You, Z., Wang, P. & Ma, Y.) 7–11 (Atlantis Press, 2017).
Mavritsaki, E., Chua, S., Allen, H. A. & Rentzelas, P. Cross-cultural differences in attention: an investigation through computational modelling. Brain Organoid Syst. Neurosci. J. 3, 1–7 (2025).
Article Google Scholar
Díaz-Pacheco, A., Guerrero-Rodríguez, R., Alvarez-Carmona, M. A., Rodríguez-González, A. Y. & Aranda, R. Quantifying differences between UGC and DMO’s image content on Instagram using deep learning. Inf. Technol. Tour. 26, 293–329 (2024).
Article Google Scholar
Liu, R., Meng, D., Wang, M., Gong, H. & Li, X. Analysis of spatiotemporal dynamics and driving mechanisms of cultural heritage distribution along the Jiangnan Canal, China. Sustainability 17, 5026 (2025).
Article CAS Google Scholar
Çardak, F. S. Community engagement and heritage awareness for the sustainable management of rural and coastal archaeological heritage sites: the case of Magarsus (Karataş, Turkey). Sustainability 17, 5302 (2025).
Article Google Scholar
Liu, X. W., Nicolau, J. L., Law, R. & Li, C. H. Applying image recognition techniques to visual information mining in hospitality and tourism. Int. J. Contemp. Hosp. Manag. 35, 2005–2016 (2023).
Article Google Scholar
Hou, L. & Pan, X. Aesthetics of hotel photos and its impact on consumer engagement: a computer vision approach. Tour. Manag. 94, 104653 (2023).
Article Google Scholar
Xiang, Z., Du, Q., Ma, Y. & Fan, W. A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism. Tour. Manag. 58, 51–65 (2017).
Article Google Scholar
Leng, Y. et al. Bridging knowledge gap between image inpainting and large-area visible watermark removal. In Proc. AAAI Conf. Artif. Intell. (eds Toby, W., Julie, S. & Zico, K.) 4589–4597 (AAAI Press, 2024).
Niu, L., Zhao, X., Zhang, B. & Zhang, L. Fine-grained visible watermark removal. In Proc. IEEE/CVF Int. Conf. Comput. Vis. (eds Jana, K., Jean, P., Cordelia, S. & Andrew, Z.) 12770–12779 (IEEE Computer Society, 2023).
Kiani, K. & Khakzand, M. The structure, concept and identity of traditional Chinese gardens. Landsc. Hist. 45, 101–120 (2024).
Article Google Scholar

Download references

Acknowledgements

The authors express their sincere appreciation to S.G., X.R., S.Z., Z.Z., P.C., Y.S., and X.W. of Jinan University (Shenzhen Campus) for their technical assistance and efforts in data preparation, without which this research could not have proceeded to its present form. This work was supported in part by the Jinan University Shenzhen Campus Funding Program (Grant No. JNSZQH2302), the National Social Science Fund of China (Grant No. 24BGL196), and the National Innovation and Entrepreneurship Training Program for Undergraduates (Grant Nos. 202510559102 and 202510559101).

Author information

These authors contributed equally: Shiyu Lu, Leqian Ni.

Authors and Affiliations

Department of E-Commerce, Jinan University, Shenzhen, China
Haoyue Chai, Shiyu Lu, Leqian Ni, Yanbing Zhou, Yinfei Xiao & Tianxiang Zheng
Department of Hospitality Management, Shenzhen Tourism College, Jinan University, Shenzhen, China
Yuanyuan Zhu
Department of Tourism Management, Shenzhen Tourism College, Jinan University, Shenzhen, China
Zifeng Mo

Authors

Haoyue Chai
View author publications
Search author on:PubMed Google Scholar
Shiyu Lu
View author publications
Search author on:PubMed Google Scholar
Leqian Ni
View author publications
Search author on:PubMed Google Scholar
Yuanyuan Zhu
View author publications
Search author on:PubMed Google Scholar
Yanbing Zhou
View author publications
Search author on:PubMed Google Scholar
Yinfei Xiao
View author publications
Search author on:PubMed Google Scholar
Zifeng Mo
View author publications
Search author on:PubMed Google Scholar
Tianxiang Zheng
View author publications
Search author on:PubMed Google Scholar

Contributions

H.C. contributed to conceptualization and drafted the manuscript. S.L. drafted the manuscript, analyzed the data, and prepared the visualizations. L.N. developed Python code for semantic segmentation and spatial grid analysis. Y.Y.Z. provided resources, including datasets and software. Y.B.Z. analyzed the data and validated the findings. Y.X. contributed to the methodology. Z.M. contributed to model interpretation and resources. T.Z. contributed to manuscript writing, project administration, supervision, and funding acquisition. All authors reviewed the manuscript.

Corresponding author

Correspondence to Tianxiang Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chai, H., Lu, S., Ni, L. et al. Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives. npj Herit. Sci. 14, 176 (2026). https://doi.org/10.1038/s40494-026-02439-1

Download citation

Received: 22 October 2025
Accepted: 06 March 2026
Published: 24 March 2026
Version of record: 24 March 2026
DOI: https://doi.org/10.1038/s40494-026-02439-1

Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives

Abstract

Similar content being viewed by others

An integrated framework to quantify tour-view experience in classical gardens using computer vision and spatial analysis

Modeling visual perception of Chinese classical private gardens with image parsing and interpretable machine learning

Digital preservation of classical gardens at the San Su Shrine

Introduction