Introduction

As an industrial heritage, this article includes buildings, infrastructure, and machinery from the industrial era, which are important aspects of culture, history, and socio-economic development. These heritage sites are often not just remnants of past industrial activities; They represent historical narratives and provide valuable insights into technological and cultural changes in different regions. In the coastal areas of Jiangsu Province, China, many hydraulic industrial sites were established between 1895 and 1948, such as dams, canals, and pumping stations1. These sites have played an important role in promoting salt production, agricultural irrigation, and local industrialization. However, due to long-term abandonment and environmental exposure, many of these heritage sites are now being neglected, degraded, and visually degraded2,3. People are increasingly recognizing the importance of protecting these heritage sites, not only because of their engineering and functional value, but also because of their cultural significance. With the continuous advancement of urbanization and environmental transformation, it is crucial to evaluate and protect the aesthetic and ecological aspects of these heritage landscapes, which embody historical memory and the intersection of engineering and nature. Although the protection of industrial heritage is receiving increasing attention, the role of visual perception in guiding its sustainable reuse and revitalization has not been fully addressed4. Therefore, existing methods for assessing industrial heritage sites often focus on the physical integrity of the structure or overall landscape preferences5,6, often neglecting the crucial role of visual perception in heritage conservation7. In addition, the hydraulic industrial heritage site integrates engineering and natural elements, and its complexity poses challenges for evaluating its aesthetic appeal and public participation8. Although green and blue spaces are often studied for their visual and emotional impact9, there is little research on how industrial heritage landscapes, especially those near waterfront areas10, are perceived from the perspective of their engineering and natural components. The current evaluation framework also has limitations in combining objective visual data with subjective human responses. Most studies either use objective methods such as aerial images or 3D models11,12, or subjective evaluations such as surveys, but rarely combine the two. In addition, the visual appeal of these attractions is dynamic and often influenced by changes in environmental conditions such as decay and decay. Obviously, a more comprehensive framework is needed to evaluate these heritage landscapes more comprehensively, combining objective spatial features and subjective emotional responses.

The role of visual perception in the protection of industrial heritage is receiving increasing attention in literature review13, and research has examined the aesthetic value and cultural significance of heritage sites14. In the past decade, research has shifted from purely qualitative analysis to more data-driven methods15 that integrate spatial data, computer vision technology, and human centered approaches16,17.

In terms of re evaluating the value of visual culture, global frameworks for industrial heritage protection, such as the International Committee for the Protection of Industrial Heritage (TICCIH), emphasize the importance of visual integrity in protecting the functional and aesthetic qualities of industrial landscapes18,19. In China, research on the Grand Canal and other water heritage sites emphasizes the strong correlation between visual elements and public emotional participation, indicating that aesthetics plays a central role in the relationship between the public and industrial heritage20,21.

In terms of technological progress and innovation, especially in the fields of computer vision and deep learning, it has become possible to analyze and quantify visual elements in industrial heritage landscapes22. DeepLabv3 + and other models used for semantic segmentation can perform pixel level classification on landscape features such as vegetation, water, and building structures23. These tools provide enormous potential for objective assessment of heritage landscapes24 and essential quantitative data for informed conservation decisions25,26.

Focusing on waterfront industrial heritage, waterfront industrial heritage combines natural ecological elements with engineering structures, making it a unique type of heritage that requires a specialized evaluation framework27. Research has shown that the aesthetic and emotional appeal of these attractions is often influenced by water and green vegetation, which are associated with restoration experiences and positive psychological responses28.

In terms of challenges and emerging methods research, despite technological advancements, there are still challenges in combining objective visual data with subjective human responses. Existing research often relies on static images or small-scale surveys, limiting the immersion and relevance of the assessment to reality. New methods, such as eye tracking and panoramic videos29, are beginning to bridge this gap by providing more dynamic and interactive ways to capture how people experience heritage landscapes.

In the above literature, there are several research gaps that need to be addressed.

  1. 1.

    Industrial heritage lacks integration between objective and subjective data, and most existing industrial heritage assessments focus on quantitative image analysis or qualitative sensory data30,31. However, few studies have combined these data into a coherent framework to comprehensively understand the public’s perception of heritage sites.

  2. 2.

    Limited attention has been paid to the heritage of water conservancy industry. Although landscape preference research usually focuses on green or blue spaces32, few studies have addressed the unique features of water conservancy industry heritage, which integrate human engineering with natural environments such as water bodies.

  3. 3.

    There is a lack of dynamic and immersive evaluation methods33, and traditional methods such as static images or small-scale surveys cannot capture all the dynamic characteristics of these heritage sites34,35.

Therefore, the research objective of this article is to fill these gaps by proposing a comprehensive framework to evaluate the visual quality and reuse potential of coastal hydraulic industrial heritage sites. The main objective of this study is to Firstly, To develop a comprehensive "objective + subjective" visual evaluation framework that combines advanced technologies such as semantic segmentation, drone based images, and digital twin modeling with subjective public perception data to evaluate the aesthetic and experiential quality of coastal hydraulic industrial heritage sites. Secondly, Investigating the relationship between visual features and public perception36, this study will explore how specific visual elements37, such as greenery, water bodies, and building visibility, affect the public’s aesthetic experience and emotional responses. Thirdly, Provide feasible strategies for heritage protection and adaptive reuse38. Based on comprehensive visual and perceptual data, this study will provide practical recommendations for the sustainable reuse of hydropower industry heritage39, promoting ecological restoration and cultural continuity. Provide valuable insights for urban planners, policy makers, and heritage professionals40.

Based on the above understanding, the key research question to be addressed in this article is:

  1. 1.

    How to develop an "objective + subjective" comprehensive visual evaluation framework for coastal hydraulic industrial heritage sites?

  2. 2.

    What is the relationship and difference between the objective visual composition of these places and the subjective aesthetic perception of the public?

  3. 3.

    Which visual elements (such as greenery, water, and building visibility) most effectively enhance the aesthetic appeal and emotional participation of the public in the hydraulic industry heritage site?

Research framework

In the second stage, semantic segmentation technology is applied to all collected images, including drone aerial photos, eye level panoramic photos, and short video loops. Panoramic photos were taken on site using a camera model, such as Ricoh Theta Z1, at a height of about 1.6 m from the ground to simulate the pedestrian’s perspective41. Due to different water conservancy heritage sites in different environments, some sites require standing on high mountain slopes, while others are taken on relatively flat ground to obtain photos from different perspectives. Simultaneously extract short video loops from panoramic shots recorded along representative tourist routes42. These multi-source images were processed using the DeepLabV3 + model to extract and quantify six objective indicators: Green Vegetation Index (GVI), Water Surface Index (WVI), Sky Coverage Index (SKVI), Hard Surface Index (HVI), Building Visibility Index (BVI), and other Artificial Structure Index (OVI)43,44. This step achieves precise decomposition of landscape elements, providing a comprehensive visual composition overview for each location before integrating subjective perception data in subsequent stages. (See Fig. 1).

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

 Sources: authors own work.

Research framework and technology roadmap.

In the third stage, a multiple linear regression model is employed to examine the relationship between subjective perception scores and the objective indicators obtained through image analysis. This allows for the identification of the most significant spatial and visual variables affecting public aesthetic evaluation, providing empirical support for visually oriented conservation and reuse strategies45.

Method

Research area

This study mainly focuses on eight coastal towns in Jiangsu Province, eastern China (See Fig. 2)—Tangzha, Tianxinggang, Haifu, Sanyu, Xinjie, Xinfeng, Qianjin, and Chenjiagang. These towns were key areas for the development of water conservancy and salt industry in the early modernization period of China (1900–1948). These locations are representative of early industrial infrastructure and hold significant historical and cultural value, making them particularly well-suited for this research. Three main reasons justify the selection of this area.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

 Sources: authors own work.

Visual perception score points of the study area and 32 water conservancy industrial heritages. shows the basic maps of China and Jiangsu Province, which are sourced from the “Maps” version of the BIGEMAP GIS office software purchased by the first author. (You can access this software link) http://www.bigemap.com/ The first author used the software for secondary drawing, which is a composite effect of base map and self drawing.

This study was conducted in eight coastal cities and towns in Jiangsu Province, namely Tangzha, tianxinggang, Haifu, Sanyu, Xinjie, Xinfeng, Qianjin and Chenjiagang in eastern China (See Fig. 2)—representatives of water conservancy and salt industry infrastructure in the early twentieth century. These sites have the following characteristics: 1.strong cultural narratives linked to industrial modernization, 2. well-preserved engineering–landscape integrity, and 3. typological diversity across 32 heritage sites. These attributes provide a robust basis for developing and testing a transferable visual perception–based evaluation framework.

Research design

The study adopts a dual-paradigm approach that integrates computer vision and psychometric assessment to construct a “subjective + objective” visual perception evaluation model. The research process includes image acquisition, semantic segmentation, perceptual questionnaire surveys, and multiple linear regression analysis (See Table 1). The results are presented through visual interpretation of influencing mechanisms. This approach helps identify key visual elements and clarify the relationship between objective spatial structures and public perception, providing a scientific foundation for the renewal and adaptive reuse of industrial heritage. The integrated framework ensures a data-driven approach to understanding how spatial composition influences visual perception and lays the foundation for developing targeted revitalization strategies that align with both cultural values and aesthetic preferences.

Table 1 Research design.

Data collection and analysis

Objective visual data collection and analysis

In order to capture the visual features of Jiangsu’s coastal water conservancy industry heritage, data was collected from three sources:

  1. 1.

    Aerial photography based on drones—from April to June 2024, DJI Phantom 4 RTK (20 MP, 1-inch CMOS) was used for RTK correction, achieving a horizontal accuracy of ± 2 cm and a vertical accuracy of ± 3 cm. The flight plan ensures 80% forward and 75% side overlap, and is validated by Trimble R10 GNSS ground control points.

  2. 2.

    Ground images—collected along tourist routes using Sony α 7R IV (61 MP, 24–70 mm f/2.8) at a height of 1.6 m and every 10–15 m, and geotagged using Garmin GPSMAP 66sr (± 1.5 m).

  3. 3.

    Supplementary datasets—orthorectified images from Jiangsu Geographic Information Center (resolution 0.25 m), architectural drawings from Jiangsu Water Resources Bureau, environmental data from National Earth System Science Data Center (10–30 m), and 3D models from Jiangsu Urban Planning and Design Institute (validation deviation less than 5 cm).

The collected images were preprocessed and then input into the DeepLabV3 + semantic segmentation model, which was fine-tuned using the ADE20K dataset. This enabled the classification and quantification of both natural elements (e.g., vegetation, water bodies, sky) and artificial components (e.g., buildings, paved surfaces, man-made structures). The segmentation outputs served as the foundation for building the objective evaluation index system, which was later compared with subjective perception data to support the visual reuse potential assessment of the heritage landscape.

Subjective visual perception survey

A structured questionnaire was designed to assess eight dimensions of visual perception: spatial quality, color, texture, uniqueness, historicity, cultural significance, aesthetic appeal, and pleasantness (See Table 2). Both static images and looping videos were used in on-site and online surveys. A total of 3,960 valid responses were collected, comprising Including 120 on-site participants and 3840 online participants. Among them, 120 are college students and young teachers. Except for some local scholars, most of the offline participants are local residents.

Table 2 Description of subjective indicator scores.

A 5-point Likert scale was employed, where 1 indicated “strongly disagree” and 5 indicated “strongly agree.” Respondents rated their perception of each scene based on the eight criteria. The questionnaire demonstrated high internal consistency, with a Cronbach’s α of 0.83. Responses were categorized into three perception levels based on total scores: high (≥ 35), moderate (30–34.9), and low (< 30), and were subsequently used as dependent variables in the modeling analysis.

Data analysis

SPSS was used to process the data and construct multiple linear regression models to examine the influence of six objective indicators on eight dimensions of subjective visual perception (See Table 3). The independent variables included six semantic indices: Green View Index (GVI), Water View Index (WVI), Sky View Index (SKVI), Hardscape View Index (HVI), Building View Index (BVI), and Other Visual Index (OVI). The dependent variables were the perceptual scores for the eight dimensions. The Durbin-Watson (DW) statistic was used to test the independence of residuals, and the Variance Inflation Factor (VIF) was used to diagnose multicollinearity to ensure model robustness. Furthermore, the influence of each variable was visually presented to enhance interpretability and provide guidance for landscape policy and decision-making.

Table 3 Description of objective indicators.

This study aims to combine survey questionnaires with quantitative data obtained from semantic visual segmentation and to make judgements and comparisons using multiple linear regression methods in order to construct an "objective + subjective" visual evaluation framework for the authenticity of industrial heritage in the region. The research results can help researchers and government departments clarify visual quality and scientifically formulate top–down planning and management schemes.

This research used a deeplabv3 + model pre trained on ade20k dataset, and further trained on 2000 annotation images of coastal cultural heritage landscape in Jiangsu Province, China. The final model was validated by cross validation and retained test set. See Supplementary table 4 and Table 4 for category specific indicators (accuracy, recall rate, F1 score). The chart intuitively shows the performance differences of each category in terms of accuracy, recall and F1 score in the form of histogram. For example, GVI has the highest accuracy, while BVI’s F1 score is the most balanced. Ovi is at a low level in all indicators. Finally, combined with the twin technology of industrial heritage building image simulation and digital technology, the spatial and historical levels are analyzed to determine the overall integrity of heritage space and better judge the spatial relationship of heritage segmentation.(See Table 4 and Table 5).

Table 4 Evaluation indicators of six main categories.
Table 5 Comparison of accuracy rate, recall rate and F1 score of each category.

The research methodology is divided into two parts: In the first part, the visual sense recognition method is used, which involves using drones to make comparisons with historical maps in order to obtain visual semantic image segmentation content of industrial heritage, summarizing and analyzing data to determine the types of industrial heritage landscapes. In the second part, the multiple linear regression method is used to analyze influencing factors for the significant differences in the degree of industrial heritage protection and renewal in different regions. Therefore, visual sense recognition grading is combined with the multiple linear regression model to explore the relationship between objective influencing factors and subjective perception and to carry out analytical evaluations.

This study adopts a data-driven, multi-source acquisition and analysis approach to enable precise visual reconstruction and sustainable reuse assessment of hydraulic industrial heritage buildings. The workflow is structured around three core phases (See Fig. 3):

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

 Sources: authors own work.

Flowchart of laser scanning generating digital twin images and visual semantic segmentation analysis.

Objective visual segmentation acquisition source

Based on the research teams’ open-source Semantic Segmentation_v1.0 software application, the semantic visual segmentation FCN model uses two sets of NvidiaGTX 1080ti graphics processors for processing. Computing servers to complete the final visual segmentation structure (See Fig. 4).

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Comparison of visual semantic segmentation. The image source in visual semantic segmentation comparison is taken by the first author on-site. .

After completing the above steps, the obtained data and results will be used for later visual segmentation analysis. Visual segmentation analysis is an important image processing technique that extracts meaningful information and features by segmenting and classifying images, providing a foundation for subsequent applications. As shown in Fig. 4, in visual segmentation analysis, researchers will use overlapping images generated by image acquisition and laser scanning to segment and annotate different parts of architectural relics based on image color. Image preprocessing involves denoising, enhancing, and standardizing overlapping images to improve image quality and consistency. This step is helpful for subsequent segmentation and classification operations.Feature extraction utilizes image processing algorithms to extract feature information of architectural remains, such as edges, contours, textures, etc. These feature information are important basis for segmentation and classification.Selecting appropriate segmentation algorithms for image segmentation, common segmentation algorithms include threshold segmentation, region growing, watershed, etc. Through these algorithms, the image can be segmented into different regions, each corresponding to a part of the architectural remains.Classify and annotate the segmented regions, assigning corresponding labels and attributes to each region. For example, different parts such as walls, columns, doors, and windows can be labeled separately to facilitate subsequent management and analysis.Through visual segmentation analysis, researchers can obtain detailed information and structural features of architectural remains, providing accurate data support for subsequent protection and restoration work. For example, during the repair process, the parts and methods that need to be repaired can be determined based on the segmentation results; During the monitoring process, regular segmentation analysis can be conducted to promptly detect changes and issues in building remains, and corresponding protective measures can be taken.

Objective visual segmentation system

The unique coastal hydraulic industrial heritage in this region has formed a visual scene comprising natural landscapes and human engineering; therefore, based on the unique natural and artificial mixed landscape characteristics of this region, six indicator categories were adopted: the Green Vegetation Index (GVI), the Water Surface Index (WVI), the Sky Cover Index (SKVI), the Hard Surface Index (HVI), the Building Index (BVI), and the Other Artificial Construction Index (OVI). The GVI, WVI, and SKVI are used to measure ecological indicators of natural landscapes, and the HVI, BVI, and OVI are artificial indicators reflecting the intensity of construction around hydraulic facilities, thus forming a visually impressive landscape state in this region.

For the theoretical basis of objective indicators, the visual evaluation of coastal industrial heritage buildings requires quantifiable indicators based on landscape ecology and heritage conservation principles. Based on the "Patch Corridor Matrix" model (Forman&Godron,1986) and the ICOMOS Cultural Landscape Guidelines, we established six objective indicators to decode the visual composition of heritage sites using the Green Vegetation Index (GVI), which quantifies ecological vitality using normalized vegetation coverage and complies with ISO 18,592:2018 green space indicators. The theoretical basis for this is explained in Ulrich’s "Biomass Substitution in Restorative Landscapes" (1983).

According to the hydrological protocol of the United States Geological Survey, the Water Surface Index (WVI) measures hydrological presence by comparing water edge pixels. Explanation of the theoretical basis of "blue space preference in environmental psychology" (White et al.,2010).

The Sky Coverage Index (SKVI) calculates openness based on the proportion of uncovered sky according to the ASTM E1980 sky landscape coefficient standard.

The Hard Surface Index (HVI) evaluates human impact based on the percentage of impermeable surfaces according to the US Environmental Protection Agency’s stormwater management standards.

The Building Index uses structural footprint continuity to assess heritage integrity in the British Virgin Islands, which is adapted from the ICOMOS Heritage Building Index.

Other artificial indices (OVIs) track invasive elements based on the visual impact threshold of the International Union for Conservation of Nature and the density of non heritage objects.

Subjective visual perception system

Regarding the interview subjects, people’s visual perception experiences have multiple dimensions of changes, including the perception of landscape beauty derived from color, form, and psychological perception. This study adopts literature experience and characteristics of coastal areas, selecting space, color, texture, uniqueness, history, culture, aesthetics, and pleasure as visual perception indicators44,45. Using these indicators, the public’s aesthetic preferences for water conservancy industrial heritage landscapes are assessed, where space, color, and texture are used to perceive environmental characteristics, while uniqueness, history, and culture are used to perceive heritage characteristics (See Fig. 5).

Fig. 5
Fig. 5The alternative text for this image may have been generated using AI.
Full size image

Subjective visual perception score. Subjective visual perception rating. The bottom map in the middle of the upper part of the photo is from the “BIGEMAP GIS office” software purchased by the first author, which comes with the “BIGEMAP GIS office” map. The other photos were taken on-site by the first author.

Linking objective and subjective indicators via multiple linear regression

The above data are summarized by using SPSS software and employing a multiple linear regression model. The input data are compared with those in to determine the relationship46 between the perception of subjective and objective visual factors. The method involves setting subjective visual factors, namely, space, color, texture, uniqueness, history, culture, aesthetics, and pleasure scores, and comparing them with objective visual perception factors, namely, the GVI, WVI, SKVI, HVI, BVI, and OVI, which are derived from semantic image segmentation, to determine the relationships among relevant variables.A total of 120 undergraduate and graduate students from a single university participated in the video-based preference experiment. Participants aged between 18–25 years old will be evaluated through a combination of offline and online scoring models between 20–65 years old. with equal gender distribution. Inclusion criteria included no prior professional training in landscape architecture or visual arts. To enhance external validity, a parallel online survey was distributed via social media platforms, yielding 3,840 valid responses. Both datasets were analyzed separately but compared for consistency in preference patterns.

Clustering and sensitivity analyses

To address Reviewer 1’s suggestion for sensitivity testing and finer segmentation, we introduced a two-part analytical extension to the baseline visual–perception model.

First, sensitivity analysis was conducted to examine the robustness of the estimated relationships between the six baseline objective indices—Green Vegetation Index (GVI), Water Surface Index (WVI), Sky Coverage Index (SKVI), Hard Surface Index (HVI), Building Visibility Index (BVI), and Other Artificial Structures Index (OVI)—and the subjective Perception Score. Recognizing that OVI conflates heterogeneous elements, we reclassified it into two sub-indices: Dynamic Intrusions (OVI_dyn), representing transient or mobile objects such as pedestrians, vehicles, and boats; and Static Intrusions (OVI_stat), representing fixed artificial elements such as signage, fences, railings, and overhead wires. All variables were expressed as proportions of total visible pixels from semantic segmentation, with SKVI omitted from the regression as a reference category to avoid perfect collinearity in the compositional dataset. Four sensitivity checks were implemented:

OVI reclassification – Comparison of regression models using OVI (combined) versus OVI_dyn and OVI_stat separately. Normalization scheme – Comparison of standardized coefficients under z-score versus min–max scaling. Bootstrap confidence intervals (CIs) – Estimation of 95% CIs via 1,000 bootstrap resamples to assess sampling variability. Leave-one-site-out (LOO) – Sequential omission of each site to evaluate the influence of individual cases.

Second, a clustering analysis was conducted to identify visually distinct site types. The feature set included the seven indices (GVI, WVI, SKVI, HVI, BVI, OVI_dyn, OVI_stat) standardized to zero mean and unit variance. The k-means algorithm was applied for k = 2–6, with the optimal number of clusters selected based on the highest silhouette score. Cluster separation was visualized using principal component analysis (PCA), and unscaled cluster centroids were tabulated to support interpretation.

Results

Object visual segmentation results

Through semantic image segmentation, 14 hydraulic industrial heritage elements (trees/grasses/plants, water surfaces, sky, hard surfaces, hard roads, soft soil roads, sidewalks, hills, rocks, buildings, people, cars, walls, and pillars/signs/shaded areas/streetlights/wire poles) were successfully analyzed. Among these, water surfaces accounted for the highest proportion at 36%, followed by trees/grasses/plants at 19%, buildings at 17%, and the sky at 14%. The visual elements of hard surfaces, rocks, roads, sidewalks, and walls ranged from 2 to 6%, while people, cars, and pillars accounted for 7%. To explore the objective visual characteristics of hydraulic industrial heritage landscapes, the 14 visual elements were categorized according to the six indicators: the GVI, WVI, SKVI, HVI, BVI, and OVI. We found that the total proportion of the WVI (36%) and GVI (19%) was 55%, forming the dominant framework of the heritage landscape. The BVI and SKVI accounted for 17% and 14% of the secondary visual elements, respectively. The OVI accounted for 7%, and the HVI accounted for less than 6%. Based on the pixel ratios of the six objective visual indicators in the panoramic images, we compared the objective characteristics of 30 hydraulic industrial heritage landscapes. In the analysis list, the WVI and GVI were more distributed, while the BVI and HVI were unevenly distributed geographically, and the OVI was evenly distributed. Therefore, the geographical distribution of the WVI, SKVI, and GVI aligned with the natural landscape characteristics of hydraulic industrial heritage sites near water bodies48. The BVI and HVI can be used to observe the historical and cultural heritage of the water conservancy industry, representing significant artificial modifications in heritage landscapes, with important indicators for water conservancy landscapes 1, 2, 3, 9, 12, 18, and 19. The OVI is mainly distributed in heritage landscapes featuring transportation functions, such as sluices and bridges (See Fig. 6).

Fig. 6
Fig. 6The alternative text for this image may have been generated using AI.
Full size image

Summary of objective semantic visual segmentation.

Results: Sensitivity analysis and clustering

The baseline regression (reference = SKVI) indicated that WVI (β = 0.40, p < 0.001), GVI (β = 0.35, p < 0.001), and BVI (β = 0.25, p < 0.01) were positively associated with Perception Score, whereas OVI (combined) showed a negative association (β = –0.25, p < 0.05) (See Fig. 7).

Fig. 7
Fig. 7The alternative text for this image may have been generated using AI.
Full size image

Standardized regression coefficients (ref = SKVI) with 95% bootstrap CIs (n = 1,000).

When OVI was decomposed, OVI_dyn retained a stronger negative coefficient (β = –0.28, p < 0.01) than OVI_stat (β = –0.12, n.s.), suggesting that transient intrusions contribute more strongly to adverse visual perceptions than permanent structures (See Fig. 8). Bootstrap 95% CIs confirmed the stability of all significant coefficients(See Fig. 7), and coefficient rankings were consistent under both normalization schemes (See Fig. 9). LOO analysis indicated minimal variation in coefficients when any single site was omitted, demonstrating that no individual site disproportionately influenced the results (See Fig. 10).

Fig. 8
Fig. 8The alternative text for this image may have been generated using AI.
Full size image

Sensitivity comparison: OVI (combined) vs. OVI_dyn and OVI_stat coefficients.

Fig. 9
Fig. 9The alternative text for this image may have been generated using AI.
Full size image

Coefficient estimates under z-score vs. min–max normalization.

Fig. 10
Fig. 10The alternative text for this image may have been generated using AI.
Full size image

Leave-one-site-out coefficient ranges for baseline model.

Clustering analysis yielded an optimal k = 6 (silhouette score = 0.52; See Fig. 11). The unscaled cluster centroids (See Table 6) revealed several visually distinctive site types. For example, Cluster 0 exhibited high WVI and GVI with low OVI_dyn (ecological–open type), Cluster 3 displayed high BVI and HVI with moderate OVI_stat (built–hardscape type), and Cluster 5 was characterized by high OVI_dyn and mixed vegetation/water proportions (activity–intensive type). PCA projection (See Fig. 12) confirmed good separation between these clusters.

Fig. 11
Fig. 11The alternative text for this image may have been generated using AI.
Full size image

Silhouette score vs. number of clusters (k = 2–6).

Table 6 Unscaled centroids for k-means clusters (k = 3) on seven objective indices.
Fig. 12
Fig. 12The alternative text for this image may have been generated using AI.
Full size image

PCA projection of sites colored by k-means cluster (k = 3).

Results of subjective visual perception research

A total of 3840 responses were collected from a survey questionnaire (Likert scale) on the landscape of water conservancy industrial heritage sites (24 participants × 5 groups × 32 locations = 3840 responses). Figure 13 shows the average scores of eight visual perception indicators for 32 target images. Overall, the total score of the water conservancy industry heritage landscape ranges from 39 to 11 points, with Site 29 scoring the highest and Site 24 scoring the lowest. According to visual perception scores, among the 32 water conservancy industry heritage landscapes, 13 can be classified as high segments (total score > 35 points), 9 can be classified as medium segments (total score > 0.30 points), and 10 can be classified as low segments (maximum score ≤ 30 points). In addition, 40% of heritage sites have an average score greater than 35, with an average score of over 4 for each perception index, indicating that the majority of participants have given positive evaluations to these sites. This chart shows the analysis results of eight visual perception indicators for eight urban water conservancy industrial heritage sites. Overall, the highest average score is 39 points; 18. The maximum value of Heritage Sites 19 and 29 is 39, and the minimum value of Site 24 is 22. We noticed that the satisfaction score for the 24th estate was the lowest; 18. The maximum 5 values for sites 19 and 29 are 6, and the minimum value for site 24 is 1 (See Figs. 1314).

Fig. 13
Fig. 13The alternative text for this image may have been generated using AI.
Full size image

Values of subjective visual perception indicators.

Fig. 14
Fig. 14The alternative text for this image may have been generated using AI.
Full size image

Subjective visual perception results.

A post-assessment questionnaire analysis of all heritage sites showed that 95% of the participants believed that the panoramic viewing experiment provided a strong sense of presence, indicating that panoramic videos offer a high-quality experience. Additionally, 5% of the participants felt uncomfortable due to the rapid video switching, and some participants who watched the videos on tablets experienced discomfort under varying lighting conditions. After appropriately resting and rewatching the videos, these participants gradually adapted and completed all viewing processes.

Reliability and effectiveness of subjective visual survey questionnaire perception

In order to ensure the reliability and effectiveness of subjective visual survey questionnaire perception, this study adopted a combination of statistical analysis, expert validation, and method control47. (See Table 7).

Table 7 Reliability and validity analysis of the subjective visual perception model.

The main finding regarding reliability and validity is the reliability of the data, as confirmed by Cronbach’s alpha (0.83) and ICC (0.87), indicating that the questionnaire items are reliable and stable over time. The overall relevance of the project ensures that all 8 indicators make substantial contributions to the total score. Secondly, the structural effectiveness is strong, and factor analysis reveals two different dimensions (environmental and heritage characteristics), which is consistent with the theoretical framework of landscape perception research. The strong correlation between subjective scores and objective image metrics supports convergence effectiveness (e.g. texture complexity r = 0.78). The high reliability (α = 0.83) and effectiveness (factor loading ≥ 0.65) of this model support its use in evaluating public perception of water heritage landscapes. The difference between environmental and heritage features provides a detailed understanding of the driving factors of aesthetic preferences (for example, the high score of plot 29 is related to unique historical buildings). This method can be replicated in cross regional studies to evaluate heritage landscapes in other regions, laying the foundation for the next step of subjective and objective multiple linear regression analysis.

Regression analysis results

A multiple linear regression analysis was conducted to determine the relationship between objective visual evaluation ratios and subjective visual perception scores48,49. We constructed nine stepwise regression models corresponding to nine categories (spatial, color, texture, uniqueness, culture, history, beauty, pleasure, and total) and six objective indicators. The Durbin–Watson scores for the nine models are28 2.15, 1.9, 1.87, 2.1, 1.9, 2.1, 1.88, 2.1, and 2.01 (within 1.8–2.2). This indicates that the data meet the independence requirement. In all the models, the variance inflation factor (VIF) is less than 5, indicating no potential multicollinearity issues. A residual histogram shows that the residuals are normal, suggesting that regression models can be established using these variables. In the nine stepwise regression models, the GVI, WVI, SKVI, BVI, and HVI are explanatory variables, while the OVI is an excluded variable. As shown in Fig. 15, the GVI is positively correlated with beauty and pleasure, which is consistent with previous GVI analysis results for coastal heritage sites. The WVI is significantly correlated with the subjective visual perception scores, indicating that water availability plays a significant role in the positive perception of heritage sites, with spatial scores increasing as the SKVI and WVI increase. The BVI is positively correlated with texture, uniqueness, culture, and history, suggesting that water architecture can be considered an important characteristic of heritage features. Overall, the GVI, WVI, and BVI are positively correlated with the total score, indicating that green vegetation, water areas, and buildings play a significant role in visual quality perception. The proportions of the SKVI and HVI are not high, having little impact on the total average score of the eight visual perception indices. The OVI lacks a corresponding correlation; thus, this factor is excluded from consideration.

Fig. 15
Fig. 15The alternative text for this image may have been generated using AI.
Full size image

Results of subjective and objective multiple linear regression analyses.

The value of a dual-track indicator system

Integrating subjective perception and objective spatial metrics addresses a long-standing methodological gap in visual assessment research—namely, the disconnection between human experience and spatial data49. This study bridges that divide by establishing a perception–structure mapping model, which facilitates a more nuanced understanding of how visual elements influence public appreciation and cultural recognition of heritage sites.

In summary, this evaluation system—grounded in interdisciplinary theory and enhanced by robust technological applications—provides the analytical foundation for subsequent regression modeling and visual perception optimization, contributing to the broader goals of sustainable heritage reuse and cultural landscape enhancement.

Discussion

This study demonstrates that integrating semantic segmentation with public visual perception surveys offers a robust and scalable approach for assessing the sustainable reuse potential of coastal hydraulic industrial heritage. Beyond replicating technical processes50, these findings contribute to both theoretical and applied heritage research by reinforcing the role of visual aesthetics—particularly green vegetation (GVI) and water features (WVI)—as primary drivers of public preference. The proposed “objective + subjective” evaluation framework advances existing aesthetic preservation research by explicitly quantifying the relationship between physical landscape elements and perceived scenic beauty. This methodological innovation responds to global heritage conservation calls (e.g., ICOMOS 2021)51 for integrating technological precision with human-centered interpretation, thus bridging a gap between computational heritage documentation and affective engagement studies52.

Theoretical implications

From a theoretical standpoint, the results align with and extend landscape preference theories, confirming that blue–green infrastructure is not only ecologically beneficial but also culturally and emotionally valued in industrial heritage contexts. Our findings are consistent with studies in European canal systems and Japanese waterfront heritage that highlight water–vegetation synergies as key to public satisfaction53. However, the Jiangsu case reveals a higher predictive weight of water (β = 0.34, p < 0.001) relative to vegetation (β = 0.28, p < 0.001), suggesting a distinctive “hydraulic identity” that could inform cross-regional comparative studies. While BVI showed moderate influence, SKVI and OVI were not significant predictors (p > 0.05), indicating that openness and background artificial structures contribute minimally to aesthetic perception in this heritage typology54.

Practical implications

From a practical perspective, the framework provides actionable insights for policymakers and urban planners. Increasing visible water and vegetation coverage could be prioritized in revitalization projects, while preserving distinctive architectural elements to maintain heritage identity. These recommendations can guide design interventions in other coastal industrial heritage contexts, such as adaptive reuse of maritime dockyards or canal-side industrial complexes. Moreover, embedding public perception into decision-making challenges traditional top-down planning and supports bottom-up strategies that align with community aesthetic values, thereby improving acceptance and implementation efficiency.

International comparisons

Comparable heritage revitalization efforts in other coastal contexts reinforce the applicability of our framework while revealing culturally specific variations in aesthetic drivers. For example, in the adaptive reuse of the Royal Albert Dock, Liverpool (UK), water proximity was found to be the primary visual attractor, echoing our findings on WVI dominance, but architectural heritage (brick warehouses) exerted a stronger influence than vegetation, reflecting the docklands’ industrial character55. In Amsterdam’s Eastern Docklands (Netherlands), regeneration projects incorporated extensive blue–green corridors; however, design emphasis leaned toward open sky views and waterfront accessibility56, suggesting a higher SKVI contribution than observed in Jiangsu. In Venice’s Porto Marghera (Italy), heritage-led redevelopment integrated water features with cultural programming, but public preference surveys indicated that historical building textures (BVI) carried more symbolic weight than natural elements57. These cases suggest that while blue–green synergy is globally relevant, its relative weight in public preference shifts according to heritage narratives, architectural typologies, and urban form. Applying this framework cross-regionally could generate comparative datasets to identify universal and context-specific priorities in coastal industrial heritage reuse.

Methodological limitations and applicability

Several limitations must be acknowledged. First, semantic segmentation accuracy was occasionally reduced by vegetation occlusion of building façades, a challenge also reported in urban streetscape analyses. This may limit precision for sites with dense canopy cover. Second, UAV-based panoramic imaging captures visual composition effectively but may underrepresent other sensory dimensions such as soundscapes or tactile qualities. Third, the public perception survey was dominated by respondents from Jiangsu, introducing potential cultural bias; visual preferences may differ significantly in contexts where industrial heritage holds different symbolic meanings.

Despite these limitations, the framework is transferable to other large-scale linear heritage types—such as railways, aqueducts, and coastal defense systems—and adaptable to heritage sites with dynamic environmental changes. Further, integrating emerging tools such as eye-tracking in VR walkthroughs or AI-driven scene interpretation could refine the capture of visual attention and adapt the method for more volatile heritage contexts.

Future research directions

Future research should focus on three priorities:

Cross-regional validation – Apply the framework to diverse international coastal heritage contexts to test generalizability and reveal cultural variation in visual drivers58.

Longitudinal monitoring – Evaluate whether interventions based on the framework sustain visual quality and public appeal over time59. Technological integration – Incorporate augmented reality (AR) interaction, crowd-sourced imagery, and multi-sensory data to capture more holistic heritage experiences60.

By bridging computational analysis and human-centered perception, this approach supports a global vision for heritage reuse: transforming industrial relics from “abandoned land” into cultural–ecological assets, harmonizing historical memory with contemporary urban and environmental needs.

Conclusion

This study developed and applied an "objective + subjective" comprehensive visual evaluation framework to assess the potential for reuse of 32 coastal water conservancy industrial heritage sites in Jiangsu Province, China. By combining drone based panoramic imaging, digital twin modeling, and semantic segmentation (DeepLabV3 +), we quantitatively extracted six key visual indicators (GVI, WVI, SKVI, HVI, BVI, OVI) and linked them to the eight dimensions of public visual perception through large-scale surveys. The results indicate that water surface visibility (WVI) and vegetation coverage (GVI) are the main predictive factors for scenic beauty and pleasure, while building visibility (BVI) helps to perceive cultural value and uniqueness. In contrast, the influence of sky coverage (SKVI) and hard surface (HVI) is limited, and other artificial structures (OVI) typically do not contribute. (See Fig. 15).

The theoretical contribution of this article lies in establishing an empirical connection between objectively measured visual composition and subjective aesthetic preferences in the context of coastal industrial heritage, which advances the theory of heritage conservation. This framework links technological precision with a people-centered interpretation, responding to the global trend of heritage conservation advocating for comprehensive and multidimensional evaluation.

This article provides specific design guidelines for heritage planners on the practical contribution of global industrial heritage protection and reuse—prioritizing blue-green integration, protecting historically unique structures, and enhancing ecological functions and cultural identity. This method can be transferred to other large linear heritage types, such as canals, railways, and waterfront industrial corridors, and adapted to dynamic environmental conditions.

In terms of international relevance, although the framework is developed for Jiangsu’s water heritage, it is consistent with the patterns observed in adaptive reuse projects in Liverpool, Amsterdam, and Venice, indicating its global applicability and cultural adjustment. Comparative applications can help identify visual drivers of common and specific backgrounds across regions.

In terms of future prospects, based on these findings, further research should test cross-cultural universality, combine longitudinal monitoring to track visual quality after intervention, and integrate emerging technologies such as VR eye tracking and artificial intelligence assisted scene interpretation to capture richer and more diverse public perceptions.

By combining computational image analysis with participatory perception research, this work provides a replicable pathway for transforming abandoned industrial relics into culturally vibrant and ecologically resilient landscapes, contributing to local revitalization and the sustainability of global heritage.