Abstract
Suzhou’s modern masonry buildings hold substantial historical significance, yet they face escalating risks of deterioration due to regional climate fluctuations and anthropogenic influences. Prompt detection of these issues is essential for effective conservation and restoration. This study integrates UAV technology and deep learning for pathology detection, focusing on five categories: material loss (ML), discoloration and deposits (DD), cracks (CR), surface spalling (SS), and biological invasion (BI). The method integrated 3D scanning and oblique photogrammetry to enable automated facade analysis, effectively addressing the limitations of manual inspection. A case study on Soochow Hospital demonstrated its effectiveness, using over 1200 facade images, with 781 for detection. The model achieved mAP@50 scores of 78% (ML), 47.1% (DD), 48.3% (CR), and 52.2% (SS), meeting conservation needs. This approach ultimately provides valuable technical support for the preservation of Suzhou’s masonry buildings and offers new insights into the conservation of modern masonry heritage.
Similar content being viewed by others
Introduction
Gray bricks, widely used in traditional Chinese architecture with profound cultural significance1. As a nationally recognized historical and cultural city, Suzhou is home to numerous modern masonry heritage comprising diverse structures such as schools, churches, and residential buildings. These buildings embody unique historical narratives and cultural identities reflective of the region’s architectural evolution. Recognizing the cultural and historical importance of these structures, both national and local governments have designated many of them as protected heritage sites, implementing active conservation measures to safeguard their architectural integrity and preserve their cultural significance for future generations (Fig. 1).
However, Suzhou’s humid climate renders masonry building materials particularly susceptible to moisture infiltration2. As a result, some heritage sites exhibit significant deterioration, highlighting an urgent need for proactive restoration efforts.
This study conducted field investigations on samples of modern masonry historical buildings in Suzhou, referencing the Chinese code Stone and Brick Collection’s Disease and Illustration3 and the international principle ICOMOS illustrated glossary on deterioration patterns of stone4 to identify and classify common pathological categorizations observed in these structures. The key pathologies documented include the following five categories: material loss (ML), discoloration and deposits (DD), cracks (CR), surface spalling (SS), and biological invasion (BI) (Table 1). These categorizations provide a basis for targeted conservation strategies to mitigate further degradation and enhance preservation outcomes for Suzhou’s valuable architectural heritage.
Studying architectural pathologies is essential to ensuring the sustainable preservation of historical buildings throughout their entire lifecycle. The process of detecting architectural pathologies can be likened to medical diagnosis, comprising key steps such as conducting a condition survey (analogous to taking a patient’s medical history), identifying the type and distribution of pathologies (diagnosis), selecting appropriate repair measures (treatment), implementing monitoring and intervention strategies (control), and predicting future pathology development (prognosis)5.
Traditional methods for detecting architectural pathologies typically require experts to conduct comprehensive on-site surveys, a process that is often time-intensive and resource-demanding. This manual approach presents several inherent challenges: reliance on subjective judgment during manual inspections, restrictions to visible areas only, difficulty in accurately quantifying the severity of pathologies, high time and labor costs, and challenges in monitoring changes in pathologies over time6.
-
a.
Reliance on subjective judgment: Traditional manual inspections of heritage pathologies depend heavily on the expertise and subjective interpretation of inspectors. Variability in individual experience and perception can lead to inconsistencies in identifying and assessing pathology types and severity, resulting in data that may lack objectivity and reproducibility.
-
b.
Restrictions to visible areas only: Manual inspections are typically limited to easily accessible or visible sections of a structure, meaning that concealed or hard-to-reach areas may go unchecked. Consequently, potential hidden or subtle pathologies may remain undetected, increasing the risk of undiagnosed structural issues that could worsen over time.
-
c.
Difficulty in Accurately Quantifying Severity of Pathologies: Accurately quantifying the extent and severity of pathologies is a complex task in manual inspection, where estimations of damage size, depth, or area often rely on visual approximation rather than precise measurement. This lack of quantitative precision can hinder efforts to evaluate the progression of pathologies over time and complicate comparisons between different areas or structures, limiting the ability to prioritize areas for conservation accurately.
-
d.
High time and labor costs: Manual inspection of architectural pathologies is labor-intensive, requiring teams of experts to spend considerable time on-site assessing each building in detail. This approach not only increases costs but also constrains the frequency and scale of inspections that can be conducted, particularly for large-scale heritage sites or in locations with limited accessibility.
-
e.
Challenges in monitoring changes over time: Detecting changes in pathologies over time is crucial for understanding their progression and determining appropriate conservation measures. Manual methods make it difficult to establish a reliable baseline or systematically track minor changes, as assessments may lack consistency across inspection periods. Furthermore, slight alterations in environmental conditions or inspector perspectives can introduce variation, hindering efforts to monitor and manage pathologies effectively across multiple time intervals.
With the growing demand for cultural heritage preservation and the significant advancements in computer vision and data processing, artificial intelligence(AI), particularly deep learning, is a transformative tool in heritage analysis7. This study leverages deep learning technology to the detection and analysis of pathologies in the gray brick of modern historical buildings in Suzhou. By integrating AI-driven techniques, we aim to explore how deep learning can enhance the accuracy and efficiency of pathology detection, thereby supporting more effective and scalable preservation strategies.
The interdisciplinary field combining artificial intelligence (AI) and the detection of pathologies in historical buildings has undergone significant evolution, reflecting advancements in computational power, data availability, and algorithmic complexity6.
Convolutional Neural Networks (CNNs) have become the predominant tool for image-based pathology detection. For instance, In 2022, Samhouri introduced the CNN model for detecting external damage in architectural heritage, to detect damages of historic structures such as erosion, material loss, color change of the stone, and sabotage issues8. The team led by Mehmet Ergün Hatir developed a deep learning model based on Artificial Neural Networks (ANN) in 2020, designed for detecting weathering types in historical stone monuments9. In 2021, the same team applied the Mask R-CNN algorithm to identify and map pathologies observed in archeological sites and monasteries, enabling the intelligent analysis and classification of damage types and severity in masonry materials10. In 2019, Wang conducted research on the application of deep learning techniques, particularly Fast R-CNN, to the rapid identification, localization, segmentation, and evaluation of surface damage in ancient buildings within the Forbidden City11. Similarly, Tawfik Masrour in 2019 employed a transfer learning approach with pre-trained Deep Convolutional Neural Network (DCNN) models to detect pathologies in old buildings. The study highlighted that previous research primarily focused on crack-related damages, often neglecting other significant pathologies affecting surface structures, such as weathering, concrete carbonation, friable plaster, water infiltration, and scaling12. In 2022, Dimitrios Loverdos demonstrated the effectiveness of the U-Net model in detecting pathologies in architectural heritage, achieving high accuracy in identifying two types of damage—brick weathering and cracks13.
However, the need for faster detection in real-time applications led to the development of single-stage object detection models, such as the YOLO (You Only Look Once)14. Unlike two-stage models, these architectures directly predict object classes and bounding boxes in a unified framework, balancing detection speed.
Since the introduction of YOLO (You Only Look Once), one-stage target detection methods have garnered significant attention in computer vision research15. Unlike two-stage target detection methods, such as R-CNN, which involve an initial region proposal step followed by category prediction, YOLO consolidates target localization and categorization prediction through a single forward pass network, this streamlined approach eliminates the need for an additional region proposal network, substantially enhancing detection speed and computational efficiency. These advantages make YOLO particularly well-suited for large-scale heritage pathology detection, where rapid and accurate processing of extensive datasets is essential.
In recent years, the YOLO algorithm has undergone significant development, and its various iterations have achieved notable success in the study of architectural heritage conservation, particularly in the identification of pathologies in brick walls, wooden structures, and roof tiles16,17,18,19. Scholars like Ma have used the YOLOv5s model to detect cracks in wooden structures of ancient buildings, demonstrating that this model can quickly identify cracks and similar issues, providing greater efficiency compared to traditional manual detection methods16. Similarly, in 2023, Idjaton utilized YOLOv5 to detect and identify pathologies, such as masonry spalling, cracks, and color changes18. Li applied YOLOv4 to detect gray brick damage on the Shanhaiguan Plain Great Wall, enabling quick identification of damaged areas without altering the appearance, and facilitating timely repairs20. Furthermore, the same year, Yang conducted a pathologies analysis on gray bricks in Macao, employing YOLOv4 to identify various deterioration types, highlighting the efficacy of deep learning models in pathologies detection19. In 2024, Yan utilized YOLOv4 to analyze and identify tile pathologies in the classical gardens of Suzhou, thereby expanding the application scope of computer vision in cultural relics protection17. Karimi proposed a system that combines YOLOv7 to provide an automatic visual inspection solution for tile pathology in Portuguese historical buildings21. Additionally, Zou proposed an improved YOLOv8-seg segmentation model, which was applied to the pathologies detection and visualization analysis of the masonry tower in Guangdong Province22. These studies underscore the feasibility and practicality of the YOLO algorithm in cultural relics protection and offer valuable insights for research on architectural heritage materials preservation (Table 2). While progress has been made in developing these models, further exploration and validation of their continued application in heritage conservation are still needed.
Current detection tasks based on YOLO series algorithms generally present results in the form of rectangular bounding boxes. However, given the irregular boundaries and complex boundaries characteristic of many pathologies, the rectangular boxes frequently capture extraneous information. Moreover, this simplified bounding approach imposes notable limitations on subsequent quantitative analyses, particularly in accurately representing the shape contours and precise areas of pathologies, as a result, this restricts the ability to conduct detailed quantitative assessment, thereby constraining the depth of analysis for heritage pathology data. While existing research has made strides in recognizing and detecting pathology images, many studies lack a systematic approach to visualizing the pathology distribution across building facades, thus failing to effectively integrate the spatial distribution information of different pathologies. Zou also asserts that this shortcoming results in a lack of comprehensive visual representation of pathology data22.
To overcome these limitations, this study advances beyond the traditional use of rectangular bounding boxes in object detection by prioritizing instance segmentation and employing precise contour annotation techniques to accurately capture the boundaries and areas of pathologies.
Furthermore, high-resolution facade images obtained via drone technology are integrated with point cloud models, facilitating rapid and detailed visualization of pathology distribution. Recent advancements have seen the integration of deep learning models with complementary technologies to enhance detection capabilities. Multimodal methods, such as combining visual data with thermal imaging and hyperspectral imaging, have provided a more holistic assessment of pathologies23,24,25. Additionally, drone technology equipped with high-resolution cameras and AI-driven models has enabled large-scale, high-precision inspections of heritage sites, addressing accessibility challenges in complex structures26.
In summary, deep learning-based pathology detection has emerged as a prominent research focus in the field of heritage conservation. Initially centered on detecting single types of pathologies, this domain has gradually evolved to encompass the classification and identification of multiple types of material damage. Correspondingly, pathology datasets have expanded to include diverse materials such as red brick, gray brick, stone, rammed earth, tiles, ceramics, and wood. The selection of models has also undergone significant development, transitioning from two-stage object detection methods like CNN, R-CNN, and Mask R-CNN to single-stage detection methods such as YOLO, reflecting advancements in detection efficiency and applicability.
This approach not only enhances detection accuracy but also significantly improves the spatial representation of pathologies, offering a more comprehensive and efficient framework for analyzing and preserving heritage. The objectives of this research are to (1) enhance detection adaptability and efficiency, (2) improve the spatial representation of pathologies for comprehensive analysis, and (3) provide a scalable and systematic framework for the sustainable preservation of Suzhou’s masonry heritage buildings and similar architectural sites. This approach aspires to set a new standard in heritage pathology detection and analysis, supporting more effective conservation strategies.
Methods
Computer vision technology offers valuable new tools for detecting pathologies in architectural heritage, with core applications including image classification, object detection, and image segmentation27. This section presents the methodological framework used to detect and visualize pathologies in masonry historical buildings, combining deep learning techniques with UAVs to enable comprehensive mapping of pathology distribution.
In this study, we established a specialized pathology dataset for gray brick materials in modern buildings across Suzhou. The images were captured between September 2023 and June 2024, focusing on the facades of modern brick buildings under natural daylight. In total, 1000 images were collected, each capturing various types of pathologies on the building’s facade. Field photographs were used as the primary data source, with representative samples carefully selected to capture the diversity of pathology expressions of modern heritage. Preprocessing phase included image cleaning is performed to remove blurred or low-quality images, followed by labeling using Labelme to prepare for instance segmentation28 (Fig. 2).
For pathology detection, we employed the YOLOv8-seg model, configured to train over 100 training epochs. The model training was conducted on NVIDIA-A40 GPUs, allowing for enhanced computational efficiency. Model performance was rigorously evaluated using standard metrics, including precision, recall, and mean Average Precision at a 50% IoU threshold (mAP@50), to ensure the accuracy and reliability of the pathology detection.
YOLOv8-seg extends the traditional YOLOv8 framework by integrating instance segmentation capabilities29, enabling it to precisely identify and segment object boundaries rather than merely providing bounding boxes. This architecture is designed to enhance detection efficiency, accuracy, and segmentation quality, making it well-suited for applications like pathology detection in heritage conservation. The YOLOv8-seg model workflow consists of four primary components: Input, Backbone, Neck, and Detection Head, which together facilitate better image feature extraction and intelligent detection and classification30 (Fig. 3).
-
a.
Input: The input component preprocesses raw images through resizing, normalization, and data augmentation (e.g., rotation, flipping, scaling). This step ensures that the model can generalize better across diverse and complex images, especially those with irregular pathology boundaries.
-
b.
Backbone: The Backbone, built with convolutional layers and CSP (Cross Stage Partial) structures, extracts multi-level features efficiently by promoting feature reuse and reducing computation. Spatial Pyramid Pooling (SPP) near the end allows the model to handle varying object scales and capture richer layer, essential for precise pathology segmentation.
-
c.
Neck: The Neck enhances Backbone features via upsampling, downsampling, and additional CSP blocks. This multi-scale fusion captures both fine details and global context, enabling the model to detect pathologies of different sizes and textures.
-
d.
Detection Head: The Detection Head produces pixel-level segmentation maps using specialized output layers. With upsampling and CSP-enhanced convolutional layers, it refines region boundaries and generates accurate segmentation masks for quantitative analysis.
To further validate the model’s effectiveness in real-world heritage conservation, we applied it to facade images of the historical site of Soochow Hospital, captured using drone technology. The pathology detection results were then projected onto orthophotos of the building facade generated from the 3D point cloud models. This integration enabled precise spatial mapping of pathologies across the facade, enhancing the visual and quantitative understanding of degradation patterns. Ultimately, this study combines UAV technology with the YOLOv8-seg deep learning models to develop an automated process for detecting gray brick pathologies in modern Suzhou buildings. In this study, we captured high-resolution images of Suzhou’s masonry buildings using a DJI Mavic 2 Pro drone, equipped with a Hasselblad camera featuring a 1-in. CMOS sensor. The camera was set to ISO 100 to reduce noise, with the aperture ranging from f/5.6 to f/8 for daylight. The drone followed an S-shaped flight path at a maximum altitude of 25 m, ensuring 70% image overlap for comprehensive facade coverage and seamless stitching. By leveraging the capabilities of deep learning, this approach streamlines the detection processes, emphasizing the critical role of high-resolution image detection, automated remote batch processing, and quantitative data analysis in heritage conservation.
Results
The gray brick of Suzhou modern masonry historical buildings pathologises dataset
A comprehensive dataset is crucial for training and validating deep learning models31. In this section, we introduce the “Gray Brick Pathologies Dataset,” specifically compiled for Suzhou’s modern masonry historical buildings. The dataset includes five typical pathologies identified in modern masonry historical buildings in Suzhou.
Our team collected a total of 1000 images of brick pathologies, capturing a range of times, locations, and climate conditions to ensure diverse representations. Following a preliminary quality screening, 895 high-quality images were selected for further processing (Table 3). Using the LabelMe, our team meticulously annotated and processed these images, emphasizing precise boundary annotation to enhance the dataset’s ability to represent pathologies with complex shapes and irregular distributions.
After the annotation process, the dataset was verified by two experienced heritage experts and five historic building conservation specialists in historical building conservation, ensuring high accuracy and consistency in the annotations. The final dataset was then converted and divided into training and validation sets in a 4:1 ratio, yielding 721 training samples and 174 validation samples. These steps ensure a high-quality and diverse dataset, providing for subsequent model training and validation.
Evaluation metrics for pathologies detection model performance
The evaluation metrics for the experiment encompass both detection box (Box) and segmentation mask (Mask) metrics. Precision indicates the model’s accuracy in detecting pathologies, with values approaching 1 reflecting high accuracy and low false positive rates. Recall measures the model’s ability to identify actual pathology samples, with values close to 1 signifying strong detection and low false negatives. Mean Average Precision (mAP) serves as a comprehensive evaluation metric for both object detection and image segmentation tasks. Specifically, mAP@50 indicates average precision at an Intersection over the Union (IoU) threshold of 0.5, requiring at least 50% overlap between predicted and ground truth boxes. Moreover, mAP@50-95 extends this by calculating average precision across IoU thresholds from 0.50 to 0.95, providing a more detailed view of the model’s accuracy across varying levels of overlap. Given the class imbalance in the pathologies dataset, with certain pathology types being more prevalent than others, this study prioritizes mAP@50 as the primary performance metric. This focus enables a balanced evaluation of the model’s ability to accurately detect pathologies across the diverse classes represented in the dataset, ensuring a practical approach to assessing the model’s effectiveness in heritage conservation applications.
In the validation of pathologies detection, training was conducted using three versions of the YOLOv8-seg model: small, medium, and large, each with 100 epochs. Among these, the YOLOv8-seg-large model demonstrated the highest performance, achieving the mAP@50 of 48.6% for object detection task, and an mAP@50 was 45.9% for the instance segmentation task, outperforming the small and medium versions (Table 4) (Fig. 4). To further examine the impact of training data volume on model performance, the study also trained the YOLOv8-seg-large model using the complete dataset as well as subsets containing 80%, 60%, and 40% of the data. Results showed a positive correlation between the volume of training data and model performance, indicating that increased data diversity and volume contribute to enhanced detection accuracy and segmentation precision. This trend underscores the importance of a comprehensive dataset for achieving reliable pathology detection in heritage conservation applications.
These results indicate the model’s capability in accurately locate and identify pathologies in Suzhou’s gray brick heritage buildings. Specifically, in the object detection task, the mAP@50 metrics for various types of pathologies were as follows: material loss (ML) at 75.6%, discoloration and deposits (DD) at 43.1%, cracks (CR) at 46.5%, surface spalling (SS) at 48.5%, and biological invasion (BI) at 29.1%. For the instance segmentation task, the mAP@50 scores for these pathologies were: material loss (ML) at 75.6%, discoloration and deposits (DD) at 41.6%, cracks (CR) at 42.5%, surface spalling (SS) at 47%, and biological invasion (BI) at 22.8% (Table 5). These results reveal relatively strong performance in detecting and segmenting cracks and material loss, underscoring the model’s effectiveness in identifying these pathology types.
However, lower performance metrics for discoloration, biological invasion, and certain other pathologies indicate the challenges posed by these conditions, which may require further refinement in data annotation or model training. This variation in detection accuracy among pathology types highlights the complexities inherent in detecting different forms of damage and provides valuable insights into areas where the YOLOv8-seg model could be improved. The findings underscore the potential of this model for enhancing pathology detection in Suzhou’s gray brick masonry buildings and inform future optimization strategies to achieve even more robust results (Table 6).
-
a.
Refinement of Label Classification:Gray brick variability (color/texture/pattern) stemming from material sources and firing processes necessitates fine-grained labeling. Future severity grading (e.g., mild/severe) could further enhance diagnostic precision by encoding progression stages.
-
b.
Enhancing Sample Diversity: Co-occurring pathologies (e.g., spalling with discoloration) induce feature interference due to overlapping spatial distributions. Future research should incorporate additional samples featuring overlapping pathologies to improve the model’s robustness and accuracy.
-
c.
Standardization of Annotation Guidelines: Consistent labeling is crucial for effective model training, particularly when dealing with pathologies that exhibit irregular boundaries. To achieve uniformity, this study established explicit annotation guidelines to ensure that annotators adhere to a unified standard when handling ambiguous boundaries.
In summary, addressing these challenges requires continued research efforts focused on model algorithm and structural improvements, optimization of dataset annotation techniques, and the enhancement of methods for capturing complex pathology features. These steps will facilitate a deeper model understanding of gray brick pathologies and enable more precise and reliable detection, ultimately advancing conservation efforts for heritage buildings.
To improve pathology detection precision and adaptability, the learning rate was reduced (lr0 = 0.005, lrf = 0.0001) for stable convergence. Data augmentation techniques, including high-resolution inputs (imgsz = 1280) and transformations such as vertical flipping, mixup, and scaling, enhanced robustness. The improved model achieved mAP@50 of 51.8% for the instance segmentation task, a 5.9% increase over the original model (Table 7).
Moreover, this study also compared the model’s instance segmentation performance with YOLOv9e-seg, YOLOv10l-seg, and YOLOv11l-seg. YOLOv8l-seg offers significant advantages in terms of lightweight design and inference efficiency. In contrast, YOLOv9e-seg performs weaker overall, particularly in detecting BI types, with higher model complexity and inference time (Table 8). YOLOv10l-seg demonstrates a balanced performance, with mAP@50 of 52.6% (Box) and 47.8% (Mask), and achieves the highest Mask mAP@50 of 76.3% for ML types, but still requires improvement in BI detection (Table 9). YOLOv11l-seg performs between YOLOv8l-seg and YOLOv10l-seg, excelling in ML and DD types, but showing limited detection ability for BI types (Table 10). Overall, YOLOv8l-seg is the most balanced model in terms of performance and inference efficiency, making it the optimal choice in this study, despite YOLOv10l-seg and YOLOv11l-seg showing better results in certain pathology types.
Manual verification of pathology data analysis for the historical site of Soochow Hospital
Recent studies have indicated that combining the results of pathology detection models with UAV technologies can facilitate multi-height data collection and detection for buildings22,32. For this study, the historical site of Soochow Hospital at Soochow University was chosen as a case study due to its significant historical background and contemporary relevance. Built in 1922, the building initially served as the inpatient wing of Soochow Hospital, founded by John A. Snell, who was also instrumental in establishing Suzhou’s first modern brick factory, the Soochow Brick & Tile Company, in 192133. This period marked a notable increase in brick production efficiency, which contributed to the development of modern architecture in Suzhou.
Today, the historical Soochow Hospital site functions as a teaching facility at Soochow University. Its gray brick facade and a four-story structure that make it an ideal candidate for automated pathology detection and analysis. This study involved comprehensive 3D scanning of the historical site of Soochow Hospital using UAV oblique photography with the DJI Mavic 2pro. Following an S-shaped flight path, 55 images of the building facade were captured, resulting in a detailed orthophoto projection that provides essential data for subsequent pathology analysis (Fig. 5).
Based on the east facade image of Soochow Hospital, an equal-sized segmentation was performed, resulting in a total of 1200 images, of which 781 are brick images for detection, which were input into the model for analysis. We used a lower confidence threshold (conf = 0.2) and a relaxed Intersection over Union (IoU = 0.3). This approach boosts recall, capturing more potential targets even at lower confidence levels, which is especially beneficial for complex or ambiguous regions in facade images. Although this may increase false positives, it ensures broader target detection, enhancing overall coverage and detection effectiveness. These images were subsequently input into the model for analysis. Traditionally, the manual creation of pathology distribution maps requires a gradual assessment and annotation of each pathology area, a process that takes ~5.8 h. In contrast, the model-based detection approach allows for rapid, automated output and processing of detection images, reducing the total time to just 1.1 h, thus demonstrating a substantial time-saving advantage. The model’s time efficiency was further highlighted by its speed: the processing time per image was ~2.7 ms, with the inference process taking 570.5 milliseconds and post-processing requiring only 2.3 milliseconds. After manual verification, and based on a 50% overlap criterion between the predicted bounding box and the ground truth, 684 images were identified as having valid detections. This resulted in an effective detection rate of 87.6%, with a false detection rate of 4.4% and a miss detection rate of 8% (Table 11). These results underscore the model’s accuracy and efficiency, highlighting its potential for streamlined and reliable pathology detection in heritage conservation applications.
Following pathologies detection, further quantitative analysis becomes crucial to understand the extent and impact of various forms of damage. Leveraging the outputs of the YOLOv8l-seg model, including bounding boxes, segmentation masks, and confidence scores, key information regarding the distribution and area of pathologies can be obtained (Fig. 6). This foundational data allows for a comprehensive statistical evaluation of pathologies, encompassing metrics such as the area of the detected damage.
The pathology detection conducted on the facade of the historical site of Soochow Hospital yielded significant quantitative data regarding the various types of damage present. Among these pathologies, discoloration and deposits (DD) accounted for 49.35%, making them the most significant concern, followed by biological invasion (BI), which constituted 39.72%. Surface spalling represented 10.81%, while material loss was minimal at 0.1%, and cracks accounted for only 0.01%.
To further assess the severity and spatial distribution of these pathologies, each detected pathology is assigned a severity weight, determined by expert evaluation, to reflect its potential impact on the building structure. The severity weights are: material loss (ML) = 0.8, discoloration and deposits (DD) = 0.6, cracks (CR) = 1, surface spalling (SS) = 0.7, and biological invasion (BI) = 0.9. Using the severity weights and area proportions of each pathology, we generate a heatmap that reflects both the extent and damage severity (Fig. 6). The area proportion of each pathology mask is first calculated, with larger masks contributing more to the overall damage severity distribution map. The intensity for each region is then determined by multiplying the mask’s area proportion by its severity weight. By integrating both the spatial extent and severity of each pathology, the heatmap offers a comprehensive view of the building’s condition. This allows for quick identification of the most severely damaged areas, helping prioritize conservation and repair efforts.
Pathology diagnosis results analysis and cause assessment
The eastern facade of Soochow hospital surrounding environment is open, with no obstruction from trees or nearby structures, resulting in strong ultraviolet exposure, especially on the southeastern side between and beneath the windows. Five pathologies have been identified through diagnostic analysis:
-
a.
Material loss primarily originates from synergistic environmental and anthropogenic factors, where wind/rain erosion coupled with physical impacts induces progressive detachment of brick units and mortar. Once cracks form, the thermal fluctuations between day and night reduce the adhesive strength of the surface, causing further detachment, while retained moisture promotes salt crystallization and moss growth, accelerating void formation.
-
b.
Surface spalling manifests as localized flaking, predominantly occurring at brick joints on the southeastern exterior walls. This deterioration shares causative factors with material loss, including environmental exposure and human impact. The underlying mechanism involves the decline in surface cohesion due to thermal variation, while accumulated moisture, salt crystallization, and moss growth further exacerbate the spalling phenomenon.
-
c.
Discoloration and deposits manifest as a brownish-yellow chromatic alteration across the eastern facade, with particular concentration around drainage pipes, beneath window sills, and near the plinth. This phenomenon arises from the synergistic interaction of airborne pollutants, retained rainwater, and acid-base reactions. Prolonged cyclic exposure to meteorological agents (wind/rain/UV radiation) coupled with anthropogenic factors (combustion emissions, physical abrasion) drives the deterioration process.
-
d.
Cracks appear as fissures on the surface of the gray bricks, primarily concentrated on the first-floor exterior walls. These are caused by moisture infiltration, dust accumulation, and the growth of moss and small plants in humid areas. Ultraviolet radiation accelerates the aging of the bricks, promoting the formation of cracks. Moreover, daily and seasonal temperature fluctuations lead to expansion and contraction of the materials, generating internal stresses that result in long-term cracking.
-
e.
Biological invasion is primarily attributed to the growth and penetration of plants such as ivy and moss, which are distributed on the eastern facade, particularly near the plinth, at cracks, and along edges. Natural factors like rainfall and wind promote water accumulation in pre-existing cracks, forming moist, acidic environments that corrode the masonry and create a soil-like layer that supports plant growth. Additionally, the accumulation of dust and organic matter in wall fissures contributes to the formation of a substrate that supports vegetation such as ivy.
Monitoring strategies and preventive protection recommendations
Monitoring forms the basis of conservation work, offering scientifically obtained information on how architectural deterioration is distributed and how it evolves over time. It plays a key role in supporting decision-making, identifying areas that require priority attention, and making better use of limited resources.In order to monitor the facade of the Soochow Hospital in a more systematic way, this study developed a dynamic framework integrating deep learning and UAV oblique photogrammetry. By collecting images every three months, the system helps track both the appearance and development of deterioration. To ensure practical usability, this study further develops an interactive platform based on Gradio, which allows users to upload images, adjust confidence and Intersection over Union (IoU) thresholds, and receive real-time detection outputs, including deterioration type, severity, and spatial distribution (Fig. 7). The platform uses asynchronous processing to remain stable and responsive even when used by multiple people at the same time.
Within the conservation framework, management functions as the critical intermediary linking monitoring and intervention phases. The core function of management, as emphasized in this study, is to translate monitoring data into actionable conservation strategies and to guide subsequent intervention procedures. According to the Risk Management Guide for Cultural Heritage published by ICCROM, monitoring provides data for management, while management decisions, in turn, inform the adjustment of monitoring protocols, forming a dynamic and iterative feedback loop34.
Based on this principle, this research argues that effective management strategies should be informed by analysis of deterioration severity and progression trends to accurately identify high-risk and priority areas. Strategy adjustments are recommended to respond to temporal changes, for example, prioritizing rapidly expanding deteriorations while applying low-intervention strategies to less severe areas enables differentiated and adaptive management. In terms of resource allocation, this study recommends integrating multiple criteria—including historical significance, functional use, and severity of deterioration—to construct a hierarchical conservation framework. This system ensures the efficient deployment of repair resources, with particular emphasis on the continued use and preservation of high-value architectural zones. Consequently, near-modern brick and stone buildings can continue to fulfill cultural, educational, and public service functions.
This study emphasizes that effective intervention strategies should prioritize structural safety risks and adopt reversible materials and minimally invasive techniques to preserve architectural heritage integrity and authenticity. The selection of materials and techniques is advised to take into account the physical and chemical compatibility with original building components, as well as their long-term durability. Additionally, this study advocates for environment-oriented interventions—such as the removal of invasive vegetation and the pruning of obstructive trees—to prevent further physical deterioration of building facade.
Ultimately, this study proposes a feedback system that interconnects monitoring, management, intervention, and re-evaluation processes. Such a closed-loop framework is intended to enhance the efficiency of heritage conservation and to support the long-term stability and sustainable utilization of Suzhou masonry architectural heritage.
Discussion
This study demonstrates the potential of deep learning technology in heritage pathology detection, particularly for Suzhou’s modern masonry buildings. By creating the first dataset tailored to five common pathology types in gray brick structures—five categories: material loss (ML), discoloration and deposits (DD), cracks (CR), surface spalling (SS), and biological invasion (BI). We employed the YOLOv8l-seg model, training it on 895 images across 100 epochs. The model achieved an mAP@50 of 54.5% for object detection and 51.8% for instance segmentation. The mAP@50 for material loss pathologies reached 78%, demonstrating the potential for intelligent detection to support manual visual inspection, offering a new approach to automated building pathology detection.
The proposed deep learning-based framework for heritage conservation incorporates automated detection, high-resolution image analysis, quantitative pathology data evaluation, and remote batch processing. This study also emphasized the spatial visualization of pathologies across building facades. Using the east facade of the Soochow Hospital site at Soochow University as a case study, high-resolution images captured by drones were employed for automated detection. After manual verification, this approach achieved an effective detection rate of 87.6%, providing enhanced support for visualizing pathology distribution. Compared to traditional manual inspection, this method significantly improves efficiency, making it especially suitable for large-scale pathology screening tasks in heritage conservation.
This study also establishes an interactive visual platform based on YOLOv8l-seg and Gradio to support deterioration monitoring and data sharing. It also emphasizes the close interdependence of monitoring, management, and intervention: monitoring provides a data foundation, management drives strategic implementation, and intervention results feed back into both monitoring and policy adjustments, forming an integrated conservation cycle.
One of the primary challenges identified in this study is the variability in pathology characteristics, particularly in color, texture, and form. Gray bricks, depending on their material composition and exposure history, exhibit significant differences in visual features, which can complicate pathology classification. This suggests that an expanded dataset encompassing a broader range of conditions and environmental factors could further enhance model accuracy. Specifically, additional samples collected across different lighting, weather, and seasonal conditions could improve the model’s ability to generalize and perform effectively under varying circumstances. A more extensive dataset would also enable the model to learn subtle distinctions between similar pathologies, thereby reducing misclassification and increasing reliability.
Another notable challenge lies in the model’s ability to accurately detect certain pathologies, particularly those with subtle or complex features such as mild discoloration or early biological invasion. While the YOLOv8 model demonstrated robust detection for distinct pathologies, it faced limitations when detecting less visible or overlapping damage. This highlights the need for algorithmic refinements, possibly by incorporating hybrid approaches that combine YOLO with other segmentation techniques or feature extraction methods designed to capture finer details. By integrating additional deep learning layers or multi-scale feature extraction, future models could potentially better capture these nuanced characteristics, further improving detection accuracy.
The use of high-resolution drone images for automated detection presents practical considerations as well. While high-resolution imagery enhances the model’s ability to identify pathologies accurately, it also generates substantial data volumes that can strain processing resources and slow down analysis. To address this, future research could explore optimized UAV flight patterns, targeted image capturing strategies, and more efficient data processing techniques. Implementing techniques such as image stitching and selective high-resolution cropping could reduce processing times without compromising detail, enabling faster and more manageable data collection in field applications.
Moreover, while the proposed deep learning-based approach offers significant advantages over manual inspection, practical challenges related to deployment in diverse heritage contexts remain. Although the model was effective in detecting pathologies on Suzhou’s gray brick facades, heritage buildings constructed from other materials, such as timber, concrete, or tile, present distinct degradation patterns. Future research could benefit from adapting or retraining the model to accommodate these material-specific characteristics. Additionally, developing adaptable models or expanding datasets to include other material types would broaden the applicability of deep learning in heritage conservation across different architectural styles and materials.
The findings also emphasize the value of spatial visualization of pathologies, which provides an intuitive and comprehensive view of a building’s condition. In practice, these visualizations could be integrated with Geographic Information Systems or building information models to support decision-making in heritage management. Such integrations could facilitate a more data-driven approach to prioritizing conservation efforts, allowing heritage managers to monitor and assess damage over time, allocate resources effectively, and address critical areas more promptly.
Lastly, this study underscores the necessity of a standardized approach to annotation and labeling. Developing a consistent annotation protocol was crucial in this study to ensure data reliability and model training accuracy. However, as the field of AI-driven heritage conservation grows, the establishment of shared labeling standards for architectural pathologies would benefit the broader research community by enabling more consistent data-sharing and cross-project comparability. Such standards could foster collaboration and accelerate advancements in heritage pathology detection technology.
In summary, while this study demonstrates substantial progress in automating heritage pathology detection, further advancements are necessary to refine model accuracy, broaden applicability, and address practical deployment challenges. Continued efforts in algorithm optimization, data quality enhancement, and practical validation across diverse building materials will enable deep learning to play an increasingly central role in preserving cultural heritage assets. Through these initiatives, deep learning can support scalable, accurate, and efficient conservation strategies, preserving valuable architectural history against environmental and structural degradation.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to their involvement in an ongoing research project. Public access is restricted until the research findings are fully published to ensure the integrity of the ongoing analysis and conclusions. Once the study is complete and the results are disseminated, the data may be made available upon reasonable request from the corresponding author.
Abbreviations
- ML:
-
material loss
- DD:
-
discoloration and deposits
- CR:
-
cracks
- SS:
-
surface spalling
- BI:
-
biological invasion
References
Zhao, P., Zhang, X., Qin, L., Zhang, Y. & Zhou, L. Conservation of disappearing traditional manufacturing process for Chinese grey brick: field survey and laboratory study. Constr. Build. Mater. 212, 531–540 (2019).
Yonghui, L., Xie, H., Wang, J. & Li, X. Experimental study of the isothermal sorption properties of late Qing and 1980s grey bricks in Wujiang, Suzhou, China. Front. Archit. Res. 2, 483–487 (2013).
National Standards—National public service platform for standards information [Internet]. [cited 13 Oct 2024]. Available from: https://std.samr.gov.cn/gb/search/gbDetailed?id=71F772D7F28ED3A7E05397BE0A0AB82A#
Vergès-Belmin, V. ICOMOS-ISCS: Illustrated Glossary on Stone Deterioration Patterns. English-French Version (ICOMOS, 2008).
Soleymani, A., Jahangir, H. & Nehdi, M. L. Damage detection and monitoring in heritage masonry structures: systematic review. Constr. Build. Mater. 397, 132402 (2023).
Mishra, M. & Lourenco, P. B. Artificial intelligence-assisted visual inspection for cultural heritage: state-of-the-art review. J. Cult. Herit. 66, 536–550 (2024).
Galantucci, R. A., Lasorella, M. & De Fino, M. A rapid pipeline for periodic inspection and maintenance of architectural surfaces. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. XLVIII-M-2–2023, 621–628 (2023).
Samhouri, M., Al-Arabiat, L. & Al-Atrash, F. Prediction and measurement of damage to architectural heritages facades using convolutional neural networks. Neural Comput. Appl. 34, 18125–18141 (2022).
Hatir, M. E., Barstuğan, M. & İnce İ. Deep learning-based weathering type recognition in historical stone monuments. J. Cult. Herit. 45, 193–203 (2020).
Hatir, M. E., İnce, İ. & Korkanc, M. Intelligent detection of deterioration in cultural stone heritage. J. Build. Eng. 44, 102690 (2021).
Wang, N. et al. Automatic damage detection of historic masonry buildings based on mobile deep learning. Autom. Constr. 103, 53–66 (2019).
Deep Convolutional Neural Networks with Transfer Learning for Old Buildings Pathologies Automatic Detection | SpringerLink [Internet]. [cited 15 Oct 2023]. Available from: https://link.springer.com/chapter/10.1007/978-3-030-36671-1_18.
Loverdos, D. & Sarhosis, V. Automatic image-based brick segmentation and crack detection of masonry walls using machine learning. Autom. Constr. 140, 104389 (2022).
Viswanatha, V., Chanda, RK. & Ramachandra, AC. Real Time Object Detection System with YOLO and CNN Models: A Review. J. Xian Univ. Archit. Technol. 14, 144–151 (2022).
Kang, C. H. & Kim, S. Y. Real-time object detection and segmentation technology: an analysis of the YOLO algorithm. JMST Adv. 5, 69–76 (2023).
Ma, J. et al. Complex texture contour feature extraction of cracks in timber structures of ancient architecture based on YOLO algorithm. Adv. Civ. Eng. 2022, e7879302 (2022).
Yan, L., Chen, Y., Zheng, L. & Zhang, Y. Application of computer vision technology in surface damage detection and analysis of shedthin tiles in China: a case study of the classical gardens of Suzhou. Herit. Sci. 12, 72 (2024).
Idjaton, K. et al. Detection of limestone spalling in 3D survey images using deep learning. Autom. Constr. 152, 104919 (2023).
Yang, X., Zheng, L., Chen, Y., Feng, J., Zheng, J. Recognition of damage types of Chinese gray-brick ancient buildings based on machine learning—taking the Macau World Heritage buffer zone as an example. Atmosphere. 14. https://www.mdpi.com/2073-4433/14/2/346 (2023).
Li Q., et al. Non-destructive testing research on the surface damage faced by the Shanhaiguan Great Wall based on machine learning. Front Earth Sci. 11 11 [cited 13 Sep 2024]. https://www.frontiersin.org/journals/earth-science/articles/10.3389/feart.2023.1225585/full (2023).
Karimi, N., Mishra, M. & Lourenco, P. B. Deep learning-based automated tile defect detection system for Portuguese cultural heritage buildings. J. Cult. Herit. 68, 86–98 (2024).
Zou, J. & Deng, Y. Intelligent assessment system of material deterioration in masonry tower based on improved image segmentation model. Herit. Sci. 12, 252 (2024).
Seo, H., Raut, A. D., Chen, C. & Zhang, C. Multi-label classification and automatic damage detection of masonry heritage building through CNN analysis of infrared thermal imaging. Remote Sens. 15, 2517 (2023).
Ali, R. & Cha, Y.-J. Attention-based generative adversarial network with internal damage segmentation using thermography. Autom. Constr. 141, 104412 (2022).
Sabato, A., Dabetwar, S., Kulkarni, N. N. & Fortino, G. Noncontact sensing techniques for AI-aided structural health monitoring: a systematic review. IEEE Sens. J. 23, 4672–4684 (2023).
Peng, X., Zhong, X., Zhao, C., Chen, A. & Zhang, T. A UAV-based machine vision method for bridge crack recognition and width quantification through hybrid feature learning. Constr. Build. Mater. 299, 123896 (2021).
Zhang, K., Mea, C., Fiorillo, F. & Fassi, F. Classification and object detection for architectural pathology: practical tests with training set. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. XLVIII-2/W4-2024, 477–484 (2024).
Russell, B. C., Torralba, A., Murphy, K. P. & Freeman, W. T. LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008).
Bai, R., Wang, M., Zhang, Z., Lu, J. & Shen, F. Automated construction site monitoring based on improved YOLOv8-seg instance segmentation algorithm. IEEE Access 11, 139082–139096 (2023).
Vijayakumar, A., Vairavasundaram, S. YOLO-based object detection models: a review and its applications. Multimed. Tools Appl. [cited 24 Sep 2024]. https://doi.org/10.1007/s11042-024-18872-y (2024).
Fiorucci, M. et al. Machine learning for cultural heritage: a survey. Pattern Recognit. Lett. 133, 102–108 (2020).
Osco, L. P. et al. A review on deep learning in UAV remote sensing. Int. J. Appl. Earth Observ. Geoinf. 102, 102456 (2021).
Dailey, D. The journey of Dr. John A. Snell: a reflection of the Chinese missions in transition. Methodist Hist. 49, 195 (2011).
Guide to Risk Management | ICCROM [Internet]. [cited 10 Mar 2025]. https://www.iccrom.org/publication/guide-risk-management
Acknowledgements
This research was funded by the National Key R&D Program of China (2021YFE0200100), Natural Science Foundation-Basic Research Program of Jiangsu Province (BK20240814), Natural Science Foundation of the Jiangsu Higher Education Institutions of China (24KJB560018) and the Suzhou Science and Technology Development Program (2022SS52). The authors would like to express their gratitude to the China–Portugal Joint Laboratory of Cultural Heritage Conservation Science. Special thanks to Professors Zhongqing Wang and Min Cao from the Soochow University School of Computer Science and Technology for their expert guidance and support throughout this research. We are also deeply grateful to graduate students Yutao He and Ziyin Zeng for their dedicated assistance with model training, which was crucial to the success of this project. Additionally, we appreciate the students from the Soochow University School of Architecture, supervised by Shiruo Wang in 2021, for completing the model scanning work.
Author information
Authors and Affiliations
Contributions
X.C. and S.W. conceptualized the study. X.C., J.H. and S.W. developed the methodology. The investigation was carried out by J.H. and S.W. using resources provided by X.C. and S.W. J.H. and S.W. performed data curation. X.C., J.H. and S.W. wrote the original draft of the manuscript and subsequently reviewed and edited by X.C. and S.W. X.C. and S.W. conducted supervision of the project, and X.C. and S.W. acquired funding. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, X., He, J. & Wang, S. Deep learning-driven pathology detection and analysis in historic masonry buildings of Suzhou. npj Herit. Sci. 13, 197 (2025). https://doi.org/10.1038/s40494-025-01783-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s40494-025-01783-y