Abstract
Identification of structural wooden components is crucial for heritage architecture conservation as it elucidates the utilization patterns of forest resources and evolution of human civilization. This paper proposes a computer vision-based in situ identification method for wooden components of Chinese heritage architectures, using a dataset comprising 4050 images from 63 components of nine buildings. The optimal algorithm, RepLKNet, trained on coniferous xylarium specimens, achieves 96.67% identification accuracy, with 98.33%, 93.33%, and 90% precision at 50%, 70%, and 90% confidence thresholds, respectively. A minimum sample size of 25 species and 1500 images per genus ensures test accuracy >90%. Impact of structural deterioration (decay and cracks) on accuracy is also evaluated. Cracks significantly affect the wood recognition accuracy of historical components. Performance degrades significantly when cracks span >30% of the image. Latewood integrity is also critical to identification. The proposed method advances structural preservation strategies and preventive maintenance practices in heritage architecture.
Introduction
Wooden heritage architectures serve as crucial material embodiments of human ingenuity, encapsulating profound historical, artistic, and scientific significance1. Owing to the intrinsic properties of wood, environmental conditions, and anthropogenic influences, wooden heritage architectures suffer from various types and degrees of deterioration with age, including cracks and decay, which compromise their structural integrity and safety. Accordance to the Principles for the Conservation of Wooden Built Heritage2 and the Chinese national standard GB/T 50165-2020 - Technical Standard for Maintenance and Strengthening of Historical Timber Buildings3, the conservation and reinforcement of heritage wooden structures must prioritize preserving the original form, structure, materials, and craftsmanship. Therefore, during the restoration or replacement of original wooden components, the timber utilized should, as far as practicable, be of the same species as the original. Consequently, precise identification of wooden components is essential for the effective maintenance and safeguarding of heritage wooden structures4. Additionally, a comprehensive understanding of the timber species used in the construction of wooden frameworks of historic buildings is necessary to understand timber selection and utilization across different epochs and regions, thereby offering insights into the evolution of the utilization of forest resources in construction in human civilization5.
Currently, wood identification in wooden heritage architectures predominantly relies on genus-level identification using the wood anatomy method6,7. This involves destructive sampling of wooden components, damaging the selected elements. In particular, the samples are first transported to a laboratory, where sample preparation, softening, sectioning, staining, microscopic observation, and characteristic analysis are performed sequentially. This procedure is not only highly specialized and labor-intensive but also time-consuming, often resulting in a prolonged identification cycle8. For historical wooden structures, a comprehensive sampling of all components increases the on-site workload significantly. Moreover, owing to the unique historical value of heritage structures, extensive destructive sampling is often impractical. Additionally, the accuracy of this method depends significantly on the expertise and subjective judgment of appraisers, introducing potential variability and bias into the identification results.
Alternatively, in-situ wood identification in heritage architecture remains a significant challenge, despite the development of advanced techniques, such as DNA barcoding9,10,11,12 and chemical fingerprinting13. Although DNA barcoding is promising, it requires a laboratory environment and involves complex procedures, such as sampling, sample processing, and nucleic acid extraction. These requirements make it unsuitable for in-situ identification and non-sampled sampling of wood components14. Further, wood components in historical buildings often undergo deterioration processes, such as decay and degradation, altering their chemical composition. This degrades the effectiveness of chemical fingerprint markers owing to the unreliability of the original chemical profiles15. Consequently, non-sampled, efficient, and accurate wood identification methods should be developed specifically for application in the field of heritage conservation.
In recent years, computer vision technology has been applied to wood identification, showcasing its significant potential in this task, driven by rapid advancements in hardware and artificial intelligence algorithms. This approach is characterized by its portability, accuracy, speed, and cost-effectiveness16,17,18. Existing research on computer vision-based wood identification has been predominantly focused on hardwoods, including Dalbergia, Pterocarpus, and Swietenia, which are listed under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES)16,18, as well as other commercially valuable timber species19,20. In contrast, softwood identification has not been researched adequately21,22. This method typically relies on the construction of a large-scale dataset, which is a complicated task in the context of wood identification owing to the unique cultural value of heritage architecture and the limited availability of its wooden components. Image data of wooden heritage architecture is often extremely scarce, compromising the data volume requirements essential for effective computer vision-based identification23. Additionally, compared to modern wooden materials, the surfaces of wooden heritage architectures frequently exhibit defects, such as decay, cracking, and aging. These imperfections hinder image acquisition significantly and reduce the accuracy of component recognition24,25. Although computer vision methods have been employed for detecting deterioration (e.g., cracking) in historic timber structures26, to our knowledge, this approach remains unexplored for wood species identification in heritage architectures.
Traditional Chinese architecture has a long and illustrious history and is characterized by the construction of wood structures, often in combination with earth, masonry, and stone. In particular, wood structures stand out as one of the most distinctive and defining features of traditional Chinese architecture. The selection, processing, and utilization of wooden heritage architecture reflect the profound technical and cultural wisdom of ancient craftsmen, who leveraged the inherent properties of wood adeptly. This not only demonstrates the advanced wood construction technologies of the time but also embodies the social consciousness and aesthetic value of the era. Historical literature and field research have consistently indicated that the Pinaceae family is extensively utilized in Chinese wooden heritage architectures, with a notably high prevalence6,27,28,29. In this context, this study focuses on four genera within the Pinaceae family—Abies, Larix, Picea, and Pinus—which are commonly found yet frequently misidentified in such wooden heritage architectures. These genera are selected as the primary research subjects to explore the feasibility of in-situ wood identification using computer vision technology. By leveraging this innovative approach, this study aims to develop a low damage, efficient, and accurate method for wood identification in heritage architecture, thereby contributing to the preservation and comprehension of these invaluable cultural treasures.
To address the challenges of large-scale image collection and database construction for wooden components in heritage architecture, we screen and determine an optimized deep learning-based identification model. The model is initially trained using an image database of xylarium wood specimens. Subsequently, a test dataset comprising images of wooden heritage architectures is created to evaluate the performance of the model.
The objectives of this study are as follows:
-
To investigate the viability of employing models trained on xylarium wood specimens for component identification in wooden heritage architectures using image recognition methods.
-
To explores the minimum sample size required for xylarium wood specimens to provide adequate statistical support for tree species identification in future dataset expansions.
-
To propose an image-based grading criterion for wood cracks and decay, and evaluate the impact of these defects on model accuracy using a deterioration simulation method.
Methods
The process proposed in this study is illustrated in Fig. 1. It consists of four steps—dataset creation, model construction, on-site image capture, and wood identification. Further, model performance is evaluated corresponding to varying degradation levels.
Data preparation
The training dataset in this study originates from Zheng et al.22, constructed using xylarium wood specimens from the Wood Specimen Resource Center of the National Forestry and Grassland Administration. Transverse end surfaces of the specimens are polished using 240, 400, 600, and 800 iterations of sanding to obtain clear surfaces for image collection. The cross-sectional images with resolution of 2048 × 2048 pixels in PNG format, each representing 6.35 × 6.35 mm of tissue, are obtained using iWood30. Wood defects, including surface cracks, blue staining, and knots, are avoided during image collection of xylarium wood specimens. The training dataset includes four genera of the Pinaceae family, i.e., Abies, Larix, Picea, and Pinus; 481 xylarium wood specimens; and 37709 images in aggregate distributed across 25 provinces of China22. Detailed species information and the origins of the xylarium wood specimens are presented in Supplementary Tables S1 and S2.
Additionally, the image data are targeted and cleaned according to the standardized selection criteria for cross-sectional images suitable for softwood identification22. Figure 2 depicts examples of training samples.
The test dataset includes 63 sampling components obtained from nine heritage architectures in China and contains 4050 images without any wood defects. As evidenced in Table 1, the selected architectural sites include Jiexiu Houtu Temple in Shanxi (JHT, constructed in 457 AD), Pagoda of Fogong Temple in Shanxi (PFT, constructed in 1056 AD), Chongshan Temple in Shanxi (CT, constructed in 1383 AD), the Forbidden City (FC, constructed in 1420 AD), Dahui Temple in Beijing (DT, constructed in 1513 AD), Chunyang Palace in Shanxi (CP, constructed in 1573 AD), Wanshou Temple in Beijing (WT, constructed in 1577 AD), Financial Street in Beijing (FS, constructed in 1912 AD), and Xuanwu Hospital in Beijing (XH, constructed in 1958 AD). The specifications of the comprehensive dataset are listed in Table 2. All components are identified using traditional wood anatomy methods, including sample preparation, slicing, staining, microscopic feature observation, and final verification by experienced wood anatomy experts based on the IAWA list31. Detailed information regarding the wooden components is in Supplementary Table S3.
Wooden components of heritage architectures require specialized sanding procedures, distinct from those for standard xylarium wood specimens, to minimize structural damage. In-situ specimen preparation involves sequential polishing of transverse end surfaces using a specialized 1-cm-diameter grinder over 180, 240, 400, 600, and 800 sanding iterations. This removes a surface layer of thickness = 0.5–1 mm from wooden components, ensuring optimal visibility of their anatomical features while maintaining structural integrity. Image acquisition is performed using iWood, which is specifically designed to minimize damage to heritage wooden components. All samples are identified at the genus level using computer vision models, and the relevant identification results are compared with those of the traditional wood anatomy method.
Deterioration classification
The primary forms of deterioration observed in wooden heritage architectures include surface changes32, cracks25, mechanical deformation33, mechanical damage, insect infestation33, decay34, and biological growth. Certain deterioration patterns, particularly insect infestation, decay, and cracking, may become more pronounced in the collected data. Given the prevalence and structural significance of key degradation mechanisms in wooden components, cracks and decay are selected for detailed analysis.
Wood identification accuracy is observed to be influenced significantly by the degree of deterioration. Although previous studies have established comprehensive grading systems for decay states35, these classifications are primarily designed for macroscopic assessment and, therefore, are inadequate for the analysis of small regions captured in individual images. To address this limitation, a comprehensive five-level classification system is implemented based on crack and decay characteristics. The status and description of crack and decay classification are listed in Table 3, and varying degrees of cracks and decay are illustrated in Fig. 3.
Calculation of degradation area and degradation simulation
Traditional image-area quantification typically relies on manual pixel value analysis of target regions36, which is extremely time-consuming. The inherent morphological irregularities of wood deterioration patterns hinder accurate area quantification using conventional methods. Semiautomated annotation techniques offer a viable solution for enhancing computational efficiency while maintaining the precision of measurement.
For deterioration data, we use the interactive semiautomatic annotation tool (ISAT)37 for semiautomatic labelling. This method is used in conjunction with the segment anything model (SAM)38 to identify instances of deterioration types rapidly and accurately. Instance segmentation outputs are processed using a threshold-based binarization algorithm, performing deterioration area quantification via pixel analysis using numpy.sum (threshold = 255) in Python. This computational pipeline enables the precise calculation of deterioration area percentages based on the distribution of white pixels over segmented regions. The methodological workflow is illustrated in Fig. 4, and the distribution of the samples and image numbers over different deterioration levels is presented in Table 4.
The dataset comprises 860 images of cracks and 530 images depicting decay in wood cross-sections, as detailed in Table 4. Although the latter group is smaller, its distribution over severity levels (c1–c4) exhibits greater uniformity compared to that of crack images, which predominantly cluster at the c1 and c2 levels with significant underrepresentation at c3 and c4. To investigate the impact of cracking on wood identification accuracy, a subset of 3403 cross-sectional images obtained from 52 samples was selected for simulated crack generation at the c3 and c4 severity levels. Subsequently, all simulated images were used as test data and directly identified using the trained model. As illustrated in Fig. 3, the crack features typically appear to manifest as irregular black patterns in the cross-sectional images. The simulation process employs an instance segmentation region and replicates the crack morphology by setting the pixels of the instance segmentation region to zero while preserving surrounding areas, as demonstrated in Fig. 5.
Model construction for wood identification
The resurgence of large-kernel models in computer vision, facilitated by advancements in computational hardware, has yielded superior predictive accuracy in recent studies39. Conventional deep learning models exhibit receptive fields with limited effectiveness39,40. In contrast, large-kernel architectures provide substantially expanded receptive fields that approximate human perceptual characteristics more closely. The deterioration features of wood affect the abilities of different identification models to varying degrees. A large receptive field is more likely to detect the length correlation between wood features, facilitating accurate judgment. As a representative implementation, RepLKNet39 exemplifies the large-kernel convolutional neural network architecture, with its structural configuration illustrated in Fig. 6.
The structure of RepLKNet is relatively simple. Following the input of the image data, the image is processed by the stem module, which consists of two convolutional layers and two depth-wise separable convolutions. The four subsequent stages include a preponderance of RepLK Blocks and ConvFFN modules. The majority of large kernels are reflected in the RepLK Block. Finally, the model performs downsampling using the transition module.
Maximum mean discrepancy
The maximum mean discrepancy (MMD) is a common statistical tool used to determine the similarity between two datasets41. It is defined as follows:
In Eq. (1), \({\rm P}\) and \(Q\) denote the two datasets to be compared and are represented by \(X=\{{x}_{1},{x}_{2},\mathrm{..}.,{x}_{m}\}\) and \(Y=\{{y}_{1},{y}_{2},\mathrm{..}.,{y}_{n}\}\), respectively. \({\rm H}\) represents the reproducing kernel Hilbert space (RKHS). \(\Phi\) denotes the feature-mapping function mapped to RKHS.
Evaluation standards
Identification performance is evaluated in terms of the accuracy of the model, which is defined as follows:
A true positive (TP) indicates that the target is identified correctly as a positive sample, and a false positive (FP) suggests that the target is identified as a positive sample even though it is a negative sample. A false negative (FN) indicates that the target is identified to be a negative sample despite actually being a positive sample. A true negative (TN) suggests that the target is correctly identified as a negative sample.
However, accuracy, by itself, does not sufficiently reflect model performance comprehensively in this context as the identification of wooden components typically requires multiple image acquisitions per sample. To address this requirement, we introduce confidence metrics to quantify model performance at the sample level. Sample confidence and precision are mathematically defined by Eqs. (3) and (4), respectively. A classification is considered to be correct when the sample confidence value exceeds the empirically determined thresholds of 0.5, 0.7, or 0.9.
Experimental configuration
The aforementioned steps are implemented on a workstation (CPU: Intel Xeon Silver 4210 R @ 2.4 GHz, RAM: 64 GB, and GPUs: NVIDIA GeForce GTX 3090). All model implementations are based on Python 3.8, Cuda 11.8, and PyTorch 2.0.
Results
Validation of dataset distribution of wooden components and xylarium specimens
Wooden component data of heritage architectures are predominantly influenced by sample degradation levels. In contrast, intact wooden components exhibit consistent anatomical characteristics. Based on this observation, we hypothesize that the distributions of xylarium wood specimens and wooden component datasets may be similar. To validate this hypothesis, MMD testing is conducted to quantify the distributional differences between the two aforementioned datasets. The features of both datasets are extracted using a pre-trained ResNet model. Subsequently, MMD testing is performed, and the results are presented in Table 5. To visualize the distribution differences between the datasets clearly, t-distributed stochastic neighbor embedding is used to project the features into a two-dimensional space and plot them in a coordinate system. The resulting visualization is depicted in Fig. 7.
As MMD approaches zero, the distributional similarity between the two datasets increases. The MMD values for all four genera (Abies, Larix, Picea, and Pinus) are observed to be less than 0.006, indicating a low degree of distributional difference. This observation is further supported by the feature distribution visualization depicted in Fig. 7. A small subset of Abies and Larix data points in the wooden component dataset are observed to exhibit divergence from the xylarium wood specimen dataset. In contrast, the distributions of Picea and Pinus exhibit a nearly complete overlap. Specifically, the xylarium wood specimen dataset encompasses the feature space of the wooden component dataset completely for these two genera. Given the small degree of distributional difference, a wood identification model trained on xylarium specimens exhibits strong potential for application in wooden component identification in heritage architectures.
In-situ wood identification of components in heritage architectures
The development of computer vision-based identification methods for wooden heritage architectures presents significant challenges owing to two primary constraints—difficulty of implementing destructive sampling owing to unique cultural values and the scarcity of relevant wooden components. To address these limitations, we develop optimized wood identification models using xylarium wood specimen image databases. The corresponding models exhibit robust generalizability based on validation using unknown modern wood samples22. They are subsequently adapted and applied to the wood identification of components in heritage architectures.
Comparative analysis is performed on five deep learning models—two traditional models (ResNet and SeResNet), two reparameterization-style models (RepVGG and RepLKNet), and a large kernel-style model (ConvNeXt). As summarized in Table 6, RepLKNet exhibits superior performance, with remarkable generalizability across our dataset. Surprisingly, although ResNet’s accuracy is lower than that of RepLKNet, it outperforms RepVGG, which is better for xylarium wood specimen validation22, demonstrating strong robustness. Meanwhile, ConvNeXt, which uses a large kernel similar to RepLKNet, delivers mediocre performance among the five models, with a testing accuracy of only 94.42%.
Statistical analysis of model performance metrics (mean precision, mean recall, and mean F1 scores) in this study reveals significant variations, primarily attributable to the higher identification error rates for Picea and Abies. Within the wooden components of heritage architectures, these two genera are less commonly used, resulting in uneven data distribution and limited sample size in our experimental dataset. In contrast, all models exhibit high identification accuracy for frequently used timber materials (e.g., Pinus and Larix), thereby validating the effectiveness of the proposed identification method.
Figure 8 presents the confusion matrix and the corresponding confidence levels for the classification outcomes obtained using RepLKNet. Notably, the model achieves recall rates exceeding 99% for Pinus and Larix (Fig. 8a), i.e., the two most commonly used genera in Chinese heritage architecture. In contrast, the classification performances for Abies and Picea yield lower recall rates of 58.08% and 83.52%, respectively. This performance discrepancy can be primarily attributed to misclassifications associated with specific samples, as detailed in Table 7.
In practical identification scenarios, the reliability of a model cannot be evaluated based solely on its accuracy. Comprehensive performance analysis requires individual sample confidence calculations, as shown in Fig. 8b. As presented in Table 8, among the 60 samples without deterioration, 54 (90%) correspond to confidence levels exceeding 90%, while the sample precision increases to 93.33% when a 70% confidence threshold is applied. In addition, only two of the 41 Pinus wood samples correspond to confidence levels below 90%. Using a 50% confidence threshold, which is a commonly adopted and relatively lenient criterion, the model achieves a sample classification accuracy of 98.33%. Only the confidence level of MY02 is lower than this requirement. Given the high sample precision obtained in this study, the RepLKNet model exhibits strong potential for application in the identification of wooden heritage architectures. This scheme addresses the limitations of computer vision methods. e.g., requiring large-scale image databases, effectively while simultaneously achieving rapid in-situ wood identification for heritage architectures and reducing reliance on destructive sampling and the time cycle of wood identification.
Minimum number of samples and images required to establish an effective wood identification model
Although image data from xylarium wood specimens is demonstrated to be effective for wood identification, this methodology typically requires the collection of a substantial number of specimens. In this context, the limited availability of xylarium wood specimens is a major hindrance. A critical research question concerns the minimum specimen and image requirements for achieving reliable wood identification accuracy (>90%). Previous studies have investigated the required numbers of xylarium wood specimens and images separately15, but we believe that the two should be interconnected.
Before determining the minimum number of xylarium wood specimens required for the proposed method, an adequate image dataset must be established. The number of images that can be acquired varies significantly with specimen dimensions, necessitating initial categorization of specimens by genus and evaluating corresponding image quantities. Subsequently, the xylarium wood specimens are selected in sequence, a model is established on this basis, and the final results are derived, as illustrated in Fig. 9. The analysis reveals a positive correlation between specimen quantity and model accuracy, achieving optimal performance (96.2%) at 40 xylarium wood specimens per genus. Notably, the accuracy consistently exceeds 90% when the number of xylarium wood specimens per genus exceeds 20, whereas the rate of precision improvement diminishes beyond this threshold. Consequently, 20 xylarium wood specimens per genus are established as the minimum requirement for reproducible results in this study.
Although model accuracy generally improves with increasing volume of training data, this relationship is not strictly linear, particularly for anisotropic wood materials. The trend observed in Fig. 9 exhibits a deviation from the expected correlation between image quantity and model accuracy, warranting further investigation to establish a definitive relationship. To examine this relationship systematically, comparative analyses are conducted using 20, 25, 30, and 35 xylarium specimens with corresponding training datasets comprising 1500, 2000, 2500, 3000, and 3500 images, respectively. Given the variability in physical size among xylarium wood specimens, individual specimens may yield fewer images than the calculated average. In such cases, e.g., when selecting a 2000-image dataset from 20 xylarium wood specimens, specimens with limited image availability (<100 images) are supplemented using random image selection from other xylarium wood specimens within the same genus to maintain dataset integrity.
As depicted in Fig. 10, the number of xylarium wood specimens in the model affects the accuracy of the image directly. By controlling the number of images in each category and comparing different xylarium wood specimen sizes, the overall accuracy is observed to increase with an increase in the number of xylarium wood specimens as well as the number of images. Notably, for 2000 and 2500 images, the models trained on 25 xylarium wood specimens are observed to outperform those trained on 30 xylarium wood specimens marginally. However, this relationship is reversed at higher image quantities, with the 30-specimen model exhibiting superior accuracy. Analysis of fixed specimen quantities reveals that models trained on 20 or 25 xylarium wood specimens attain performance plateaus at ~2000 images, suggesting that this image volume sufficiently captures representative features of these specimen sets. Beyond this threshold, improving the accuracy using only additional images without incorporating new xylarium wood specimens becomes difficult. The 20-specimen model exhibits overfitting beyond 2500 images, despite achieving accuracy exceeding 90%. Based on these findings, we recommend a minimum dataset size of 1500 images obtained from 25 xylarium wood specimens per genus as the optimal size for maintaining an accuracy exceeding 90% while preventing overfitting.
Effect of deterioration in wooden components on identification accuracy
During prolonged service, wooden components of heritage architectures exhibit inherent susceptibility to hygrothermal fluctuations, sustained creep deformation under mechanical loads, and microbiological degradation mechanisms, cumulatively manifesting as characteristic deterioration patterns, including crack propagation, dimensional instability, and biodeterioration24.
In this subsection, the confidence levels of the deteriorated samples are systematically evaluated and compared with those of the non-deteriorated specimens, as illustrated in Figs. 11 and 12. The test data were derived from the degradation data presented in Table 4. Given the limited sample sizes for deterioration levels c3 and c4, we employed simulated data for extensive validation. The samples exhibiting severe cracks (c3 and c4) are observed to exhibit substantially reduced confidence levels, which significantly affect the identification accuracy of the model, as depicted in Table 9. In contrast, the decayed samples exhibit minimal confidence reduction. These findings align with those of previous research that identified tracheid morphology transitions between earlywood and latewood as primary identification features22. The current study further emphasizes the critical role of the anatomical feature integrity of latewood within growth rings for accurate wood identification.
Discussion
In this study, a deep learning-based in-situ wood identification framework is proposed for wooden components of heritage architectures. The proposed model is trained on an image database comprising images of xylarium wood specimens and directly applied to wood identification, effectively avoiding the database construction limitations. Simultaneously, the model is successfully applied for the rapid in situ identification of wood in heritage architectures, reducing the reliance on destructive sampling and shortening the identification cycle. The optimal algorithm, RepLKNet, achieves an accuracy of 96.67%. The sample precisions are 93.33% and 90% at 70% and 90% confidence levels, respectively. This proves the viability of employing models trained on xylarium wood specimens for component identification in wooden heritage architectures using image recognition methods. However, certain specimens exhibit low identification confidence scores, which requires further discussion.
During the identification of Picea, 22 images are erroneously identified as Pinus, whereas the other eight images are incorrectly identified as Larix. Sample WA05 exhibits the lowest confidence level (<70%) among Picea samples, and 16 images in aggregate are incorrectly identified as Pinus (Table 7). This is attributed to the small difference between wood structure characteristics of Pinaceae wood in cross-sectional images, and the similarity between the gradual trends of variation in the morphology of tracheids during the transition from earlywood to latewood for Picea and Pinus. Therefore, from the perspective of traditional wood anatomy, Picea and Pinus need to be identified using the main characteristics of the cross-field pitting pattern in the radial section.
Abies exhibits the lowest recall rate of only 58.08%. The erroneous data for Abies mainly originates from two samples, MY02 and MY13. Analysis of identification results revealed that Abies specimens were predominantly misclassified as Picea. Previous computer vision studies have demonstrated that the key diagnostic features distinguishing Abies from Picea lie in tracheid morphology during the transition between earlywood and latewood and the growth rings and their adjacent tracheids22. However, as shown in the optical micrographs in Supplementary Figs. S1 and S2, Abies and Picea display very similar tracheid morphology and growth ring patterns in cross-sections, which can lead to confusion in computer vision-based identification.
The minimum size for the number of xylarium wood specimens and the number of images required to construct an effective model are observed to be 25 xylarium wood specimens and 1500 images per genus, respectively. This ensures real-world test accuracies exceeding 90%. The minimum specimens and images should preferentially exclude deteriorated samples without strict image quantity constraints for each specimen. Some existing studies controlled the number of images collected and trained models on a large number of xylarium wood specimens; however, they collected very few images per specimen19. Restricted image sampling may lead to the inadequate representation of specimen characteristics because collectors cannot reliably determine whether minimal images capture the full features of a specimen sufficiently. Moreover, this methodology requires a substantial number of xylarium wood specimens and is challenging to replicate. Owing to the difficulty in obtaining xylarium wood specimens with clear backgrounds in plant taxonomy, we believe that the principle of “collecting as many images as possible” should be followed to reduce the need to collect a large number of xylarium wood specimens. Although the acquisition of an excessive number of images could theoretically induce overfitting, the practical threshold for such occurrences remains substantially high and scales with specimen quantity. This image acquisition principle enhances the reproducibility of computer vision-based wood identification methods and ensures consistent and reliable application outcomes.
Wood deterioration is observed to affect the identification accuracy. Wood decay refers primarily to the biochemical degradation of wood, mediated by microbial enzymatic activity42. Among these microorganisms, fungi constitute the most significant degradative agents, categorized as white and brown rot based on their distinct decay mechanisms43. Research on the decomposition of coniferous wood has established the following characteristic patterns—brown-rot fungi depolymerize cellulose selectively while leaving a modified lignin matrix, whereas white-rot fungi exhibit simultaneous lignocellulosic degradation through oxidative enzymatic systems44. The growth of fungi causing rot decay is limited and slower in latewood than in earlywood because of narrower cell lumen, thicker walls, and higher density of latewood45. Meanwhile, during the sanding of the surface of wooden components of heritage architectures, latewood cells with higher strength are exposed more easily than their earlywood counterparts because of the shallow sanding depth. For these reasons, the state of preservation and presentation of latewood cells within growth rings of wood components is more favorable. This explains the limited influence of the deterioration type of wood decay on identification accuracy.
In particular, when the cracked region exceeds 30% of the image area, the recognition accuracy of the model decreases significantly. The impact of cracks on identification accuracy is particularly significant and can be primarily attributed to the compromised integrity of latewood features. Most cracks in wooden components of heritage architectures are dry shrinkage cracks, which are caused by the dry shrinkage and wet swelling characteristics and anisotropy of wood46. Wood is a natural anisotropic material with hygroscopic and desorptive capabilities and undergoes drying and wetting with changes in environmental temperature and humidity. At the microscopic level, drying cracks often appear first in wood ray tissues because they comprise mostly thin-walled cells with low strength that are not closely connected with the surrounding cells. When wood shrinks, ray cells undergo significant deformation and are damaged by drying stresses, resulting in cracking47,48. The cell walls of latewood are usually thicker, and the amount of drying deformation is larger than that in earlywood cells. The ray tissue of latewood is first destroyed by shrinkage stress, which produces small cracks. Subsequently, these cracks gradually expand along the ray tissues, leading to a gradual increase in depth and width49,50. This phenomenon results in frequent crack formation within the latewood regions, compromising the identification accuracy of the model.
Future works should focus on optimizing the efficiency and accuracy of the proposed method, particularly for components with varying types and levels of deterioration, while extending its applicability from softwood to hardwood component specimens. Specifically, we intend to enhance the identification model by integrating an image inpainting algorithm, addressing the insufficient identification accuracy for high-deterioration-grade image data (e.g., severely cracked wood) to improve the analytical capability of the model for degraded images. Simultaneously, we intend to collect images of hardwood components, such as Phoebe, Quercus, and Ulmus, which are commonly used in heritage architecture, systematically to expand the dataset, thereby significantly strengthening the proposed model’s scenario generalization capability. This approach aims to establish adaptive conservation protocols for historic timber structures, ensuring scientific preservation and targeted restoration strategies across diverse material conditions.
Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Zhu, H. et al. Wood-derived materials for green electronics, biological devices, and energy applications. Chem. Rev. 116, 9305–9374 (2016).
International Council on Monuments and Sites. Guidelines for the Conservation of Wooden Built Heritage https://jianzhuyichan.tongji.edu.cn/info/1007/1543.htm (2017).
Sichuan Institute of building research & Zhongke construction. Technical Standard for Maintenance and Reinforcement of Wooden Structures in Ancient Buildings. GBT501652020 https://zjj.sm.gov.cn/xxgk/fgwj/jsbz/202011/t20201113_1589718.htm (2020).
Jiang, X., Yin, Y. & Liu, B. Current status development and prospect of wood identification technology. China Wood Ind. 24, 36–39 (2010). in Chinese.
Jiao, L. et al. Ancient plastid genomes solve the tree species mystery of the imperial wood “Nanmu” in the Forbidden City, the largest existing wooden palace complex in the world. Plants People Planet 4, 696–709 (2022).
Gasson, P. How precise can wood identification be? Wood anatomy’s role in support of the legal timber trade, especially CITES. IAWA J. 32, 137–154 (2011).
Dong, M. et al. Wood used in ancient timber architecture in Shanxi Province, China. IAWA J. 38, 182–200 (2017).
Dormontt, E. E. et al. Forensic timber identification: It’s time to integrate disciplines to combat illegal logging. Biol. Conserv. 191, 790–798 (2015).
Hartvig, I., Czako, M., Kjær, E. D., Nielsen, L. R. & Theilade, I. The use of DNA barcoding in identification and conservation of rosewood (Dalbergia spp.). Plos One 10, e0138231 (2015).
Jiao, L. et al. DNA barcode authentication and library development for the wood of six commercial Pterocarpus Species: the critical role of Xylarium specimens. Sci. Rep. 8, 1945 (2018).
Yu, M. et al. DNA barcoding of vouchered xylarium wood specimens of nine endangered Dalbergia species. Planta 246, 1165–1176 (2017).
Lu, Y. et al. DNA methods for identifying wood in ancient timber architecture. Chin. J. Wood Sci. Technol. 37, 12–18 (2023). in Chinese.
Domínguez-Delmás, M. Seeing the forest for the trees: new approaches and challenges for dendroarchaeology in the 21st century. Dendrochronologia 62, 125731 (2020).
Jiao, L., Lu, Y., He, T., Guo, J. & Yin, Y. DNA barcoding for wood identification: global review of the last decade and future perspective. IAWA J. 41, 620–643 (2020).
Traoré, M., Kaal, J. & Cortizas, A. M. Chemometric tools for identification of wood from different oak species and their potential for provenancing of Iberian shipwrecks (16th-18th centuries AD). J. Archaeol. Sci. 100, 62–73 (2018).
He, T. et al. Developing deep learning models to automate rosewood tree species identification for CITES designation and implementation. Holzforschung 74, 1123–1133 (2020).
Ravindran, P., Thompson, B. J., Soares, R. K. & Wiedenhoeft, A. C. The XyloTron: flexible, open-source, image-based macroscopic field identification of wood products. Front. Plant Sci. 11, 1015 (2020).
Ravindran, P., Costa, A., Soares, R. & Wiedenhoeft, A. C. Classification of CITES-listed and other neotropical Meliaceae wood images using convolutional neural networks. Plant Methods 14, 25 (2018).
Ravindran, P., Owens, F. C., Wade, A. C., Shmulsky, R. & Wiedenhoeft, A. C. Towards sustainable North American wood product value chains, part I: computer vision identification of diffuse porous hardwoods. Front. Plant Sci. 12, 758455 (2022).
Ravindran, P. & Wiedenhoeft, A. C. Comparison of two forensic wood identification technologies for ten Meliaceae woods: computer vision versus mass spectrometry. Wood Sci. Technol. 54, 1139–1150 (2020).
Fabijańska, A., Danek, M. & Barniak, J. Wood species automatic identification from wood core images with a residual convolutional neural network. Comput. Electron. Arg. 181, 105941 (2021).
Zheng, C. et al. Deep learning-based species identification of gymnosperm xylem: The practice in digital forestry. Comput. Electron. Arg. 237, 110581 (2025).
Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vision. 115, 211–252 (2015).
Stratigaki, M. Autofluorescence for the visualization of microorganisms in biodeteriorated materials in the context of cultural heritage. ChemPlusChem 89, e202400170 (2024).
Li, X., Qian, W. & Chang, L. Analysis of the density of wooden components in ancient buildings by micro-drilling resistance, using information diffusion. BioResources 14, 5777–5787 (2019).
Zhang, L., Xie, Q., Wang, H., Han, J. & Wu, Y. Deep-learning-based crack identification and quantification for wooden components in ancient Chinese timber structures. Struct. Control Health 1, 9999255 (2024).
Yin, Y. et al. Research on the identification of wood species used for wooden structures in Southeastern Shanxi Province. World Antiquity 04, 33–36 (2010). in Chinese.
Zhang, Q. An analysis of the history of the palace of compassion and tranquility complex from the aspect of wood species of timber members, The Forbidden City. Herit. Architecture 04, 1–12 (2020). in Chinese.
Li, S. et al. Research on the identification and configuration of wood components for the main hall of Jianshui Zhilin Temple Cultural Relics Protection and Archaeological Science. Sci. Conserv. Archaeol. 32, 91–98 (2020). in Chinese.
He, T. et al. iWood: an automated wood identification system for endangered and precious tree species using convolutional neural networks. Sci. Silvae Sincae 57, 152–159 (2021). in Chinese.
IAWA Committee. IAWA List of microscopic features for softwood identification. IAWA J. 25, 1–70 (2004).
Tan, Y. et al. Inspection and evaluation of wood components of ancient buildings in the South-Three Courts of the Forbidden City. BioResources 17, 962–974 (2022).
Ma, X. et al. 3D structural deformation monitoring of the archaeological wooden shipwreck stern investigated by optical measuring techniques. J. Cult. Herit. 59, 102–112 (2023).
Venugopal, P., Junninen, K., Linnakoski, R., Edman, M. & Kouki, J. Climate and wood quality have decayer-specific effects on fungal wood decomposition. For. Ecol. Manag. 360, 341–351 (2016).
China Academy of Forestry Wood Industry Research Institute, et al. LY/T 2014-2024 Non-destructive testing method and defects classification for wooden components of ancient buildings. National Forestry and Grassland Administration https://std.samr.gov.cn/hb/search/stdHBDetailed?id=1E5DB4EC8381C2FFE06397BE0A0A7B72 (2024) (in Chinese).
Ji, M., Zhang, W., Wang, G., Wang, Y. & Miao, H. Online measurement of outline size for Pinus densiflora dimension lumber: maximizing lumber recovery by minimizing enclosure rectangle fitting area. Forests 13, 1627 (2022).
Ji, S. & Zhang, H. ISAT with segment anything: an interactive semi-automatic annotation tool v1.10 https://github.com/yatengLG/ISAT_with_segment_anything (2025).
Kirillov, A. et al. Segment anything. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 4015–4026 (IEEE, 2023).
Ding, X., Zhang, X., Han, J. & Ding, G. Scaling up your kernels to 31x31: revisiting large kernel design in CNNS. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 11963–11975 (IEEE, 2022).
Ding, X. et al. UniRepLKNet: a universal perception large-kernel ConvNet for audio video point cloud time-series and image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5513–5524 (IEEE, 2024).
Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B. & Smola, A. A kernel method for the two-sample-problem. In Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, 513–520 (MIT Print, 2007).
Ayrilmis, N., Kaymakci, A. & Güleç, T. Potential use of decayed wood in production of wood plastic composite. Ind. Crop Prod. 74, 279–284 (2015).
Green, F. & Highley, T. L. Mechanism of brown-rot decay: paradigm or paradox. Int. Biodeter. Biodegr. 39, 113–124 (1997).
Blanchette, R. A. Wood decay: a submicroscopic view. J. For. 78, 734–737 (1980).
Besma, B., Ahmed, K. & Yves, B. Effects of biodegradation by brown-rot decay on selected wood properties in eastern white cedar (Thuja occidentalis L.). Int. Biodeterior. Biodegrad. 87, 87–98 (2014).
Chen, Y., Huang, A., Meng, S. & Ni, L. Research progress in influencing factors to wood deformation and cracking and anti-cracking measures. World For. Res. 37, 71–76 (2024). in Chinese.
Wang, H. H. & Youngs, R. L. Drying stress and check development in the wood of two oaks. IAWA J. 17, 15–30 (1996).
Yamamoto, H., Sakagami, H., Kijidani, Y. & Matsumura, J. Dependence of microcrack behavior in wood on moisture content during drying. Adv. Mater. Sci. Eng. 2013, 802639 (2013).
Gao, Y. et al. The formation mechanism of microcracks and fracture morphology of wood during drying. Dry. Technol. 41, 1268–1277 (2023).
Sakagami, H. Microcrack propagation in transverse surface from heartwood to sapwood during drying. J. Wood Sci. 65, 33 (2019).
Acknowledgements
This study was supported financially by the National Key Research and Development Program of China (Grant No. 2023YFF0906301), the National Science & Technology Fundamental Resources Investigation Program (Grant No. 2023FY101400), and the China Scholarship Council (CSC) (Grant No. 202403270010). The authors would like to express our gratitude to Professor Xiaomei Jiang of the Research Institute of wood Industry, the Chinese Academy of Forestry for her valuable academic advice, and Professor Yongping Chen, Professor Juan Guo, Mrs. Mingkun Xu, Mr. Yonggang Zhang and Mr. Yu Sun of the Research Institute of wood Industry, the Chinese Academy of Forestry for their technical supports. The academic advice from Professor Jianjun Mei, Professor Geoffrey Lloyd, Dr. Sally Church and Mr. John Moffett of the Needham Research Institute, Cambridge of the United Kingdom is gratefully acknowledged.
Author information
Authors and Affiliations
Contributions
C.Z., L.J., H.T. and Y.Y. designed the experiments. C.Z., T.L. and Y.L. prepared the samples for imaging and imaged the specimens. T.L., Y.L., S.L. and C.Z. curated the collected dataset. L.J., R.Y., H.Z. and Y.Y. provided research sources. C.Z. developed the deep learning models. C.Z., Y.Y., S.L. and L.J. analyzed the results. C.Z., L.J. and Y.Y. wrote, reviewed and edited the paper. L.J. and Y.Y. conducted project administration and supervision. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zheng, C., Jiao, L., He, T. et al. Deep learning-based in-situ identification of coniferous wood components in heritage architectures of China. npj Herit. Sci. 13, 546 (2025). https://doi.org/10.1038/s40494-025-02120-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s40494-025-02120-z











