Enhancing chronic wound assessment through agreement analysis and tissue segmentation

Morgado, Ana C.; Carvalho, Rafaela; Sampaio, Ana Filipa; Vasconcelos, Maria J. M.

doi:10.1038/s41598-025-06703-5

Download PDF

Article
Open access
Published: 01 July 2025

Enhancing chronic wound assessment through agreement analysis and tissue segmentation

Ana C. Morgado¹^na1,
Rafaela Carvalho¹^na1,
Ana Filipa Sampaio¹^na1 &
…
Maria J. M. Vasconcelos¹^na1

Scientific Reports volume 15, Article number: 22244 (2025) Cite this article

1325 Accesses
1 Citations
Metrics details

Subjects

Abstract

Accurate monitoring of chronic wound progression is crucial for assessing healing dynamics. However, the current manual process of tissue segmentation and quantification, which is an indicator of the healing progress, is time-consuming and subject to variability, so automated methods that can effectively monitor wound healing are required. In this work, inter-rater agreement analyses were conducted to evaluate the consistency of manual annotations performed by multiple experts and an automated methodology for tissue segmentation leveraging advanced deep learning techniques is proposed. For this, the convolutional neural network DeepLabV3-R50 and a transformer-based approach (SegFormer-B0) were explored. Furthermore, the potential of transferring knowledge from open wound segmentation models trained on different available datasets and fine-tuning them for this specific task was investigated. The tissue segmentation model is integrated into a framework that combines a wound and reference marker detection model with a wound segmentation model to refine the predicted tissue masks. The results show the benefits of employing previous knowledge gained from simpler tasks within the same domain, as well as the efficacy of post-processing operations. The top-performing tissue segmentation model, DeepLabV3-R50, achieved an overall mean Intersection over Union of 62.95% and a mean Dice score of 76.82% for the three analysed tissues on the Wounds dataset, when assessed independently. Considering the complete framework, the same model returns a mean Intersection over Union of 59.67% and a mean Dice of 74.38%, resulting in mean absolute errors of 14.33%, 14.31% and 8.84% for granulation, slough and eschar proportion estimation, respectively. Moreover, the obtained inter-rater agreement scores still emphasize the inherent complexity of the task, as even experienced healthcare professionals may differ in delineating tissue boundaries. Given the proven intricacy of tissue characterisation and the promising results that were achieved, the proposed pipeline contributes to streamline the tissue segmentation and quantification task, leading to the automation of the wound bed characterisation process and enhancing consistency and efficiency in wound healing.

Fully automatic wound segmentation with deep convolutional neural networks

Article Open access 14 December 2020

Clinically validated classification of chronic wounds method with memristor-based cellular neural network

Article Open access 28 December 2024

Wound management materials and technologies from bench to bedside and beyond

Article 17 June 2024

Introduction

Chronic wounds pose a significant healthcare burden across the world, both in terms of patient morbidity and healthcare costs. Older individuals face a heightened risk of developing chronic wounds due to the natural slowdown of wound healing processes that comes with aging. This risk is reinforced by the increased prevalence of cardiovascular diseases and diabetes, which tend to rise with age¹. The economic cost of wound care represents up to 4% of the total health budget in developed countries^1,2. With the ageing population worldwide, the demand for products and technologies to support better care will increase. The financial burden includes direct costs for wound care supplies, treatments, and healthcare professionals’ interventions, alongside indirect costs arising from productivity loss, disability, and reduced quality of life for patients and caregivers. The major types of chronic wounds consist of pressure, venous, arterial and diabetic ulcers.

The wound healing process is complex and involves several phases. Proper monitoring, assessment and documentation of the wound’s evolution over time is essential to allow the adjustment of the applied treatment according to the wound’s progression³. However, wound assessment and measurement depend on visual examination, which can be highly subjective and dependent on the clinician’s experience.

Wound evolution is assessed not only based on its dimensions, but also considering the proportion of the tissue types present in the wound bed region. The Red/Yellow/Black model allows the discrimination of tissues according to the phase of the healing process they are in, providing important information for evaluating wound healing. It associates red with granulation tissue, yellow with slough tissue (not prepared to heal), and black with necrotic (eschar) tissue⁴.

This work addresses the challenge of effective chronic wound bed characterisation by proposing a fully automated framework based on state-of-the-art deep learning architectures. The primary contributions of this study are summarised as follows:

A novel, private dataset for detailed differentiation of granulation, slough, and eschar tissues in chronic wounds;
Quantitative inter-rater agreement analysis, highlighting the subjectivity inherent to tissue assessment and establishing a human performance baseline to benchmark automated approaches;
Detailed investigation of knowledge transfer, using both convolutional neural networks (CNNs) and transformer-based models, from the simpler task of open wound segmentation to the more complex problem of tissue segmentation via fine-tuning.
Development of a reliable and robust fully automated pipeline for potential deployment in clinical practice, improving wound care through reduced subjectivity and increased reproducibility of healing assessments.

These findings advance the field of automated chronic wound assessment, providing a pathway toward more consistent, objective and clinically applicable decision-support tools.

Related work

Recently, several research works have focused on automating different tasks of the wound assessment process, to alleviate the burden of the healthcare professionals and make wound status evaluation more reproducible^3,5. Many research lines pertain to the extraction of relevant properties from wound images, including the identification of the wound’s aetiology and location⁶, the outline and measurement of the wound region⁷, the recognition of healing complications⁸ and periwound skin alterations⁹, and the characterisation of the tissues inside the wound bed region.

Wound tissue differentiation is a challenging task in clinical practice, with its inherent subjectivity affecting the reproducibility of the wound bed composition estimated by clinical experts. Considering intra and inter-rater agreement studies, an analysis was performed in¹⁰ using 58 wound images from the Swift private dataset, where four tissues were annotated independently by five clinicians with an one week interval. Despite the level of intra-observer agreement reaching moderate/high levels, the inter-observer agreement recorded was lower: being considered weak for epithelial (0.389) and devitalized tissue (0.591) and moderate for necrotic (0.759) and granulation (0.765). It was also verified that the clinicians’ visual estimation overestimated epithelial and necrotic tissue and underestimated devitalized and granulation tissue compared to the proportions calculated through their notes. Moreover, the error distribution between the visual estimate and calculated proportion had high variability in all tissue types with standard deviations of 38% and 39%.

Motivated by the variability observed for this task, many works use machine learning (ML) and computer vision algorithms to automate it and increase its reliability. Some methodologies simplify the wound bed characterisation as a classification problem, recognising the presence or absence of specific tissue types in the images¹¹. For tissue segmentation, the simplest ML approaches focus on colour recognition. In¹², the different regions are segmented through direct recognition of the pixels of each colour. In contrast, the methodology followed in¹³ considers a white reference marker and separates the regions through a thresholding process based on the HSV (hue, saturation, value) color space. An overall accuracy of 75% and accuracies of 76.2%, 63.3% and 75.1% for granulation, slough and necrotic tissues (respectively) were reported, comparable with the agreement between the different experts who annotated the ground truth (0.65-0.85)¹³. In¹⁴, the authors use a Convolutional Neural Network (CNN) to directly distinguish tissue types in images of pressure ulcers. Many algorithms developed for tissue differentiation first determine the open wound region. In¹⁵, the wound is divided into the largest number of regions with a homogeneous tissue type using the k-means algorithm. Then, the tissue in each region is identified using a Support Vector Machine (SVM), based on colorimetric, topological and morphological properties.¹⁶ presents an end-to-end system that uses superpixels (Spx) generated with Simple Linear Iterative Clustering (SLIC) as input to feed different neural network models (U-Net, SegNet and FCN-Net) for automatic segmentation of diabetic foot ulcers (DFU) and tissue differentiation. The best method, Spx-FCN32, outperforms classical Fully Connected Networks (FCN) models, significantly improving performance in all metrics (accuracy 92.68% and Dice 75.74%). In¹⁷, popular decoder models are compared for tissue segmentation. 2836 images of pressure ulcers, annotated using SLIC pre-processing based on superpixels, were used, obtaining an accuracy of 99.57% with the DeepLabV3 architecture.

The individual analysis of pixels in the wound region, without prior grouping into uniform tissue sub-regions, is also significant in the literature. Most works use ML models to identify the tissue in each pixel. In¹⁸, colour and texture features are extracted from the open wound region and fed to a Bayesian models and SVMs to determine the type of tissue corresponding to each pixel. Also, some studies sample the open wound area by dividing the region of interest into fixed-size patches^19,20. In¹⁹, a CNN model obtained average Dice results of 91.38% and²⁰ used a multidimensional CNN model presenting accuracy results of 99.55%. The works of García-Zapirain et al.^14,21 evaluate the impact of prior segmentation of the region of the wound in the results achieved for tissue differentiation. The performance achieved using a single CNN to directly recognise the tissue type of each pixel in the entire image is similar to that achieved using separate networks to segment the wound region and the tissues within it, allowing for faster analysis of the wound, which encourages the use of a single neural network for both tasks¹⁴. Although the HSI colour space allows for less lighting impact and good contrast of the three main types of tissues, the RGB representation provides essential information for their correct discrimination. The YCrCb space is also presented as an alternative with relevant information for this task, with the related space YDbDr identified in other works²² as one of the most promising representations. In²³, an approach for segmenting the edge and classifying the type of wound tissue using GANs is proposed, namely a conditional GAN (c-GAN), with a chronic wound dataset from eKare Inc. The authors evaluated the impact of the number of images used for training and the number of epochs considered, obtaining a Dice score of 90% for the best combination. Finally, in¹⁰, the AutoTissue model was trained with 17,000 anonymised image-annotation pairs from the Swift Dataset and tested with 383 images of category 2 pressure ulcers, arterial and venous and diabetic ulcers, returning an average intersection over union (IoU) of 71.92%.

Despite the emergence of several approaches for tissue differentiation, it is still an underdeveloped task, mostly due to the lack of annotated public datasets. Most works focus on open wound segmentation, driven by the Foot Ulcer Segmentation (FUSeg)²⁴ and the Diabetic Foot Ulcer (DFUC)²⁵ challenges, as well as the availability of the Medetec²⁶, AZH FU²⁷ and Wseg²⁸ datasets. For this task, besides using methodologies similar to the ones applied for wound tissue segmentation, more complex approaches leveraging state-of-the-art models, such as vision transformers^29,30, and the combination of different datasets³¹ have arisen, exhibiting potential to advance the performances attained.

Methods

This study proposes an automated wound bed characterisation pipeline for tissue segmentation and relative tissue proportion estimation. To establish a benchmark for automated methods, human performance on the described tasks is evaluated through agreement studies, described in Section "Open wound and tissue segmentation agreement studies", which assess the consistency of manual annotations among different raters and their alignment with the obtained consensus ground truth. Following this, the proposed pipeline, illustrated in Fig. 1, is described. The methodology, further explained in Section "Tissue segmentation", begins by pre-processing the input images, through the selection of the wound region of interest. To obtain this region, the wound and marker bounding boxes are used, which may derive from ground truth (GT) annotations or from a detection model introduced in Section "Tissue segmentation". The processed wound images are then provided to the tissue segmentation models in order to extract the delineation of each tissue across the image. As an optional post-processing step, the open wound segmentation masks coming from the GT annotations or from a previously developed open wound segmentation model³² may then be used to refine the model outputs and obtain the tissue region only contained within the wound bed. Finally, the percentage of each tissue is calculated.

Dataset

In this work, a private Wounds dataset comprising images from several healthcare units captured using different mobile devices (smartphones) was considered. For data acquisition, a study protocol was submitted and approved by the different Health Ethical Committees, and informed consent was obtained from both the healthcare professionals and the patients involved. All experiments were performed in accordance with relevant guidelines and regulations. The images were acquired by healthcare professionals who were instructed to centre the wound in the image, including a 2 $\times$ 2 centimetres (cm) reference marker, consisting of coloured patches (blue, green, yellow, white), in the same plan of the wound and at least 4 cm of perilesional skin. A custom-developed mobile application was used to facilitate image acquisition and ensure consistency^33,34. The coloured patches were included for potential future research into colour correction methods, aimed at improving the robustness of the models to variations in lighting and imaging conditions. Most of the images were acquired using mid-range devices both Android and iOS, though some came from low-end or high-end devices. Moreover, the professionals were advised to use good natural light conditions and evenly distributed lighting when possible, avoid using the camera flash to prevent reflections on wound tissues, and take images parallel to the wound bed, aligned with the patients’ head-toe orientation. Concerning the mask annotations of the wound and its three tissue components (granulation, slough and eschar), 121 out of 307 images were manually annotated by three wound specialists (nurses with different years of experience), who first drew the boundaries separately, whereas the remaining images were annotated by only one specialist. To establish a robust ground truth mask for the wound and each tissue, the intersection of the annotations among the three specialists was identified, corresponding to the majority agreement. In cases where consensus was uncertain, specialists collaboratively refined the mask, ensuring the reliability of the annotated masks. The annotation process was conducted using a custom-built labelling tool developed in-house, as illustrated in Fig. 2. Fig. 3 shows examples of images from the Wounds dataset and corresponding annotations.

Two examples of the masks resulting from the specialists’ annotations are represented in Fig. 4. It can be observed the level of agreement between them depicted in different colours for the open wound (yellow) and each tissue (granulation - red, slough - green and eschar - blue). In each case, the strongest colour refers to pixels where the three annotators agreed, whereas the middle shade represents the pixels where two annotators were in agreement and the softest colour represents pixels only delineated by one of the specialists.

The Wounds dataset comprises 307 images from 104 wounds of different types, as detailed in Table 1. Most wounds correspond to pressure ulcers (59), but venous, arterial and diabetic foot ulcer images are also represented in the dataset. In pressure ulcers, wounds from all categories exist, being the majority of categories 2, 3 and 4. The dataset contains images with all skin phototypes, being the majority from patients with phototypes 2 and 3. The dataset split in training (75%) and test (25%) sets, demonstrated in Supplementary Material S1, was designed to incorporate representative examples from diverse wound types, body locations and skin phototypes. Moreover, the proportion of images containing each type of tissue in the training and test set was maintained. The dataset contains wounds with sizes between 0.01 and 160 squared centimetres (average area of 13.65$cm^2$ and 21.78$cm^2$ standard deviation), with width and height up to 13 centimetres. In the original dataset, the open wound area covers up to 25% of the total images’ area. Statistics of the percentages of tissues inside the wound on the Wounds dataset are presented in Supplementary Material S1.

Table 1 Distribution of the Wounds dataset, with the number of wounds (#W) and images (#I) per wound type.

Full size table

Moreover, the publicly available AZH FU dataset²⁷ was also considered in our experiments, being a crucial component for the pretraining phase of our models. This dataset is composed of 1010 images, split in a proportion of 80:20 for training and testing, and the corresponding ground truth wound segmentation masks, from 889 patients.

Open wound and tissue segmentation agreement studies

Two distinct agreement analyses were conducted to assess the reliability and consistency of the wound and tissue segmentation masks provided by the experts. Firstly, we evaluated inter-rater agreement to measure the level of agreement among the three specialists. Secondly, we conducted individual rater comparisons against the generated consensus masks previously described and used in our experiments. To achieve this, the subset of 121 images annotated by various specialists from the earlier mentioned dataset (Section "Dataset") was utilised.

In both cases, the variability in boundary annotations was quantified using the agreement measure defined in Eq. (1) which corresponds to the IoU measure, where $D_1$ and $D_2$ refer to the regions annotated by different specialists in the first study, and, in the second case, to the regions annotated by each specialist and the consensus.

$$\begin{aligned} IoU (Agreement) = \frac{D_1 \cap D_2}{D_1 \cup D_2} \times 100 \end{aligned}$$

(1)

Additionally, to measure the annotations variability among the three raters and also in relation to the consensus, the Shrout and Fleiss intraclass correlation coefficient (ICC)³⁵ was computed, employing a two-way mixed effects model, given that the raters assessed the same set of samples in the dataset. ICC describes how strongly units in the same group resemble each other.

The first agreement analysis (inter-rater agreement) comprised the agreement in wound and tissue mask annotations between the pairs of raters and all three, and the ICC obtained for tissue proportion estimations. It is worth noting that, in this case, the proportion of each tissue was obtained by counting the pixels belonging to the corresponding tissue type against the sum of the three possible tissues (granulation, slough and eschar) within the wound bed, although the open wound may aggregate other tissues. Regarding the second agreement study (raters agreement with consensus), the agreement of each rater’s annotations with the ground truth masks was also computed and the ICC was measured considering the IoU of the tissues masks annotated by each rater and the consensus.

Tissue segmentation

To automate the identification of the various tissues within the wound bed and determine their respective proportions, a tissue segmentation model was developed. This development comprised two parts. The first one intended to evaluate the feasibility of the proposed methodology, thus GT annotations of the open wound in terms of bounding boxes and segmentation masks were used, avoiding error propagation at each step. These annotations were respectively used in the images’ pre-processing to crop the wound region of interest (ROI) (Section "Pre-processing"), and for post-processing of the tissue masks (Section "Post-processing"). In the second part, instead of using these GT annotations, wound detection and wound segmentation models previously developed were incorporated in a streamlined pipeline, illustrated in Fig. 1, for seamless deployment in the real world. Moreover, in both parts, the previously trained wound segmentation models were used as a baseline for the fine-tuning process of the tissue segmentation models.

Pre-processing

Before entering the tissue segmentation models, the images were standardised through a pre-processing step that consisted of a cropping operation centred on the wound region. As previously stated, the experiments comprised two phases. To obtain the region of interest around the wound, in the first phase, the GT segmentation masks were used to extract the bounding boxes of the wound and reference marker to prevent error propagation in the model development. In contrast, in the streamlined pipeline, these bounding boxes were obtained through a RetinaNet with a MobileNetV2 backbone detection model³³, reporting mAP@.75IOU values of 0.39 and 0.95 for the wound and marker recognition. In the Wounds dataset, a padding margin equal to 25% of the reference marker’s largest side was then applied to the cropped images to preserve contextual information, whereas, in the case of the AZH FU dataset, a tolerance of 30% of the bounding box size was considered. The two padding percentages were determined empirically. For both datasets, the cropped region was then enforced to a square shape with dimensions of $320\times 320$ pixels for consistency and pixel intensity normalization was applied.

Implementation details

The study on tissue segmentation in chronic wounds encompassed both CNN and transformer-based models using different training approaches. The investigated models were based on the DeepLabV3+ architecture³⁶, with a ResNet50³⁷ backbone pre-trained on ImageNet³⁸ (DeepLabV3-R50), and on the SegFormer architecture³⁹(SegFormer-B0). While DeepLabV3+ uses a CNN with encoder-decoder design for segmentation, SegFormer uses lightweight transformers to capture global context without positional encodings. Specifically, we compared models trained only for tissue segmentation with models pre-trained for open wound segmentation and subsequently fine-tuned for our task. Regarding the wound segmentation models, besides being trained on the Wounds dataset, the impact of using other datasets during their training was investigated, so the AZH FU dataset was also employed. Therefore, in our work, a total of four models concerning open wound segmentation was explored as pre-trained models for tissue segmentation, namely a DeepLabV3+ and a SegFormer-based models both trained only on the Wounds dataset and on the Wounds and AZH FU datasets. These models are described in detail in³² and their performance on the Wounds dataset is reported in Table 2.

Table 2 Open wound segmentation results (in %) on the Wounds dataset, using models developed in³².

Full size table

To optimize the model configuration to be used, a grid search hyperparameter tuning process in association with a stratified 3-fold cross-validation procedure was employed. For training, a maximum of 200 epochs was established, using early stopping with a patience of 10 runs. Considering the impact on computational demands, two image dimensions ($224\times 224$ and $320\times 320$ pixels) were explored while adopting batch sizes of 16 and 32. Moreover, the Adam optimizer with learning rate (LR) values of $10^{-4}$ and $10^{-3}$ was considered. With respect to the loss functions, for the DeepLabV3-R50 model, a combination of Dice loss and Cross Entropy (using equal weights) was employed, while in the case of SegFormer-B0 the Cross Entropy was used. Both training and evaluation of the models were performed using the open-source semantic segmentation toolbox MMSegmentation v1.2.1⁴⁰ on PyTorch v1.13.1+cu116. The experiments used a workstation with four NVIDIA Tesla A100 and V100 GPUs.

Post-processing

In order to obtain the final tissues masks, we explored the effect of intersecting the masks predicted by the tissue segmentation models with the open wound masks, hence eliminating possible artifacts outside the segmented wound. In the first case, the GT open wound masks annotated by the specialists were considered, whereas in the pipeline step, the SegFormer-B0 wound segmentation model trained exclusively on the Wounds dataset (Dice 91.55% - Table 2) was employed to obtain the wound masks. Finally, the percentage of each tissue inside the open wound relative to the sum of all three tissues was calculated.

Evaluation details

An ablation study was conducted to evaluate the influence of different training processes and processing strategies on the performance of the tissue segmentation model. Different architectures were compared, namely the CNN represented by DeepLabV3-R50, and a transformer-based model (SegFormer-B0). Additionally, the study investigated the effect of fine-tuning a model initially trained for open wound segmentation, and the influence of different datasets utilised for pretraining the models. The assessment also included an investigation into the impact of the described post-processing operations.

To assess the segmentation performance and compare it with other state-of-the-art approaches, we used IoU Eq. (1) and Dice coefficient Eq. (2), where $D_1$ represents the consensus (GT) mask and $D_2$ the model’s prediction. The IoU quantifies the overlap between the predicted and GT segmentation masks and ranges from 0 (no overlap) to 1 (perfect overlap); Dice is a measure of the similarity between two masks and values closer to 1 indicate higher agreement.

$$\begin{aligned} Dice = \frac{2 | D_1 \cap D_2 |}{|D_1|+|D_2|} \end{aligned}$$

(2)

To evaluate the estimated tissue proportions, the mean absolute error (MAE) was adopted. MAE is defined in Eq. (3), with n representing the total number of samples, and $y_i$ and ${\hat{y}}_i$ the true and predicted values of the i-th sample, respectively.

$$\begin{aligned} MAE = \frac{1}{n} \sum _{i=1}^{n} |y_i - {\hat{y}}_i| \end{aligned}$$

(3)

Results and discussion

Open wound and tissue segmentation agreement studies

The findings of the inter-rater and rater/consensus agreement analysis are described in this section. This analysis established a baseline for human performance on the task, providing a benchmark to compare the effectiveness of the proposed tissue characterisation framework. Firstly, the inter-rater variability was examined to assess the consistency among the experts; then, the results of the alignment between each expert and the consensus ground truth were evaluated, to understand the accuracy of the manual annotations.

Inter-rater agreement

Table 3 provides an analysis of the open wound and tissue boundary agreement among each pair of experts and all three raters. Regarding open wound annotations, the mean agreement between any two raters ranges from 78.5% to 81.2%, declining to 71.5% when all three raters are considered. Notably, specialists 1 and 3 exhibit greater alignment, demonstrating the highest mean, minimum, and maximum agreement values. This trend is reflected in tissue agreement. Upon examination of mean and standard deviation values, the same pair of experts demonstrate the highest agreement for slough and eschar. In terms of granulation, a similar mean agreement is observed between pair 1 (1 and 2) and pair 2 (1 and 3), with a difference of 0.5%. However, pair 1 exhibits a lower standard deviation compared to pair 2, with a difference of 1.8%. Overall, agreement at tissue level is considerably lower, with average values mostly between 50% and 60% and standard deviation above 20%. Among the three tissue types, necrotic appears to be more easily identifiable due to its strong colour and well-defined contours, while granulation and slough present comparable levels of difficulty. However, it is important to note that the number of images containing eschar is considerably lower, which could potentially influence this observation. Notably, analysis of the minimum agreement values reveals cases with 0% agreement for granulation when rater 2 is involved, revealing that this specialist annotated regions with no overlap with those delineated by the other raters. Figure 5 shows examples of cases with low inter-rater agreement for each tissue.

Table 3 Inter-rater agreement (%) for wound and tissue segmentation across rater pairs and the complete set.

Full size table

The examination of tissue proportions within the wound bed also offers insights into inter-rater variability, captured by the ICC. For the calculation of this statistic, only images containing annotations from the three specialists were considered for each tissue. There is an excellent agreement among the experts⁴¹, with granulation and eschar tissues exhibiting the highest level of consensus (0.931 and 0.932, respectively), while for slough it is slightly lower (0.910), however still classified as excellent agreement.

Compared with previous studies¹⁰, where ICC values of 0.765, 0.591 and 0.759 were reported for granulation, slough and eschar, respectively, our study indicates a higher level of consistency in expert annotations, surpassing those by approximately 0.17, 0.32, and 0.17, respectively. Similarly, when compared to another inter-reader variability study¹³, our findings indicate a higher level of agreement among experts regarding tissue annotations in a larger proportion of images. Specifically, when considering all three raters,¹³ reports average agreement values of 17.8% and 24.5% for slough and eschar, respectively, largely contrasting with our values of 43% and 63%. Conversely, the agreement values for granulation are very similar between the two studies. Regarding open wound boundary delineation, the agreement among experts in¹³ is slightly higher than the ones in Table 3, with higher minimum values and lower standard deviation. The higher tissue agreement observed in our study may be attributed to the fact that our specialists are all nurses with varying years of experience. In contrast, other studies^10,13 involve different healthcare professionals with diverse backgrounds, potentially contributing to discrepancies in their annotations. In addition, previous studies^10,13 were based on only around 50 images, whereas our study analysed the agreement based on 121 images.This analysis infers greater confidence for our proposed approach, as robust ground truth data is important for training accurate and reliable segmentation models.

Raters agreement with consensus

The results concerning each rater agreement in comparison to the GT annotations (consensus) of wound and tissue proportions may be found in Table 4. As we are comparing the masks annotated by each rater with the corresponding consensus annotation, only images where both the expert and consensus had annotations for a specific class (wound or tissue) were considered, therefore a different number of supporting images is found for each rater. In terms of wound annotations, we observe a high agreement between the masks delineated by each rater and the generated consensus mask (GT), with values above 80% for the three specialists. Overall, as it was the case in the previous analysis (Section "Inter-rater agreement"), although the agreement between clinicians and consensus at the wound level was good, their agreement at the tissue level decreased, with average values ranging from around 57.9% to 79.6%, which resulted from the inherent complexity of the problem. Comparing the agreement between each of the three experts and the GT masks (consensus), we found that both for open wound annotations and for tissue annotations, except for eschar tissue, rater 1 achieved the highest mean, minimum, and maximum agreement values with the consensus masks, showing greater proximity to the GT masks used in our experiments. In terms of tissues, the eschar tissue also achieved the highest agreement, which, apart from the smaller number of images, may result from the easier differentiation in the colour of its boundaries.

Table 4 Rater agreement with consensus segmentation for open wound and tissues (%).

Full size table

The agreement between each expert’s and the consensus (GT) masks was also quantified using the ICC applied to the IoU values for each tissue type. The high ICC values obtained for granulation (0.803) and slough (0.883) tissues demonstrate good reliability in the experts’ annotations relative to the consensus. Nevertheless, for the eschar tissue, we find an opposite trend to that seen in Table 4, as, in this case, this tissue reveals the lowest consistency, although a moderate ICC value is still observed (0.581). Besides inter-rater variance, ICC also measures how consistent the IoU with the consensus masks are among each rater. Therefore, as a lower number of images was available for this tissue, the estimations were more prone to fluctuations that may have impacted ICC calculations but that were not reflected in the average agreement values found in Table 4.

Overall, the results for the inter-rater and rater/consensus agreement analysis demonstrate the variability in visual estimation, reflecting the tissue differentiation challenge. To reduce ambiguity and provide objectivity in this task, a framework for deep learning–based tissue segmentation and proportion estimation was presented and its results are discussed next.

Tissue segmentation

The results of the model optimisation experiments are presented, followed by an ablation study to select the best architectures, fine-tuning methods and post-processing operations and its impact on the performance of the proposed framework, regarding tissue segmentation and proportion estimation.

Hyperparameter tuning

To determine the optimal hyperparameter configuration for each model, we compared the performance in terms of the IoU metric achieved across various experiments using different combinations of image sizes ($224\times 224$ and $320\times 320$), batch sizes (16 and 32), and learning rates ($10^{-4}$ and $10^{-3}$). The average cross-validation results are depicted in Supplementary material S2. The most effective hyperparameter combination for each model (highlighted in the image) was chosen for the ablation study.

Ablation study

We performed an ablation study to investigate the impact of different architectures, fine-tuning models pre-trained in a simpler task from the same domain with different datasets, and post-processing operations on the overall performance. The results are presented in Table 5 and infer the effectiveness of different training strategies in improving the segmentation accuracy of chronic wound tissues. Open wound segmentation models fine-tuned for tissue segmentation consistently outperformed the others across all classes. Leveraging prior knowledge in open wound segmentation has proven advantageous across all tissue types. By initially training the model on this specific domain, which presents a relatively simpler task, we provided the model with a solid foundation. Subsequently, this knowledge was successfully applied to tackle the more intricate challenge of segmenting different regions within the wound. Furthermore, the effectiveness of the post-processing step, wherein tissue predictions are intersected with the open wound mask, is also confirmed, as this approach positively influenced the results for each tissue class.

By inspecting Table 5, the DeepLabV3-R50 model, pre-trained for open wound segmentation and fine-tuned for tissue segmentation with the Wounds dataset with an input size of $320\times 320$, batch size of 32 and LR equal to $10^{-4}$, emerges as the top-performing approach, yielding the highest mIoU and mDice scores: 62.95% and 76.82%. Despite the limited representation of necrotic within the dataset, the model achieved Dice scores of approximately 83% for this class. Granulation exhibited a performance of 81.33% for the same metric. These outcomes are not surprising given that both components typically exhibit well-defined boundaries. Slough emerged as the most challenging category, with the model returning a Dice of 66.13%, while none of the other models attained values above 68%.

The performance of our best model falls short of the AutoTissue approach proposed in¹⁰, which achieves a mIoU of 71.92%.¹⁶ and¹⁷ employed superpixel-based approaches, dividing the image into fixed regions for tissue classification. The first study reported an accuracy of 92.68% and Dice of 75.74%, while the latter achieved precision, recall and accuracy scores of 99.15%, 99.15% and 99.57%, respectively. Moreover, the authors in¹⁷ excluded images where annotators could not reach a consensus. In²⁰ and¹⁹, the authors also performed tissue classification in patches of $5\times 5$ pixels, with the former yielding an accuracy of 99.55%, specificity of 98.06% and sensitivity of 95.66%, and the latter reporting classification accuracy of 92.01% and an average total weighted Dice of 91.38%. These methodologies with fixed regions reduce the error, compared to our proposed pixel-wise approach. Furthermore, the results obtained from these approaches may not be directly comparable, as each method is tailored to the specific characteristics and requirements of its own dataset and is evaluated on that particular dataset. Variables such as image resolution, diversity of wounds, and annotation protocols can vary between datasets, influencing the performance of the algorithms. Therefore, while these approaches may achieve impressive results within their respective datasets, their applicability and generalisability to other datasets, including ours, may be limited.

Overall performance of the proposed framework

Table 5 presents the results obtained for the complete pipeline, thus incorporating both detection and segmentation models. Since the open wound detection model failed to detect wounds in a number of samples, only 65 out of the initial 78 images contained in the test set were utilised for segmentation evaluation. With the exception of the Segformer-B0 model pre-trained for open wound segmentation on the Wounds dataset, we can find a positive impact of incorporating domain knowledge regarding a simpler task (open wound segmentation) when training our models, as the fine-tuned models exhibit higher performances when compared to the models merely trained for tissue segmentation. Similarly to the results verified in Table 5 which concern the GT-based pre-processing approach, the DeepLabV3-R50 model pre-trained for open wound segmentation on the Wounds datasets emerged as the top-performing model, achieving the highest mIoU and mDice scores. The combined approach of detection and segmentation then demonstrated notable robustness, with only a minor 2.5% reduction in overall segmentation performance for the best-performing model (mDice decreasing from 76.82% to 74.38%) relative to the results illustrated in Table 5. Moreover, comparing the outcomes for each tissue, the trend persists, with slough tissue being the most difficult to identify due to its more challenging colouring, presenting the lowest Dice score (64.67%). In this case, granulation and eschar tissues achieved similar results corresponding to a Dice of 79.45% and 79.01%, respectively.

Table 5 Results (%) of tissue segmentation for the test set of the Wounds dataset for the ablation study and proposed framework approaches. Wound Inters. denotes if the predicted tissue masks are intersected with the corresponding open wound masks.

Full size table

Statistical comparisons were performed on the two top-performing models in terms of mean Dice, namely the two DeepLabV3-R50 open wound segmentation models fine-tuned for tissue segmentation. Both models were initially trained with the Wounds dataset, with one additionally pre-trained using the AZH FU dataset. The comparisons were conducted at a significance level of 95% ($p<0.05$). To identify the most appropriate test, the normality of the IoU and Dice distributions was assessed using the Shapiro–Wilk test. Given the non-normality of the data and the paired nature of the samples, the Wilcoxon signed-rank test was employed and no significant difference between the models under consideration was found. To further compare the performance of the models, a visual inspection was conducted, as well as the impact of their corresponding predicted tissue masks on the final goal of the framework: tissue proportion estimation.

Examples of the predictions from the top-performing models are depicted in Fig. 6, where, for simplicity’s sake, the DeepLabV3-R50 model pre-trained for open wound segmentation on the Wounds datasets is referred to as “model A” and the DeepLabV3-R50 model pre-trained for open wound segmentation both on the AZH FU and the Wounds datasets corresponds to “model B”. The top row showcases satisfactory predictions with mDice values of 66.75% vs 56.32%, and 92.21% vs 86.25% for models A and B, respectively. In the middle row, segmentations with mean Dice scores of 31.40% vs 64.61%, and 52.3% vs 49.43% are presented. Lastly, the bottom row displays flawed instances, with mDice of 56.14% vs 65.88%, and 54.78% vs 26.37%. These examples show the importance of visual examination. Despite the higher mean Dice scores in the last row, predictions in the middle row appear more sensible: the example on the left contains minor misclassifications of eschar as the other tissues, which penalize the overall result; on the right, there is granulation tissue misidentified as slough by both models, a reasonable error given its lighter colour, and the necrotic tissue is not recognised within the wound bed, leading to low Dice score for this class. The bottom examples illustrate poor predictions: the left sample displays excessive slough segmentation in both models, likely attributed to the lighter tone caused by reflections. Additionally, there is minimal tissue identification in the centre of the wound in the example on the right and model B completely misidentifies the granulation and necrotic tissues. Comparing the two approaches, model A not only achieves higher mDice scores in the provided examples but also consistently delivers more visually coherent results. The superior performance of model A implies that pretraining on a domain-specific dataset (Wounds dataset) is more beneficial than combining it with additional datasets (AZH FU) that may introduce variability without significant benefit.

The MAE for the tissue proportions calculated from the tissue masks generated by the DeepLabV3-R50 tissue segmentation models, previously trained for open wound segmentation on the AZH FU and/or Wounds datasets, for the test images was computed, as in Table 6. Model A, pre-trained solely on the Wounds dataset, consistently demonstrated lower errors for all tissue types as well as lower standard deviation, indicating reduced variability, compared to model B. This further supports the conclusion drawn from visual examination.

No reports of the MAE for tissue proportion estimation were found in the literature, preventing direct comparisons of our results. A valuable follow-up study would involve collecting visually estimated wound tissue proportions from specialists in real-world settings, hence evaluating the reproducibility of the automated approach compared to expert assessment.

Table 6 Mean absolute error of tissue proportions obtained by the pipeline in the test set. The models compared are the DeepLabV3-R50 tissue segmentation models, pre-trained for open wound segmentation on the AZH FU and/or Wounds datasets.

Full size table

Conclusions

This work demonstrates the potential of an automated approach for chronic wound tissue segmentation and tissue proportion estimation by leveraging deep learning and domain knowledge. This study introduces a novel and comprehensive tissue segmentation dataset, encompassing a wide range of wound types and appearances. Furthermore, an inter-rater agreement analysis confirmed the high annotation quality of the constructed dataset, while also highlighting the inherent complexity of tissue delineation and providing important context for interpreting the model’s performance. Using this dataset and after exploring different training conditions, a DeepLabV3-R50 model first pre-trained for a simpler task (open wound segmentation) and then fine-tuned for tissue segmentation on a private Wounds dataset was selected to predict the tissue regions and subsequently estimate their percentage within the open wound. This resulted in Dice scores of 79.45%, 64.67% and 79.01% for granulation, slough and eschar tissues, respectively, and MAE for proportion estimation of ($14.33\pm 16.05$)%, ($14.31\pm 15.28$)% and ($8.84\pm 5.29$)% for the same tissues. The impact of the explored conditions in the final performance emphasizes the importance of considering different factors, including network architecture, fine-tuning strategies, dataset diversity and post-processing operations, in optimizing the performance of the models.

Future work will focus on clinical validation and exploring a data-centric approach to improve the quality of the considered dataset by identifying images with lower quality and under-represented sample types, ensuring a more representative data coverage.

Data availability

The authors do not have permission to share data from the Wounds dataset as the Ethical Committee only approved the usage of the data as a private dataset. The AZH FU dataset is publicly available at https://github.com/uwm-bigdata/wound-segmentation/tree/master/data/wound_dataset.

Code availability

The deep learning models used in this study were implemented using the open-source semantic segmentation toolbox MMSegmentation v1.2.1⁴⁰. Model training and evaluation were performed using the MMSegmentation repository. The configurations for the DeepLabV3-R50 and SegFormer-B0 models are available at the following links, respectively: DeepLabV3-R50 config file and SegFormer-B0 config file . The prediction source code could be made available on request to the corresponding author based on commercial or academic terms.

References

Järbrink, K. et al. The humanistic and economic burden of chronic wounds: a protocol for a systematic review. Syst. Rev. 6, 1–7 (2017).
Article Google Scholar
Sen, C. K. Human wounds and its burden: an updated compendium of estimates. Adv. Wound. Care 8(2), 39–48. https://doi.org/10.1089/wound.2019.0946 (2019).
Article Google Scholar
Mohafez, H., Ahmad, S. A., Roohi, S. A. & Hadizadeh, M. Wound healing assessment using digital photography: a review. J. Biomed. Eng. Med. Imag. 3(5), 01. https://doi.org/10.14738/jbemi.35.2203 (2016).
Article Google Scholar
Kumar, K. S. & Reddy, B. E. Wound image analysis classifier for efficient tracking of wound healing status. Signal Image Process. 5(2), 15 (2014).
Google Scholar
Anisuzzaman, D. et al. Image-based artificial intelligence in wound assessment: A systematic review. Adv. Wound. Care 11(12), 687–709 (2022).
Article CAS Google Scholar
Patel, Y. et al. Integrated image and location analysis for wound classification: A deep learning approach. Sci. Rep. 14(1), 7043. https://doi.org/10.1038/s41598-024-56626-w (2024) Accessed 2024-05-17.
Carrión, H. et al. Automatic wound detection and size estimation using deep learning algorithms. PLoS Comput. Biol. 18, 1009852. https://doi.org/10.1371/journal.pcbi.1009852.
Galdran, A., Carneiro, G. & Ballester, M. A. G. Convolutional Nets Versus Vision Transformers for Diabetic Foot Ulcer Classification. In: Yap, M.H., Cassidy, B., Kendrick, C. (eds.) Diabetic Foot Ulcers Grand Challenge pp. 21–29. Springer. https://doi.org/10.1007/978-3-030-94907-5_2.
Lo, Z. J. et al. Development of an explainable artificial intelligence model for Asian vascular wound images 21(4), 14565. https://doi.org/10.1111/iwj.14565 . Accessed 2024-05-17.
Ramachandram, D. et al. Fully automated wound tissue segmentation using deep learning on mobile devices: Cohort study. JMIR Mhealth Uhealth. 10(4), 36977 (2022).
Article Google Scholar
Shenoy, V., Foster, E., Aalami, L., Majeed, B. & Aalami, O. Deepwound: Automated Postoperative Wound Assessment and Surgical Site Surveillance Through Convolutional Neural Networks. https://doi.org/10.48550/arXiv.1807.04355 . http://arxiv.org/abs/1807.04355 Accessed 2024-05-17.
Wild, T. et al. Digital measurement and analysis of wounds based on colour segmentation. European Surgery 40(1), 5–10 (2008).
Article Google Scholar
Fauzi, M. F. A. et al. Computerized segmentation and measurement of chronic wound images. Comput. Biol. Med. 60, 74–85 (2015).
Article Google Scholar
Elmogy, M., García-Zapirain, B., Burns, C., Elmaghraby, A. & Ei-Baz, A. Tissues classification for pressure ulcer images based on 3d convolutional neural network. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 3139–3143 (2018). IEEE
Veredas, F. J., Luque-Baena, R. M., Martín-Santos, F. J., Morilla-Herrera, J. C. & Morente, L. Wound image evaluation with machine learning. Neurocomputing 164, 112–122 (2015).
Article Google Scholar
Niri, R., Douzi, H., Lucas, Y. & Treuillet, S. A superpixel-wise fully convolutional neural network approach for diabetic foot ulcer tissue classification. In: Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part I, pp. 308–320 (2021). Springer.
Chang, C. W. et al. Deep learning approach based on superpixel segmentation assisted labeling for automatic pressure ulcer diagnosis. Plos one 17(2), 0264139 (2022).
Article Google Scholar
Mukherjee, R. et al. Automated tissue classification framework for reproducible chronic wound assessment. Biomed. Res. Int. 2014, (2014).
Zahia, S., Sierra-Sosa, D., Garcia-Zapirain, B. & Elmaghraby, A. Tissue classification and segmentation of pressure injuries using convolutional neural networks. Comput Methods Programs Biomed. 159, 51–58 (2018).
Article PubMed Google Scholar
Rajathi, V., Bhavani, R. & Wiselin Jiji, G. Varicose ulcer (c6) wound image tissue classification using multidimensional convolutional neural networks. The Imaging Science Journal 67(7), 374–384 (2019).
Article Google Scholar
García-Zapirain, B., Elmogy, M., El-Baz, A. & Elmaghraby, A. S. Classification of pressure ulcer tissues with 3d convolutional neural network. Med. Biol. Eng. Comput. 56, 2245–2258 (2018).
Article PubMed Google Scholar
Chakraborty, C. Computational approach for chronic wound tissue characterization. Inform. Med. Unlocked. 17, 100162 (2019).
Article Google Scholar
Sarp, S., Kuzlu, M., Pipattanasomporn, M. & Guler, O. Simultaneous wound border segmentation and tissue classification using a conditional generative adversarial network. J. Eng. 2021(3), 125–134 (2021).
Google Scholar
Wang, C. et al. FUSeg: The Foot Ulcer Segmentation Challenge 15(3), 140. https://doi.org/10.3390/info15030140 (2024). Accessed 2024-05-17.
...Yap, M. H. et al. Diabetic foot ulcers segmentation challenge report: Benchmark and analysis. Med. Image. Anal. 94, 103153. https://doi.org/10.1016/j.media.2024.103153. Accessed 2024-05-17 (2024).
Thomas, S. Medetec Wound Database: Stock Pictures of Wounds. http://www.medetec.co.uk/files/medetec-image-databases.html. Accessed 17 May 2024 (2024).
Wang, C. et al. Fully automatic wound segmentation with deep convolutional neural networks. Sci. Rep. 10(1), 21897. https://doi.org/10.1038/s41598-020-78799-w (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Oota, S. R., Rowtula, V., Mohammed, S., Liu, M., & Gupta, M. Wsnet: towards an effective method for wound image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3234–3243 (2023).
Liao, T.-Y., Yang, C.-H., Lo, Y.-W., Lai, K.-Y., Shen, P.-H. & Lin, Y.-L. HarDNet-DFUS: Enhancing Backbone and Decoder of HarDNet-MSEG for Diabetic Foot Ulcer Image Segmentation, pp. 21–30. Springer. https://doi.org/10.1007/978-3-031-26354-5_2
Hassib, M., Ali, M., Mohamed, A., Torki, M. & Hussein, M. Diabetic Foot Ulcer Segmentation Using Convolutional and Transformer-Based Models, pp. 83–91. Springer.https://doi.org/10.1007/978-3-031-26354-5_7.
Chae, J. & Kim, J. An Investigation of Transfer Learning Approaches to Overcome Limited Labeled Data in Medical Image Analysis. Appl. Sci. 13(15), 8671. https://doi.org/10.3390/app13158671 Accessed 2024-05-17 (2024).
Carvalho, R., Morgado, A. C., Sampaio, A. F. & Vasconcelos, M. J. Exploring cnn and transformer-based architectures to improve image segmentation for chronic wound measurement. In: International Workshop on Applications of Medical AI, pp. 1–10 (2024). Springer.
Sampaio, A. F., Alves, P., Cardoso, N., Alves, P., Marques, R., Salgado, P. & Vasconcelos, M. J. M. Leveraging deep neural networks for automatic and standardised wound image acquisition. In: 9th International Conference on Information and Communication Technologies for Ageing Well and e-Health, pp. 253–261 (2023).
Vasconcelos, M. J. M., Sampaio, A. F., Cardoso, N., Liberal, M., Alves, P., Marques, R. & Salgado, P. Standardising wound image acquisition through edge ai. In: International Conference on Information and Communication Technologies for Ageing Well and e-Health, pp. 130–149 (2023). Springer.
Shrout, P. E. & Fleiss, J. L. Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86(2), 420 (1979).
Article CAS PubMed Google Scholar
Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016).
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. & Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848.
Xie, E. et al. Segformer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021).
Google Scholar
MMSegmentation Contributors: OpenMMLab Semantic Segmentation Toolbox and Benchmark. https://github.com/open-mmlab/mmsegmentation.
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15(2), 155–163 (2016).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors would like to thank Raquel Marques from Universidade Católica Portuguesa, Paula Teixeira from Unidade Local de Saúde de Matosinhos and Paulo Ramos for their collaboration in the data annotation process.

Funding

This work is under the scope of project HfPT, funded by IAPMEI with reference 41, co-financed by Component 5 - Capitalization and Business Innovation, integrated in the Resilience Dimension of the Recovery and Resilience Plan within the scope of the Recovery and Resilience Mechanism (MRR) of the European Union (EU), framed in the Next Generation EU, for the period 2021 - 2026.

Author information

Ana C. Morgado, Rafaela Carvalho, Ana Filipa Sampaio and Maria J. M. Vasconcelos: These authors contributed equally to this work.

Authors and Affiliations

Fraunhofer Portugal AICOS, Porto, Portugal
Ana C. Morgado, Rafaela Carvalho, Ana Filipa Sampaio & Maria J. M. Vasconcelos

Authors

Ana C. Morgado
View author publications
Search author on:PubMed Google Scholar
Rafaela Carvalho
View author publications
Search author on:PubMed Google Scholar
Ana Filipa Sampaio
View author publications
Search author on:PubMed Google Scholar
Maria J. M. Vasconcelos
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Maria J. M. Vasconcelos.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval and consent to participate

For data acquisition, a study protocol was submitted and approved by different Portuguese Health Ethical Committees (Northern Regional Health Administration CE/2023/58 approved in 11.05.2023 and Guarda Local Health Unit - 55/2023 approved in 24.05.2023) and informed consent was obtained from both the healthcare professionals and the patients involved. All experiments were performed in accordance with relevant guidelines and regulations.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Morgado, A.C., Carvalho, R., Sampaio, A.F. et al. Enhancing chronic wound assessment through agreement analysis and tissue segmentation. Sci Rep 15, 22244 (2025). https://doi.org/10.1038/s41598-025-06703-5

Download citation

Received: 30 October 2024
Accepted: 10 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-06703-5

Subjects

Abstract

Similar content being viewed by others

Fully automatic wound segmentation with deep convolutional neural networks

Clinically validated classification of chronic wounds method with memristor-based cellular neural network

Wound management materials and technologies from bench to bedside and beyond

Introduction

Related work

Methods

Dataset

Open wound and tissue segmentation agreement studies

Tissue segmentation

Pre-processing

Implementation details

Post-processing

Evaluation details

Results and discussion

Open wound and tissue segmentation agreement studies

Inter-rater agreement

Raters agreement with consensus

Tissue segmentation

Hyperparameter tuning

Ablation study

Overall performance of the proposed framework

Conclusions

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Ethical approval and consent to participate

Additional information

Publisher’s note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links