Abstract
Gingival inflammation grade serves as a well-established index in periodontitis. The aim of this study was to develop a deep learning network utilizing a novel feature extraction method for the automatic assessment of gingival inflammation. T-distributed Stochastic Neighbor Embedding (t-SNE) was utilized for dimensionality reduction. A convolutional neural network (CNN) model based on DenseNet was developed for the identification and evaluation of gingival inflammation. To enhance the performance of the deep learning (DL) model, a novel teeth removal algorithm was implemented. Additionally, a Grad-CAM + + encoder was applied to generate heatmaps for computer visual attention analysis. The mean Intersection over Union (MIoU) for the identification of gingivitis was 0.727 ± 0.117. The accuracy rates for the five inflammatory degrees were 77.09%, 77.25%, 74.38%, 73.68% and 79.22%. The Area Under the Receiver Operating Characteristic (AUROC) values were 0.83, 0.80, 0.81, 0.81 and 0.84, respectively. The attention ratio towards gingival tissue increased from 37.73% to 62.20%, and within 8 mm of the gingival margin, it rose from 21.11% to 38.23%. On the gingiva, the overall attention ratio increased from 51.82% to 78.21%. The proposed DL model with novel feature extraction method provides high accuracy and sensitivity for identifying and grading gingival inflammation.
Similar content being viewed by others
Introduction
Periodontitis is a chronic infectious disease that affects up to 90% of the world's population1. Alveolar bone resorption caused by periodontitis are considered to be the main leading reasons for tooth loss. Gingival inflammation is the initial stage of periodontal tissue destruction and continues throughout the development of periodontal disease. Gingival inflammation grade is a well-established clinical index2. Gingival inflammation is the key symptom of periodontal disease and is the basis for the diagnosis of periodontal disease, as well as determining the treatment plan and the prognosis of the disease. The various indices of gingival inflammation, such as the Gingival Index (Loe and Silness), the Bleeding Index and the Modified Gingival Index, are subjective and based on the judgement of the examiner. It is a very time-consuming process to accurately record and track these indices.
With the progress of medical image processing technology, the integration of machine learning, computer vision, and deep neural networks has proven increasingly efficacious, facilitating the automated and precise detection of various diseases3. Deep learning (DL) technology, particularly the convolutional neural network (CNN), is renowned for its outstanding feature extraction capabilities and has been widely used in image categorization4. Krois et al. measured periodontal bone loss on panoramic radiographs by using DL models5, while Lee et al. used DCNN to identify and categorize fractured dental implants6.
Algorithms are a central component of a DL model. A substantial proportion of investigations conducted employed CNNs as the basic DL architecture, relying on standard algorithm7,8. It is therefore easier to distinguish between different diseases using standard DL method, however, accurate discrimination of subtle features within the same disease remains challenging.
Algorithmic development emerges as a potent method for both disease identificationand facilitating the extraction of a variety of features to comprehensively discern distinctions among seemingly similar characteristics9. Lyu constructed a novel benchmark based on topological and geometrical analysis of retinal blood vessels aiming to facilitate ocular fundus pathology10. These improvements not only promise heightened accuracy but also a more broadly applicable framework for clinical decision-making. However, in the context of the CNN algorithm utilized for periodontitis analysis, a notable dearth exists in novel algorithms aimed at refining feature extraction.
In this study, a CNN-based gingival inflammation grading algorithm was developed, which works primarily by evaluating the gingival contour and removing teeth from oral images. The model was trained using annotated intraoral images implementing a U-Net architecture with a DenseNet encoder, coupled with Grad-CAM++ techniques. The t-SNE was employed for dimension reduction and visualization of graded images, and then the diagnostic and grading performance of four CNN models was compared.
Materials and methods
Dataset collection
The protocol and research project were approved by the Ethics Committee of School and Hospital of Stomatology, Wuhan University. All methods were conducted in accordance with the approved guidelines and regulations, written informed consent was obtained from each participant and/or their guardian(s). The collection comprises images of gingiva in various states of health and inflammation to ensure representative coverage. In this study, a total of 826 intraoral photos (8214 teeth) were gathered. Dentists captured all intraoral images with a digital camera (Canon EOS Rebel T3, 4272 × 2848 pixels; approximately 3 gigabytes). The camera settings were f/25, ISO 200, shutter speed 1/160, and auto white balance. The flash output power was set to 1/4.
In terms of inclusion and exclusion criteria, patients of all ages, smokers, pregnant and lactating women, patients treated for stable systemic disease, but not for active systemic disease, patients taking antibiotics, patients receiving anticancer therapy, patients with excessive gingival hyperplasia, and patients with extensive melanin pigmentation or abnormally coloured gingiva. The demographics of the 826 patients were as follows: sex (420 men and 406 women); age (10.7% under 20, 16.3% 20–30, 19.5% 30–40, 24.3% 40–50, and 29.2% above 50).
The intraoral image dataset was saved in a standardized format (JPEG 100%). The photos were preprocessed with normalization, denoising, and resizing algorithms to automatically disregard extraneous information and limit the effect of noise on the DL algorithm. The dataset was then randomly divided into two parts (Fig. 1a): training (80%) and validation (20%). The disease distribution in both the training and validation sets is completely random.
The framework of the deep learning model for grading gingival inflammation by intraoral photos. (a) The input images are normalized, denoised, and resized before being divided into training and validation sets. The framework is then optimized by training and evaluating validation sets using the CNN-based model. Finally, the deep learning-based algorithm generates the gingival status visualization. (b) CNN model is made out of DL blocks and transition layers. Images are entered into the first convolution layer, and the resulting features are transferred to the input DL block. DL blocks have several layers of the same size, and each layer is Densely attached. Through convolution and pooling, two transition layers between adjacent blocks adjust feature-map sizes. The final DL block generates feature maps, which are then put into the global average pooling and softmax to get the final classification results.
Hardware systems
In this study, the model was created with TensorFlow 2.011 and trained and evaluated with the NVIDIA Tesla V100 GPU. Additionally, the stochastic gradient descent approach of the Adam optimizer was used to determine the model parameters for all segmentation models that all employed the binary cross-entropy loss as their loss function.
Gingival inflammation labeling
Six dental specialists scored the images. During the oral doctors' training, a dataset of an additional 30 pictures was used for the test. Those who did not complete the training until they achieved a κ value of 0.70 or more. A κ value of 0.6 to 0.8 shows high agreement, whereas 0.8 to 1.0 implies nearly complete agreement. As a result, three dentists were eligible to be graders and classify the photographs.
All photographs were categorized as having different gingival statuses based on a modified gingival index (mGI)12 ranging from 0 (healthy gingiva) to 4 (severe gingival inflammation). Personal medical history and the initial diagnosis were documented in the medical files for all photos, and graders were required to review and confirm all information before classifying the images using labeling software (LabelBox, Labelbox Inc., CA).
Deep learning model establishment
DenseNet algorithm, a representative CNN-based DL algorithm, was used as a main platform for grading gingival inflammation in this study, as shown in Fig. 1b. The DenseNet architecture was utilized to train the model after randomization. The parameters of the fundamental structure of DenseNet in this study is made up of dense blocks and transition layers. The final block is followed by global average pooling and passed via a softmax classifier to obtain the final result. Dense connections are formed by directly linking the outputs of all preceding levels13. Each layer is specifically related to all previous layers in the channel dimension (the feature graph has the same size) and acts as the input to the next layer. A transition layer connects two neighboring dense blocks, and the pooling minimizes the size of the feature map.
Here, the feature is a combined operation that can include batch normalisation (BN), rectified linear units (ReLU), pooling and convolution (Conv). DenseNet hyperparameters were modified, such as learning rate, number of dense blocks, growth rate, and batch size based on the validation findings to develop the best performing model. In the DL model, each dense block has a structure of BN + ReLU + 1 × 1 Conv + BN + ReLU + 3 × 3 Conv. Adam Optimizer was applied to this study, adjusting batch size first and then learning rate. For more complex tasks such as image classification, a smaller batch size is usually used to ensure that the model can fully learn the detailed features of each sample, so the batch size was set to 16 in this study. For the learning rate, the study started with a large learning rate and then decreased the learning rate with the number of iterations until the optimal learning rate was found. The study set the learning rate to 0.0001.
Method processing for gingival margin morphology
A fact that the wave shape of inflamed gingiva is significantly wider than the scallop shape of the normal gingiva indicates that the swollen gingiva may has more cross area than the ideal gingival shape. Find the boundary line between the gingiva and the teeth (Fig. 2a), then label the high and low points of the gingiva from the greyscale picture (Fig. 2e,f). Generate the inflammatory gingival line and the normal gingival line based on the inflammatory and physiological characteristics of the gingival profile, respectively (Fig. 2b,c and g). The midpoints between high and low points were calculated from the inflammatory gingival line and physiological gingival line, respectively, and composited (Fig. 2d,h). The gingival inflammatory features were captured from the composite map and the difference between its physiological and pathological lines was focused (Fig. 2i,j). The final heatmap shows the gingival margin inflammation (Fig. 2k).
The process of feature extraction and gingival margin inflammation assessment with the new method. (a) Convert the RGB image of gingival inflammation to greyscale image; (e) adjust the curve contrast; (b) find the boundary line between the gingiva and the teeth, F. label the high and low points of the gingiva from the greyscale picture; (c, g) generate the inflammatory gingival line and the normal gingival line based on the inflammatory and physiological characteristics of the gingival profile, respectively; (d) the midpoints between high and low points were calculated from the inflammatory gingival line and physiological gingival line, respectively; (h) composited; (i) the gingival inflammatory features were captured from the composite map; (j) the difference between its physiological and pathological lines was focused; (k) the final heatmap shows the gingival margin inflammation.
Teeth removal and gingiva segmentation method
A new algorithm for removing teeth was created to achieve region-of-interest extraction. U-Net14 can make masks for each object in a picture, allowing it to distinguish the positions and forms of numerous objects. As a result, improved U-Net models15,16 were used as the basis for the automatic segmentation of the teeth and gingiva in this study.
The following algorithm was used to segment tooth and gingiva contours. The photo was instantly converted to grayscale (Fig. 3a,b) and then into a pseudocolored image (Fig. 3c). By adjusting the contrast by 25%, the pixels with the top 20% brightness in the image were computed, and the color range was set to extract those pixels with a 20% tolerance. Similarly,the tooth contour algorithm was applied to the pseudocolored image. The overlapping region of the tooth contours from the grayscale and pseudocolored pictures was identified as the final tooth contour, while the complete tooth structure was eliminated, leaving just the gingiva (Fig. 3d). The tooth contour was further constricted (Fig. 3e) to locate the core locations of the teeth, principally by drawing perpendicular lines inward from the contour margins. Segmented gingiva was obtained by bisecting the angles extending outward from the gingival papilla of the tooth contour (Fig. 3f). The tooth positions were determined by using the previously acquired tooth center points. Finally, all assigned data were applied to the original image, resulting in a segmented image containing only the gingiva (Fig. 3g). Attention will be focused on the gingival tissue, reducing the tooth interference (Fig. 3h).
Segmentation by U-Net with DenseNet Encoder. (a-c) The input image is transformed to grayscale before being turned into a pseudo-colored image. (d-f) The tooth contour algorithm is applied to both grayscale and pseudocolored images. The overlapping region of the tooth contours from the grayscale and pseudocolored pictures is identified as the final tooth contour, leaving just the gingival tissue. The tooth contour is further constrained to pinpoint the central points of the teeth. Segmented gingiva is then formed by bisecting the angles extending outward from the gingival papilla of the tooth contour. (g) Finally, all assigned data are applied to the original picture, resulting in a segmented image containing just gingival tissue. (h) Attention will be focused on the gingival tissue, reducing the tooth interference.
t-distributed Stochastic Neighbor Embedding evaluation for classification performance
To explore the best DL model in terms of getting more separable features, the new algorithms were evaluated by using t-distributed Stochastic Neighbor Embedding (t-SNE) visualization. t-SNE is used to retain in their mapping the proximity of similar groups while placing dissimilar groups at greater distances. The principle of t-SNE is as follows: data points at high latitude are transformed into Gaussian distribution probabilities, while data points at low latitude use t-distribution and then set a penalty function to reduce the dimension while retaining certain local characteristics. Point clouds of different colors reflect varying degrees of gingival tissue inflammation. Thus, by applying the best algorithm to the model we've developed, we can better differentiate between various degrees of gingival inflammation.
Attention visualization improvement for the CNN
Gradient-weighted class activation mapping (Grad-CAM) visualization tool was utilized to explain the computer attention heatmap of the AI neural network. The following is the methodological structure: The input images were first processed via numerous convolution layers, and the feature graph of the final convolutional layer was produced using global average pooling. The pooling results were then flattened into one dimension, transforming it into a fully connected layer, and the classification results were predicted using the softmax activation function, while the final convolution layer was calculated using all of the feature maps of image category weight and weighted sum.
Confusion matrices
The confusion matrix diagram depicts the results of categorization using this algorithm and two professional dentists. Each confusion matrix's element (x,y) demonstrates that when class x served as the ground truth, the empirical probability of predicting was classy.
Five-fold cross validation for diagnostic validation
Five-fold cross validation was used to assess the performance of the AI-assisted categorization for gingival inflammation. In short, the photos were evenly and randomly split into five datasets. Four of the datasets were used to train and fit a deep learning diagnostic model, while the last one was used to evaluate the model and assess its performance indices. Finally, based on this method, five diagnostic models were created, and performance indicators for each model were also scored.
Evaluation of our DL model performance
The following models' performance was assessed: (1) identifying gingival inflammation in the input image; and (2) categorizing the level of gingival inflammation in the input image. In terms of detection performance, the Mean Intersection over Union (MIoU) metric were used to compare the AI model's detection accuracy to that of two dentists17.
The MIoU is a widely recognized semantic segmentation standard that computes a ratio between the intersection and union of two sets: in this study, the ground truth (the actual gingival inflammation area) and the expected segmentation result (the gingival inflammation regions detected by the AI model or the dentists). Therefore, the MIoU can be redefined as the number of true positives (intersection) divided by the total of true positives, false negatives, and false positives (union).
In terms of classification performance, the area under the ROC curve (AUC) with 95% confidence intervals was used to evaluate the algorithm's diagnostic performance. Additionally, performance indicators such as sensitivity, specificity, and accuracy were calculated for the DL model.
Evaluation of different DL model performances
The performance of this model with DenseNet for detecting gingival inflammation was also compared to three previously presented methods: ResNet, Inception-v3 and EfficientNet. Here, a fine-tuning process was performed directly on pre-trained models, following similar methods as Gulshan et al.18. The initial few layers were unfrozen last, and the layer weights were gradually unfrozen in phases. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and AUROC of identifying gingival inflammation in various DL models were evaluated.
Statistical analysis
For detection performance, the differences of MIoU between the AI model and the dentists were assessed, the parametric data were evaluated using paired t-tests. A P.05 value was considered statistically significant. The statistical analysis was performed using SPSS software, version 19.0 (Chicago, IL, USA).
For classification performance, the differences of AUC, sensitivity, specificity and accuracy between different grades were assessed by GraphPad Prism version 9.5.0 for Windows (GraphPad Software, San Diego, CA) , with a significant difference threshold of P < 0.05.
Results
t-distributed stochastic neighbor embedding
The t-SNE visualization of DL algorithms was shown in Fig. 4. The feature dots are completely mixed together in the standard model, as shown in Fig. 4a. This indicates that the DL model could extract a large number of features that are too discrete and may not be used for classification. On the contrary, after using the improved method, the DL model formed well-defined clusters of the gradation of gingival inflammation. The dots of the same gradation features are close to each other, although there were still a small number of mixed dots between different features.
t-SNE space visualization of the feature extraction of gingival inflammation gradation after using gingival margin algorithm. The feature dots are completely mixed together in the standard model (a). After using the improved method, the DL model formed well-defined clusters of the gradation of gingival inflammation (b). The new algorithm was applied to train other four models, ResNet (c), Inception-v3 (d) and EfficientNet (e), and their classification features are extracted. The different degrees of gingival inflammation are shown by different colored point clouds. Points with the same degree of inflammation cluster together, while points with different degrees disperse. The t-SNE revealed that the DenseNet gained more separable features for each category than the other three algorithms. It has clear clustering features with less interspersion between features.
The visualization results of ResNet (Fig. 4c), Inception-v3 (Fig. 4d) and EfficientNet (Fig. 4e) revealed that the same category was more dispersed in space, and different categories may intersect with each other. The results of DenseNet revealed that the same category was more closely related, and different categories were more dispersed and less crossed. The red dots concentrated in the lower middle area and represented the most severe gingival inflammation, while the black dots concentrated in the upper left corner and showed healthy gingival tissue. The green and yellow dots in the center left and right reflected moderate gingival inflammation. The t-SNE visualization showed that DenseNet algorithm obtained more separable features for each category than the other three algorithms.
Heatmap for computer visual attention
Figure 5 shows an illustration of Grad-CAM + + utilizing DenseNet, with a red zone suggesting more concentrated attention. For the direct output images, the localization range of the heatmap included the gingival tissue and the teeth (Fig. 5b). After the optimization of our novel algorithm, the heatmap showed that the red area was focused on gingival tissue (Fig. 5c). Contour mountain map was generated for attention analysis based on the heatmap data (Fig. 5d). In the overall visual field ratio, attention focused on all gingival tissue increased from 37.73% to 62.20% (Fig. 5e); attention focused within 8 mm of the gingival margin increased from 21.11% to 38.23% (Fig. 5f); while the attention focused on the teeth decreased from 31.10% to 19.07% (Fig. 5g). The overall gingival attention ratio increased from 51.82% to 78.21%. (Fig. 5h).
The heatmaps for images with different degrees of gingival inflammation (mGI0-mGI4). A red area implies more concentrated attention. The heatmap's localization range for the direct output pictures covers gingival tissue and teeth, and the attention is distracted (a, b). The heatmap after optimization demonstrates that the red region is focused on gingival tissue, improving attention to the gingival tissue (c). The attention improvement can be calculated from the 3D attention density plot data (d): in the overall visual field ratio, attention focused on all gingival tissue increased from 37.73% to 62.20% (e); attention focused within 8 mm of the gingival margin increased from 28.11% to 38.23% (f); attention focused on the teeth decreased from 31.10% to 19.07% (g). The total attention ratio on the gingival tissue rose from 51.82% to 78.21% (h).
Performance of teeth removal DL model for grading gingival inflammation
The ability of this DL model to grade gingival inflammation is shown in Table 1. Grade IV had the best accuracy (79.22%), with a sensitivity of 83.75% and a specificity of 69.33%. The highest specificity (70.64%) was found in Grade 0, along with 83.08% sensitivity and 77.09% accuracy. In Grade I, II, and III, the accuracy was more than 70%. In addition, the AUROC values for gingival inflammation grades 0, I, II, III, and IV were 0.83, 0.80, 0.81, 0.81, and 0.84, respectively (Fig. 6a–e).
Gingival inflammation identification performance of the proposed deep learning algorithm and the dentists. (a–e) Receiver operating characteristic curves for different degrees of gingival status are generated from the fivefold cross-validation of our model. The receiver operator characteristic curve depicts the deep learning model's ability to discriminate between different degrees of gingival inflammation. Gingival inflammation grades 0, I, II, III, and IV had ROC values of 0.83, 0.80, 0.81, 0.81, and 0.84, respectively. k1 to k5 represent the obtained ROC curves from the five test results of fivefold cross-validation. AUC, the area under the ROC curve. (f–h) Confusion matrices show our algorithm and two professional dentists identifying five degrees of gingival inflammation. Our algorithm has a stronger ability to distinguish the gingival inflammation than human diagnosis.
Confusion matrices
The confusion matrices illustrating the identification of five different degrees of gingival status by the algorithm and two professional dentists are displayed in Fig. 6f–h. The predicted classes were located on the horizontal axis, and the actual classes were located on the vertical. The results showed similar trends in misclassification between DL model (Fig. 6f) and human dentists (Fig. 6g and h). Grade I and II whose diagnose tended to be confused by the dentist identification also tended to be confused in the DL analysis. The DL algorithm has a stronger ability to distinguish the inflammation in Grade 0 and IV.
Performance of teeth removal DL model for detecting gingival inflammation
Two dentists detected gingival inflammation for the first time with a MIoU of 0.681 ± 0.238 and 0.679 ± 0.231, respectively. After one week, the MIoU of two dentists marking the same images for the second time was 0.676 ± 0.232 and 0.678 ± 0.226, respectively. The AI model had a higher MIoU (0.727 ± 0.117) and the same results after one week. A paired t-test revealed no significant difference in the detection gingival inflammation between the DL model and the human experts (P > 0.05).
Performance of different DL models for gingival inflammation detection
DenseNet surpassed all three current methods for identifying gingival inflammation (Fig. 7a), which was consistent with the t-SNE result. The algorithm with DenseNet achieved the highest accuracy (76.32%), PPV (84.28%) and NPV (61.81%) among the four DL models (Fig. 7b,c and f), with 82.03% sensitivity and 69.11% specificity (Fig. 7d–e). ResNet obtained an accuracy of 73.60%, with 77.77% sensitivity and 64.75% specificity. Inception-v3 got an accuracy of 74.56%, with 78.71% sensitivity and 65.75% specificity. EfficientNet achieved an accuracy of 72.72%, with 76.23% sensitivity and 65.25% specificity. The AUC of DenseNet algorithm was 0.82, whereas the AUC of the ResNet, Inception-v3, and EfficientNet models were 0.79, 0.80 and 0.77, respectively (Fig. 7g).
Performance comparison for detecting gingival inflammation with novel method of various deep learning models: DenseNet, ResNet, Inception-v3 and EfficientNet. The sensitivity, specificity, PPV, NPV, accuracy, and AUROC of identifying gingival inflammation in four DL models were evaluated (a-g). DenseNet surpassed all three current methods for identifying gingival inflammation. The algorithm with DenseNet achieved the highest accuracy (76.32%) among the four DL models, with 80.12% sensitivity and 68.25% specificity. The AUC of our algorithm was 0.82, whereas the AUC of the ResNet, Inception-v3, and EfficientNet models were 0.79, 0.80 and 0.77, respectively.
Discussion
Periodontitis is widely regarded as the sixth most common disease in the world. The timely identification and classification of gingivitis emerge as pivotal elements in delivering proactive treatment and interventions for periodontal disease. Nevertheless, the accurate classification of subtle features within the same disease poses a major challenge, mainly due to the limited feature extraction. To overcome this limitation, a DL method, an automated system for grading gingival inflammation, was proposed, incorporating a new method for gingiva removal. This method focuses on the morphology of the gingival contour and the removal of teeth that are not relevant to the assessment of gingival inflammation. The incremental adjustments of DL model have a significant impact, enabling grading and improving the overall performance of the DL. The algorithm demonstrated a high degree of accuracy and efficiency in disease classification diagnosis. This validation was achieved through a comprehensive evaluation encompassing cluster analysis, a five-fold cross-validation, matrix analysis, and the assessment of AUROC values.
In order to extract more features while improving the accuracy and efficiency of the algorithm, a U-Net architecture was combined with a DenseNet encoder, coupled with Grad-CAM + + techniques. Xu et al. suggested a sequential-level feature DL algorithm that is trained by locating the lesion area's centroid and extending a fixed-size region outward, following an enlarged ring19. After modifying the algorithm, this method increased DenseNet's diagnostic accuracy from 60% to 66.3%. Inspired by their work, we initiated efforts to enhance feature extraction through the refinement of novel algorithms, aiming to capture subtle differences within each feature. The approach incorporates Grad-CAM + + , which offers improved visual explanations for Convolutional Neural Network (CNN) model predictions. This improvement is evident in better object localization and elucidating occurrences of multiple object instances within a single image, compared to the current state-of-the-art methods.
To date, existing studies have treated entire intraoral images as homogenous and undifferentiated objects, resulting in a relatively scattered focus on the image. This algorithm adopted a comprehensive approach, analyzing the color and shape of the gums through the clustering of scatter points in distinct areas. Simultaneously, the algorithm performed graded marking, contributing to a more nuanced understanding of intraoral images.
In addition to enhancing DL models, various effective methods have been demonstrated to increase AI diagnostic performance. Pohl et al. utilized LogOdds for non-convex interpolation between space-conditioned distributions20. Li et al. recognized inflammation using contrast-limited adaptive histogram equalization, a gray-level co-occurrence matrix21. These approaches, with varying emphases on regions, edges, and clustering, underscore that comprehensive multiple image processing can indeed enhance the diagnostic capabilities of AI models. In this study, the primary objective of teeth removal is to minimize interference with the diagnosis of gingival inflammation, directing computer attention on the gingiva, and thereby increasing detection accuracy. Following image processing, the model is fed with the resultant image, wherein the overlapping area of the tooth contour is identified as the definitive tooth contour. This process eliminates all tooth structures, leaving only the gingiva. In the optimized heatmap, gingival tissue will be the focus of attention. The proportional visual field devoted to attention on the all gingival tissue and within 8 mm of the gingival margin increased from 37.21% to 62.39% and 22.44% to 40.46%, respectively. Consequently, the overall attention ratio on the gums escalated from 52.56% to 78.32%.
It is also worth mentioning that the gingival inflammation was not only identified but also graded. In the previous study, gingivitis was simply separated from healthy individuals, but the severity of gingivitis was not evaluated22,23. The t-SNE was employed to score gingival inflammation in this study: the data was dimension-reduced and visualized. Similar samples are represented by close points, while dissimilar samples are modeled by high-probability distant points24. The results indicated that different gingival conditions could be displayed as five clusters of various features, which are more intuitive and straightforward to understand.
In addition, the t-SNE visualization of four conventional algorithms (DenseNet, ResNet, Inception-v3, and EfficientNet) were compared with this new algorithm to find the one with the best performance in grading different levels of gingival inflammation. The visualization results from DenseNet, on the other hand, demonstrated that the same category was more closely associated, while different categories were more dispersed and less crossed. Therefore, DenseNet algorithm was chose in this study to realize better performance. DenseNet has fewer parameters and is more efficient and effective25. The optimized DL model outperformed dentists in terms of MIoU(0.727 ± 0.117) . When utilizing this DenseNet, the accuracy of five different degrees of inflammation is all over 70%, and the sensitivity of five different degrees of inflammation is all above 80%. For DenseNet, the output feature graph of each layer is the input of all the layers later, so the transmission of features is strengthened; due to the dense connection mode, it improves the back propagation of the gradient, alleviates the problems of gradient vanishing and model degradation, and makes the network easier to train.
Moreover, compared the performance of our model to three previously presented methods. DenseNet has the following advantages over other algorithms: fewer parameters, fewer computations, stronger overfitting resistance, and stronger generalization. The gingival margin feature extraction and tooth removal algorithms developed in this study are integrated with DenseNet, achieving superior results. The algorithm with DenseNet achieved the highest accuracy (76.32%), with 82.03% sensitivity and 69.11% specificity. The AUC of our algorithm was 0.82, whereas the AUC of the ResNet, Inception-v3, and EfficientNet models were 0.79, 0.80 and 0.77, respectively. Therefore, based on the t-SNE visualization results, this also validated the selection rationality of DenseNet.
Despite the fact that the proposed model showed high accuracy and reliability, some limitations still need to be addressed: (1) The dataset in this study is restricted since the photos were obtained from a single collection. Expanding the dataset with data from numerous areas across the world may help further validate the algorithm and increase its generalization. (2) Some special images, such as excessive gingival hyperplasia and melanin pigmentation, may mislead the diagnosis, so they cannot be automatically excluded from the analysis. One way to solve this problem is to further improve the deep learning methods at the framework and algorithmic levels.
In terms of practicality, this model is promising; for example, it can help dentists make rapid diagnoses and improve the accuracy of their diagnoses. Furthermore, as this approach is efficient in terms of image handling and drastically reduces diagnostic time, it may be used in the future for the periodontal disease census.
In summary, based on a huge dataset of intraoral images, a deep learning model with high accuracy and sensitivity was created for identifying and grading gingivitis. As a result, in the future, a system that assists dentists in making graded diagnoses and allows patients to self-grade will be developed, promoting public oral health. This approach can offer a cost-effective screening of periodontal disease.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due patient privacy, but the design and algorithm are available from the corresponding author on reasonable request.
Abbreviations
- CNN:
-
Convolutional neural network
- t-SNE:
-
T-distributed stochastic neighbor embedding
- DL:
-
Deep learning
- BN:
-
Batch normalization
- ReLU:
-
Rectified linear units
- Conv:
-
Convolution
- Grad-CAM + + :
-
Optimized Gradient-weighted Class Activation Mapping
- MIoU:
-
Mean Intersection over union
- PPV:
-
Positive predicted value
- NPV:
-
Negative predicted value
- AUC:
-
The area under the ROC curve
References
Pihlstrom, B. L., Michalowicz, B. S. & Johnson, N. W. Periodontal diseases. Lancet 366(9499), 1809–1820 (2005).
Caton, J. G. et al. A new classification scheme for periodontal and peri-implant diseases and conditions—Introduction and key changes from the 1999 classification. J. Periodontol. 89(Suppl 1), S1-s8 (2018).
Janowczyk, A. & Madabhushi, A. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. J. Pathol. Inform. 7(1), 29 (2016).
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012. (2021).
Krois, J. et al. Deep learning for the radiographic detection of periodontal bone loss. Sci. Rep. 9(1), 8495 (2019).
Lee, D.-W., Kim, S.-Y., Jeong, S.-N. & Lee, J.-H. Artificial intelligence in fractured dental implant detection and classification: evaluation using dataset from two dental hospitals. Diagnostics 11(2), 233 (2021).
Aquino, D. A., Klipfel, A. A., Brosnan, C. F. & Norton, W. T. The 70-kDa heat shock cognate protein (HSC70) is a major constituent of the central nervous system and is up-regulated only at the mRNA level in acute experimental autoimmune encephalomyelitis. J. Neurochem. 61(4), 1340–1348 (1993).
Andrade, K. M., Silva, B. P. M., de Oliveira, L. R. & Cury, P. R. Automatic dental biofilm detection based on deep learning. J. Clin. Periodontol. 50(5), 571–581 (2023).
Mun, Y., Paik, I., Shin, S. J., Kwak, T. Y. & Chang, H. Yet Another Automated Gleason Grading System (YAAGGS) by weakly supervised deep learning. NPJ Digit Med. https://doi.org/10.1038/s41746-021-00469-6 (2021).
Lyu, X., Cheng, L. & Zhang, S. Y. The RETA benchmark for retinal vascular tree analysis. Sci. Data https://doi.org/10.1038/s41597-022-01507-y (2022).
Abadi, M. et al. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), Savannah GA, USA, November 2-4, 2016, Proceeding, 265–283 (eds. Kimberly, K. & Timothy, R.) (USENIX Association, 2016)
Lobene, R. A modified gingival index for use in clinical trials. Clin. Prevent Dent. 8, 3–6 (1986).
Huang, G., Liu, Z., Pleiss, G., Van Der Maaten, L. & Weinberger, K. Q. Convolutional networks with dense connectivity. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 8704–8716 (2019).
Ghosh, S., Chaki, A. & Santosh, K. Improved U-Net architecture with VGG-16 for brain Tumor segmentation. Phys. Eng. Sci. Med. 44(3), 703–712 (2021).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18 (ed. Ronneberger, O.) (Springer, 2015).
Lee, C. T. et al. Use of the deep learning approach to measure alveolar bone level. J. Clin. Periodontol. 49(3), 260–269 (2022).
Everingham, M., Van Gool, L., Williams, C. K., Winn, J. & Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316(22), 2402–2410 (2016).
Xu, Y. et al. Deep sequential feature learning in clinical image classification of infectious keratitis. Engineering 7(7), 1002–1010 (2021).
Pohl, K. M. et al. Logarithm odds maps for shape representation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2006: 9th International Conference, Copenhagen, Denmark, October 1–6, 2006 Proceedings, Part II 9 (ed. Pohl, K. M.) (Springer, 2006).
Li, W. et al. A gingivitis identification method based on contrast-limited adaptive histogram equalization, gray-level co-occurrence matrix, and extreme learning machine. Int. J. Imaging Syst. Technol. 29(1), 77–82 (2019).
Chen, Y. & Chen, X. Gingivitis identification via GLCM and artificial neural network. In Medical Imaging and Computer-Aided Diagnosis: Proceeding of 2020 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2020) (ed. Chen, Y.) (Springer, 2020).
Rana, A. et al. Automated segmentation of gingival diseases from oral images. In 2017 IEEE Healthcare Innovations and Point of Care Technologies (HI-POCT) (ed. Rana, A.) (IEEE, 2017).
Linderman, G. C. & Steinerberger, S. Clustering with t-SNE Provably. SIAM J. Math. Data Sci. 1(2), 313–332 (2019).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. Proc. IEEE Confer. Computer Vis. Pattern Recogn. https://doi.org/10.1137/18M1216134 (2017).
Acknowledgements
This study was supported by Natural Science Foundation of Hubei Province of China (2021CFB466), Medical Backbone Talents Foundation of Wuhan City of China (2020-55), Nursing Research Foundation of Wuhan University (030-600400020), Central Higher Education and Teaching Fund (2024ZG147) , and Clinical Medicine Education Research Project of Hubei Provincial Health Commission (HBJG-220005).
Funding
This study was supported by Natural Science Foundation of Hubei Province of China (2021CFB466), Medical Backbone Talents Foundation of Wuhan City of China (2020–55), Nursing Research Foundation of Wuhan University (030–600400020), Central Higher Education and Teaching Fund (2024ZG147) , and Clinical Medicine Education Research Project of Hubei Provincial Health Commission (HBJG-220005).
Author information
Authors and Affiliations
Contributions
All authors gave their final approval and agree to be accountable for all aspects of the work. XYB, CW and DY composed the manuscript; DY and CW conceived and designed the study; XYB and DY designed the algorithms and collected data; XYB and SHL analyzed the data; XXW and JXY critically revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wen, C., Bai, X., Yang, J. et al. Deep learning based approach: automated gingival inflammation grading model using gingival removal strategy. Sci Rep 14, 19780 (2024). https://doi.org/10.1038/s41598-024-70311-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-70311-y