Introduction

The close proximity of the maxillary posterior teeth with the maxillary sinus and the application of sinus lift procedures for dental implants made maxillary sinus an important anatomical site1,2. However, the primary ostium, which is the main pathway for the drainage of the maxillary sinus, is located in an unfavorable location3. In addition to the difficult location, the primary ostium is also susceptible to blockage during inflammation4. Accessory ostium [AO], also called the Girade’s orifice, is one of the important anatomical variations in the maxillary sinus5. The AOs may be located unilaterally or bilaterally, either as solitary or multiple apertures, between the uncinate process and inferior turbinate6,7. AO tends to occur more frequently on the posterior fontanelle, which is part of the lateral nasal wall covered only by mucoperiosteum8.

Studies using computed tomography [CT], cone beam computed tomography [CBCT], endoscopy and cadaveric analysis have shown a wide range of variations in the prevalence of AO9,10. Some studies have reported the association between AO and sinus pathologies3,10. The presence of AO leads to an increase in the ventilation of the sinus; however, it also leads to reverse drainage into the sinus from the middle meatus into the sinus11. The reverse drainage a causes reduction in the level of nitrous oxide and a buildup of mucous in the maxillary sinus, leading to pathologies like retention cyst, mucosal thickening, and maxillary sinusitis12,13. Recent studies have shown that CBCT can be effectively used for imaging the anatomy of the sinonasal structures and AO with precision and lower radiation dose10,14.

Artificial intelligence models have been explored for detection and segmentation anatomical structures in the craniofacial region15,16. Deep learning models which, are a subset of machine learning and artificial intelligence, have shown promising results in interpreting medical images when combined with residual neural networks17,18,19. Experts suggest that the use of AI makes radiology workflow efficient by reducing image reading time, speeding disease detection, and improving diagnostic accuracy20.

Tasks such as segmentation of the maxillary sinus, upper airway space, and detection of sinus pathology have been achieved using deep learning models with high accuracy21,22,23. Deep learning models have shown promising results in the detection of nasal septal deviation and fracture of the nasal bones16,24.

However, there is no research exploring the use of neural networks in the detection of AO using radiographic images. To fill this gap of knowledge, we conducted a study to determine the accuracy of the deep learning model in the detection of AO in coronal CBCT images.

Materials and methods

We conducted a retrospective cross-sectional study using the radiology archives of University Dental Hospital, Sharjah, United Arab Emirates between 1st January 2024 and 30th June 2024. The CBCT scans were made for various diagnostic purposes using Planmeca Viso 7 (Finland) at 95 kilovoltage peak (kVp), 5 milliampere (mA), and 0.2-millimeter (mm) resolution. Two examiners with 10 years of clinical experience screened 3278 CBCT scans to obtain 856 scans, which were obtained with a large field of view (FOV) with a region of coverage extending from the base of the mandible to the cranial base (20 × 17 cm).

CBCT scans of patients below the age of 18 years and scans of patients with a history of trauma and tumors in the sinonasal region were excluded from the study. CBCT scans of patients with congenital deformities of the sinonasal region and cleft palate were excluded from the study. CBCT scans of patients with history of nasal polyps, choanal atresia, acute sinusitis, and severe nasal septal deviation (deviated septum contacting the lateral nasal wall) were excluded from the study.

The two examiners analyzed 856 CBCT scans for AO. In case of a disagreement between the examiners, a third examiner with equal experience was called in to detect the presence of AO. The inter-rater reliability was calculated, and 10% of the scans were re-evaluated by each of the examiners after 2 weeks to obtain intra-rater reliability.

The examiners scrolled the coronal CBCT sections from the mesial side of the maxillary first premolar to the distal side of the maxillary second molar. The examiners then cropped the image from coronal CBCT sections at the site of AO. To maintain uniformity in cropping the images, the medial boundary was set at the nasal septum to distally the vertical line crossing the middle of the sinus. Inferiorly one centimeter below the level of the hard palate and superiorly at the level of the cribriform plate (Fig. 1).

Fig. 1
figure 1

Coronal CBCT image with showing supero-inferior and medio-lateral boundaries for cropping. The yellow arrow is pointing at the AO.

The images were saved in the Joint Photographic Experts Group (JPEG) format and labeled with the letter ‘A’ preceding the patient’s hospital registration number. Example: 706788RA (R/L implies right or left side). The examiners obtained 227 images from the 856 CBCT scans. Since these were the highest possible number of images we could get from our radiology archives, we followed the convenience sampling. We then obtained 227 coronal images from the CBCT images without AO following the same boundaries and labelled them as ‘N’ after the registration number. Example: 65097LN. To maintain uniformity, all the ‘N’ images were cropped from the coronal section coinciding with the medial aspect of the maxillary first molar. The outline of the methodology (data reprocessing, image classification, and image classification) followed in our study is shown in Fig. 2.

Fig. 2
figure 2

Flowchart of the steps followed in the present study. In the first step, coronal CBCT images were preprocessed. This was followed by image augmentation (rotation, shift width, zoom). In the next step, fine-tuning of the base model and custom layers was carried out. In the last step the classification output (normal = without AO, abnormal = with AO) of the model was obtained.

The images were first segregated into two separate folders (Data cleaning) based on the presence and absence of AO. As a part of preprocessing, the images were resized and subjected to a sharpening filter using ‘ImageFilter’, a package of Python Imaging Library (PIL) available in python Fig. 3.

Fig. 3
figure 3

Showing (A) Original image subjected to ImageFilter feature leading to (B) Preprocessed image. (C) Snap of the code for imagefilter in PIL.

The custom dataset had a total 454 images of two classes (227 normal images and 227 accessory ostium images). Among the 227 images, 118 were from the right side and 109 from the left side. Whereas 114 normal images were obtained from the right side and 113 from the right side. To avoid overfitting of the model, a data augmentation technique was used to create 1260 (630 normal and 630 accessory ostium) images by using 420 images from the training data set. The rest of the 34 images were kept for testing of the model. The overall distribution of images for training, validation, and testing is presented in Table 1.

Table 1 The distribution of images for training, validation, and testing of the classification model.

The “ImageDataGenerator from tensorflow.keras.preprocessing.image” package was used to increase the multiplicity of data for training models like rotating, shifting, zooming, and width shift of the images Fig. 4.

Fig. 4
figure 4

Showing sample of original image (A), augmentation using rotation (B), shift width (C) and zoom (D).

The parameters of “ImageDataGenerator” are also shown below in Fig. 5. Rotation range was set at 7 (this parameter specifies the range of degrees [0-180] within which the images can be rotated). Rotation augments the model’s robustness to orientation changes25,26. The width shift range was set at 0.2, and the height shift range was set at 0.2 (these parameters specify the range of horizontal and vertical shifts [as a fraction of the image size] that can be applied to the images). Width shift enhances the model’s robustness to object positioning and reduces overfitting to specific object locations25,26. The zoom range was set at 0.2 (this parameter specifies the range of zoom factors [as a fraction of the original image size] that can be applied to the images). Object size variations reduce overfitting of the model to specific object sizes25,26. Horizontal flip was set at false, meaning disabled. (This parameter specifies whether the images should be flipped horizontally [mirrored]). Fill mode was set at nearest, (this parameter specifies how to fill the newly created pixels when applying transformations [rotation, shifting]. In our study, the nearest neighbour interpolation method is used.

Fig. 5
figure 5

Showing the parameters set in ‘ImageDataGenerator’ for augmentation of images.

Three pre-trained models: Visual Geometry Group of the University of Oxford-16 layers [VGG16], MobileNetV2, and ResNet101V2, were used as base models. They were chosen as the base models due to their performance in previous deep learning studies in the sinonasal region, well-established architectures, strong feature extraction capabilities, availability of pre-trained weights, and balance between accuracy and computational costs16. The performance of all models was analyzed (Table 2and Fig. 6), and ResNet101v2 was selected as a base model.

Table 2 Performance parameters of pre-trained models VGG16, MobileNetV2 and ResNet101V2.
Fig. 6
figure 6

Showing loss curves and accuracy curves of pretrained base models (VGG16, MobileNetV2).

We used the fine-tuning approach in our study (Fig. 7). In the initial steps, pre-processing and augmentation of the input data. The base model was then frozen, and a classification layer was added over it. Training and evaluation of the base model was then carried, out followed by unfreezing some of the top layers. Retraining of the whole model (lower training rate) was carried out, followed by evaluation. The fine-tuning was repeated if an improvement in performance was observed. This cycle was repeated till a pause in the improvement of performance metrics was noticed. The main idea of the proposed fine-tuning framework was to achieve a gradual increase in the level of layers that are to be unfrozen and tuned. To avoid overfitting, L1 regularization, also known as Lasso regularization, was used. It added a penalty term to the model’s loss function, which encourages the model to reduce the magnitude of its weights27. The model was trained with the following hyperparameters: 20 epochs, batch size 32, Adam optimizer with a learning rate of 1e-5, binary-crossentropy loss function, and sigmoid activation function for the top layer classification.

The analysis was performed on a local workstation running Ubuntu with Intel(R) Core (TM) M-5Y71 CPU @ 1.20 GHz, 1.40 GHz, and 8GB RAM, using Python to build the system using the deep learning frameworks Keras with TensorFlow as a back end.

Fig. 7
figure 7

Flow chart showing steps in the training and fine-tuning of the model used in the study. Training and evaluation of the base model was then carried out, followed by unfreezing some of the top layers. Retraining of the whole model was carried out, followed by evaluation. If performance metrics improved, the cycle was continued. The cycle was stopped till no further improvement was exhibited by the model.

Statistical analysis

The inter-rater and intra-rater reliability was evaluated using the Kappa Cohen test. The performance metrics of the model was evaluated in terms of accuracy, precision, F1-score, and area under curve (AUC).

Results

In the present study, the examiners analyzed 856 CBCTs and found AOs in 207 scans with an estimated prevalence of 24.18%. Among the 207 CBCT scans, 20 showed bilateral AOs [40 AOs], 98 showed right unilateral AOs, and 89 showed left unilateral AOs. Therefore, the total number of AOs was 227.

The inter-rater reliability between the two examiners for the detection of AO was 0.87, indicating almost perfect agreement. The intra-rater reliability for examiners 1 and 2 was 0.91 and 0.95, respectively.

The evaluation of performance metrics of the classification model revealed a training accuracy of 99%, and a valid accuracy of 81% (Fig. 8). The training and valid loss is shown in Fig. 9. The formula used for calculating accuracy = (TN + TP)/(TP + FP + TN + FN) [TN = True negative, True Positive, FP = False positive, FN = False negative].

Fig. 8
figure 8

Training accuracy and valid accuracy curve of the classification model ResNet101V2.

Fig. 9
figure 9

Training loss and valid loss curve of the classification model ResNet101V2.

The test accuracy and test loss of the unseen dataset was 0.81 and 0.51 respectively (Fig. 10).

Fig. 10
figure 10

Screenshot showing accuracy of the classification model on unseen data (data not used for training).

The classification report of model (in terms of accuracy, precision, F1-score) and confusion matrix is shown in Figs. 11 and 12. The AUC value found to be at 0.87 are shown in Fig. 13.

Fig. 11
figure 11

Screenshot showing classification report of the model ResNet101V2 used in our study. Macro-averaged (macro avg) indicates the metrics with equal contribution from all classes. Weighted-averaged (weighted avg) indicates the metrics with contributions from individual classes weighted by their size.

Fig. 12
figure 12

Confusion Matrix for ResNet101V2 classifier showing the actual and predictive values.

Fig. 13
figure 13

ROC-AUC Curve of each class. Class 0 indicative of normal images and Class 1 indicative of images with accessory ostium.

Discussion

Studies have revealed that the prevalence of AOs is as high as 30% in patients with chronic sinusitis and 10–20% in healthy individuals, suggesting a strong link between the existence of AM and sinus pathologies6,28,29. In the present study, the prevalence of AOs in the CBCT scans of healthy individuals was estimated at 24.15%.

Some researchers believe that AO leads to the re-entry of the mucous that is drained out of the maxillary sinus through the primary ostium30. This complication associated with AO is known as “two-hole syndrome”31. The AO-linked mucous recirculation has been associated with chronic sinusitis32. The close proximity of the sinus to the posterior teeth makes the sinus pathology important to dental professionals.

In the present study, the inter-rater agreement for the detection was 0.87 for the detection of AO. A previous study in the same region for the detection of AO in CBCT scans using two observers has reported similar (0.83) inter-rater agreement values10. However, slightly lower (0.67) inter-rater agreement was reported in a study conducted at the University of Hong Kong1.

In the present study, we used the ImageDataGenerator to increase the multiplicity of data for training models, like rotating, shifting, zooming, and width shifting of the images. A recently published study using CBCT images in different planes of the maxillofacial region also used ImageDataGenerator with settings: rotation of 15 degrees, height and width shift of 0.1, and zoom by a factor of 0.533.

In general radiology studies, ImageDataGenerator has been used to generate a large number of chest x-ray images for the detection of COVID-19 using deep convolutional neural networks DCNN34. Flipping, rotation, and translation are the common methods used for augmentation of CT images35. Similar augmentation methods were used by the ImageDataGenerator used in our study to increase the data pool.

Recently, studies have revealed that deep learning models exhibit good performance metrics in image classification and segmentation36,37. In the present study, we used the ResNet-101V2 classification of model with a test accuracy of 81%. Recently published studies revealed that ResNet-101 showed higher accuracy compared to ResNet-50 and ResNet-152 in the classification chest X-rays for COVID related changes38. In another study ResNet showed best performance in classification of dental radiographs39. Another recent study using ResNet-101V2 for the detection of furcal bone loss showed a test accuracy of 91%40. The valid accuracy of the classification model used in our study was 81%, and the probability of misclassification is 19% (Fig. 14).

Fig. 14
figure 14

Screenshot of the model classification showing an example of misclassification. Actual abnormal image (with AO) been predicted as normal (without AO) highlighted by yellow circle.

The probable factors for misclassification in our model could be due to a relatively smaller data set. The other reason could be due to wide variations in the anatomical position of the lateral nasal wall and the AO in the coronal CBCT images41,42,43.

Some recently published studies have used ResNet in the radiographic evaluation of the sino-nasal region16,44. A pretrained ResNet al.ong with a Swin transformer showed 99% accuracy in detecting boundaries of maxillary sinus pathologies in CBCT scans44. Similarly, another study on the classification of sinus pathologies in CT scans using ResNet showed an accuracy of 95%45.

In the present study, the ResNet-101V2 classifier showed an AUC value of 0.87 in the detection of AO. There is pre-activation of weights in version 2.00 of ResNet101, thus leading to better generalization compared to version 1.0046. Version 2.00 also produces a more normalized and regularized output signal leading to reduced overfitting46.

Though there are no studies exploring the accuracy of deep learning models in the detection of AO, one recent study has used ResNet-101V2 for the detection of nasal septum deviation in coronal CBCT images16. The AUC value of the classifier model used in that study (0.83) was slightly lower than in our study16. Slightly higher AUC values (0.92) were reported when ResNet-101 was used to detect sinusitis in paranasal sinus [PNS] radiographs47. The mild variation in the AUC could be due to the difference in the region of interest [ROI] and quantity of training datasets in these studies.

We carried out fine-tuning of the classification model in our study. Fine tuning improves the speed and computing efficiency of the AI model48. Since our dataset was comparatively smaller, we used the transfer learning technique. In this technique, the model is initially trained on a smaller dataset, and the features that have been grasped are readapted for use on training a different dataset48.

In the present study, we used L1 regularization, also known as Least Absolute Shrinkage and Selection Operator (LASSO) regularization to reduce overfitting. A recently published study used L1 regularization in ResNet to construct a compact model for reading ECG signals49.

Most of the recently published research papers on the application of deep learning models in the sino-nasal region focus on the classification of sinus pathologies, detection of deviated nasal septum, and detection of concha15,16,50. We have made an attempt to pioneer a study in the AI-based detection of AO in CBCT scans. We were able to develop an AI model for the detection of AO with an accuracy of 0.81 and an AUC of 0.87. We can further develop our model to detect AO in three-dimensional CT scans using the present study as the foundation.

However, there are some limitations in our study. Firstly, we have used two-dimensional cropped coronal sections from the CBCT scan and not 3D CBCT scans. The major challenges in developing a classification model for 3D CBCT scans are (1) complex 3D anatomical representations. (2) requires higher computing resources and (3) higher computational costs51.

The other limitation is a smaller dataset because of the lack of availability of large FOV CBCT scans, which are not frequently made in a dental imaging setup. The main disadvantage of using a smaller dataset is overfitting52. Overfitting leads to reduced generalizability and transferability of the classification model, causing poor performance when used on newer datasets52. The generalizability of our model is further affected because our data was obtained from one hospital setup. Future studies involving three-dimensional CBCT scans and larger dataset from different hospitals can be carried out with different deep learning models to further support our findings.

Conclusion

ResNet-101V2 showed good accuracy in detection of AO from coronal CBCT images. The findings of the present study can provide a base for future AI-based imaging studies on AO and other sino-nasal variations.