CystNet: An AI driven model for PCOS detection using multilevel thresholding of ultrasound images

Moral, Poonam; Mustafi, Debjani; Mustafi, Abhijit; Sahana, Sudip Kumar

doi:10.1038/s41598-024-75964-3

Download PDF

Article
Open access
Published: 23 October 2024

CystNet: An AI driven model for PCOS detection using multilevel thresholding of ultrasound images

Poonam Moral¹,
Debjani Mustafi¹,
Abhijit Mustafi¹ &
…
Sudip Kumar Sahana¹

Scientific Reports volume 14, Article number: 25012 (2024) Cite this article

10k Accesses
15 Citations
Metrics details

Subjects

Abstract

Polycystic Ovary Syndrome (PCOS) is a widespread endocrinological dysfunction impacting women of reproductive age, categorized by excess androgens and a variety of associated syndromes, consisting of acne, alopecia, and hirsutism. It involves the presence of multiple immature follicles in the ovaries, which can disrupt normal ovulation and lead to hormonal imbalances and associated health complications. Routine diagnostic methods rely on manual interpretation of ultrasound (US) images and clinical assessments, which are time-consuming and prone to errors. Therefore, implementing an automated system is essential for streamlining the diagnostic process and enhancing accuracy. By automatically analyzing follicle characteristics and other relevant features, this research aims to facilitate timely intervention and reduce the burden on healthcare professionals. The present study proposes an advanced automated system for detecting and classifying PCOS from ultrasound images. Leveraging Artificial Intelligence (AI) based techniques, the system examines affected and unaffected cases to enhance diagnostic accuracy. The pre-processing of input images incorporates techniques such as image resizing, normalization, augmentation, Watershed technique, multilevel thresholding, etc. approaches for precise image segmentation. Feature extraction is facilitated by the proposed CystNet technique, followed by PCOS classification utilizing both fully connected layers with 5-fold cross-validation and traditional machine learning classifiers. The performance of the model is rigorously evaluated using a comprehensive range of metrics, incorporating AUC score, accuracy, specificity, precision, F1-score, recall, and loss, along with a detailed confusion matrix analysis. The model demonstrated a commendable accuracy of $96.54\%$ when utilizing a fully connected classification layer, as determined by a thorough 5-fold cross-validation process. Additionally, it has achieved an accuracy of $97.75\%$ when employing an ensemble ML classifier. This proposed approach could be suggested for predicting PCOS or similar diseases using datasets that exhibit multimodal characteristics.

Transfer learning-enhanced CNN model for integrative ultrasound and biomarker-based diagnosis of polycystic ovarian disease

Article Open access 03 October 2025

An extended machine learning technique for polycystic ovary syndrome detection using ovary ultrasound image

Article Open access 12 October 2022

Polycystic ovary syndrome

Article 18 April 2024

Introduction

Polycystic Ovary Syndrome is a disorder that affects a large number of women worldwide. It causes ovaries to produce a high number of immature eggs, which turn into cysts, leading to enlarged ovaries and increased androgen production^1,2. PCOS was first identified by Stein and Leventhal in 1935³, characterized by symptoms such as hirsutism, chronic amenorrhea, anovulation, and enlarged ovarian cysts etc. Although Polycystic Ovarian Syndrome was recognized earlier, it wasn’t officially included in the International Classification of Diseases (ICD-10) by the World Health Organization under the code ’E28.2’ until 1990⁴. In 2009, the definition of PCOS was revised to include hyperandrogenism. Hyperandrogenism and polycystic ovary morphology (PCOM)⁵ are fundamental aspects of PCOS, a condition of ovarian dysfunction⁶. PCOM is defined by an ovarian volume of 10 ml or 25 follicles per ovary with 8 MHz transducer frequencies. Among the innumerable health challenges faced by women in the 21st century⁷, PCOS has emerged as a prevalent and significant concern impacting women of reproductive age (18–44). Approximately one in fifteen women worldwide are affected by PCOS^8,9, with prevalence rates in India ranging from 3.7 to 22.5%, higher in urban areas than rural areas¹⁰, possibly due to lifestyle factors and stress¹¹. Estimates suggest that approximately 120 million women, or 4.4% of the global female population, are affected by PCOS. In India, Ramamoorthy et al.¹² found that 10% of young girls suffer from PCOS, which is associated with a high rate of miscarriages and infertility cases¹⁰. It is linked to various psychological and metabolic issues, including hirsutism, irregular menstrual cycles, sudden weight gain, thyroid issues, type 2 diabetes, depression, excessive hair growth, alopecia, oily skin, acne, high blood pressure, sexual dissatisfaction^2,13,14, and metabolic issues such as hypertension, hyperinsulinemia, abdominal obesity, and dyslipidemia, all of which diminish the quality of life. A family history of PCOS significantly increases the risk, with 24%-32% of patients likely to develop the condition¹⁵. It can also lead to cancers in the breast¹⁶ or uterus during reproductive age. Low follicle-stimulating hormone (FSH) and luteinizing hormone (LH) levels combined with high prolactin levels prevent follicle growth and maturation in ovaries afflicted by PCOS. Normally, with the right levels of FSH and LH, a single follicle grows to about 20 mm in diameter and is ready for ovulation¹⁷. In polycystic ovaries, follicles stop growing at 5–7 mm during ovulation and remain immature. These immature follicles secrete a hormone that thickens the uterine lining, leading to spotting or excessive bleeding due to prolonged estrogen production.

However, early diagnosis through standardized approaches can lead to effective management with long-term, symptom-focused treatments. Due to these varied criteria, diagnosing PCOS is challenging, but the Rotterdam criteria¹⁸ are a widely accepted method. According to the Rotterdam Consensus Criteria (2003)¹⁹, PCOS is diagnosed if at least two of the specified criterias are met; the measures are oligo-anovulation, clinical or biochemical signs of excess androgen activity, if the ovaries have a volume of at least 10 $\hbox {cm}^{3}$ or contain 10 or more follicles i.e. polycystic overies. Ultrasound diagnostics use frequencies between 2 and 15 MHz, with sound waves sent to the object and reflected back as electrical pulses displayed as grayscale images. The high noise and low contrast of US images necessitate better accuracy to detect Polycystic Ovary follicles, which appear spherical and clustered in a necklace-like pattern. In contrast, in 2006 the Androgen Excess and PCOS Society required the presence of hyperandrogenemia, ovulation disorder, and PCOM for diagnosis⁴. Each diagnostic criterion has distinct clinical implications, such as skin manifestations from excessive androgen, endometrial hyperplasia, infertility from ovulation disorders, and risk of ovarian hyperstimulation syndrome (OHSS) from PCOM. Ultrasound imaging is a primary tool for early detection of PCOS²⁰, providing vital information on the number, volume, and position of follicles. Ultrasound is preferred over CT and MRI due to its low cost, accessibility, safety, and real-time results^21,22, but it is time-consuming, prone to human error, and reliant on the availability of skilled radiologists, particularly in less developed regions. Consequently, many women remain undiagnosed and untreated. These challenges underscore the need for intelligent computer-aided systems to support gynecologists, traditional methods, which involve image processing and machine learning, are complex and less effective, while deep learning methods, despite their accuracy and overcoming manual examination limitations, are computationally demanding^1,23,24, also an integrated machine learning approach could improve diagnostic performance and reduce the computational complexity of identifying PCOS from ultrasound images²⁵. In contrast, AI-based approaches are showing promising results in other ultrasound imaging²⁶ applications, such as thyroid^27,28,29 , breast cancer detection³⁰ , etc., further enhancing diagnostic accuracy.

Unlike traditional approaches, Haider et al.³¹ proposed a method that incorporates full contextual information surrounding the face from the provided dataset. Leveraging InceptionNet V3 for deep feature extraction, they employed attention mechanisms to refine these features. Subsequently, the features were passed through transformer blocks and multi-layer perceptron networks to predict various emotional parameters simultaneously. Their model excelled in predicting arousal, valence, emotional expression classification, and action unit estimation, achieving significant performance on the MTL Challenge validation dataset. Aziz et al.³² introduced IVNet, a novel approach for real-time breast cancer diagnosis using histopathological images. Transfer learning with CNN models like ResNet50, VGG16, etc., aims for feature extraction and accurate classification into grades 1, 2, and 3. IVNet achieves a commendable classification rate. Validation and statistical analysis confirm its efficacy. A user-friendly GUI aids real-time cell tracking, facilitating treatment planning. IVNet serves as a reliable decision support system for clinicians and pathologists, specially in resource-constrained settings. The study conducted by Kriti et al.³³ evaluated the performance of four pre-trained CNNs named ResNet-18, VGG-19, GoogLeNet, and SqueezeNet for classifying breast tumors in ultrasound images. The proposed CAD system uses GoogLeNet and a convolutional autoencoder for deep feature extraction, followed by correlation-based and fuzzy feature selection, with the final classification done using an ANFC-LH classifier. This system aids radiologists in diagnosis and serves as a training tool for radiology students. A smart feature extraction method based on Convolutional Autoencoders for semiconductor manufacturing was utilized by Maggipinto et al.³⁴, particularly focusing on predicting etch rates using Optical Emission Spectroscopy (OES) data. Traditional Machine Learning algorithms struggle with the complexity of OES data, prompting the adoption of Convolutional Neural Networks (CNNs) for feature extraction. The proposed method surpasses conventional techniques like PCA and statistical moments, offering precise etch rate predictions without domain-specific knowledge. Multipath Convolutional Neural Network (M-CNN) for feature extraction and Machine Learning (ML) classifiers for severity classification of Diabetic Retinopathy (DR) using Fundus images was employed by Gayathri et al.³⁵. Evaluation is conducted across multiple databases using Support Vector Machine, Random Forest, and J48 classifiers. Results indicate that the M-CNN network combined with the J48 model performs optimally. The proposed technique offers a promising solution for automated DR diagnosis, with potential applications in predicting other retinal diseases, thus improving retinal healthcare monitoring.

With about 70% of PCOS cases undiagnosed worldwide, Gopalakrishnan et al.³⁶ presented an automated PCOS detection and classification system using ultrasound images. The system preprocesses images with a Gaussian low pass filter, segments them using multilevel thresholding, and extracts features with the GIST-MDR technique, achieving 93.82% accuracy with the Support Vector Machine (SVM) classifier. Alamoudi et al.³⁷ conducted a study combining ovarian ultrasound images and clinical data, employing a deep learning model for PCOM detection achieving 84.81% accuracy, and fusion models combining image and clinical data with 82.46% accuracy. The study underscores the importance of clinical data in PCOS detection and highlights the potential of automated models to accelerate diagnosis and mitigate associated risks. An Improved Fruit Fly Optimization-based Artificial Neural Network (IFFOA-ANN) for classifying normal and abnormal follicles in ultrasound images was introduced by Nilofer et al.³⁸, enhancing previous adaptive k-means clustering methods. The technique improves image quality, segments follicles, and extracts features using statistical Grey Level Co-occurrence Matrix (GLCM), with the ANN trained on these features. The IFFOA-ANN method achieves 97.5% accuracy, offering a reliable automated classification system that improves the diagnosis and treatment of infertility. Suha³⁹ proposed a hybrid machine-learning method for PCOS detection using 594 ovary ultrasound images. It employs a CNN with transfer learning for feature extraction, specifically the VGGNet16 model, followed by a stacking ensemble model with XGBoost as the meta-learner. This approach effectively combines deep learning and traditional machine learning, outperforming existing techniques in both accuracy and execution time. Maheswari et al.⁴⁰ conducted a study for PCOS detection using ultrasound images, emphasizing the importance of image processing in improving computer system performance. It employs adaptive histogram equalization to remove noise and extracts features relevant to PCOS. A novel approach called Furious Flies is proposed for feature identification, addressing the drawbacks of conventional algorithms. Three stages—attraction-based ROI selection, follicle selection, and follicle identification—are employed. Classification is performed using a Naive Bayesian classifier and artificial neural networks, enabling early detection of PCOS. Despite these advancements, several gaps remain: the limited dataset sizes, may not represent the full spectrum of PCOS variations, potentially affecting model robustness and generalizability. There is often no discussion on potential biases in the datasets used for training, which could impact model performance on unseen data. Many studies focus on diagnosing specific conditions like PCOM but do not expand to classify other types of ovarian cysts, limiting their clinical utility. Additionally, manual diagnosis of polycystic ovary morphology in ultrasound images by specialists introduces subjectivity and variability, impacting diagnostic accuracy and consistency. Table 1 represents A comprehensive summary of the literature study and its key findings.Addressing these gaps is crucial for developing more robust, generalizable, and clinically useful diagnostic systems. Within the framework of this study, we have introduced several pivotal contributions to advance the domain of deep learning-based image analysis:

1.
Image Pre-processing Techniques for Improved Diagnostic Precision The study introduces a comprehensive approach to pre-processing and segmenting ultrasound images for PCOS detection, incorporating multiple techniques such as image resizing, normalization, augmentation, Watershed technique, multilevel thresholding, Morphological Processing etc. This meticulous process ensures precise and accurate identification of follicles and cysts and contributes to the overall diagnostic accuracy, reducing manual errors and time consumption.
2.
Innovative Feature Extraction using the CystNet Hybrid Model This proposed CystNet technique for feature extraction is proposed which is a hybrid model that integrates InceptionNet V3^41,42 and Convolutional Autoencoder^43,44deep learning approaches. By leveraging the strengths of both InceptionNet V3, known for its efficiency in handling varied spatial hierarchies, and Autoencoders, which excel in unsupervised learning, CystNet effectively extracts and highlights critical features from ultrasound images. This dual approach enhances the ability of the model to distinguish subtle differences between affected and unaffected cases, thereby improving the reliability and robustness of the diagnostic process.
3.
High-Accuracy Classification Using DenseNet and ML Classifiers The classification phase of the proposed system employs a dense Layer or fully connected (FC) layer alongside traditional machine learning classifiers with rigorous 5-fold cross-validation, demonstrating exceptional diagnostic performance.

The remaining study is structured into four sections, each offering a detailed examination of the research process and outcomes. Section 2 details the research methodology, encompassing dataset description, image segmentation, feature extraction, and PCOS classification. Subsequently, Section 3 conducts a thorough analysis of experimental results. Finally, Section 4 encapsulates the key findings of the study and outlines potential future research directions.

Table 1 Analysis of literature findings

Full size table

Methodology

This section comprises a comprehensive overview of the dataset utilized for training and testing the diagnosis model, followed by image preprocessing which includes normalization, augmentation and segmentation. Moreover, this section discusses the proposed model for diagnosing PCOS using ultrasound images and classifying PCOS and non-PCOS ovaries. The overall framework of this study is visually presented in Fig. 1, outlining the various phases involved in the research.

Dataset description

A dataset obtained from Kaggle⁵² is utilized for training and testing our models, initially comprising 1,924 images for training and 1,932 for testing. However, due to significant overlap between these sets, the test set is discarded, and the training set is utilized exclusively. This set is then split into new training and test sets. It includes ultrasound images labelled as ’INFECTED’ (781 images with cystic ovaries) and ’NOT INFECTED’ (1,143 images with healthy ovaries). The given ’INFECTED’ and ’NOT INFECTED’ classes uniquely identify individuals suffering from PCOS and those who are not, respectively, making this classification method highly relevant for real-time medical systems in accurately diagnosing PCOS. Figure 2 shows sample images from both categories.

This study has also incorporated the PCOSGen Dataset⁵³, gathered from various online sources. This dataset includes 3,200 healthy and 1,468 unhealthy samples, divided into training and test sets, which have been medically annotated by a gynaecologist in New Delhi, India. Additionally, the Multi-Modality Ovarian Tumor Ultrasound (MMOTU) image dataset⁵⁴ is utilized, containing 1,639 US images from 294 patients.

Dataset preprocessing

The data preprocessing step is crucial for preparing the ultrasound images for training and testing the diagnosis model. It involves several sub-processes: image resizing and normalization, image augmentation, and image segmentation techniques are described below:

Image resizing and normalization

All ultrasound images are resized to a uniform dimension of 224x224 pixels. This standardization ensures that the input dimensions are consistent across all images, which is crucial for the convolutional neural network(CNN) to process them efficiently. Uniform resizing also helps in reducing computational load and memory requirements during model training. After resizing, the pixel values are normalized using min-max normalization⁵⁵. This involves scaling each pixel value I to a range of 0 to 1 using the formula:

$$\begin{aligned} I_{\text {norm}} = \frac{I - I_{\text {min}}}{I_{\text {max}} - I_{\text {min}}} \end{aligned}$$

(1)

where, I$\rightarrow$ original pixel value, $I_{\text {min}}$$\rightarrow$ minimum intensity value, and $I_{\text {max}}$$\rightarrow$ maximum intensity value (usually 255 for 8-bit images). This scaling helps in reducing the variance and making the model training process faster and more stable. It also ensures that the pixel values are on a similar scale, preventing any single feature from dominating the learning process.

Image augmentation

Image augmentation⁵⁶ is essential for enhancing the diversity of training data, thereby improving the ability of the model to generalize and perform well on unseen samples. It includes applying various transformations to the original images, such as rotation, flipping, zooming, and shifting. These operations introduce variations in the dataset, simulating real-world scenarios and ensuring robustness in the predictions of the model. Figure 3 depicts the workings of augmentation operations visually and these augmentation operations are discussed below to provide a comprehensive overview of the transformations applied to the images during the preprocessing stage:

Rotation: Images I$\rightarrow$ rotated by an angle $\theta$ randomly selected from the range $[-90^\circ , 90^\circ ]$. This transformation is represented mathematically as:
$$\begin{aligned} I_{\text {rotated}} = R(\theta ) \cdot I \end{aligned}$$
(2)
Where $R(\theta )$$\rightarrow$ rotation matrix for angle $\theta$.
Flipping: Horizontal and vertical flipping are applied, represented as:
$$\begin{aligned} I_{\text {flipped\_horizontal}} = F_h \cdot I \end{aligned}$$
(3)
$$\begin{aligned} I_{\text {flipped\_vertical}} = F_v \cdot I \end{aligned}$$
(4)
where $F_h$ and $F_v$$\rightarrow$ horizontal and vertical flip operators, respectively. This effectively doubles the dataset size.
Zooming: Zoom transformations are applied by scaling the image by a factor Z randomly chosen from a range $[Z_{\text {min}}, Z_{\text {max}}]$:
$$\begin{aligned} I_{\text {Zoomed}} = Z(z) \cdot I \end{aligned}$$
(5)
where Z(z) $\rightarrow$ zoom transformation matrix.
Shifting: Random width $\Delta _x$ and height $\Delta _y$ shifts, up to 20% of the total dimensions, are applied:
$$\begin{aligned} I_{\text {shifted}} = T(\Delta _x, \Delta _y) \cdot I \end{aligned}$$
(6)
Where $T(\Delta _x, \Delta _y)$$\rightarrow$ translation matrix for shifts $\Delta _x$ and $\Delta _y$.

Image segmentation

Image segmentation is a crucial process in image analysis, where an image is partitioned into multiple segments or regions to simplify and change the representation of the image into an interpretable format. It is particularly important for medical imaging, as it allows for the precise identification and isolation of relevant structures, such as identifying cysts in ultrasound images for diagnosing PCOS. In this study, the segmentation process involves several steps: noise filtering, contrast enhancement, binarization, multilevel thresholding, and morphological processing. The process used for segmentation is visualized step-by-step in Fig. 4.

Noise Filtering: Noise filtering³⁹ aims to smooth out irregularities in the image caused by noise. One common technique is Gaussian blur, which applies a convolution operation between the image and a Gaussian kernel. Mathematically, this is represented as:
$$\begin{aligned} I_{\text {blurred}}(x,y) = \sum _{i=-k}^{k} \sum _{j=-k}^{k} I(x+i, y+j) \times G(i,j) \end{aligned}$$
(7)
Where i$\rightarrow$ horizontal distance & j$\rightarrow$ vertical distance from the center of the kernel, $I_{\text {blurred}}(x,y)$$\rightarrow$ blurred pixel value at coordinates (x, y), $I(x+i, y+j)$$\rightarrow$ neighboring pixel values in the original image, and G(i, j) $\rightarrow$ Gaussian kernel values which is mathematically represented as:
$$\begin{aligned} G(i,j) = \frac{1}{2\pi \sigma ^2} \exp \left( -\frac{i^2 + j^2}{2\sigma ^2}\right) \end{aligned}$$
(8)
The Gaussian filter used has a mean of $\mu = 0$ and a standard deviation of $\sigma = 0.85$, which are set to specific values during preprocessing to control the amount of smoothing applied.
Contrast Enhancement: Contrast enhancement³⁹ techniques aim to improve the visual quality of an image by increasing the intensity differences between pixels. Histogram equalization is a common method that redistributes pixel intensities to achieve a more uniform histogram. Mathematically, the transformation function T is computed from the image histogram H and its cumulative distribution function CDF:
$$\begin{aligned} T(i) = \frac{L-1}{N \times M} \sum _{j=0}^{i} H(j) \end{aligned}$$
(9)
Where T(i) $\rightarrow$ transformed intensity value, L$\rightarrow$ number of intensity levels, and $N \times M$$\rightarrow$ total number of pixels in the image.
Watershed Technique: The watershed algorithm⁴⁰ interprets the intensity gradient of the image as a topographic surface. It identifies regions of unexpected changes in intensity, indicating potential object boundaries. Mathematically, it computes the gradient magnitude $\nabla I$ of the image and treats it as a topographic surface. It then identifies regions of low intensity and regions of high intensity to segment the image.
Binarization: This operation simplifies the image by converting grayscale intensity values to binary values, effectively separating the foreground from the background. Otsu’s thresholding³⁹ is a widely used technique that automatically computes the optimal threshold value T to distinguish between the two classes by minimizing the intra-class variance. The optimal threshold value T is computed by maximizing the between-class variance $\sigma _B^2$, which represents the variance between the two classes of pixel intensities. Mathematically, it can be expressed as:
$$\begin{aligned} T = \arg \max _t (\sigma _B^2(t)) \end{aligned}$$
(10)
In this equation, $\arg \max$$\rightarrow$ value of t that maximizes $\sigma _B^2$ between-class variance.
Multilevel Thresholding: Multilevel thresholding³⁶ partitions an image into multiple segments by applying thresholding operations with different threshold values $T_i$. This technique is useful for segmenting images with complex intensity distributions. Mathematically, it applies thresholding operations with different threshold values $T_1, T_2, \ldots , T_n$ to segment the image into multiple regions. For a grayscale image, this process can be represented mathematically as follows:
$$\begin{aligned} \text {Segment}_i = \{p \in \text {Image} \mid T_{i-1} \le \text {Intensity}(p) < T_i \} \end{aligned}$$
(11)
Where $\text {Segment}_i$$\rightarrow$ region segmented by the $i^{th}$ threshold, $T_{i-1}$ and $T_i$ are consecutive threshold values, and $\text {Intensity}(p)$ denotes the intensity of the pixel p in the image.
Morphological Processing: Morphological operations³⁹ manipulate the shape and structure of objects in the image. Erosion, dilation, opening, and closing are common morphological operations. Mathematically, these operations are defined using structuring elements (kernels) and applied to binary images to modify object boundaries and remove noise. For instance, erosion E and dilation D can be expressed as:
$$\begin{aligned} E(A) = \bigcap _{i=1}^{n} (A - B_i) \end{aligned}$$
(12)
$$\begin{aligned} D(A) = \bigcup _{i=1}^{n} (A \oplus B_i) \end{aligned}$$
(13)
Where A$\rightarrow$ input binary image, $B_i$$\rightarrow$ structuring element, and $\oplus$ and − denotes dilation and erosion operations, respectively.

Dataset division

The dataset from Kaggle is divided into two sets: 70% for training and 30% for testing. The training set is utilized to train the classifier, allowing it to learn patterns and features indicative of PCOS, while the testing set evaluates the performance of the model and ensures its ability to generalize and accurately diagnose PCOS on unseen data.

Feature extraction

The process of feature extraction transforms raw data into a set of informative attributes which reduces the complexity of the data by focusing on the most relevant aspects, enhancing the performance and efficiency of models. InceptionNet V3 and Convolutional Autoencoder have been used for feature extraction in our approach. These methods integrate to form a comprehensive model named CystNet, which combines the strengths of both techniques to complete the feature extraction process, represented in Fig. 5. Each method is described in detail below:

Feature extraction using InceptionNetV3

Unlike traditional CNN models that use specific receptive field sizes in different layers for feature extraction, the Inception module employs kernels of various sizes (1$\times$1, 3$\times$3, and 5$\times$5) in parallel⁴¹. These parallel features are then depth-wise stacked to produce the output of the module. Additionally, a 3$\times$3 max-pooled version of the input is included in the feature stack. This combined output delivers rich, multi-perspective feature maps to the subsequent convolutional layer, contributing to the effectiveness of the Inception module in applications such as medical imaging, in our case with PCOS ultrasound images.

InceptionNet V3⁴², a variant of the Inception architecture, has been employed in this study for feature extraction. Retaining three Inception modules and one grid size reduction block, followed by Max Pooling and Global Average Pooling layers to decrease the output dimensions. It is a 48-layer deep convolutional neural network capable of recognizing intricate patterns and features in medical images and employs convolution kernel splitting to break down large convolutions into smaller ones, such as dividing a 3$\times$3 convolution into 3$\times$1 and 1$\times$3convolutions, developed by Keras and pre-trained on ImageNet. After convolution, the channels are aggregated and non-linear fusion is performed, enhancing the adaptability of the network to different scales, preventing overfitting, and enhancing spatial feature extraction. With input matrix $A_X$ given in equation 23, the convolutional operation followed by ReLU activation, max-pooling, and global average pooling is represented as follows:

$$\begin{aligned} A_X = \left[ \begin{array}{ccc} P_{1,1} & \cdots & P_{1y} \\ P_{21} & \cdots & P_{2y} \\ P_{x1} & \cdots & P_{xy} \end{array}\right] \times \left[ \begin{array}{ccc} Q_{1,1} & \cdots & Q_{1y} \\ Q_{21} & \cdots & Q_{2y} \\ Q_{x1} & \cdots & Q_{xy} \end{array}\right] \end{aligned}$$

(14)

$$\begin{aligned} = \sum _{i=0}^{x-1} \sum _{J=0}^{y-1} P_{(x-1),(y-j)} B_{(i+1),(j+1)} \end{aligned}$$

(15)

$$\begin{aligned} \text {ReLU} = \max (0,x) \end{aligned}$$

(16)

$$\begin{aligned} X^{(i)} = \text {ReLU}(W^{(i)} * X^{(i-1)} + b^{(i)}) \end{aligned}$$

(17)

$$\begin{aligned} \text {MaxPooling}(X)_{ij} = \max (X_{ij}) \end{aligned}$$

(18)

$$\begin{aligned} X_{\text {pooled}}^{(i)} = \text {MaxPooling}(X^{(i)}) \end{aligned}$$

(19)

$$\begin{aligned} \text {GlobalAveragePooling}(X) = \frac{1}{M \times N} \sum _{i=1}^{M} \sum _{j-1}^{N} X_{ij} \end{aligned}$$

(20)

$$\begin{aligned} X_{\text {features}} = \text {GlobalAveragePooling}(X^{(n)}) \end{aligned}$$

(21)

In these equations, $W^i$$\rightarrow$ weight tensor of the $i^{th}$convolutional layer, $X^i$$\rightarrow$ output feature map after the $i^{th}$ layer, $b^{(i)}$$\rightarrow$ bias vector, and $X_{\text {features}}$$\rightarrow$ final feature vector obtained after applying global average pooling, while M and N represent the height and width of the spatial dimensions of the feature map X, respectively.

Feature extraction using convolutional autoencoder

An autoencoder can utilize various types of neurons, but convolutional kernels are particularly effective for 2D data, making them ideal for handling images. Convolutional autoencoders share a structural similarity with CNNs, including convolution filters and pooling layers⁴⁴. Another benefit of CNNs is their ability to maintain and utilize the spatial information inherently present in images⁴³. Moreover, the convolutional autoencoder converts 2D data representations into 1D arrays. For example, it transforms the positions of m pins, given as pairs of coordinates (p, q) in $\mathbb {R}^{m \times 2}$, into a 1D format and ensures that the input and output nodes have the same dimensionality, allowing for a direct comparison between the input and output images⁴³. This comparison is utilized as an objective function to learn the parameters of the autoencoder. Since the learning process is label-independent, convolutional autoencoders are considered unsupervised methods. Equation 22 is used to express the latent encoding of the $i^{th}$ feature map in the encoder for a single-channel input p.

$$\begin{aligned} l_i = \sigma (p * w_i + b_i) \end{aligned}$$

(22)

where $*$ represents 2D convolution, $w_i$$\rightarrow$convolutional filters in the encoder, and $\sigma$$\rightarrow$ activation function. The decoder reconstructs the image using:

$$\begin{aligned} q = \sigma \left( \sum _{k \in L} l_i * \bar{w}_i + b\right) \end{aligned}$$

(23)

where b$\rightarrow$ bias per input channel, L$\rightarrow$ set of latent feature maps, and $\bar{w}$$\rightarrow$ learnable de-convolution filters. The objective function minimized during training is the mean squared error (MSE), defined as:

$$\begin{aligned} O(\theta ) = \frac{1}{2p} \sum _{k=1}^m (p_k - q_k)^2 \end{aligned}$$

(24)

where $\theta$$\rightarrow$ parameters of the autoencoder, $p_k$$\rightarrow$ the input image in the dataset, and $q_k$$\rightarrow$ the reconstructed image produced by the autoencoder.

Image classification

After feature extraction, we have applied two approaches for image classification: Classification Using Dense Layer and Classification Using ML Classifiers are explained below:

Classification using dense layer

In the Dense Layer classification, a fully connected neural network is used to perform the final classification of the extracted features. The network architecture typically consists of multiple dense layers, each followed by activation functions and dropout layers to prevent overfitting. The operations or blocks are described below:

Input Layer: The input to the dense network is a feature vector x extracted from the previous stages. If the feature vector has n features, it is represented as:

$$\begin{aligned} x = [x_1, x_2, \ldots , x_n] \end{aligned}$$

(25)

Dense Layers: Each dense layer performs a linear transformation followed by a non-linear activation function. For a given layer l, the transformation can be expressed as:

$$\begin{aligned} Z^l = W^l a^{(l-1)} + b^l \end{aligned}$$

(26)

where $a^{(l-1)}$$\rightarrow$ activation from the previous layer, $W^l$$\rightarrow$ weight matrix of layer l, and $b^l$$\rightarrow$ bias vector of layer l. The activation function applied to the linear transformation is typically a Rectified Linear Unit (ReLU), defined as:

$$\begin{aligned} a^l = \text {ReLU}(Z^l) = \max (0, Z^l) \end{aligned}$$

(27)

Dropout Layers: Dropout layers are used during training to prevent overfitting by randomly setting a fraction p of the input units to zero. Mathematically, this can be expressed as:

$$\begin{aligned} a_{\text {dropout}}^l = a^l \cdot d^l \end{aligned}$$

(28)

where $d^l$$\rightarrow$ binary mask vector sampled from a Bernoulli distribution with parameter p.

Output Layer: The final layer is a dense layer or fully connected layer with a sigmoid activation function that outputs the probability of the positive class for binary classification:

$$\begin{aligned} \hat{y} = \sigma (W^{\text {out}} a^L + b^{\text {out}}) \end{aligned}$$

(29)

where $W^{\text {out}}$ and $b^{\text {out}}$ are the weights and biases of the output layer, $a^L$$\rightarrow$ activation from the last hidden layer, and $\sigma$$\rightarrow$ sigmoid function, which is calculated as:

$$\begin{aligned} \sigma (z) = \frac{1}{1 + e^{-z}} \end{aligned}$$

(30)

Model Compilation: During the training process, the model parameters are optimized using the Adam optimizer, which dynamically adjusts the learning rate for efficient convergence. The optimization aims to minimize the loss function, specified as the binary cross-entropy, which quantifies the difference between predicted and actual labels. Additionally, the training incorporates rigorous 5-fold cross-validation to ensure robustness and generalization to unseen data. In 5-fold cross-validation, the dataset is divided into five subsets, and the model is trained on four subsets while the remaining one is held out for validation as depicted in Fig. 6. This process iterates five times, with each subset used once as the validation data. Mathematically, the loss function can be expressed as:

$$\begin{aligned} \text {Loss} = -\frac{1}{n} \sum _{i=1}^n \left[ o_i \log (P_i) + (1 - o_i) \log (1 - P_i) \right] \end{aligned}$$

(31)

where n$\rightarrow$ number of samples, $o_i$$\rightarrow$ true label of the $i^{th}$ sample, and $P_i$$\rightarrow$ predicted probability of the positive class. Additionally, the k-fold cross-validation process is mathematically represented as:

$$\begin{aligned} \text {CV} = \frac{1}{K} \sum _{k=1}^K E_k \end{aligned}$$

(32)

where K$\rightarrow$ number of folds, and $E_k$$\rightarrow$ performance metric obtained on the $k^{th}$ fold during validation.

Classification using ML algorithms

In addition to the deep learning approach, traditional machine learning classifiers are also employed for the classification of PCOS. These classifiers include Naïve Bayes (NB), k-Nearest Neighbors (K-NN), Random Forest (RF), and Adaptive Boosting (ADB) where the features extracted by the CystNet model serve as the input for these classifiers. Each of the classifiers is briefly discussed below:

1.
K-Nearest Neighbor: The K-NN⁵⁷ classifier is a straightforward, instance-based learning algorithm used for classification tasks. It classifies a new instance by identifying the K closest training instances (neighbors) using a distance metric, commonly the Euclidean distance. The Euclidean distance between two instances $x_i$ and $x_j$ with m features is given by:
$$\begin{aligned} d(x_i, x_j) = \sqrt{\sum _{l=1}^m (x_{il} - x_{jl})^2} \end{aligned}$$
(33)
The class label of the $i^{th}$ neighbor $\hat{y}$ is then determined by the majority vote of its K-nearest neighbors:
$$\begin{aligned} \hat{y} = \text {mode}\{y_i | i = 1, 2, \ldots , K\} \end{aligned}$$
(34)
2.
Naïve Bayes: Naïve Bayes⁵⁷ is a probabilistic classifier grounded on Bayes’ theorem, operating with an assumption of feature independence. Given a feature vector $x = (x_1, x_2, \ldots , x_n)$ and a class $C_k$, the classifier computes the class with the highest posterior probability using the formula:
$$\begin{aligned} P(M_k | x) = \frac{P(x | M_k) \cdot P(M_k)}{P(x)} \end{aligned}$$
(35)
This calculation involves the likelihood $P(x | M_k)$, the prior probability $P(M_k)$, and the marginal likelihood P(x).
3.
Random Forest: An RF⁵⁷ classifier is an ensemble learning method used for classification tasks, where it constructs multiple decision trees and outputs the class that is the mode of the classes predicted by the individual trees. Mathematically, for a new instance x, the prediction $\hat{y}$ is the mode of the predictions from B trees:
$$\begin{aligned} \hat{y} = \text {mode}\{h_b(x) | b = 1, 2, \ldots , B\} \end{aligned}$$
(36)
where $h_b(x)$ is the prediction of the $b^{th}$ tree. The best split $s^*$ at each node is found by minimizing the impurity, such as the Gini impurity:
$$\begin{aligned} s^* = \arg \min _s \sum _{k \in \{L, R\}} \frac{|D_k|}{|D|} \cdot \text {Gini}(D_k) \end{aligned}$$
(37)
where $D_L$ and $D_R$ are the left and right splits of the node, respectively.
4.
Adaptive Boosting: ADB⁵⁸ classifier combines multiple weak models to create a strong classifier. It iteratively trains models, giving more weight to misclassified samples. The final strong classifier H(x) is a weighted sum of the individual weak classifiers $h_t(x)$:
$$\begin{aligned} H(x) = \text {sign}\left( \sum _{t=1}^T \alpha _t h_t(x) \right) \end{aligned}$$
(38)
where t is the index of the individual weak classifiers, and T is the total number of weak classifiers.

Experimental results

Performance evaluation metrics

To assess the performance of the model, various performance metrics have been utilized efficiently. These metrics provide a comprehensive evaluation of the model’s effectiveness in accurately classifying the ultrasound images and diagnosing PCOS. The foundation for these metrics is the confusion matrix (CM)⁵⁹, which captures the model’s predictions against the actual classifications. The following terms of the confusion matrix are discussed below :

True Positive ($\textit{TP}_{os}$): This occurs when the classifier correctly identifies a patient as having PCOS.
True Negative ($\textit{TN}_{eg}$): This occurs when the classifier correctly identifies a patient as not having PCOS.
False Positive ($\textit{FP}_{os}$): This occurs when the classifier incorrectly identifies a patient as having PCOS when they do not.
False Negative ($\textit{FN}_{eg}$): This occurs when the classifier incorrectly identifies a patient as not having PCOS when they actually do.

Based on this matrix, several metrics have been calculated and are briefly discussed below:

1.
Accuracy: The ratio of correctly predicted instances to the total instances⁵⁹. Mathematically,
$$\begin{aligned} \text {Accuracy} = \frac{TP_{os} + TN_{eg}}{TP_{os} + TN_{eg} + FP_{os} + FN_{eg}} \end{aligned}$$
(39)
2.
Precision: It indicates the accuracy of the positive predictions made by the model⁵⁹. Mathematically,
$$\begin{aligned} \text {Precision} (P_e) = \frac{TP_{os}}{TP_{os} + FP_{os}} \end{aligned}$$
(40)
3.
Recall: It measures the ability of the model to identify all relevant instances in the dataset⁵⁹. Mathematically,
$$\begin{aligned} \text {Recall} (R_e) = \frac{TP_{os}}{TP_{os} + FN_{eg}} \end{aligned}$$
(41)
4.
F1 Score: The harmonic mean of precision and recall⁵⁹. Mathematically,
$$\begin{aligned} \text {F1 score} = \frac{2 \times P_e \times R_e}{P_e + R_e} \end{aligned}$$
(42)
5.
Specificity: It measures the ability of the model to correctly identify negative instances⁵⁹. Mathematically,
$$\begin{aligned} \text {Specificity} (s_e) = \frac{TN_{eg}}{TN_{eg} + FP_{os}} \end{aligned}$$
(43)
6.
ROC-AUC: The ROC-AUC curve represents the ability of the model to distinguish between classes⁵⁹.

Each metric offers a unique perspective on the performance of the model, capturing different aspects of the classification task. Accuracy, Precision, Recall, F1 Score, Specificity, and ROC-AUC were chosen because they collectively measure the model’s ability to correctly identify both positive and negative instances, the trade-off between precision and recall, and the overall discriminative power of the model.

Result and discussion

The proposed framework for Polycystic Ovary Syndrome (PCOS) detection is implemented using an 8th-generation Intel Core i7 processor, 8 GB RAM, and Python programming tools. The results demonstrate the efficacy of the system in accurately diagnosing PCOS from ultrasound images, with substantial improvements over traditional methods. The necessity for precise PCOS diagnosis lies in its significant impact on women’s reproductive health, affecting approximately 15% of reproductive-aged women globally. Early detection is vital to manage associated symptoms such as acne, alopecia, hirsutism, and infertility, thereby enhancing the quality of life and reducing long-term health risks. Our study introduces an advanced automated system that utilizes AI techniques, including Machine Learning, Transfer Learning, and Deep Learning, to detect and classify PCOS from ultrasound images. The image pre-processing techniques, such as image resizing, normalization, augmentation, division, and segmentation which contribute significantly to the accurate identification of follicles and cysts, reducing the likelihood of manual errors. Furthermore, the proposed CystNet hybrid model for feature extraction integrates InceptionNet V3 and Convolutional Autoencoder approaches, effectively highlighting critical features from the ultrasound images and perceiving subtle differences between affected and unaffected cases. This model extracts 2048 features from InceptionNet V3 and 1024 features from a convolutional autoencoder, which are then rescaled and concatenated, resulting in a total of 3062 features for the subsequent classification stage. The classification phase of the system is diverged into two approaches: Approach A and Approach B.

Table 2 Comparison table for approach A with various DL models on Kaggle PCOS US images.

Full size table

Table 3 Comparison Table for Approach B with Various ML Models on Kaggle PCOS US Images.

Full size table

In Approach A, the system employs a dense (fully connected) layer for classification, as detailed in Table 2. CystNet achieved an accuracy of 96.54%, a precision of 94.21%, a recall of 97.44%, a F1-score of 95.75%, and a specificity of 95.92% on the Kaggle PCOS US images. These metrics indicate a high level of diagnostic precision and reliability, outperforming other deep learning models like InceptionNet V3, Autoencoder, ResNet50, DenseNet121, and EfficientNetB0. The accuracy and loss graphs in Fig. 7 further illustrate the robust training and validation process for Approach A, with minimal overfitting observed. Moreover, the effectiveness of Approach A extends to other datasets, as reflected in its better performance on additional datasets. Specifically, Approach A achieved an accuracy of 94.39% when applied to the PCOSGen dataset, and this approach further demonstrated the robustness with an accuracy of 95.67% on the MMOTU dataset. These results represent the versatility and reliability of Approach A across different data sources.

Approach B integrates machine learning classifiers for the final classification phase after feature extraction with CystNet. The Random Forest classifier led the performance, achieving an accuracy of 97.75%, a precision of 96.23%, a recall of 98.29%, a F1-score of 97.19%, and a specificity of 97.37% on the Kaggle PCOS US images, as represented in Table 3. This approach outperforms traditional methods and other classifiers such as Adaptive Boosting, K-Nearest Neighbors, and Naive Bayes. Figures 8 and 9 depict the confusion matrix and AUC curves, respectively, further validating the high diagnostic accuracy of both approaches and highlighting the high diagnostic precision and reliability of the suggested model. Furthermore, Approach B exhibited better performance across various datasets, demonstrating its capability to generalize effectively. On the PCOSGen dataset, this approach attained an accuracy of 96.12%, while on the MMOTU dataset, it excelled further, reaching an accuracy of 97.23%. The outcomes represent the ability of Approach B to maintain high diagnostic accuracy and reliability across different medical datasets.

Table 4 Comparison of findings across previous studies.

Full size table

Table 5 Comparison table with different division of train and test set for approach A and approach B on Kaggle PCOS US images.

Full size table

Table 6 Comparison table with different number of folds for approach A and approach B on Kaggle PCOS US images.

Full size table

A brief comparison with previous studies indicates that our approach surpasses existing methods in terms of accuracy and reliability, emphasizing its potential for medical application. The recent systematic review by Arora et al.⁶⁴ highlights various machine learning algorithms for PCOS diagnosis, observing the challenges and limitations of current techniques in capturing the complexity of the syndrome. Paramasivam et al.⁶² developed a Self-Defined CNN (SD_CNN) for PCOS classification, achieving a notable accuracy of 96.43% using a Random Forest Classifier. While their model is effective, our approach incorporates a more comprehensive feature extraction process, leading to higher accuracy and robustness. A CNN-based automation for PCOS diagnosis with a focus on model interpretability using the Grad-CAM technique was presented by Galagan et al.⁶⁵. Moreover, Kermanshahchi et al.⁶⁶ introduced a machine learning-based model for PCOS detection on a specialized dataset. While their approach emphasizes transparency in decision-making, the integration of multiple AI techniques in our proposed approach enhances its generalizability across diverse datasets, making it more suitable for real-world clinical settings. Table 4 demonstrates that our approach surpasses existing methods in terms of accuracy and reliability, further emphasizing its potential for medical application. Additionally, the performance consistency across different divisions of training and test sets in Table 5 and various folds for cross-validation in Table 6 highlight the robustness of our system.

The sophistication of the proposed solution primarily arises from the integration of multiple AI techniques, including ML, TL, and DL. The CystNet hybrid model, which combines InceptionNet V3 with a Convolutional Autoencoder, involves extensive computation for feature extraction and integration. Unlike existing approaches, which often rely on single-stage or less integrated approaches, our method’s novelty lies in its comprehensive feature extraction and hybrid model integration, which enhances feature representation and classification performance. Despite the advancements, the current solution has limitations. One major limitation is that the dataset used for training and evaluation originates from a single source, which may not fully represent the variability found in diverse populations. This could affect the generalizability and performance of the model in real-world clinical settings. Additionally, the computational demands of the CystNet model may limit its practical deployment on devices with lower processing power. Future research should incorporate multi-source datasets to enhance model robustness. Additionally, real-time deployment and integration into clinical workflows pose challenges, necessitating further development in terms of computational efficiency and user-friendly interfaces for healthcare professionals. However, the experimental results underscore the potential of the proposed framework in revolutionizing PCOS diagnosis through automated image analysis and classification techniques. By streamlining the diagnostic process and improving accuracy, the framework holds promise in facilitating timely interventions and reducing the burden on healthcare professionals, ultimately benefiting women’s reproductive health and well-being.

Conclusion and future work

Diagnosing Polycystic Ovary Syndrome is crucial due to its significant impact on the reproductive health of women, affecting approximately 15% of reproductive-aged women globally. In this study, a dataset obtained from Kaggle, consisting of ultrasound images labeled as ’INFECTED’ (cystic ovaries) and ’NOT INFECTED’ (healthy ovaries), is used. These images are augmented using various transformations, such as rotation, flipping, zooming, and shifting, and segmented using techniques like the Watershed technique, multilevel thresholding, and morphological processing to ensure precise image segmentation. An advanced automated system is developed using AI techniques, including ML, TL, and DL, to detect and classify PCOS, with the proposed CystNet hybrid model integrating InceptionNet V3 and Convolutional Autoencoder to effectively extract critical features from the ultrasound images. During the classification phase, both a dense layer and traditional ML classifiers are employed to enhance the robustness of the classification process. With a 5-fold cross-validation process, the dense layer approach achieved an accuracy of 96.54%, precision of 94.21%, recall of 97.44%, and specificity of 95.92%, while the RF classifier achieved an impressive accuracy of 97.75%, precision of 96.23%, recall of 98.29%, and specificity of 97.19%. The results highlight the potential of the proposed framework to streamline the diagnostic process, reducing manual errors and time consumption while facilitating timely interventions for women’s reproductive health. Future work could focus on incorporating multi-source datasets from diverse populations to improve the model’s generalizability. Moreover, enhancing computational efficiency and developing user-friendly interfaces are crucial steps to ensure the practical usability of the system.

Data availability and access

The PCOS ultrasound dataset is available at: https://www.kaggle.com/datasets/anaghachoudhari/pcos-detection-using-ultrasound-images.

References

Khan, I. U. et al. Deep learning-based computer-aided classification of amniotic fluid using ultrasound images from Saudi Arabia. Big Data Cogn. Comput. 6, 107 (2022).
Article Google Scholar
Gopalakrishnan, C. & Iyapparaja, M. Active contour with modified otsu method for automatic detection of polycystic ovary syndrome from ultrasound image of ovary. Multimed. Tools Appl. 79, 17169–17192 (2020).
Article Google Scholar
Stein, I. F. & Leventhal, M. L. Amenorrhea associated with bilateral polycystic ovaries. Am. J. Obstet. Gynecol. 29, 181–191 (1935).
Article Google Scholar
Organization, W. H. The ICD-10 classification of mental and behavioural disorders: Clinical descriptions and diagnostic guidelines Vol. 1 (World Health Organization, London, 1992).
Google Scholar
Azziz, R. et al. The androgen excess and pcos society criteria for the polycystic ovary syndrome: The complete task force report. Fertil. Steril. 91, 456–488 (2009).
Article PubMed Google Scholar
Laven, J. S., Imani, B., Eijkemans, M. J. & Fauser, B. C. New approach to polycystic ovary syndrome and other forms of anovulatory infertility. Obstet. Gynecol. Surv. 57, 755–767 (2002).
Article PubMed Google Scholar
Dewailly, D. et al. Definition and significance of polycystic ovarian morphology: A task force report from the androgen excess and polycystic ovary syndrome society. Hum. Reprod. Update 20, 334–352 (2014).
Article CAS PubMed Google Scholar
Nandipati, S. C. & Ying, C. X. Polycystic ovarian syndrome (pcos) classification and feature selection by machine learning techniques. Appl. Math. Comput. Intell. 9, 65–74 (2020).
Google Scholar
Morgante, G., Cappelli, V., Di Sabatino, A., Massaro, M. & De Leo, V. Polycystic ovary syndrome (pcos) and hyperandrogenism: The role of a new natural association. Minerva Ginecol. 67, 457–463 (2015).
CAS PubMed Google Scholar
Ganie, M. A. et al. Epidemiology, pathogenesis, genetics & management of polycystic ovary syndrome in India. Indian J. Med. Res. 150, 333–344 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Bharathi, R. V. et al. An epidemiological survey: Effect of predisposing factors for pcos in Indian urban and rural population. Middle East Fertil. Soc. J. 22, 313–316 (2017).
Article Google Scholar
Ramamoorthy, S., Senthil Kumar, T., Md. Mansoorroomi, S. & Premnath, B. Enhancing intricate details of ultrasound pcod scan images using tailored anisotropic diffusion filter (tadf). In Intelligence in Big Data Technologiesâ€”Beyond the Hype: Proceedings of ICBDCC 2019, 43–52 (Springer, 2021).
Palomba, S., Piltonen, T. T. & Giudice, L. C. Endometrial function in women with polycystic ovary syndrome: A comprehensive review. Hum. Reprod. Update 27, 584–618 (2021).
Article CAS PubMed Google Scholar
Kałużna, M. et al. Effect of central obesity and hyperandrogenism on selected inflammatory markers in patients with pcos: A whtr-matched case-control study. J. Clin. Med. 9, 3024 (2020).
Article PubMed PubMed Central Google Scholar
Kahsar-Miller, M. D., Nixon, C., Boots, L. R., Go, R. C. & Azziz, R. Prevalence of polycystic ovary syndrome (pcos) in first-degree relatives of patients with pcos. Fertil. Steril. 75, 53–58 (2001).
Article CAS PubMed Google Scholar
Holste, G. et al. End-to-end learning of fused image and non-image features for improved breast cancer classification from mri. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 3294–3303 (IEEE, 2021).
Gopalakrishnan, C. & Iyapparaja, M. A detailed research on detection of polycystic ovary syndrome from ultrasound images of ovaries. Int. J. Recent Technol. Eng. 8, S11 (2019).
Google Scholar
Wang, R. & Mol, B. W. J. The Rotterdam criteria for polycystic ovary syndrome: Evidence-based criteria?. Hum. Reprod. 32, 261–264 (2017).
Article PubMed Google Scholar
Rachana, B. et al. Detection of polycystic ovarian syndrome using follicle recognition technique. Global Trans. Proc. 2, 304–308 (2021).
Article Google Scholar
Balen, A. H., Laven, J. S., Tan, S.-L. & Dewailly, D. Ultrasound assessment of the polycystic ovary: International consensus definitions. Hum. Reprod. Update 9, 505–514 (2003).
Article PubMed Google Scholar
Sitheswaran, R. & Malarkhodi, S. An effective automated system in follicle identification for polycystic ovary syndrome using ultrasound images. In 2014 international conference on electronics and communication systems (ICECS), 1pp. –5 (IEEE, 2014).
Deng, Y., Wang, Y. & Chen, P. Automated detection of polycystic ovary syndrome from ultrasound images. In 2008 30th annual international conference of the IEEE engineering in medicine and biology society, pp. 4772–4775 (IEEE, 2008).
Liu, S. et al. Deep learning in medical ultrasound analysis: A review. Engineering 5, 261–275 (2019).
Article Google Scholar
Khan, I. U. et al. Amniotic fluid classification and artificial intelligence: Challenges and opportunities. Sensors 22, 4570 (2022).
Article ADS PubMed PubMed Central Google Scholar
Zhou, Z. et al. Robust mobile crowd sensing: When deep learning meets edge computing. IEEE Network 32, 54–60 (2018).
Article Google Scholar
Yadav, N., Dass, R. & Virmani, J. A systematic review of machine learning based thyroid tumor characterisation using ultrasonographic images. J. Ultrasound 27, 209–224 (2024).
Article PubMed Google Scholar
Yadav, N., Dass, R. & Virmani, J. Despeckling filters applied to thyroid ultrasound images: A comparative analysis. Multimed. Tools Appl. 81, 8905–8937 (2022).
Article Google Scholar
Yadav, N., Dass, R. & Virmani, J. Deep learning-based cad system design for thyroid tumor characterization using ultrasound images. Multimed. Tools Appl. 83, 43071–43113 (2024).
Article Google Scholar
Dass, R. & Yadav, N. Image quality assessment parameters for despeckling filters. Proc. Comput. Sci. 167, 2382–2392 (2020).
Article Google Scholar
Virmani, J. et al. Assessment of despeckle filtering algorithms for segmentation of breast tumours from ultrasound images. Biocybern. Biomed. Eng. 39, 100–121 (2019).
Article Google Scholar
Haider, I., Tran, M.-T., Kim, S.-H., Yang, H.-J. & Lee, G.-S. An ensemble approach for multiple emotion descriptors estimation using multi-task learning. arXiv preprint arXiv:2207.10878 (2022).
Aziz, S., Munir, K., Raza, A., Almutairi, M. S. & Nawaz, S. Ivnet: Transfer learning based diagnosis of breast cancer grading using histopathological images of infected cells. IEEE Access 11, 127880–127894 (2023).
Article Google Scholar
Kriti, V. J. & Agarwal, R. Deep feature extraction and classification of breast ultrasound images. Multimed. Tools Appl. 79, 27257–27292 (2020).
Article Google Scholar
Maggipinto, M., Masiero, C., Beghi, A. & Susto, G. A. A convolutional autoencoder approach for feature extraction in virtual metrology. Proc. Manuf. 17, 126–133 (2018).
Google Scholar
Gayathri, S., Gopi, V. P. & Palanisamy, P. Diabetic retinopathy classification based on multipath cnn and machine learning classifiers. Phys. Eng. Sci. Med. 44, 639–653 (2021).
Article CAS PubMed Google Scholar
Gopalakrishnan, C. & Iyapparaja, M. Multilevel thresholding based follicle detection and classification of polycystic ovary syndrome from the ultrasound images using machine learning. Int. J. Syst. Assur. Eng. Manag.[SPACE] https://doi.org/10.1007/s13198-021-01203-x (2021).
Article Google Scholar
Alamoudi, A. et al. A deep learning fusion approach to diagnosis the polycystic ovary syndrome (pcos). Appl. Comput. Intell. Soft Comput. 2023, 9686697 (2023).
Google Scholar
Nilofer, N. et al. Follicles classification to detect polycystic ovary syndrome using glcm and novel hybrid machine learning. Turk. J. Comput. Math. Educ. 12, 1062–1073 (2021).
Google Scholar
Suha, S. A. & Islam, M. N. An extended machine learning technique for polycystic ovary syndrome detection using ovary ultrasound image. Sci. Rep. 12, 17123 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Maheswari, K., Baranidharan, T., Karthik, S. & Sumathi, T. Modelling of f3i based feature selection approach for pcos classification and prediction. J. Ambient. Intell. Humaniz. Comput. 12, 1349–1362 (2021).
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 (IEEE, 2016).
Pan, Y. et al. Fundus image classification using inception v3 and resnet-50 for the early diagnostics of fundus diseases. Front. Physiol. 14, 1126780 (2023).
Article ADS PubMed PubMed Central Google Scholar
Polic, M., Krajacic, I., Lepora, N. & Orsag, M. Convolutional autoencoder for feature extraction in tactile sensing. IEEE Robot. Autom. Lett. 4, 3671–3678 (2019).
Article Google Scholar
Lee, H., Kim, J., Kim, B. & Kim, S. Convolutional autoencoder based feature extraction in radar data analysis. In 2018 Joint 10th international conference on soft computing and intelligent systems (SCIS) and 19th international symposium on advanced intelligent systems (ISIS), pp. 81–84 (IEEE, 2018).
Patil, S. D., Deore, P. J. & Patil, V. B. An intelligent computer aided diagnosis system for classification of ovarian masses using machine learning approach. Int. Res. J. Multidiscip. Technovation 6, 45–57 (2024).
Article Google Scholar
Rahman, W. et al. Multiclass blood cancer classification using deep cnn with optimized features. Array 18, 100292 (2023).
Article Google Scholar
Jung, Y. et al. Ovarian tumor diagnosis using deep convolutional neural networks and a denoising convolutional autoencoder. Sci. Rep. 12, 17024 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Bhosale, S., Joshi, L. & Shivsharanan, A. Pcos (polycystic ovarian syndrome) detection using deep learning. Int. Res. J. Modern. Eng. Technol. Sci.[SPACE] https://doi.org/10.1109/IC3I59117.2023.10397615 (2022).
Article Google Scholar
Khamparia, A. et al. Diagnosis of breast cancer based on modern mammography using hybrid transfer learning. Multidimens. Syst. Signal Process. 32, 747–765 (2021).
Article PubMed PubMed Central Google Scholar
Dewi, R., Adiwijaya, W. U. N. & Jondri,. Classification of polycystic ovary based on ultrasound images using competitive neural network. J. Phys. Conf. Ser. 971, 012005 (2018).
Article Google Scholar
Chen, M., Shi, X., Zhang, Y., Wu, D. & Guizani, M. Deep feature learning for medical image analysis with convolutional autoencoder neural network. IEEE Trans. Big Data 7, 750–758 (2017).
Article Google Scholar
Choudhari, A. Pcos detection using ultrasound images. https://www.kaggle.com/datasets/anaghachoudhari/pcos-detection-using-ultrasound-images (2024). [Online].
Handa, P. et al. Pcosgen-train dataset. https://zenodo.org/records/10430727 (2023). [Online].
Zhao, Q. et al. A multi-modality ovarian tumor ultrasound image dataset for unsupervised cross-domain semantic segmentation. arXiv preprint arXiv:2207.06799 (2022).
Zhao, Z., Kleinhans, A., Sandhu, G., Patel, I. & Unnikrishnan, K. Capsule networks with max-min normalization. arXiv preprint arXiv:1903.09662 (2019).
Perez, L. & Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017).
Lanjewar, M. G., Panchbhai, K. G. & Charanarur, P. Lung cancer detection from ct scans using modified densenet with feature selection methods and ml classifiers. Expert Syst. Appl. 224, 119961 (2023).
Article Google Scholar
Taherkhani, A., Cosma, G. & McGinnity, T. M. Adaboost-cnn: An adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning. Neurocomputing 404, 351–366 (2020).
Article Google Scholar
Japkowicz, N. & Shah, M. Performance evaluation in machine learning. Mach. Learn. Radiat. Oncol. Theory Appl. 41–56 (2015).
Nakhua, H., Ramachandran, P., Surve, A., Katre, N. & Correia, S. An ensemble approach for ultrasound-based polycystic ovary syndrome (pcos) classification. Educ. Adm. Theory Pract. 30, 14589–14597 (2024).
Google Scholar
Bedi, P., Goyal, S., Rajawat, A. S. & Kumar, M. An integrated adaptive bilateral filter-based framework and attention residual u-net for detecting polycystic ovary syndrome. Decis. Anal. J. 10, 100366 (2024).
Article Google Scholar
Paramasivam, G. B. & Ramasamy Rajammal, R. Modelling a self-defined cnn for effectual classification of pcos from ultrasound images. Technol. Health Care 1–17.
Chitra, P. et al. Classification of ultrasound pcos image using deep learning based hybrid models. In 2023 second international conference on electronics and renewable systems (ICEARS), 1389–1394 (IEEE, 2023).
Arora, S., Vedpal & Chauhan, N. Polycystic ovary syndrome (pcos) diagnostic methods in machine learning: a systematic literature review. Multimed. Tools Appl. 1–37 (2024).
Galagan, R., Andreiev, S., Stelmakh, N., Rafalska, Y. & Momot, A. Automation of polycystic ovary syndrome diagnostics through machine learning algorithms in ultrasound imaging. Appl. Comput. Sci. 20, 194–204 (2024).
Article Google Scholar
Kermanshahchi, J., Reddy, A. J., Xu, J., Mehrok, G. K. & Nausheen, F. Development of a machine learning-based model for accurate detection and classification of polycystic ovary syndrome on pelvic ultrasound. Cureus 16 (2024).

Download references

Acknowledgements

This research work is part of the project supported by DST under the PURSE 2022 scheme.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi, 835215, India
Poonam Moral, Debjani Mustafi, Abhijit Mustafi & Sudip Kumar Sahana

Authors

Poonam Moral
View author publications
Search author on:PubMed Google Scholar
Debjani Mustafi
View author publications
Search author on:PubMed Google Scholar
Abhijit Mustafi
View author publications
Search author on:PubMed Google Scholar
Sudip Kumar Sahana
View author publications
Search author on:PubMed Google Scholar

Contributions

All the authors have substantial contributions to prepare the article. All of them provided critical feedback and helped to shape the research work. Poonam Moral has designed the model, performed the experiment, analyzed the data. She has thoroughly investigated the data set used in this research work. She also took a lead in conceptualization of the idea and writing the original draft. She is also acting as a corresponding author. Debjani Mustafi contributed to design and implementation of the research, conceptualization of the proposed method, framing and drafting the manuscript. She provided the overall planning of the research work and conceived the original idea along with reviewing the paper. Abhijit Mustafi aided in interpreting the results and analysis and contributed to the final version of the manuscript. He has contributed in the conceptualization of the proposed method. The author has also reviewed the entire paper. Sudip Kumar Sahana has given a critical review of the research paper. He took active part for collection, analysis of datasets.

Corresponding author

Correspondence to Poonam Moral.

Ethics declarations

Competing interests

The authors have no conflicts of interests to declare that are relevant to the contents of the manuscripts.

Ethical and informed consent for data used

The datasets employed in this study were obtained from publicly available sources, and their utilization adheres to ethical standards and guidelines. As these datasets are publicly accessible, specific informed consent was not required.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Moral, P., Mustafi, D., Mustafi, A. et al. CystNet: An AI driven model for PCOS detection using multilevel thresholding of ultrasound images. Sci Rep 14, 25012 (2024). https://doi.org/10.1038/s41598-024-75964-3

Download citation

Received: 07 August 2024
Accepted: 09 October 2024
Published: 23 October 2024
Version of record: 23 October 2024
DOI: https://doi.org/10.1038/s41598-024-75964-3

Keywords

This article is cited by

Feature fusion context attention gate UNet for detection of polycystic ovary syndrome
- Yuvaraj Natarajan
- Sri Preethaa K. R.
- Shyamala Devi M.
Scientific Reports (2025)
Automated high precision PCOS detection through a segment anything model on super resolution ultrasound ovary images
- S. Reka
- T. Suriya Praba
- Rengarajan Amirtharajan
Scientific Reports (2025)
Nanomaterial-enhanced biosensors for polycystic ovarian syndrome diagnosis and pathophysiological insights
- Bakr Ahmed Taha
- Marwa Amin Al-Rawi
- Norhana Arsad
Microchimica Acta (2025)
Artificial intelligence in polycystic ovarian syndrome management: past, present, and future
- Jinyuan Wang
- Ruxin Chen
- Li Tang
La radiologia medica (2025)
Advancement in early diagnosis of polycystic ovary syndrome: biomarker-driven innovative diagnostic sensor
- Aniket Nandi
- Kamal Singh
- Kalicharan Sharma
Microchimica Acta (2025)

Subjects

Abstract

Similar content being viewed by others

Introduction

Methodology

Dataset description

Dataset preprocessing

Image resizing and normalization

Image augmentation

Image segmentation

Dataset division

Feature extraction

Feature extraction using InceptionNetV3

Feature extraction using convolutional autoencoder

Image classification

Classification using dense layer

Classification using ML algorithms

Experimental results

Performance evaluation metrics

Result and discussion

Conclusion and future work

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical and informed consent for data used

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Search

Quick links