Introduction

Some anomalies may lead to a specific condition, brain stroke, in the circulation or vascular anatomy of the brain, which causes the death of brain cells. Adults and the elderly are the main populations affected. The global death toll exceeds 5.5 million people who die annually1. Two-thirds of the deaths happened in poor nations, and 40% of the subjects were less than 70 years old. Stroke not only increased fatalities but also left many of the survivors disabled and in need of assistance with everyday tasks2. A stroke strikes every 40 s, and tragically, one life is lost every 4 min3.

Traditional diagnostic methods, including clinical assessments and diagnosing methods such as CT4 and MRI5, are effective but often constrained by accessibility, cost, and the need for specialized personnel. The integration of recent techniques for developing a training model6 into medical imaging has shown promise in enhancing the performance of a stroke detection model. It is nowadays a leading cause of death all over the world.

Deep learning (DL)7 models, particularly convolutional neural networks (CNN)8, have achieved higher performance in analysing complex images, providing automated and precise identification of stroke-related abnormalities. These models potentially reduce diagnostic times and augment the capabilities of healthcare professionals by identifying subtle signs of stroke that might be missed by human observers910.

The primary limitation of using DL for brain stroke detection is the dependency on large, annotated datasets for training robust models. Additionally, the performance of these models is hindered by variability in patient demographics, scanner types, and imaging conditions, leading to potential biases and reduced generalizability1112. To overcome these issues, a novel S3ET-NET is proposed for detecting brain stroke.

The GB filter pre-processes the MRI images to reduce the noise distortion in the input images. The Ghost Net model extracts the relevant features from the improved images. The S3O has been used for identifying optimal features that are more suitable for training the model. The Efficient Net model is utilized for classifying brain stroke cases, such as normal, IS, and HS.

The remaining content has been divided into the sections listed below. Section 2 accurately incorporates the relevant works; A comprehensive explanation of the proposed S3ET-NET is shown in Sect. 3; The experimental outcomes were discussed elaborately in Sect. 4; and A conclusion and further proceeding to work were discussed in Sect. 5.

Related work

There is more research on MRI brain image recognition, classification, and segmentation strategies that have been examined by the researchers. Some of the most recent research works are summarized in this section.

In13, brain stroke images for segmentation and classification are employed. LeNet is used for categorization, and an autoencoder-decoder is utilized for segmentation, two different DL models. This study evaluates 406 images in the model, which makes use of datasets in NIFTI format. The main performance metrics of the pixel-wise classification model are 85%, and the overall classification model is 96% accurate.

In14, the proportional-elaborated high-depth pixel-based comprehensive clustering approach is implemented, which prioritizes interhemispheric imbalance in the assessment of lesions. The difference establishes the threshold value for the hierarchically clustered super pixels, resulting in an accurate and computerized lesion diagnosis process.

In15, the severity of stroke prediction can be detected by a DL-based system by utilizing real-time Electroencephalogram (EEG) sensor data. The experimental results showed strong confidence in our system’s capacity to predict stroke with 94% accuracy, utilizing raw EEG data and the CNN-bidirectional LSTM model, with a low FPR of 6% and FNR of 5.7%.

In16, developed stroke detection model was developed using a hybrid and OzNet al.gorithm. The in-depth analysis was required. At its peak accuracy, it was 87.47%. With 98.42% accuracy. OzNet-mRMR-NB recognizes strokes from brain CT scans with a higher success rate than other models.

In17, a DL automatic transfer method for accurate cerebral hemorrhage prediction using NCCT brain images is created. This approach combines the dense layer and ResNet-50. To evaluate the model, 1164 NCCT brain scans from 62 patients at the Kalinga Institute of Medical Science in Bhubaneswar were utilized. The suggested approach classifies the scan images as either normal or hemorrhagic based on their input.

In18, a 3D U-Net was used to expand the accuracy of stroke segmentation. The study offers recommendations for additional research along with an interpretation of the data. The evaluation metric of either the Jaccard index or another score coefficient was used. The identification of the computed tomography picture and the segmentation of ischemia through the selection of suitable hyperparameters.

In19, the prediction of BS has been recommended by using DL and Machine Learning (ML) models. For the classification tasks in the study, several models were successfully used, including Random Forest (RF), Decision Tree (DT), Logistic Regression, Ada Boost, Extreme Gradient Boosting (XGBoost), K-Neighbors, SVM - Linear Kernel, Naive Bayes, and deep neural networks. The RF algorithm outperformed any machine learning classifier by 99%.

In20, the highly dense features of MRI were used to predict the BS. The DL models were trained with two different datasets, such as the ischemic and hemorrhagic stroke datasets. The greatest results for BS prediction were obtained with the multiparameter model.

In21, A novel lightweight model has been developed for Alzheimer’s disease detection. This model used the ViT transformer technique to overcome the limitation, such as the inability to capture global features by traditional convolutional neural networks. This model includes the adaptive token fusion technique, which removes unnecessary tokens and improves the computational speed. It also performs randomized learning regularization. The diagnostic accuracy of the proposed model shows 3% improvement from the normal cognitive model.

In22, the major shortcomings of disease classification, such as computational cost and the overfitting problem, can be overcome by transformer models on the tuberculosis dataset, such as CT and X-ray images. They introduced a new patch reduction task by removing unnecessary tokens to improve efficiency. They also proposed a randomized classifier to overcome the overfitting problem while applying large pretrained models. They tried to avoid unnecessary calculations with the help of the token fusion technique. The model was performed on both private and public datasets by enhancing adaptability and readability for more efficient and accurate TB diagnosis.

From the related works, various deep learning methods were used that focused on accuracy for classifying and segmenting brain strokes. Similarly, the proposed network aims to enhance accuracy by utilizing advanced segmentation and classification techniques. However, a significant restriction is the shortage of direct comparison and integration among these diverse techniques, which hinders the development of a more robust and generalized model. Additionally, there is an insufficient focus on the real-world applicability, scalability, and generalization across different populations and imaging modalities. The variability in datasets, imaging techniques, and evaluation metrics further complicates a holistic assessment. To overcome these issues, a novel deep learning-based S3ET-NET is proposed for BS detection.

The motivation behind using all these pipelined networking models is as follows:

GhostNet is a lightweight Convolutional Neural Network (CNN) architecture designed to generate more features with fewer parameters and lower computation, while maintaining high accuracy. It eliminates unnecessary convolution filters that don’t add unique information. It is more suitable for deployment on low-power devices. Despite being lightweight, it captures fine-grained variations important for detecting small lesions in MRI.

Since it can generate diverse features, the most relevant and discriminative features must be selected to remove redundancy. This can be done by the salp shuffled shepherd optimization algorithm, which is a combination of the salp swarm and the shuffled shepherd optimization algorithm. This salp-shuffled shepherd performed faster and utilized less memory. It chooses only useful patterns and prevents noisy features to provide better classification.

The compound scaling strategy of EfficientNet plays the major role in the classification process to improve accuracy without overfitting or adding unnecessary complexity. It is well-suited for limited data, since it can handle fewer parameters and provide high precision. This can also perform well with robust data. The squeeze and excitation blocks in EfficientNet will help to focus on important regions that lead to better accuracy.

Proposed method

This research proposes a novel deep learning-based S3ET-NET model for the early detection of BS from the MRI images. The dataset used in this study is MRI images that were taken from Kaggle. A GB filter pre-processes the MRI images to reduce the noise distortion in the input images. The relevant features were extracted using the Ghost Net model. The optimal features were chosen using the S3O algorithm. The Efficient Net model is utilized for classifying the brain stroke cases, such as normal, IS, and HS. (Fig. 1). shows the overall workflow for the proposed method.

Fig. 1
figure 1

Overall workflow of the proposed S3ET-NET. Preprocessing is done by a Gaussian bilateral filter to reduce noise while keeping important edges. Features are extracted using Ghostnet and optimized using Salp Shuffled Shepherd Optimization. Finally, images are classified into Normal, Ischemic stroke, or hemorrhage stroke using Deep learning based EfficientNet.

Dataset description

The dataset has been carefully selected for stroke research. It consists of diverse medical images. In this research, 615 brain MRI images are used. These MRI images have been collected from Kaggle. These images are categorized into three groups. The ‘Normal’ category has 399 images, ‘ischemic stroke’ images include 30, and hemorrhagic has 186 images. The description of these datasets is represented in Table 1. Each image measures 224 × 224 pixels in size. This consistency is crucial for our proposed model. A uniform size ensures standard input dimensions. It aids in efficient processing and analysis.

Table 1 Dataset description.

The classification dataset for the proposed model was sourced from various medical databases, including clinical and public health sources. It was prepared by normalizing pixel values and dividing the dataset into training and validation sets. The ’Normal’ category included normal brain scans. The ’Ischemic’ and ’Hemorrhagic’ categories represented the two main types of strokes. This careful preparation and categorization ensured a robust and varied learning base, enhancing the model’s accuracy and generalizability. To process the image data, first make a dictionary to convert class names to integer labels and empty lists to store the photos and labels. Next, open the directory for each class, load each image file there, and convert it to an array. Add the associated label to the list of labels and the picture array to the list of images. After that, pixel values are normalized to fall between (0, 1) by converting the list of photos to a NumPy array. Moreover, created a one-hot encoded NumPy array from the label list. Lastly, the picture and label arrays have been processed.

Data pre-processing

In this section, the Gaussian bilateral filter is used to enhance the quality and reduce the disruption in the image. To minimize the disruption in input images during the process of pre-processing, the GB filter is utilized. The noise in the image is successfully removed using a GB filter, which also offers better edge preservation and smoothing. The GB filter is used to diminish the additive noise in MRI scans, and depending on certain requirements, the Gaussian filter may cause non-uniform image distortion. It produces a more aesthetically pleasant look by reducing high-frequency information and softening abrupt transitions. The Gaussian distribution approach is utilized to distort the image. It is advised to estimate a Gaussian function using a discrete approximation. Following the acquisition procedure, noises were included in the images, which helps prevent false detection. Therefore, to correctly identify stroke lesions, the noises must be eliminated. The proposed method might greatly improve image quality. The bilateral filter and input image I and guidance G are different, as shown in Eq. (1).

$$\:f\left(t\right)={\sum\:}_{v}\left(W\begin{array}{c}g\\\:t,v\end{array}\right)\left(G\right){I}_{v}$$
(1)

From the equation, \(\:f\left(t\right)\) Be the output of the bilateral filter at the position. \(\:\left(t\right)\), \(\:W\begin{array}{c}g\\\:t,v\end{array}\) Be the bilateral filter weight between the pixels at positions. \(\:t\) and \(\:v\), which depends on the guidance image \(\:G\) and \(\:{W}_{t,v\left(G\right)}^{G}\)It is expressed in Eq. (2).

$$W_{{t,v\left( G \right)}}^{G} = \frac{1}{{L_{t} }}\exp \left[ { - \left\| {\frac{{t - v}}{{ - \sigma _{s}^{2} }}} \right\|^{2} } \right]$$
(2)

Where, \(\:{L}_{t}\) Represent the normalizing factor. In Eq. (2) Gaussian spatial kernel is signified by \(\exp \left[ { - \left\| {\frac{{t - v}}{{ - \sigma _{s}^{2} }}} \right\|^{2} } \right]\), \(\:{\sigma\:}_{s}^{2}\) Controls the influence of intensity differences between the pixels, and the GB kernel is expressed in Eq. (3). \(\:{G}^{-}\) obtained from Eqs. (1) and (3) and \(\:exp\) Is the range kernel.

$$GB_{{\ker nel}} = W_{{t,v}}^{{gb}}$$
(3)
$$\:fin\left(o\right)={\sum\:}_{v}\left(W\begin{array}{c}gbf\\\:t,v\end{array}\right)$$
(4)

The GB filter’s final output \(\:fin\left(o\right)\)It is stated using Eq. (4). The DL-based Ghost Net uses the noise-free images as input to extract the crucial features needed to categorize brain stroke cases.

Feature extraction

Ghost Net is a neural network architecture designed to provide effective model estimation on integrated and transportable hardware. To reduce model size and computational complexity while maintaining accuracy. Rather than using ordinary convolution as in traditional networks, Ghost Net uses depth-wise separable convolution (DWConv). The intrinsic features of each channel were used for the application of a simple kernel to form a two-stage convolutional layer. It was developed mainly for image recognition and classification. For large input images, CNN algorithms create a local responsive field for each hidden layer of data instead of completely connected layers, which takes a very long time. A ghost unit is deployed on the CNN network to reduce network traffic, improve feature benefits, and extract bottom-level parameters at numerous scales.

In particular, the number of parameters of a convolutional layer of data is considered. \(\:H\in{q}^{{R}_{s}*{R}_{s}*{I}_{s-1}*{K}_{s}}\), where \(\:{k}_{s}\)Is the number of kernels, \(\:{I}_{s-1}\) Denotes data from the source channels. The sth layering function for this architecture is determined using the linear model derived from Eq. (5), which combines a similar number of weights with the layer’s input quantity. An unfavourable vector connection, \(\:{P}_{t-1}\in{q}^{{N}_{s-1}*{N}_{s-1}*{E}_{s-1}}\) Is given. As a result, a random output map of features \(\:{G}_{r}\in{q}^{{J}_{s}*{J}_{s}*{E}_{s}}\)It is produced, which indicates the positions of the traits that have been identified in the given image.

$$\:{G}_{s}=H*{H}_{s-1}+{b}_{s}$$
(5)

Equation (6) states that the input features are projected onto the active receptive field while convolutional kernels with gradients are applied to the input images. In this case, the spatial values are represented as \(\:\stackrel{\prime }{B}=\stackrel{\prime }{B}-\left[{L}_{t-1}/2\right]\), where \(\:t,L,\stackrel{\prime }{B}\wedge\:\stackrel{\prime }{L}\) Denote the indices across the spatial distributions of the input information or outcome values and the weights appropriately.

$$\:{c}_{i}^{u,v,w}={\sum\:}_{\stackrel{\prime }{l}\stackrel{\prime }{k}\stackrel{\prime }{r}}{F}_{i}^{u,\stackrel{\prime }{v,}\stackrel{\prime }{w}}*{X}_{i-1}^{u+\stackrel{\prime }{u},v+\stackrel{\prime }{v,}\stackrel{\prime }{w}}+{bias}_{i}^{i}$$
(6)

Several simple methods are shown as \(\:{Q}_{r}\in{P}^{{E}_{s}*{E}_{s}*{I}_{s-1}*{K}_{s}}\), which are the ultimate feature maps \(\:{H}_{r}\)Those are obtained as ghosts by altering a sequence of crucial feature maps. \(\:{G}_{r}\). Integration of the equation results in \(\:{\stackrel{\prime }{L}}_{r}{L}_{r}\). Thus, to determine the precise \(\:{L}_{r}\). To generate M ghost attributes, several low-cost linear techniques are applied to every intrinsic feature of \(\:{H}_{r}\) In map format, as indicated by the following equation:

$$\:{H}_{r}^{u,v}={\phi\:}_{u,v}\left({\stackrel{\prime }{S}}_{r}^{u}\right),\forall\:u=1,\dots\:.{\stackrel{\prime }{L}}_{l},\forall\:v=1,\dots\:.{\stackrel{\prime }{M}}_{l}$$
(7)

where \(\:{\phi\:}_{u,v}\)Is the vth linear function that derives the vth ghost feature map from the uth fundamental feature mappings, \(\:{H}_{r}^{u,v}\), and \(\:{\stackrel{\prime }{S}}_{r}^{u}\in\:{H}_{l}^{u}\)Is the uth essential feature map? The selected \(\:{K}_{r}=M*{\stackrel{\prime }{L}}_{r}\)Feature maps are thus obtained. The next block receives the size of each paired layer that is merged, as well as the volume of each point layer. Finally, the training process successfully extracts diverse features in BS imaging.

Feature selection

In this research, the S3O method is employed to select the optimal features for brain stroke detection from MRI sequences. The S3O algorithm is an advanced metaheuristic optimization technique that combines the behavior of salps with shepherd optimization mechanisms. Inspired by the chain-like movement of the Salp in the ocean. The Salp is divided into leaders and followers. Leaders guide the movement, and followers adjust their positions based on the leaders. The shepherd helps guide the Salp to promising regions in the search space and prevents premature convergence.

Notations:

  1. 1.

    \(\:D=\left\{{b}_{1}{,b}_{2}{,\dots\:,b}_{x},\dots\:.,{b}_{N}\right\}:\) An MRI dataset with N images.

  2. 2.

    \(\:Q=\left({q}_{1}{,q}_{2},{q}_{t},{q}_{s}\right)\)Extracted grids or patches from images.

  3. 3.

    \(\:{R}_{g}\)Position of the 5th salp in feature space.

  4. 4.

    \(\:{t}_{op}^{\left[i\right]}\)True output for sample i.

  5. 5.

    \(\:{Q}^{\left[i\right]}\)Model prediction for sample i.

  6. 6.

    \(\:k\)Number of samples for fitness evaluation.

  7. 7.

    \(\:{\gamma\:}^{th}\): Dimension index.

By applying the S3OA for accurate brain stroke feature selection, the updated equation of the follower in the produced S3O is modified. The following is the structure of the suggested S3O’s algorithmic phases:

Algorithm

Input: Feature space dimension, population size, max iterations.

Output: Optimal feature subset.

  1. 1.

    The Salp population is initialized as \(\:{R}_{g}\left(g=\text{1,2},\dots\:,n\right)\)The position of the Salp in each n-dimensional space is denoted by the symbol. \(\:{R}_{g}\).

  2. 2.

    The distance between the source and the Salp point is measured to construct the fitness function.

    $${\text{Fitness }}\left( {{\text{Rg}}} \right){\text{ = }}\frac{{\text{1}}}{{\text{k}}}\mathop {\text{{\aa}}}\limits_{{{\text{i = 1}}}}^{{\text{k}}} \left( {{\text{t}}_{{{\text{op}}}}^{{\left[ {\text{i}} \right]}} {\text{ - Q}}^{{\left[ {\text{i}} \right]}} } \right)^{{\text{2}}}$$
    (8)
  3. 3.

    Identify the best salp as the leader, and update the leader position. The following is an updated position of the leading Salp:

    $$\:{d}_{1}^{\gamma\:}\left\{\begin{array}{c}{r}^{\gamma\:}+{g}_{1}\left(\left({up}^{\gamma\:}-{l}^{\gamma\:}\right){g}_{2}+{jo}^{\gamma\:}\right){g}_{3}\ge\:0\\\:{r}^{\gamma\:}-{g}_{1}\left(\left({up}^{\gamma\:}-{l}^{\gamma\:}\right){g}_{2}+{jo}^{\gamma\:}\right){g}_{3}\ge\:0\end{array}\right.$$
    (9)

    When the position of the brain stroke in the \(\:{\gamma\:}^{th}\) Dimension is represented by the variable. \(\:r\). The designations \(\:jo\), \(\:up\), and \(\:{d}_{1}^{\gamma\:}\) Correspond to the lower bound, upper bound, and first Salp position. According to the Coefficients, \(\:{g}_{1}\) It is the Coefficient that is typically employed to balance exploitation and exploration.

  4. 4.

    For each follower, update position

    $$\:{d}_{z}^{\gamma\:}=\frac{1}{2}{jo}^{2}+{f}_{o}j$$
    (10)

    Using average update,

    $$d_{Q}^{{\gamma + 1}} = \frac{1}{2}\left[ {d_{Q}^{\gamma } + d_{{q - 1}}^{\gamma } } \right]$$
    (11)
  5. 5.

    The optional vector update is given by

    $$\:{d}_{Q}^{\gamma\:+1}=\sigma\:*r\wedge\:^\circ\:\left({d}_{j}^{\gamma\:}-{d}_{Q}^{\gamma\:}\right)+\rho\:r\wedge\:\left({d}_{Q}^{\gamma\:}-{d}_{Q}^{\gamma\:}\right)$$
    (12)
    $$\:{d}_{Q}^{\gamma\:+1}=\frac{2r\wedge\:^\circ\:\left(\rho\:+\sigma\:\right)}{2r\wedge\:^\circ\:\left(\rho\:+\sigma\:\right)}$$
    (13)

    From the equation, \(\:{d}_{Q}^{\gamma\:+1}\) represents the updated position of the \(\:\gamma\:+1\) follower Salp, \(\:r\) Be the random number used in the movement. \(\:\gamma\:\) represents the follower Salp index, \(\:\rho\:\) and \(\:\sigma\:\) The coefficients used to control the influence of different terms in the movement vector and \(\:{z}^{th}\) Position in the dimension is represented as \(\:{D}_{k}^{\gamma\:+1}\) .

  6. 6.

    Repeat until max iterations reached or convergence. Return the salp with the best fitness as the optimal feature subset.

The step-by-step procedure of the proposed method is shown in (Fig. 2).

Fig. 2
figure 2

Overall process of the proposed S3O algorithm. The flowchart begins with parameter initialization and brain stroke identification. Leader and Follower positions are updated iteratively to explore possible stroke regions. Regions are classified as stroke or non-stroke based on the fitness threshold. A shuffling and update mechanism refines the search until the termination condition is met, after which the final stroke position is reported.

Augmentation techniques

The extracted images are regarded as input in the augmentation phase. The image is augmented to increase the quantity of images for the process of training. Here, shifting, flipping, shearing, and random rotation are applied to the image. The image is randomly rotated counterclockwise or in a clockwise direction by some number of degrees by moving the location of the object in an image. The position of every object in an image is transformed to its new location by a geometric transformation termed “image shift”. The image is moved along an edge along the horizontal or vertical axis, and due to this transformation parallelogram is produced. Images can be flipped in both the vertical and horizontal directions, and the image is produced by rotating it at a multiple of 90 degrees.

Classification

In this section, the selected images are classified utilizing an efficient network. The noise-free images are fed into the deep learning-based Efficient Net for classifying the BS cases. The EfficientNet model aims to deliver exceptional performance on image classification tasks while using minimal computational power. To balance model dimensions with performance, it integrates the ideas of compound scaling and effective model design. The algorithm consists of constantly repeated blocks filled with activation, batch normalization, convolutional, and pooling layers. A deep learning-based Efficient Net’s general structure is represented in (Fig. 3).

Fig. 3
figure 3

Proposed Architecture of Efficient Net. The model begins with a 3 × 3 convolution layer followed by multiple Mobile Bottleneck (MB Conv) layers with varying kernel sizes (1 × 1, 3 × 3, and 5 × 5) across three parallel branches. These branches progressively extract multiscale features. The outputs are then merged and passed through 1 × 1 convolution, pooling, and fully connected (FC) layers. The final FC layer produces the classification output.

The Efficient Net raises the model’s computational efficiency by reducing the dimensionality and floating-point computing cost. Compound growth allows for the creation of many versions of Efficient Net. Compound scaling is used by Efficient Net, which simultaneously expands the network’s depth, width, and resolution. The scale is adjusted using scaling coefficients \(\:l,n\) and a user-defined scaling parameter . Using a weighted scale with the three interrelated model hyperparameters, depth \(\:l\), width \(\:m\), and resolution \(\:n,\) defined in Eq. (14) is known as triple scaling.

$$\:l={A}^{\varnothing\:},m={B}^{\varnothing\:},n={C}^{\varnothing\:}u.v.w.{n}^{2}\approx\:2$$
(14)
$$\:d\ge\:1,e\ge\:1,f\ge\:1$$

where the constants \(\:d,e,f\) will determine the network’s resolution. Larger and more powerful models are produced by higher \(\:\varnothing\:\) Values, whereas smaller and more effective models are produced by lower values. The grid search optimizes the coefficients \(\:d,e,f\) in a few different ways:

$$\:{d}^{1}{e}^{2}{f}^{2}\approx\:2$$
(15)

From this \(\:d\ge\:1,e\ge\:1\), Efficient Net achieved optimal values for \(\:d,e,f\) under the circumstances specified in (15). Changing the coefficient \(\:\varnothing\:\) Eq. (14) yields the scaled versions of Efficient Net. Efficient Net effectively classifies BS features with a high success rate.

Results and discussion

In this phase, the experimental setup was made using a deep learning algorithm implemented in Python. The collected dataset’s brain MRI images are classified into three categories in this study of the results: normal, IS, and HS cases. A substantial collection of data is accessible from the publicly available dataset. The dataset contains MR images of healthy, normal participants that have been gathered.

Fig. 4
figure 4

Experimental result of the proposed S3ET-NET. Each row illustrates the processing pipeline for a brain MRI image: the original input, the pre-processed image after Gaussian bilateral filtering, the extracted feature map using GhostNet, and the final classification result. The model accurately differentiates between Normal, Ischemic Stroke, and Hemorrhagic Stroke cases based on the extracted deep features.

The outcomes of the developed S3ET-NET are displayed in (Fig. 4). The patient’s MRI images have been considered as input and passed to pre-processing. These pre-processed images can be given for the feature extraction phase. Finally, the classification output, such as normal, IS, and HS, can be done with extracted images.

Performance analysis

The earlier-mentioned evaluation metrics are produced using straightforward parameters like True Positive (\(\:Tu{P}_{V}\)), True Negative (\(\:Tu{N}_{G}\)), False Positive (\(\:Fe{P}_{V}\)), False Negative (\(\:Fe{N}_{G}\)).

Accuracy:

The proportion of total correct predictions (both positive and negative) out of all predictions made.

$$\:Accuracy=\frac{Tu{P}_{V}+Tu{N}_{G}}{Tu{P}_{V}+Tu{N}_{G}+Fe{P}_{V}+Fe{N}_{G}}\times\:100$$
(16)

Precision:

The proportion of correctly predicted positive samples out of all predicted positives. It reflects how many of the predicted positives are correct.

$$\:Precision=\frac{Tu{P}_{V}}{Tu{P}_{V}+Fe{P}_{V}}$$
(17)

Recall:

The proportion of actual positive samples that were correctly identified. It measures how well the model captures true positives.

$$\:Recall=\frac{Tu{P}_{V}}{Tu{P}_{V}+Fe{N}_{G}}$$
(18)

Specificity:

The proportion of actual negative samples that were correctly classified. It measures how well the model avoids false positives.

$${\text{Specificity = ~}}\frac{{{\text{TuN}}_{{\text{G}}} }}{{{\text{TuN}}_{{{\text{G + FeP}}_{{\text{V}}} }} }}$$
(19)

F1-Score:

The harmonic means of precision and recall. It balances the trade-off between false positives and false negatives.

$$\:F1-Score=\frac{2}{\frac{1}{\left(Recall\right)}+\frac{1}{\left(Precision\right)}}$$
(20)
Table 2 Performance metrics of the proposed Model.

The effectiveness of the proposed model by classifying Brain stroke, including Normal, Ischemic stroke, and Haemorrhagic stroke, is shown in Table 2. The proposed S3ET-NET attains a 99.41% accuracy rate. The bold values represent highest result of each metric.

For Normal cases, the model achieves high accuracy (99.41%) with strong precision (96.12%) and recall (96.44%), indicating that the model reliably distinguishes healthy patients with minimal false positives or negatives.

The Ischemic Stroke class — typically more challenging due to subtle lesion appearances — still achieves a high accuracy (99.42%) along with balanced precision (97.24%) and recall (96.35%), demonstrating the model’s robustness in detecting this stroke type.

The model performs slightly better on Hemorrhagic Stroke, with an accuracy of 99.35% and an F1-score of 98.93%, indicating a strong balance between precision (98.25%) and recall (97.33%).

Across all classes, specificity remains above 96%, further confirming that the model avoids misclassifications between stroke types. These results suggest that S3ET-Net generalizes well and maintains consistent performance across all classes, without favoring the majority class.

The average performance across all the classes typically uses macro averaging, which means equal weight to each class. These macro averages represent the model’s overall balance across classes, regardless of class size. The following Table 3 shows the average performance across all classes.

Table 3 Average performance across all Classes.
Fig. 5
figure 5

Accuracy of proposed S3ET-NET. Training and testing accuracy over 110 epochs. The model quickly learns within the first 20 epochs and achieves stable testing accuracy above 95%, indicating strong generalization.

(Fig. 5). shows the accuracy graph, which was based on a predefined accuracy range and 100 epochs. The more epochs there are in a photo, the more accurate the proposed model becomes for BS diagnosis.

Fig. 6
figure 6

Loss of proposed S3ET-NET. Training and testing loss over 100 epochs. Both losses decrease steadily, indicating effective learning, with minimal overfitting and stable convergence.

The reduction in loss as the suggested model goes through more epochs is demonstrated in (Fig. 6), which shows the epochs and the accompanying loss range. The suggested model correctly distinguishes between different stages of brain stroke using brain MRI data. After 100 training epochs, the suggested model shows a minimal error rate and a detection accuracy of 99.41%.

Fig. 7
figure 7

Confusion matrix of the proposed S3ET-NET. Normalized confusion matrix showing high classification accuracy across all stroke types, with minimal misclassification and strong model performance.

(Fig. 7). shows the multi-class confusion matrix for the suggested model. The suggested S3ET-NET has achieved a minimal error and better accuracy in brain stroke detection.

To evaluate the classification performance of the proposed S3ET-Net model, a Receiver Operating Characteristic (ROC) curve was generated. The ROC curve illustrates the trade-off between the true positive rate (sensitivity) and the false positive rate (1-specificity) across various threshold settings.

As shown in (Fig. 8), the orange curve represents the mean ROC performance of the model. At the same time, the shaded region indicates the ± 1 standard deviation, reflecting the variability in performance across different runs or subsets. The Area Under the Curve (AUC) was approximately 97% ± 2%, indicating the model’s excellent discriminative ability. The curve’s proximity to the top-left corner suggests that S3ET-Net achieves a high recall with a very low false positive rate, demonstrating strong robustness and generalization capability.

Fig. 8
figure 8

ROC curve for the proposed S3ET-Net model with ± 1 standard deviation. The model demonstrates strong classification performance with an AUC of approximately 97% ± 2%.

Ablation study

To evaluate the effectiveness of individual components within the proposed classification pipeline, an ablation study was conducted by systematically replacing or removing the GhostNet, S3O, and EfficientNet modules.

GhostNet was selected for its ability to extract rich feature representations using fewer parameters and lower computational complexity, making it well-suited for capturing fine-grained variations in stroke MRI images. The S3O algorithm was employed to optimize the feature selection process by identifying the most relevant features from the extracted set. Finally, EfficientNet was used for its scalable architecture and high classification accuracy in distinguishing between different stroke types.

The primary objective of this study is to understand how each module contributes to the overall performance of the proposed model. By selectively removing each component, we measured the impact on key evaluation metrics such as accuracy, precision, recall, specificity, and F1-score. The experiments were conducted on a publicly available brain stroke MRI dataset. The results, summarized in Table 4, clearly demonstrate the importance of each module in enhancing the model’s classification performance.

Table 4 Ablation study of the proposed S3ET-Net model, analyzing the impact of each module (GhostNet, S3O, EfficientNet) on performance metrics. The full configuration outperforms all ablated versions across accuracy, precision, recall, specificity, and F1-score, confirming the critical role of each component.

Training configuration

The S3O was initialized with a population of 30 and run for 100 iterations. The search space of size 1024 was defined by the GhostNet-extracted features. A linearly decaying control parameter g1g_1g1​ guided the balance between exploration and exploitationThe EfficientNet classifier was trained with the Adam optimizer (learning rate = 0.0001), batch size = 16, and categorical crossentropy loss. Dropout of 0.3 was applied to the fully connected layers. Training was conducted for 50 epochs, and the learning rate was adjusted using a ReduceLROnPlateau strategy. The model input was resized to 224 × 224 × 3 pixels. It is shown in Table 5.

Table 5 Overview of key hyperparameters used for feature selection with S3O and classification with efficientnet.

Comparative analysis

The effectiveness of neural networks was compared with other existing models to verify that the suggested S3ET-NET produces results with high accuracy. The competency of the suggested model using various deep learning classifiers such as Link Net, Mobile Net, Google Net, and Proposed Ghost Net was analyzed, and it is shown in Table 6.

Table 6 Comparative analysis of existing models the main performance metrics attained by the developed model are 99.41%, showing better performance than the traditional DL networks. In contrast to link net, mobile net, and Google net, the proposed ghost net improves detection accuracy by 1.16%, 1.94%, and 3.14%, respectively.
Table 7 Comparative analysis between existing DL networks and the proposed Model.

The comparison in Table 7. shows that MRI scans used to predict BS with high accuracy, and the algorithms are trained on MRI scans. The suggested S3ET-NET outperforms ResNet50, OzNet-mRMR-NB, and DNN with an accuracy range by 3.20%, 5.22%, and 4.21%, respectively. Nevertheless, the previously stated approaches are less effective than the proposed S3ET-NET. With an accuracy of 99.41%, this model outperforms other models in achieving accurate results.

The dataset is publicly available on Kaggle. Implementation details have been described in the manuscript, and code snippets will be made available upon request for research purposes.

Conclusion

A novel DL-based S3ET-NET model is developed for the early detection of BS from the MRI brain images. A GB filter pre-processes the MRI images to reduce the noise distortion in the input images. The Ghost Net model is employed for deriving suitable features from the improved images. The S3O algorithm is used to choose the optimal features from the derived images. The Efficient Net model is used for classifying brain stroke cases, including normal, IS, and HS. According to the result, the proposed S3ET-NET attains a 99.41% reliability rate. In contrast to Link Net, Mobile Net, and Google Net, the proposed Ghost Net improves detection accuracy by 1.16%, 1.94%, and 3.14%, respectively. The suggested S3ET-NET outperforms ResNet50, OzNet-mRMR-NB, and DNN with an accuracy range of 3.20%, 5.22%, and 4.21%, respectively. Though the proposed model works well, it has some limitations in handling data imbalance. While the dataset serves as a useful resource for training stroke classification models, it has several limitations. These include a strong class imbalance, particularly underrepresentation of Ischemic stroke cases, a limited number of labelled images, and the absence of clinical metadata. Ischemic stroke images are significantly underrepresented. This imbalance may bias the model toward the dominant classes (Normal and Haemorrhagic), resulting in poor performance on Ischemic cases. In the future, the data augmentation model and advanced DL models will be added to the developed system to higher the success rate of brain stroke detection.