Introduction

There are billions of cells in the brain, each of which has its own unique function. After being produced, most of the cells are then divided to form new ones for proper functioning1. Cells that are not needed replace the old ones. Cells that are damaged or aged are not always able to die as they should. Instead, they are produced to form a lump of tissue known as a tumor2. A tumor that has been placed inside the brain region can cause the body’s sensitive functions to malfunction. This can cause various symptoms such as memory loss and headaches. Due to its location and how capable it is of spreading, this type of tumor is difficult to treat3.

Brain tumors are different from cancer. They can either be malignant or benign. The difference is that a malignant tumor has a uniform structure, and it does not contain cancer cells4. The growing number of brain tumors known as Gliomas is considered to be an alarming and increasing issue for human mortality. There are two kinds of brain tumors: high-grade gliomas and low-grade gliomas. The former is characterized by a high growth rate, while the latter is slower. For patients with these types of tumors, the prognosis is usually not more than 2 years5.

In order to save lives, it is necessary for a brain tumor diagnosis to be carried out in its advanced stages. This process should be performed with high accuracy and speed. Magnetic resonance imaging can be used to visualize suspicious regions in the brain tumor6. Through the use of complex medical images, this method can be performed to identify brain tumors. Unfortunately, this process usually involves experts. The process can be very time-consuming and challenging. For instance, segmentation by multiple experts can take a long time7. Different factors can also affect the success of the segmentation process. For instance, the presence of a different doctor can affect the results of the process. One of the most crucial factors that can be considered when it comes to detecting brain tumors is the automatic segmentation technique. Unfortunately, this method does not currently exist8.

Various research projects are being conducted on the use of artificial intelligence to detect brain tumors. This technology could help decrease the manual effort involved in performing the diagnosis. The precise location and size of brain tumors are among the most important factors that can be analyzed to detect them9. CT and MRI are medical imaging techniques that can visualize the brain. They are commonly used to check the function and structure of a tumor. CT scans are commonly used to visualize the brain, but they can also expose the human body to harmful radiation. MRI, on the other hand, provides an accurate depiction of the anatomical details of the brain10. A magnetic resonance imaging technique uses a combination of magnetic and radio waves to create detailed images. Due to the complexity of the imaging process, it is very important that the pathologists thoroughly analyze the data generated by the technology to improve the quality of the diagnosis11.

A type of imaging known as magnetic resonance imaging is commonly used to diagnose and treat various brain disorders, such as schizophrenia and Alzheimer’s disease. Aside from being used for the diagnosis and treatment of these disorders, tissue atrophy is also a common biomarker for other neurological conditions12. Understanding the various aspects of tissue atrophy is very important for the treatment and diagnosis of these disorders. For instance, segmentation is needed to measure the change in brain structures13. The precise localization of tissue atrophy and its surrounding healthy structures during treatment is also very important for the diagnosis and treatment of these disorders14. A qualitative and quantitative analysis of the various aspects of tissue atrophy is also performed in clinical trials. These studies are usually conducted to study the effects of treatment on a group of patients15.

When it comes to choosing a machine learning technique for detecting brain tumors, the implementation goals are often the determining factors. While Recurrent Neural Networks are commonly associated with reinforcement learning, their architecture, featuring interconnected neurons with time-delayed inputs and reusable outputs, makes them suitable for other tasks like time series analysis, not typically relevant for brain tumor detection from static images. Although reinforcement learning holds potential for healthcare, its application in brain tumor detection is currently limited due to challenges such as data structuring, reward definition, and computational demands. Therefore, other machine learning paradigms are more commonly employed.

There are many factors to consider when it comes to choosing the right deep learning or machine learning approach for detecting brain tumors. The number of unique features extracted from medical images, the amount of data that has been collected, and the distribution of information are some of the factors that can affect the effectiveness of these methods. For instance, high-dimensional feature spaces may require dimensionality reduction techniques, while limited datasets may benefit from data augmentation or transfer learning strategies. Furthermore, imbalanced datasets, where certain tumor types are under-represented, may necessitate specialized training procedures to avoid biased predictions. Due to the capabilities of neural networks to learn spatial hierarchies, they have been able to detect brain tumors. Developers of frameworks such as ResNet, VGG, and Inception can fine-tune their systems for detecting brain tumors using the knowledge they have gained from other image recognition tasks. Transfer learning techniques can help address the challenges of limited dataset sizes. Data augmentation can also increase the training data’s diversity and robustness. The optimal approach is dependent on the data’s attributes and the goal of the project, such as tumor classification or segmentation. In this paper, we present a method that can design and implement deep learning systems that can detect brain tumors. The method is based on ensemble learning, which can improve the model’s performance. It also utilizes the PSO method to attain the best result.

Related works

Several studies have explored deep learning architectures for brain tumor analysis, demonstrating promising results. Kamnitsas et al.16 introduced a 3D deep architecture that outperformed previous methods by capturing both regional and contextual information within the tumor region.

Axel et al.17 employed a two-pathway CNN approach, incorporating Maxout layers and a softmax output. This model achieved efficient performance, requiring only approximately 20 min for analysis.

Pereira et al.18 utilized two distinct CNN architectures for classifying high-grade and low-grade tumors. Their approach involved analyzing 2D patches from four MRI channels, followed by post-processing to remove non-tumor regions. This method achieved an estimated processing time of around 10 min.

Havaei et al.19 presented a two-network architecture that considers inter-network connectivity. Their process included a post-processing step to remove flat patches near the skull and utilized a GPU for brain segmentation, achieving a processing time of approximately three minutes.

Shaik et al.20 proposed MANet, a multi-level attention network incorporating cross-channel and spatial attention, as well as temporal dependencies. Evaluated on BraTS and Figshare datasets, MANet demonstrated strong performance, achieving 96.51% accuracy on both datasets. These studies highlight the potential of deep learning for improving brain tumor analysis and diagnosis.

Proposed methodology for detecting glioma

This study aims to classify different tumor types using an imbalanced MRI image database. This database, while extensive, is less complex than those used by some AI tools designed for similar tasks. The study’s objective is to compare the performance of various deep learning networks for this classification task. To minimize training and implementation time, facilitating potential clinical applications, a streamlined network architecture was chosen. This “smart network” approach aims to provide an efficient solution suitable for everyday clinical use. The primary goal is to develop a system capable of accurate voxel-level segmentation. This process can be performed on various tissues, such as the necrotic center and oedema. Due to the varying intensity and boundaries of the images, it is very challenging for the system to identify infiltrated tumors, such as glioma.

Hence, in current years, MRI is used as a primary screening tool to access tumor presence and localize it, which helps the radiologist analyze its nature. DBN, SVM, LBP, ICA, and PSO may increase the complexity of implementation. Manual interpretation of MRI images for tumor segmentation can be time-consuming, complex, and prone to errors, potentially impacting treatment strategies. Existing methods may also lack efficacy for certain tumor types. In contrast, the approach proposed in introduces a novel deep belief network and probabilistic neural network architecture designed for efficient and accurate tumor segmentation, addressing the limitations of traditional methods.To cope with this problem we introduce DBN along with SVM has been used to learn the glioma texture features for accurate classification. Our proposed framework is illustrated in Fig. 1.

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

Deep learning and support vector machine fusion architecture for brain tumor classification.

Formulation of dataset and pre-processing function:

We collect the data set Es for the proposed model, which then performs brain MR tissue segmentation. The method is then applied to real brain MR images of 512 × 512 pixel size. After further processing, the resulting dataset is shown in the following equation.

$$E_{s} = e_{1} + e_{2} + e_{3} + .....e_{n}$$
(1)

The collection of images from the database is denoted as Es and the individual MR image is represented as e1, e2, e3….en. Preprocessing is a process that involves improving the quality of images before they are processed by human systems or machine vision. Aside from improving the quality of images, preprocessing can also help in various ways. It can reduce the noise level and improve the signal-to-noise ratio. For better signal-to-noise ratio and clarity, we used a modified version of the sigmoid function. After removing the unnecessary features, the data will be properly processed. Aside from the usual tumor or Glioma feature, preprocessing also includes additional information such as the presence of other regions or features. These data are then fed into the feature mining steps to ensure that the extraction is accurate.

Texture feature extraction and dimension reduction for grading glioma detection

A feature extraction process is used to reduce the original data by identifying positive features in a sample that can be used to discriminate against another one. Features help in identifying brain tumors by their intensity or texture. Magnetic resonance imaging (MRI) scans have been used to identify different types of brain tumors due to their complexity. Radiologists can then use these images to distinguish them from healthy brain tissue. These features are useful in identifying brain tumors by their texture characteristics. They can also be used to design a fitting algorithm for discriminating between different types of brain tumors. Due to the complexity of the process, many tasks require the use of classification features. It can take a long time to perform the task and increase the complexity of the process. It’s important to reduce the number of these features. As one of the previous linear dimensionality discriminants conquers the issue that PCA is not capable for separate the discriminant embedded data from exorbitant dimensional emotional capacities. In this work, the Independent Component Analysis reveals the hidden factors that underlie sets of random variables and the reduced feature set act as an input to the DBN-SVM tumor detection and classification.

In the detection of brain tumors, the DBN can automatically learn various level abstract features and obtain feature nonlinear descriptions. In this case, the feature reduction is performed by extracting local binary patterns. The LBP is a local texture feature that takes advantage of the distribution and brightness of these patterns. The LBP + ICA features are then input into the DBN, which allows the system to recognize independent local texture features. Due to the invariance of the rotation and illumination, the LBP can achieve a high recognition rate. Unfortunately, the LBP is not able to achieve a high recognition rate due to the characteristics of its features, such as its high dimension, sparse, and poor antinoise ability. This issue can prevent the network from achieving its global optimal. To improve the recognition rate and minimize the complexity of the high dimension, an independent component is used to improve the local gradient information and reduce the encoding length. The LBP-ICA features are then input into the DBN, which allows the system to recognize independent local texture features. This component ensures that the DBN can obtain the most accurate and diverse information from brain images. For certain light intensities and rotation angle changes, the texture feature can provide a good recognition effect on the image.

Local binary pattern analysis

A local describer (LBP) is a type of operator that is invariant when it comes to the illumination variation of an image. When it comes to describing texture features, the pixel neighborhood is typically defined first. The radius of an image’s local neighborhood and the pixel’s uniformity usually represents this type of feature.

The pixel’s feature information is computed by taking into account its neighborhood and the overall image set. The distribution of its features is shown in the formula (2).

$$Y(e_{v} ) = y(e_{v} ,e_{0} ,......,e_{q - 1} )$$
(2)

A central pixel’s gray value is referred to as ev, while the value of an annular pixel is called {gi|i = 0,…, P−1}. A feature measure can be achieved To get the value of the center pixel, take its gray value and subtract it from the value of the other nearby pixels. Suppose any Y(ev) is independent of ev, it is defined as follows:

$$Y = \overline{y} (e_{v} )y(e_{0} - e_{c} ,e_{1} - e_{v} ,......,e_{q - 1} - e_{v} )$$
(3)

The texture and grey information of an image are represented by the symbol where the two are respectively shown. Variance in the illumination of objects can have a significant impact on their appearance. For instance, t g(c) generally changes a lot, while g- gc |i=0, P1 experiences some degree of variation. The abandoned measure of the gray difference is transformed into a new one by replacing the pixel between the center and the annular neighborhood. This new measure of texture features is not sensitive to light is defined in (5.3):

$$Y(e_{v} ) = y(d(e_{0} - e_{v} ),d(e_{1} - e_{c} ),......,d(e_{q - 1} - e_{v} ))$$
(4)

The value of the LBP pattern remains the same irrespective of the variations in its illumination.

$$LBP_{Q,T} = \sum\limits_{o = 0}^{q - 1} {d(e_{o} - e_{v} ) \times 2^{o} }$$
(5)

ICA analysis

To understand what an ICA is, let’s first look at what PCA is and how it works. The main difference between the two is that while PCA produces a vector for each independent pixel, the ICA will provide a vector for each mixed number of independent pixels. In medical images, there are several overlapping pixels. With the use of an ICA, we can get a separate vector for each feature or signal. The following model shown in Fig. 2 is used for the un-mixing and linear mixing of images. It assumes that the number of sensors and the sources is equal to or greater than the number of sources. Noise levels are not affected by the sensor’s design. The device is used to measure the intensity of a tissue cluster. The cluster is shown in medical images when the relaxation times of T1 and T2 are combined. The intensity of the signal is affected by the combination of these relaxation times. The intensity of the cluster is computed by taking into account the difference between the tissue included in each pixel and the relaxation times T1. The relaxation times are computed using the data sets y1, y2, and y3. Since the source of the signal is independent, the background noise generated by the images is not significant.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

(a) The relationship between the data sets and their Independent Components is separated into three components. These components are as independent as possible. (b) The relationship between the relaxation times and the cluster of Independent Components is shown. Each tissue cluster has its relaxation times.

We built up this feature extraction to find a linear representation of Non-Gaussian information so the components are statistically independent. ICA is a linear dimension reduction strategy that changes datasets into columns of independent components. It is an important tool to extract required features from a dataset and make it independent. We can use LBP and assume we observe n linear mixture of data packets \(y_{1} ,y_{2} ,.....y_{n}\) as a mixture of independent components (which have multiple Tumor characteristics). Which is called a combination of random vector elements and can be expressed as matrix elements of the data set in Eq. (6).

$$y_{i} = b_{i1} u_{1} + b_{i2} u_{2} + ........ + b_{in} u_{n} ,for\,all\,i.$$
(6)

The LBP + ICA model accepts that every mixture \(y_{i}\), as well as every independent component is a random variable. If the consensus is not legitimate, the model zero-mean will appear. This is because the observed values \(y_{i} (T)\) do not correspond to a fitting time signal. It is generally recommended to use matrix notation instead of sums to reduce the computational complexity. This method also allows one to perform calculations with less complexity.

Figure 2a shows the relationship between the data sets and their Independent Components. These components are as independent as possible. Given B is a random vector, we can denote it with independent elements \(y_{1} ,y_{2} ,.....y_{n}\). Likewise, given a matrix with elements, we can use y as a vector with independent elements \(u_{1} ,u_{2} ,.....u_{n}\). The lower case letters and upper case letters of a given word refer to vectors and matrices, respectively. In this step, all vectors are considered column vectors (\(y\)), since the LBP + ICA representation of independent elements, makes a row a vector (\(y^{T}\)).

$$y = Bu$$
(7)

Occasionally we need the independent components as a columns of matrix Q; denoting them by the vector elements \(b_{i,j}\) the model can also be written as

$$y = \sum\limits_{i = 1}^{n} {b_{i} } u_{i}$$
(8)

The ICA statistical model is a framework used to analyze data packets containing tumor characteristics. It shows how these characteristics can be extracted by combining various independent components. Figure 2b shows the relationship between the relaxation times and the cluster of Independent Components, and each tissue cluster has its relaxation times. The first step in the process is to assume that the components are statistically independent (\(u_{i}\)). However, since they can’t be directly observed, the mixture matrix is considered unknown. The first step in the process is to estimate the total number of A and s using the random vector y. This is done under general assumptions.

In this work, we have used non-Gaussian distributions that will be used to approximate the A matrix. After the estimation process of the B matrix, we will reverse its process, say W and obtain it using the independent texture feature vector

$$u = Xy$$
(9)

where, ‘\(u\)’ will be the independent texture feature value to be extracted, only ‘\(u\)’ will be passed by our proposed framework as a seed for the learning process for effective detection of Glioma.

Learning by optimized DBN With SVM

Dimensionality reduction plays a crucial role in our brain tumor detection approach, addressing computational challenges and enhancing the performance of the ensemble Support Vector Machine classifier. Our objective is to distill valuable insights from high-dimensional MRI data, enabling the ensemble SVM to achieve improved accuracy and faster processing compared to an SVM trained on the full dataset. To accomplish this, we employ a Deep Belief Network. However, training large-scale deep learning models for brain tumor detection presents computational and memory limitations. Therefore, we utilize a distributed training strategy for the DBN based on Apache Spark to mitigate these challenges. The DBN’s architecture, featuring a greedy layer-wise unsupervised pre-training followed by supervised fine-tuning, effectively learns hierarchical representations of the MRI data, capturing intricate patterns relevant to tumor detection. A general artificial neural network that learns the probability of a tumor is found in a given data set using statistical mechanics. The network is composed of two layers, one of which is a visible layer and the other is a hidden layer. The former is used as the input data, while the latter is used as the learned samples. The model is an indirect graphical representation that shows the relationships between all visual units and all hidden units. It has three parameters: the connection weight, the visible unit biases, and the hidden unit biases. The theoretical basis for our proposed learning process is the net energy calculation, which takes into account the probability distributions over the joint state of the hidden and visible units.

$$i = k\left( {v,a} \right) = t(n) - \sum\limits_{k = 1}^{iV} {h_{j} } v_{j} - \sum\limits_{o = 1}^{iA} {\sum\limits_{k = 1}^{uC} {a_{i} F_{ij} } } v_{j} - \sum\limits_{o = 1}^{dZ} {h_{i} a_{i} }$$
(10)

The two binary states that are unlabeled are j and i. The number of hidden and visible entities is represented by the numbers uC and iA. The joint distribution between the hidden and visible units is called the joint distribution:

$$k\left( {v,a} \right) = \frac{1}{B}e^{ - J(c,l)}$$
(11)

This work addresses the challenge of imbalanced datasets in brain tumor detection using a Deep Belief Network based approach. Specifically, our DBN model iteratively refines a generator and discriminator. Given initial generator parameters, the model generates synthetic data. This data, along with real MRI data, is used to train the discriminator. The discriminator’s feedback then guides the generator to produce increasingly realistic synthetic samples. This iterative process continues until both generator and discriminator reach stable, optimal performance. The optimized generator then creates a substantial volume of synthetic data that mirrors the distribution of real MRI data, effectively balancing the initially imbalanced training set.

In brain tumor detection, accurate differentiation between tumor and normal cases is paramount. The balanced training data generated by the DBN allows for more precise classifier training, mitigating the negative effects of class imbalance. To validate the effectiveness of our DBN-based approach, we employ Support Vector Machines, a common and effective classifier for medical image analysis. SVMs identify an optimal hyperplane to separate data points, maximizing the margin between different classes. By addressing the imbalance issue at the data level, the SVM classifier can achieve more accurate and efficient brain tumor detection results, leveraging its strengths in handling small, balanced datasets.

The following two margin distributions show the relationship between the unlabeled and labeled textures features:

$$U(v|a) = \mathop \prod \limits_{j = 1}^{sM} \frac{{\exp \left( {h_{i} c_{j} + \sum\limits_{j = 1}^{sB} {l_{i} D_{ij} } c_{j} } \right)}}{{\sum\nolimits_{{\mathop b\limits^{\sim } }} {c_{j} } + \exp \left( {h_{i} \mathop c\limits^{\sim }_{j} + \sum\limits_{j = 1}^{sB} {l_{i} D_{ij} } \mathop c\limits^{\sim }_{j} } \right)}}$$
(12)
$$U(v|a) = \mathop \prod \limits_{j = 1}^{sM} \frac{{\exp \left( {h_{i} c_{j} + \sum\limits_{j = 1}^{sB} {l_{i} D_{ij} } c_{j} } \right)}}{{\sum\nolimits_{{\mathop b\limits^{\sim } }} {c_{j} } + \exp \left( {h_{i} \mathop c\limits^{\sim }_{j} + \sum\limits_{j = 1}^{sB} {l_{i} D_{ij} } \mathop c\limits^{\sim }_{j} } \right)}}$$
(13)

One layer is independent of the others, so it can be accessed without having to look for hidden connections.

$$U(v_{i} = 1|a) = \frac{1}{{1 + \exp ( - h_{i} \sum\limits_{i = 1}^{sB} {l_{i} D_{ij} } }}$$
(14)

The optimal maximum likelihood estimate can help us understand the design of the proposed network. It can also pinpoint areas of interest.

$$Q(\theta ) = \log U(c|l) = \log \sum\limits_{z} {r{}^{ - K(n,z)}} - \log \sum\limits_{n.z} {e{}^{ - J(n,z)}}$$
(15)

where Ѳ = EGi parameter is used to estimate the effective learning curve for texture feature.

$$= \sum\limits_{z} {U(v|a)} \left( { - \frac{\partial K(v,a)}{{\partial (\theta )}}} \right) - \sum\limits_{z} {U(v|a)} \left( { - \frac{\partial K(v,a)}{{\partial \theta }}} \right)$$
(16)

We have added two new variables to make the process easier. These will make the terms easier to understand.

$$\left\langle \theta \right\rangle_{\inf } = \sum\limits_{m} {U(v|a)} \left( { - \frac{\partial K(v,a)}{{\partial (\theta )}}} \right)$$
$$\left\langle \theta \right\rangle dom = \sum\limits_{z} {U(v|a)} \left( { - \frac{\partial K(v,a)}{{\partial \theta }}} \right)$$
(17)

The proposed model parameters’ energy function can be summarized in the following energy minimization terms.

$$- \frac{\partial K(v,a)}{{\partial (\theta )}} = n_{i} z_{j} \;\quad - \frac{\partial K(v,a)}{{\partial \theta }} = n_{j} \;\quad \frac{\partial K(v,a)}{\partial } = a_{i}$$
(18)

In Eq. (19), it is suggested that the first object \(\left\langle \theta \right\rangle_{\inf }\) should be easy to compute, but that the second \(\left\langle \theta \right\rangle_{dom}\) should also be able to cross all the possible hidden and unlabeled features. This is a hard problem to solve, as it involves dealing with multiple hidden features and mini-batches. We proposed a framework that can minimize the computational burden. Mini-batches are designed to help the proposed framework run faster. They are split into smaller sizes so that they can be easily accessed by the developers. The updated rules for these mini-batches are given below.

$$\theta = \theta + \varepsilon \Delta \theta = \theta + \varepsilon ( < \theta > )_{data} - < \theta >_{\bmod el}$$
(19)

The learning rate of the framework we are proposing for detecting Glioma has been set at \(\varepsilon\). This will allow us to expand the scope of our methods.

$$\Delta D_{ij} = \frac{{\sum\limits_{x = 1}^{l} {(l_{(x)}^{(0)} } c_{(x)}^{(0)} - l_{(x),i}^{k} c_{(x),j}^{(k)} }}{L}$$
(20)
$$\Delta h_{j} = \frac{{\sum\limits_{x = 1}^{l} {(c_{(x),j}^{(0)} } - c_{(x),j}^{(k)} )}}{L}$$
(21)
$$\Delta o_{i} = \frac{{\sum\limits_{x = 1}^{l} {l_{(x),j}^{(0)} } - l_{(x),j}^{(k)} )}}{L}$$
(22)

The parameter \((.)_{(x),i}^{k}\) represents the element of a texture feature that’s inside a feature’s ith parameter. The number of features that are represented by ith is calculated after k-steps. The structure of a texture feature is learned by layer. After collecting data, its parameters are revealed, and samples labeled for detection are then used to look for signs of an attack in the surroundings. In this study, the researchers used various techniques to analyze the brain’s images.We divided the brain into four categories: normal, gray matter, tumor-infected, and cerebrospinal fluid. The tissues that were detected were categorized into three categories: meningioma, glioma, and pituitary. Their size, location, and impact were taken into account. Furthermore, we use PSO to optimize the result obtained by our proposed framework to accurately predict the attack with minimum time.

Particle Swarm optimization

Every particle in the group flies toward the search area with the goal of acquiring the best velocity, which is determined by its previous best and the experience of its partner.Similar to this, labeled samples were adapted to suit Glioma characteristics to achieve greater accuracy with higher performance in minimum time.

In an n-dimensional search space, Glioma characteristics with different grades and positions of yi and velocities wi are the values of the positional coordinates are used to calculate the fitness and org initialization. The texture elements are then relocated to the new positions based on the attacking level.:

$$E_{i} \left( {o + 1} \right) = \omega \times e_{i} \left( i \right) + v_{1} \times \varphi 1 \times \left( {Q_{b} - U_{i} (o)} \right) + V_{2} \times \varphi 2.\left( {Q_{b} - U_{i} (o)} \right)$$
(23)
$$U_{i} (o + 1) = U_{i} (o) + E_{i} (o + 1)$$
(24)

The benefit of utilizing PSO over other optimization structures is its straightforwardness. PSO is utilized to investigate the search space of an offered issue to discover the settings or parameters required to augment a specific target. The proposed framework’s function space is then linked to the one-dimension fitness space, which gives a solitary fitness value for each parameter.

Result and discussion

The experiment is carried out using a Python programming language. Some of the libraries used are PyTorch and Keras, which are used for data augmentation and feature extraction respectively. Table 1 provides further details about the other tools and specifications.

Table 1 System environment used for run the model.

Dataset description

The goal of this experiment is to collect and analyze brain images from a self-made dataset consisting of 3,064 MRI slices from 23 patients (available in https://figshare.com/articles/dataset/brain_tumor_dataset/1512427). The 5063 images collected from this study represent various types of brain tumors, such as meningioma, glioma, and pituitary tumor. Each of these images has an in-plane resolution of 512 × 512 pixels which are collected and processed in non destructive manner. The data collected for this study were taken from 6 mm thick slices. The normal and abnormal images are different by a factor of one millimeter. Figure 3 shows the sample images of these two categories. The FLAIR procedure is an MRI sequence that features a fluid-attenuated recovery set.

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Sample images of (a) Low-grade glioma, (b) high-grade glioma.

Training and testing for detecting brain tumor

Our brain tumor detection approach involves two key stages: feature extraction and ensemble learning. Feature extraction addresses the high dimensionality of MRI data, while ensemble learning enhances detection accuracy. For feature extraction, we utilize a Deep Belief Network composed of stacked Restricted Boltzmann Machines. The first RBM is trained on the input MRI data, and subsequent RBMs are trained on the activations of the preceding layer, learning hierarchical representations. The distributed DBN’s hyperparameters are optimized using a validation set to ensure optimal performance for brain tumor detection.

The ensemble learning component employs a multi-layer Support Vector Machine architecture. The decision to add another layer to the ensemble is based on a confidence measure. If a sufficient proportion of patterns are classified with low confidence, another SVM layer is added and trained. Confidence is assessed using the posterior probabilities generated by each SVM. A prediction threshold is employed to ensure that classifications are made only when the model exhibits high certainty. This approach allows for a more robust and reliable detection process. For clarity and reproducibility, we report only the optimal hyperparameters for both the distributed DBN and the multi-layer ensemble SVM. This streamlined approach aims to maximize detection accuracy while minimizing computational burden, a crucial consideration in medical image analysis.

To test the accuracy of the DBN classifier, the location of the tumor in each image should be labeled and saved to a file. This tool is used to do so. The tool is written in Python and uses the graphical interface of Qt. After the labeling process, 75% of the database’s images are randomly selected as samples. Since the training set is dominated by one class, the selection of images from each class is performed to prevent the data from being dominated by one class. A proposed model is then trained on the data. The parameters of the proposed model are as follows:

  • Learning rate: 0.0003

  • Momentum constant: 0.912

  • Batch size: 1

  • Number of iterations: 170,000

  • Softmax neurons: 3

  • Max-pool kernel size: 2

  • Max-pool stride: 2

Table 2 shows the statistical attributes of some images. These are useful in identifying similar pictures and extracting information about their intensity level distribution. The intensity level distribution of the various texture features, such as standard deviation, mean, kurtosis, energy, and smoothness, is calculated by taking into account the probability of their occurrence in the bins of the histograms.

Table 2 Statistical features for a few images.

25% of the dataset is used to test the proposed DBN-SVM model. The three parameters of the model are returned to classify the detected tumor. These parameters are shown in Figs. 4 and 5.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

(a) Original image, (b) Enhanced image, (c) Skull-stripped image, (d) ICA image, (e) Tumor region, (f) Area extracted tumor region.

Fig. 5
Fig. 5The alternative text for this image may have been generated using AI.
Full size image

Images containing different types of tumors: (a) Meningioma, (b) Glioma, (c) Pituitary and detected and classified brain tumors: (d) Meningioma, (e) Glioma, (f) Pituitary tumor.

Performance analysis

Performance metrics are tools used to measure the speed and accuracy of a system. The proposed system addresses some of these criteria, including recall F-measure and accuracy.

Class labels can be assigned to detected tumors using the minimum probability value of 0.8. If the tumor’s probability value is less than 0.8, it will not be classified.it was observed that 26 samples were unclassified. 48 samples were also unclassified. There were also 36 misclassified samples. Table 3 shows the three confusion matrices that were used to set the threshold values for the classification of a tumor.

Table 3 Confusion Matrix with different threshold values.

Table 4 shows the number of samples that decrease in number as the decision threshold gets lowered. This reduction in the number of unclassified samples has a positive effect on the performance of various statistical measures used in image classification. The table shows that 86.42% of the images are properly classified when the decision threshold is set at 0.8. Even though the probability of success is low, the classifier can still detect a tumor. Unclassified and misclassified samples belong to different classes. For instance, the specificity value of the “glioma” subtype is higher than the values of the other two. This makes it difficult for the machine to learn this class effectively.

Table 4 Performance metrics with different threshold values.

A method for identifying brain tumors using the DBN and PSO is compared to other methods such as the ANFIS and the back propagation classifier. The details of the execution measures are presented in Fig. 6. The proposed technique exhibited a significant improvement in its detection rate compared to the other methods.

Fig. 6
Fig. 6The alternative text for this image may have been generated using AI.
Full size image

Comparison of accuracies in different classifiers.

Figure 7 shows the comparisons of existing and proposed methodologies. The proposed method is compared with the existing approaches. It has a sensitivity index of 97.81 and a specificity of 98.3 percent, and it is accurate at 99.21 percent. The results of this study indicate that the proposed technique is more accurate. Figure 8 shows the comparison of performance with other related research. According to the proposed study’s findings, the accuracy of the proposed method is higher than that of the conventional approaches. The suggested method for detecting brain tumors has a superior performance ratio. It can accurately identify the tumors with a sensitivity of almost a hundred percent and a specificity of almost ninety percent. It also performs better than the current strategies in this area. Table 5 shows the comparative analysis with state-of-the-art works.

Fig. 7
Fig. 7The alternative text for this image may have been generated using AI.
Full size image

Comparisons of existing and proposed methodologies21,22,23,24,25.

Fig. 8
Fig. 8The alternative text for this image may have been generated using AI.
Full size image

Comparison of performance with other related research26,27,28,29,30,31,32.

Table 5 Comparative analysis with state-of-the-art works.

This paper introduces a hybrid model in contrast to existing research that predominantly focuses on single-faceted approaches. Previous studies33 Moldovanu et al. have primarily explored individual techniques, including conventional Convolutional Neural Networks, CNNs with pre-training, or combinations of pre-trained CNNs with traditional Machine Learning algorithms33. This work diverges from these singular strategies by proposing a hybrid architecture that integrates multiple methodologies.

Table 6 shows the real-world testing or validation in an actual clinical environment. In the context of brain tumor detection, the proposed method demonstrates improved performance compared to baseline classifiers. Specifically, using features extracted by our Convolutional Neural Network led to enhanced accuracy for both Support Vector Machines and Random Forests, while K-Nearest Neighbors performance remained constant. This suggests that the CNN-derived features provided richer, more discriminative information for the SVM and RF classifiers, enabling them to learn more effectively.

Table 6 Real-world testing or validation in an actual clinical environment.

Furthermore, this approach is designed for computational efficiency, making it well-suited for brain tumor detection scenarios involving medium-sized datasets and resource-constrained environments. By combining advanced feature extraction with the SVM classifier, we achieve a balance between accurate detection and practical resource utilization. This method not only rivals but also has the potential to outperform more complex, computationally demanding deep learning models in certain situations. This underscores the value of carefully designed machine learning techniques as a resource-efficient alternative, particularly when deep learning models might prove excessively resource-intensive.

There are many challenges that face the field of brain tumor segmentation. For instance, the different kinds of brain tumors have varying morphological and size characteristics. Furthermore, MRI images may encounter issues such as artifacts, noise, and distortion, which may negatively affect segmentation’s performance. Deep learning models require a large amount of annotated data to be trained properly. This makes it challenging to acquire accurate annotated data for medical applications. In this paper, we present an advanced segmentation method for detecting and distinguishing brain tumors using MRI images. The paper presents an advanced segmentation method that is specifically designed for MRI images. It can perform a variety of tasks such as analyzing the size and shape of the tumor, calculating its area, and performing a detailed analysis of the image. Furthermore, this method can be utilized for performing various practical functions, such as performing a detailed analysis of MRI images of brain tumors. In addition, it can calculate the tumor’s area and accurately measure its diameter.

Conclusion

The paper presents a method to classify brain tumors using deep learning and SVM fusion. The method is performed by combining a DBN-based architecture with a T1weighted contrast-enhanced MRI database. The paper proposes a method that combines the advantages of PSO and deep learning to improve the accuracy of classifying brain tumors. It can be performed by combining the LBP and ICA features of both systems. The proposed model is highly accurate in classifying gliomas with 99.21% accuracy. It is also able to detect the highest sensitivity and specificity. The proposed network is easier to implement than a pre-trained network. It can be run on modern personal computers. Also, due to the low number of resources required for implementation, it is possible to use it on conventional machines. The importance of establishing smaller networks is also related to the potential of the algorithm to be used on mobile platforms. The paper argues that the proposed algorithm can contribute to the development of highly accurate computer-aided diagnosis systems for gliomas in non destructive manner. In the future, studies will be carried out on the implementation of the proposed algorithm using Mask R-CNN. These will then be followed by the use of state-of-the-art transfer learning models.

Building upon previous work on brain tumor classification using a CNN based on VGG-16,DBN based machin learning this project advances towards precise tumor localization using Mask R-CNN. The previous CNN model effectively distinguished between scans with and without tumors, but lacked the ability to pinpoint the tumor’s location. This limitation is addressed by employing Mask R-CNN, which not only classifies the presence of a tumor but also provides a pixel-level segmentation mask, accurately outlining the tumor’s boundaries within the MRI scan.

This enhanced approach is being applied to a more comprehensive dataset, enabling a more rigorous performance evaluation. The analysis will focus on the model’s ability to differentiate between various types of brain lesions, a crucial aspect of accurate diagnosis. While this current dataset serves as a valuable starting point, future research will prioritize incorporating a more diverse range of brain lesions. This expansion will encompass a wider spectrum of tumor types, sizes, locations, and imaging characteristics, better reflecting the complexities encountered in real-world clinical settings. By addressing these challenges, the aim is to develop a robust and reliable brain tumor detection system suitable for practical diagnostic applications. Furthermore, future work will explore incorporating additional clinical data, such as patient demographics and medical history, to further enhance the model’s diagnostic capabilities.

Mask R-CNN, a state-of-the-art instance segmentation model, enhances the object detection capabilities of Faster R-CNN by adding a mask generation branch. While Mask R-CNN has seen limited use in brain tumor segmentation, this study adapts and refines it for MRI analysis, making it more accessible to medical professionals. Our modified Mask R-CNN replaces the standard backbone with a ResNet architecture integrating an attention mechanism and Feature Pyramid Network. This combination enhances feature extraction from MRI images, crucial for accurate brain tumor identification. This automated approach addresses the limitations of manual brain tumor identification, which is time-consuming, requires specialized expertise, and carries the risk of human error. Our method aims to improve the efficiency and accuracy of brain tumor identification by automatically segmenting the tumor regions within MRI images.

This innovative approach leverages the strengths of Mask R-CNN, the attention mechanism, and the FPN, resulting in a more powerful segmentation model. Specifically, the ResNet backbone with integrated attention mechanism and FPN extracts rich feature representations from the MRI images. These features are then passed to the Region Proposal Network to generate potential tumor regions. RoIAlign refines the features from these proposals, ensuring precise alignment with the original image. Finally, two parallel processes occur: classification and bounding box regression of potential tumor regions using a fully connected layer, and precise tumor segmentation using a Fully Convolutional Network. This study contributes both a novel adaptation of Mask R-CNN for brain tumor segmentation and an automated solution for a critical medical task, improving efficiency and reducing reliance on manual expertise. The automated nature of our method eliminates the subjectivity inherent in manual identification, providing consistent and objective results. The tumor types considered in this study include glioma, meningioma, astrocytoma, and neuroblastoma. While the use of transfer learning with pre-trained models is mentioned, further details on their integration and contribution to the framework are needed.