Abstract
Astrocytomas are among the most prevalent primary brain tumors and are classified into four grades by the World Health Organization. Accurate grading is essential for guiding treatment, as therapeutic strategies depend heavily on tumor grade. This paper presents a new preoperative classification method for astrocytomas, addressing the issue of data scarcity in medical imaging. This work leverages an advanced statistical modeling approach based on stochastic differential equations to analyze post-contrast T1-weighted brain MRI images that require minimal data and offer rapid processing times. In this method, the alpha-stable nature of MRI images is represented by applying a fractional Laplacian filter, and the parameters of the resulting alpha-stable distribution are fed to classifiers to detect the grade of astrocytomas. The method is implemented in both 1D and 2D processing modes, with customized preprocessing for each. Three classification algorithms were evaluated: support vector machine, K-nearest neighbor, and random forest. In the three-class classification task (Grades II–IV), the support vector machine exhibited superior performance, achieving accuracy, sensitivity, and specificity of 98.49%, 98.42%, and 99.23% in 2D mode, and 93.52%, 93.23%, and 96.72% in 1D mode. The results indicate that the proposed framework has the potential to significantly enhance preoperative grading of astrocytomas.
Similar content being viewed by others
Introduction
Brain is one of the most vital organs in the human body, responsible for coordinating thought, memory, language, behavior, and motor function1. It comprises nerve cells, or neurons, along with supporting tissue known as glial cells. Among the various types of glial cells, astrocytes play a critical role in providing nutrition to neurons. However, astrocytes have the potential to become cancerous, leading to the formation of astrocytic brain tumors, known as astrocytomas2,3. Astrocytomas represent one of the most prevalent forms of primary brain tumors, characterized by their rapid growth potential and varying degrees of malignancy, ranging from benign to highly aggressive forms. The World Health Organization (WHO) classifies astrocytomas into four grades based on their growth rate, tendency to infiltrate surrounding brain tissue, and molecular characteristics. These grades are further grouped into low-grade (grades I and II) and high-grade (grades III and IV) tumors, reflecting the severity of their malignancy. Astrocytic brain tumors occur across all age groups, with a higher prevalence in men than in women2,4,5. Low-grade astrocytomas are often initially benign but may progress to malignancy over time, while high-grade astrocytomas are malignant from their onset. The treatment and prognosis for these tumor grades vary significantly, with low-grade astrocytomas generally associated with favorable outcomes, whereas high-grade astrocytomas often carry a poor prognosis4. Accurate grading of astrocytomas is critical because different grades necessitate distinct treatment approaches. Tumor grading typically relies on various pathological features; however, histopathological analysis—the gold standard for diagnosis—can sometimes result in ambiguity6. This diagnostic process requires invasive procedures such as biopsy or surgery, both of which are expensive and pose risks to the patient7.
Currently, magnetic resonance imaging (MRI) is the most widely used non-invasive technique for assessing brain tumors due to its high sensitivity in detecting and localizing small brain lesions4,6. Conventional MRI sequences play a key role in the monitoring, diagnosis, and characterization of astrocytomas. However, these techniques alone are insufficient for reliably determining tumor grades because high-grade and low-grade astrocytomas often exhibit overlapping features on conventional MRI scans2,4,5. This limitation underscores the necessity for more advanced diagnostic tools to improve the accuracy of tumor grading.
Despite advances in the diagnosis and treatment of brain tumors, astrocytomas remain a significant cause of mortality. Early detection and accurate preoperative tumor grading are essential for tailoring treatment strategies and improving patient survival rates2,3. However, the limitations of current diagnostic methods highlight a critical need for approaches that enhance the accuracy of MRI interpretation and facilitate tumor grade classification. To address this challenge, computer-based analysis methods, including traditional machine learning (ML) and deep learning (DL) techniques, have emerged as promising tools3. These AI-driven approaches aim to enhance diagnostic accuracy by leveraging large datasets and automated feature extraction. A summary of recent studies in this domain is presented in Table 1.
As demonstrated in the table, most ML-based studies have focused on differentiating glioma grades using various features and classification algorithms4,7,8,9,10,11. ML techniques, such as support vector machines, random forests, and decision trees, offer advantages such as interpretability and efficiency in handling structured data. However, their performance heavily depends on handcrafted feature selection and labeled datasets, which may limit their generalizability12.
Similarly, DL-based studies have predominantly employed convolutional neural networks to achieve this objective5,6,13,14,15,16,17. Unlike traditional ML, DL methods automatically extract hierarchical features from raw data, reducing the need for manual feature engineering. This capability has led to significant breakthroughs in medical image analysis. However, DL approaches require large-scale, high-quality training datasets, which are often scarce, particularly for brain tumors. Additionally, these models are computationally expensive, prone to overfitting, and demand powerful processors due to the extensive data volumes required for training.
These limitations raise concerns about the reliability and practicality of current methods for preoperative tumor grading. Furthermore, the misclassification of tumor grades may lead to severe and irreversible consequences for patients3.
This study explores the challenges associated with brain tumor grading and proposes a statistical modeling approach to address the limitations of traditional ML and DL methods. The central hypothesis is that statistical modeling can enhance the accuracy and reliability of preoperative tumor grading by more effectively capturing the stochastic properties inherent in medical imaging data. By integrating statistical techniques, this research seeks to improve the precision and robustness of brain tumor grading while mitigating the shortcomings of traditional ML and DL frameworks.
Statistical modeling offers a robust foundation for a range of image processing techniques, particularly in medical imaging. By accounting for the stochastic properties of medical images, statistical models can construct tailored descriptive frameworks. A common objective of such models is to determine the probability density function (pdf) of the dataset. These models aim to select an appropriate distribution that fits the dataset and estimate the parameters of this distribution as relevant features or biomarkers under varying conditions. A fundamental principle of statistical modeling is that when data is accurately represented, the extracted statistical features can support numerous applications, including classification, segmentation, and noise reduction18,19. In recent years, various statistical modeling methods have been developed to analyze brain MRI images for different purposes. Table I summarizes these studies, most of which have concentrated on two primary objectives: the segmentation or classification of normal brain tissue in healthy individuals and the segmentation of tumor regions in patients with brain tumors. Techniques such as mixture models and Markov models have been widely used to achieve these goals20,21,22,23,24,25,26,27.
Despite extensive research, a significant gap exists in applying statistical modeling techniques specifically for brain tumor grading. To the best of our knowledge, no prior study has explored the use of statistical modeling for this purpose. Furthermore, no research has investigated the application of the specific statistical method proposed in this study for analyzing brain MRI images. To address this gap, we propose a new stochastic differential equation (SDE)-based modeling framework, referred to as the innovation model, for analyzing MRI images containing astrocytic brain tumors. This framework leverages the probabilistic nature of MRI data to construct dedicated models for brain tumor classification and grading. By determining the underlying pdf and selecting an appropriate statistical distribution, the SDE approach enables the estimation of distribution parameters, providing a customized framework for precise and reliable tumor grading.
Given the constraints of the dataset, which is limited to astrocytoma grades II, III, and IV28,this study focuses on extracting informative features from the proposed model and utilizing these features to classify tumors into the three grade categories. The quantitative evaluation of classifier performance serves to validate the model’s capability for pre-surgical grading of these tumors.
The rest of this paper is organized as follows: Section “Materials and methods” presents the proposed statistical model for grading astrocytomas. Section “Results” discusses the results of the modeling and classification. Sections “Discussion” and “Conclusion” provide the discussion and conclusion, respectively.
Materials and methods
In this section, the dataset will be introduced, followed by an elucidation of the innovation model and the defining characteristics of stable distributions. The proposed model for grading astrocytomas will then be described.
Dataset
The dataset utilized in this study is provided by the University of California, San Francisco, and is publicly available at www.cancerimagingarchive.net28. It includes patients who underwent preoperative brain MRI at a single center between 2015 and 2021, in all of whom astrocytic brain tumors and their grade were confirmed by histopathologic methods on the basis of WHO criteria. All MRIs were conducted with a 3.0 Tesla MRI imaging system (Discovery 750, GE Healthcare, Waukesha, Wisconsin, USA) with a dedicated eight-channel head coil. The imaging protocol included 3D post-contrast T1-weighted sequences. To ensure consistency in analysis, all images underwent skull stripping for the removal of non-brain tissues, a process conducted by the data collection team. While astrocytomas are classified into four grades, the dataset employed in this study is restricted to grades II, III, and IV. Grade I tumors are therefore excluded from the current analysis.
The demographic and clinical characteristics of the patient cohort enrolled in this study, along with the total number of tumor-containing slices, are summarized in Table 2.
Innovation model and stable distributions
The innovation model assumes that a stochastic process can be regarded as the response of a SDE to a white noise, which is not necessarily Gaussian. This concept is illustrated in the following equation36:
where L represents a differential operator applied to the process s, which is also referred to as the whitening operator. Additionally, w represents the white noise that the process derives from it. It is essential to note that L must be invertible, and the inverse operator is L−1. The inverse operation can be expressed as follows36:
The derivative white noise is an identically independent distributed (i.i.d.) stationary process, commonly referred to as the innovation process. According to the innovation model, the integration-like operator L−1 implicitly shapes the white noise and yields the correlation characteristics of the process s, while the sparsity structure and statistical properties of s are determined by w. The discrete counterpart of Eq. (2), where the increment process u is associated with the process s, can be described as follows36:
where Ld represents the discrete form of L. It is a finite difference like operator with a sequence of weights \(\:{d}_{L}\in\:{k}_{1}\left({Z}^{d}\right)\). Unser et al.36 demonstrated that if \(\:s={L}^{-1}w\) is a stochastic process and L is a spline-admissible operator, then \(\:u={L}_{d}s\) is stationary and nearly decoupled. Therefore, stochastic processes characterized by independent stationary increments can be effectively modeled within this framework, including stable fractal processes that have fractal nature and long-range dependencies (LRD). These processes exhibit the characteristic of self-similarity in their pdf, where the degree of self-similarity is determined by a parameter called the Hurst exponent (0 < H < 1). These processes, also known as fractional Levy stable motion (fLsm), have found extensive applications in stochastic modeling and can be modeled with this framework. The increments of fLsm conform to an alpha-stable distribution and are independent when they follow a symmetric alpha-stable (sαs) distribution37.
Alternatively, the majority of natural images can be viewed as discrete realizations of an underlying stochastic process. Consequently, the discretization of an innovation model can be applied for a random noise generating an image. According to fLsm, by identifying a spline-admissible scale-invariant operator L, discretizing it, and then applying it to an image, we obtain the sαs white noise of the image. Fractional Laplacians are one such operator. In previous studies, the isotropic polyharmonic spline was employed for the discretization of these operators. The discretized form in the frequency domain is as follows38:
where ω = (ω1, ω2) is the two-dimensional Fourier domain and γ is the fractional Laplacian order.
Alpha-stable distributions are characterized by four parameters: The characteristic exponent, α (0 < α ≤ 2), determines the tail decay rate, with smaller values indicating heavier tails. The skewness parameter, β (-1 ≤ β ≤ 1), indicates the distribution’s asymmetry. The scale parameter, σ (σ > 0), determines the width or dispersion of the pdf. The location parameter, µ (µ ∈ R), quantifies the distribution’s shift from the peak, analogous to the mean in a Gaussian distribution39. In alpha-stable distributions, as α approaches 2, the distribution behavior becomes increasingly similar to that of Gaussian distributions, and when α = 2, a Gaussian distribution is achieved. It is worth mentioning that alpha-stable distributions generally do not have closed-form expressions for their pdfs, except for specific values of α. In these distributions, moments of order less than α are finite. This implies that distributions with α < 2 lack a second-order moment or variance and other higher-order moments. Consequently, alternative statistical procedures, such as fractional lower-order moments (FLOM), would be employed instead. The calculation for FLOM in sαs distributions is as follows40:
where Γ represents the gamma function, 0 < p < α, and \(\:\sigma\:\) denotes the scale parameter of the distribution. Furthermore, for sαs distributions, there are closed-form expressions for the absolute logarithmic moments, expressed as follows41:
The proposed model for grading astrocytomas
The block diagram of the proposed method is presented in Fig. 1, and its steps will be detailed in this section. In the innovation model, by assigning a specific whitening operator to each stochastic process, its productive innovation process can be revealed. Given the stochastic and non-stationary nature of MRI images, the innovation model can effectively describe them. This operator can be applied to whiten medical images, as the power spectral density of such images follows the form \(1/\left\| \omega \right\|^{\gamma }\), which indicates LRD and self-similarity in these images. The relation between the fractional Laplacian order (γ) and the Hurst exponent (H) and α for a self-similar process is as follows, where d is the dimension of the process36:
To whiten MRI images, the discrete version of the Fractional Laplacian operator was employed, as illustrated in Eq. (4). The anticipated outcome is a whitened image characterized by an i.i.d. random vector with a histogram having a zero-mean sαs shape. To determine the optimal order of the whitening operator for each image, the algorithm starts with an initial value for the parameter α, as per Eq. (7). Subsequently, the value of α is adjusted until an approximately zero-mean sαs pdf is achieved. Since alpha-stable distributions do not have closed-form pdf expressions, a regression-type estimation method was employed, as proposed by Koutrouvelis IA. et al.42to estimate the parameters (α, β, σ, µ) of the pdf.
In this study, post-contrast 3D T1-weighted images (T1W + C) were selected, and only those slices containing astrocytic brain tumors were utilized for each subject. The images in the dataset were previously skull-stripped. Given that the background of these images is entirely black with a grayscale value of zero, the proposed algorithm requires a non-zero background. Consequently, the images were processed in two distinct ways. In the first method, only the grayscale values of the brain were extracted as a one-dimensional (1D) vector. The vector was then smoothed using a Gaussian filter with a standard deviation of 1. This specific standard deviation was chosen based on empirical evaluations, which demonstrated its effectiveness in reducing noise while preserving critical anatomical boundaries, such as tumor edges. Preliminary experiments indicated that lower standard deviation values were insufficient for noise suppression, whereas higher values resulted in excessive smoothing that compromised the integrity of fine structural details. Following this, the innovation model was applied to the vector, and the parameters of the alpha-stable histogram were extracted. These parameters were subsequently employed as distinctive features for the purpose of training the classifiers. In the second method for two-dimensional (2D) analysis, the black background was initially refined to the edges of the brain using the Sobel edge detection algorithm. Gaussian noise with a mean of zero and a standard deviation of 0.001 was then added exclusively to the residual background in order to achieve a non-zero background. Subsequently, the innovation model was conducted, and the parameters from the alpha-stable distribution and whitened image were extracted. Figure 2 depicts a sample brain MRI slice along with a histogram in both 1D and 2D modes. It also illustrates a whitened image in 2D mode and the whitened histogram in both 1D and 2D modes.
Following the implementation of the proposed algorithm, a total of 12 features were extracted in both 1D and 2D modes, as detailed in Table 3. Some of these features represent parameters of distributions, while others were extracted from the histograms and whitened images. Subsequently, the extracted features from both the 1D and 2D modes were utilized independently to train three distinct classifiers, including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF), all of which have demonstrated strong performance in comparable medical classification tasks8,43. These classifiers were then applied to classify astrocytomas.
The SVM was implemented with a Radial Basis Function (RBF) kernel, while the RF classifier was constructed using 100 decision trees. The KNN classifier was optimized by selecting an appropriate value for k (set to 3) and further evaluated using three distance metrics: Euclidean, Mahalanobis, and Cityblock. To ensure consistent scaling across features, a normalization step was performed prior to classifier training. The classification performance was assessed using the 10-fold cross-validation technique.
Furthermore, to assess the importance of the extracted features, SHAP (Shapley additive explanations) values were computed. This analysis offered insights into the contributions of individual features to the decision-making processes of the classifiers, thereby identifying the most influential features in the astrocytoma classification task44.
The experiments were conducted on a computational system equipped with an NVIDIA GeForce MX330 GPU, a Core i7-1165G7 processor with a base clock speed of 2.80 GHz, and 16.0 GB of RAM operating at a frequency of 3200 MHz. This configuration provided sufficient computational resources for the tasks performed during the experiments. All simulations and data processing were carried out using MATLAB 2021a. Additionally, Python 3.8 was employed for the SHAP analysis. The computational time for the 1D mode, spanning preprocessing to classification, was recorded as 1.67 min. In comparison, the 2D mode required 20.4 min for the same sequence of operations. On a per-slice basis, this corresponds to an average processing time of 0.042 s in the 1D mode and 0.48 s in the 2D mode.
A summary of the relevant parameters and their definitions utilized in this study is provided in Supplementary Table S1, available online.
(a) Post-contrast T1-weighted MRI image of an astrocytoma. (b) Histogram corresponding only to the brain region in part (a). (c) The whitened histogram in 1D mode, obtained from the histogram in part (b). (d) The whitened image in 2D mode. (e) A histogram of the MRI image with a non-zero background in 2D mode. (f) The whitened histogram in 2D mode, obtained from the histogram in part (e).
Results
In this section, we will examine the results of the modeling process, the significance of various features, and the grading of astrocytomas.
Performance analysis of the proposed model
To validate the proposed model, this study employed a subset of goodness-of-fit (GOF) methods, which are comprehensively described in the subsequent sections.
One of the most commonly used tools for assessing the stability and heavy-tailed properties of data is the Quantile-Quantile (Q-Q) plot. This graphical technique compares the distribution of a sample dataset to a theoretical distribution, typically a normal distribution45. In this study, Q-Q plots were generated for a sample of whitened images corresponding to various tumor grades in both 1D and 2D modes.
Figure 3 illustrates the Q-Q plots before and after the application of the proposed model. The pre-application Q-Q plots lack a specific or well-defined shape, indicating an absence of clear distributional structure in the data. In contrast, the post-application Q-Q plots exhibit distinct S-shaped patterns, suggesting that the majority of the data points are concentrated near zero, with the remaining points distributed along the tails of the theoretical distribution line. This characteristic S-shape is indicative of a stable distribution. Consequently, the Q-Q plots provide strong evidence that the whitened images conform to a stable distribution after the implementation of the proposed model, thereby validating the model’s efficacy in achieving the desired statistical properties.
To further evaluate the performance of the proposed model, the chi-squared distance test was utilized as a complementary quantitative analysis. This statistical measure assesses the similarity between two distributions, with a chi-squared distance close to zero indicating a better GOF between the observed data and the model’s predictions46. Accordingly, this test was conducted to examine the similarity between the distribution of the whitened images and the stable distribution.
The results, summarized in Table 4, indicate that the distribution of the whitened images in both 1D and 2D modes conforms to a stable distribution at a significance level of 0.05. Specifically, in 1D mode, the chi-squared test yielded a mean distance value of 0.00053 with a standard deviation of 0.00053 across all tumor grades (p-value > 0.05). In 2D mode, the mean chi-squared distance was 0.0044, with a standard deviation of 0.0025 across all tumor grades (p-value > 0.05). These results confirm an optimal fit between the whitened image distributions and the stable distribution in both processing modes.
Features importance analysis
In this study, SHAP values were utilized to evaluate the significance of individual features. SHAP offers a unified framework for interpreting predictions generated by machine learning models. It is based on cooperative game theory and the concept of the Shapley value, which quantifies each feature’s contribution to a specific prediction. This is achieved by examining all possible combinations of features and their corresponding marginal contributions. SHAP values ensure an equitable distribution of the difference between an individual prediction and the dataset’s average prediction. This framework upholds key properties, including efficiency, symmetry, and additivity, thereby providing a rigorous and consistent approach to interpreting feature importance in machine learning models44.
Figures 4 and 5 present the mean SHAP values associated with tumor grades II, III, and IV for the 1D and 2D modes, respectively. In the 1D mode, the sαs parameters significantly contribute to the model’s ability to distinguish between tumor grades. However, their impact becomes even more pronounced in the 2D mode. This enhanced influence is likely attributed to the 2D mode’s ability to leverage spatial information and structural details within the images—key elements for achieving accurate classification. The higher SHAP values associated with the sαs parameters in the 2D mode further underscore their critical role in enhancing the model’s predictive accuracy and robustness. This observation aligns with the performance metrics, which demonstrate that the 2D mode achieved superior classification accuracy compared to the 1D mode.
Classification results
The classification results are summarized in Table 5, highlighting the effectiveness of the proposed method in classifying astrocytic brain tumors into three grades (II, III, and IV) prior to surgery, with particularly strong performance observed in the 2D mode. Among the algorithms tested, the SVM classifier demonstrated the highest accuracy in 1D mode, achieving a value of 93.52%. In 2D mode, the SVM classifier further outperformed other methods, achieving an impressive accuracy of 98.49%. Notably, the KNN classifier (with k = 3 and the Cityblock distance metric) also exhibited strong performance in 2D mode, achieving an accuracy of 98.45%.
To further evaluate the diagnostic performance of the proposed model in differentiating tumor grades, Receiver Operating Characteristic (ROC) curve analyses were conducted across all 10 test folds, and confusion matrices were generated, as illustrated in Figs. 6 and 7, respectively. In 1D mode, the area under the ROC curve (AUC) achieved an impressive value of 0.99 for each tumor grade. Meanwhile, in 2D mode, the AUC reached a perfect score of 1. These results underscore the high diagnostic accuracy and robust capabilities of the proposed model in distinguishing tumor grades. Furthermore, sensitivity, specificity, precision, F1-score, and false discovery rate (FDR) metrics were computed for both 1D and 2D modes using the SVM classifier, which achieved the highest accuracy among the tested classifiers. The detailed results of these performance metrics are presented in Table 6. Collectively, these findings highlight the reliability and efficacy of the proposed model in accurately classifying astrocytic brain tumor grades, emphasizing its potential clinical applicability.
t-SNE visualization of model classification performance
T-distributed Stochastic Neighbor Embedding (t-SNE) is a widely utilized dimensionality reduction technique, particularly effective for the visualization of high-dimensional data. Its primary advantage lies in its ability to preserve local structures while capturing complex manifold patterns within the data. These characteristics render t-SNE highly suitable for a range of applications, including image processing and pattern recognition47,48.
To evaluate the classification performance and explore feature distribution, t-SNE was employed to project the test set predictions into a two-dimensional space. Figure 8 illustrates the resulting visualizations for both the 1D and 2D processing modes.
These plots demonstrate the degree of separability among the predicted tumor grades (II, III, and IV) within the transformed feature space. Notably, the 2D SDE representation yields more compact and distinctly separated clusters compared to the 1D representation, corresponding with the observed higher classification accuracy (98.49% versus 93.52%). This enhanced cluster separation underscores the effectiveness of the 2D SDE approach in improving feature discrimination and, consequently, classification performance.
Overall, the t-SNE visualizations support the robustness of the proposed model in distinguishing between astrocytoma grades. However, the presence of some overlapping regions indicates potential areas for improvement, particularly in refining the decision boundaries to further reduce misclassification rates.
Discussion
While recent studies on computational brain tumor grading have predominantly focused on ML- and DL-based methods, this study introduces an advanced statistical modeling approach that addresses many of the limitations inherent in these methods. The proposed method is based on SDE, commonly referred to as the innovation model. This represents a relatively new approach in statistical modeling. The proposed model eliminates the statistical dependency of pixel intensities in MRI images by employing a differential operator, known as the whitening operator. When this operator is applied directly to the image in the frequency domain as a digital filter, it whitens the image, transforming the pdf of the MRI images into a zero-mean symmetric alpha-stable distribution. Due to the inherent statistical differences between MRI slices, particularly those containing tumors, the sαs distributions exhibit distinct statistical parameters. These parameters provide valuable insights and can serve as informative features for various applications. The model was implemented in both 1D and 2D modes, with specific preprocessing steps tailored to the unique requirements of each dimensionality.
The study underscores the effectiveness of this approach by demonstrating its utility in tasks related to tumor grading. The classification outcomes highlight the model’s ability to accurately differentiate between grades of astrocytomas based on MRI images. This is particularly evident in the 2D processing mode, where the model achieved an impressive accuracy of 98.49% using the SVM classifier. The preoperative tumor grading capability of the proposed method is highly promising, particularly given its significant advantages. These include minimal data processing time—0.042 s per slice in the 1D mode and 0.48 s per slice in the 2D mode—as well as low training data requirements, addressing critical challenges associated with deep learning approaches. Deep learning methods often require substantial computational resources, prolonged processing times, and extensive training datasets, challenges that are especially problematic given the scarcity of medical image datasets.
Furthermore, compared to traditional machine learning methods, the proposed model demonstrates a high degree of accuracy while requiring a remarkably low number of features—only 12 in both 1D and 2D modes. This efficiency mitigates the curse of dimensionality often encountered in ML-based methods, emphasizing the model’s potential for practical application. Notably, the results indicate that the 2D mode outperforms the 1D mode in tumor grade discrimination due to its ability to extract a richer set of statistical features.
Additionally, SHAP analysis indicates that the alpha-stable distribution parameters of the modeled MRI images serve as biologically meaningful features for distinguishing tumor grades, particularly in the 2D mode. The consistently positive SHAP values associated with these parameters suggest their strong discriminative power in the classification model, indicating that the heavy-tailed characteristics captured by the innovation model contain pathologically relevant information about tumor heterogeneity.
The proposed model achieved an impressive AUC of 0.99 in the 1D mode and a perfect score of 1 in the 2D mode. Additionally, key classification metrics—including sensitivity, specificity, precision, and F1-score—were evaluated for both modes, with all exceeding 90%. These strong performance indicators underscore the model’s reliability and effectiveness across different operational modes, making it well-suited for high-precision medical applications.
Moreover, the FDR was assessed for both the 1D and 2D modes, yielding low values of 6.5% and 1.5%, respectively. These results highlight the model’s strong generalizability and its ability to minimize false-positive identifications, which is critical for clinical decision-making. Collectively, these findings further support the central hypothesis that statistical modeling enhances the accuracy and reliability of preoperative brain tumor grading by effectively capturing the stochastic properties inherent in medical imaging data.
Finally, the effectiveness of the proposed model in whitening MRI images was evaluated through goodness-of-fit tests, as detailed in the results section. These tests confirmed the model’s ability to achieve a statistically significant fit to the whitened data in both 1D and 2D modes.
In comparison with previous studies that utilized the same dataset, three notable works were identified, each employing distinct methodologies for three-class tumor grade classification. Takuma Usuzaki et al.29 implemented the Variable Vision Transformer (vViT), a multimodal deep learning model, and achieved an accuracy of 84%. This approach leveraged the transformer architecture’s ability to capture long-range dependencies in data, which is particularly advantageous for complex medical imaging tasks. However, the computational complexity and high resource requirements of transformer-based models can be significant limitations, especially in resource-constrained medical environments.
Similarly, Xuewei Wu et al.30 employed an end-to-end multi-task deep learning (MDL) pipeline, achieving an accuracy of 83.7%. The multi-task framework enabled the model to simultaneously learn multiple related tasks, potentially enhancing generalization. Nevertheless, the performance improvement compared to single-task models was marginal, and the increased complexity of the pipeline posed challenges in terms of implementation and hyperparameter tuning.
In another study, Pedro Vale et al.31 utilized ResNet50 in combination with a weighted random sampler and data augmentation, obtaining an accuracy of 62.26% for three-class classification. While the use of ResNet50 provided a robust baseline, the relatively lower accuracy suggests that the model struggled to capture the inherent variability and complexity of the dataset, despite the use of data augmentation and sampling strategies.
In contrast, the proposed method in this study demonstrated superior performance, achieving accuracies of 93.52% in 1D mode and 98.49% in 2D mode. These results underscore the effectiveness of the proposed approach in capturing the underlying patterns and features of the dataset, particularly in 2D mode, where the model’s ability to leverage spatial information significantly enhanced its predictive capabilities.
The proposed method presents several notable advantages, particularly its capacity to address long-range dependence and self-similarity characteristics inherent in medical imaging data—features that are often overlooked by traditional ML and DL approaches. This capability is reflected in the improved classification accuracy achieved by the model. Additionally, the method demonstrates high computational efficiency and scalability, making it well-suited for real-world clinical applications where limitations in computational resources and processing time are critical considerations. However, potential limitations of the proposed approach include its reliance on the quality and preprocessing of input data. Additionally, further validation on larger and more diverse datasets is necessary to ensure its generalizability and robustness.
Regardless of the dataset type, previous studies employing deep learning methods for tumor classification have reported the highest accuracy rates in binary classification tasks. Notably, Thanh Han Trong et al.32 and S. Shargunam et al.13 both achieved an impressive 99% accuracy in classifying tumors as low- or high-grade, demonstrating the effectiveness of deep learning for relatively straightforward binary classification problems. In contrast, the present study addresses the more complex challenge of three-class tumor classification, achieving an accuracy of 98.49%. This performance is particularly noteworthy, as it is comparable to—and in some cases exceeds—the results of studies focused on simpler binary classification frameworks. For example, comparable efforts in multi-class tumor classification by Chung-Ming Lo et al.6 and Ghulam Gilanie et al.5 reported accuracy rates of 97% and 96%, respectively, in tackling three-class and four-class classification problems using deep learning. These comparisons further underscore the competitive performance of the method proposed in this study.
The findings of this study underscore the exceptional performance of the proposed method in the preoperative classification of astrocytomas, demonstrating high accuracy even in the context of complex multi-class classification tasks. This is particularly noteworthy given the inherent challenges of medical image analysis, including variations in tumor morphology, imaging artifacts, and inter-patient heterogeneity, all of which can adversely affect diagnostic precision. The robustness and reliability exhibited by the method further emphasize its clinical utility, offering a promising pathway toward more accurate, automated, and efficient preoperative assessment of astrocytoma cases.
Moreover, the proposed approach provides meaningful support for clinicians by enabling precise tumor grading prior to surgical intervention. This diagnostic accuracy facilitates the development of timely and individualized treatment plans, thereby accelerating the therapeutic process and improving overall patient outcomes. In addition, by reducing reliance on resource-intensive and complex diagnostic procedures, the method presents a cost-effective alternative for clinical implementation. Its high computational efficiency enhances its suitability for deployment in real-world medical environments, establishing the method as a practical and effective tool for evidence-based clinical decision-making.
Conclusion
This study introduces a new methodology for grading astrocytic brain tumors using statistical modeling based on stochastic differential equations (SDE) applied to post-contrast T1-weighted brain MRI images. This advanced model, employed for the first time in the field of MRI imaging, assigns a symmetric alpha-stable random process to each tumor-containing slice of MRI images. Goodness-of-fit analysis confirms that the SDE model fits MRI data effectively. The model was applied in both 1D and 2D configurations, with specific preprocessing steps for each. Features were extracted from the alpha-stable distributions and whitened images, and three classification algorithms were evaluated for tumor grade differentiation. The SVM achieved the highest accuracy, with 93.52% in 1D mode and 98.49% in 2D mode, showing comparable or superior performance to existing methods. The proposed feature extraction approach, grounded in SDE-statistical modeling, provides a robust and interpretable set of parameters that enhance the grading of astrocytomas. Notably, the sαs distribution parameters exhibit strong correlations with pathological outcomes, as indicated by the SHAP analysis, underscoring their potential for integration into computer-aided diagnosis systems.
This method also addresses the challenge of limited medical data by offering a computationally efficient solution that requires fewer data and features than deep learning approaches while ensuring fast processing times. By reducing the need for large datasets, this approach offers a promising alternative to deep learning techniques.
Subsequent research should aim to expand the methodological framework by incorporating additional tumor cohorts, diverse MRI acquisition protocols, and a wider array of brain MRI applications. Such efforts would enhance the robustness and clinical relevance of the proposed approach, thereby strengthening its contributions to medical image analysis. Additionally, the integration of statistical modeling techniques with deep learning architectures—particularly through the combination of stochastic differential equation-based methods and convolutional neural networks (SDE-CNN)—offers significant potential for improving tumor grading accuracy and advancing the development of automated diagnostic systems.
Data availability
The data utilized in this study are publicly available through The Cancer Imaging Archive (TCIA) and can be accessed at the following link: https://www.cancerimagingarchive.net/collection/ucsf-pdgm.
References
George, N. & Manuel, M. A four grade brain tumor classification system using deep neural network. In 2nd Int. Conf. Signal. Process. Commun. ICSPC 2019 - Proc., 127–132. https://doi.org/10.1109/ICSPC46172.2019.8976495 (2019).
Raisi-Nafchi, M., Faeghi, F., Zali, A., Haghighatkhah, H. & Jalal-Shokouhi, J. Preoperative grading of astrocytic supratentorial brain tumors with diffusion-weighted magnetic resonance imaging and apparent diffusion coefficient. Iran. J. Radiol. 13 (2016).
Saluja, S., Trivedi, M. C. & Saha, A. Deep CNNs for glioma grading on conventional MRIs: Performance analysis, challenges, and future directions. Math. Biosci. Eng. 21, 5250–5282 (2024).
Chen, B. et al. Differentiation of low-grade astrocytoma from anaplastic astrocytoma using radiomics-based machine learning techniques. Front. Oncol. 11, 1–7 (2021).
Gilanie, G., Bajwa, U. I., Waraich, M. M. & Anwar, M. W. Risk-free WHO grading of Astrocytoma using convolutional neural networks from MRI images. Multimed. Tools Appl. 80, 4295–4306 (2021).
Lo, C., Chen, Y., Weng, R. & Hsieh, K. L. Intelligent glioma grading based on deep transfer learning of MRI radiomic features. Appl. Sci.
Kumar, A. et al. Machine-learning-based radiomics for classifying glioma grade from magnetic resonance images of the brain. J. Pers. Med. 13, 920 (2023).
Vijithananda, S. M. et al. Texture feature analysis of MRI-ADC images to differentiate glioma grades using machine learning techniques. Sci. Rep. 13 (2023).
Tian, Z. et al. Glioblastoma and anaplastic astrocytoma: Differentiation using MRI texture analysis. Front. Oncol. 9 (2019).
Dong, F. et al. Differentiation between pilocytic Astrocytoma and glioblastoma: A decision tree model using contrast-enhanced magnetic resonance imaging-derived quantitative radiomic features. Eur. Radiol. 29, 3968–3975 (2019).
Rizky, T., Mutig, G. & Hardi, I. Informatics and health integrating explainable artificial intelligence and light gradient boosting machine for glioma grading. Inf. Health 2, 1–8 (2025).
Murphy, P. & Murphy, K. K. P. A probabilistic perspective. Chance encounters: Probability in … (2012).
Shargunam, S. & Rajakumar, G. An efficient glioma classification and grade detection using hybrid convolutional neural network-based SVM model. Imaging Sci. J. 72, 1–22 (2023).
Gutta, S., Acharya, J., Shiroishi, M. S., Hwang, D. & Nayak, K. S. Improved glioma grading using deep convolutional neural networks. Am. J. Neuroradiol. 42, 233–239 (2021).
Naser, M. A. & Deen, M. J. Brain tumor segmentation and grading of lower-grade glioma using deep learning in MRI images. Comput. Biol. Med. 121, 103758 (2020).
Shoeibi, A. et al. Diagnosis of schizophrenia in EEG signals using dDTF effective connectivity and new pretrained CNN and transformer models. In Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14674 LNCS (Springer, 2024).
Goceri, E. An efficient network with CNN and transformer blocks for glioma grading and brain tumor classification from MRIs. Expert Syst. Appl. 268, 126290 (2025).
Amini, Z. & Rabbani, H. Classification of medical image modeling methods: A review. Curr. Med. Imaging Rev. 12, 1–1 (2016).
Tajmirriahi, M. & Amini, Z. Modeling of seizure and seizure-free EEG signals based on stochastic differential equations. Chaos Solitons Fractals 150, 111104 (2021).
Bian, Z. MR brain tissue classification based on the Spatial information enhanced Gaussian mixture model. Technol. Health Care 30, S81–S89 (2022).
Cheng, N., Cao, C., Yang, J., Zhang, Z. & Chen, Y. A spatially constrained skew student’st mixture model for brain MR image segmentation and bias field correction. Pattern Recognit. 128, 108658 (2022).
Pravitasari, A. A. et al. A bayesian neo-normal mixture model (Nenomimo) for MRI-based brain tumor segmentation. Appl. Sci. 10, 1–19 (2020).
Pravitasari, A. A. et al. Bayesian spatially constrained Fernandez-Steel Skew Normal Mixture model for MRI-based brain tumor segmentation. In AIP Conf. Proc. Vol. 2194 (2019).
Peis, I. et al. MRI brain segmentation using hidden Markov random fields with alpha-stable distributions. In 2016 IEEE Nucl. Sci. Symp. Med. Imaging Conf. Room-Temperature Semicond. Detect. Work. NSS/MIC/RTSD 2016 (2017).
Xia, Y., Ji, Z. & Zhang, Y. Brain MRI image segmentation based on learning local variational Gaussian mixture models. Neurocomputing 204, 189–197 (2016).
Chua, A. S. et al. Handling changes in MRI acquisition parameters in modeling whole brain lesion volume and atrophy data in multiple sclerosis subjects: Comparison of linear mixed-effect models. NeuroImage Clin. 8, 606–610 (2015).
Balafar, M. A. Gaussian mixture model based segmentation methods for brain MRI images. Artif. Intell. Rev. 41, 429–439 (2014).
Calabrese, E. et al. The university of California San Francisco preoperative diffuse glioma MRI (UCSF-PDGM) (Version 4) [Dataset]. Cancer Imaging Arch. https://doi.org/10.7937/tcia.bdgf-8v37 (2022).
Usuzaki, T. et al. Grading diffuse glioma based on 2021 WHO grade using self-attention-base deep learning architecture: Variable vision transformer (vViT). Biomed. Signal. Process. Control 91, 106001 (2024).
Wu, X. et al. Biologically interpretable multi-task deep learning pipeline predicts molecular alterations, grade, and prognosis in glioma patients. NPJ Precis. Oncol. 8, 1–14 (2024).
Vale, P., Boer, J., Oliveira, H. P. & Pereira, T. Deep learning models to predict brain cancer grade through MRI analysis. In Proc. - IEEE Symp. Comput. Med. Syst., 153–157. https://doi.org/10.1109/CBMS61543.2024.00033 (2024).
Han Trong, T., Van, N. & Vu Dang, L. H. High-performance method for brain tumor feature extraction in MRI using complex network. Appl. Bionics Biomech. (2023).
Togao, O. et al. Gamma distribution model of diffusion MRI for the differentiation of primary central nerve system lymphomas and glioblastomas. PLoS One 15, 1–15 (2020).
Salas-Gonzalez, D. et al. Parameterization of the distribution of white and grey matter in MRI using the α-stable distribution. Comput. Biol. Med. 43, 559–567 (2013).
Vadaparthi, N., Srinivas, Y. & Penumatsa, S. V. Unsupervised medical image segmentation on brain MRI images using skew Gaussian distribution. In Int. Conf. Recent. Trends Inf. Technol. ICRTIT 2011, 1293–1297. https://doi.org/10.1109/ICRTIT.2011.5972371 (2011).
Unser, M. & Tafti, P. D. Stochastic models for sparse and piecewise-smooth signals. IEEE Trans. Signal. Process. 59, 989–1006 (2010).
Mandelbrot, B. B. & Van Ness, J. W. Fractional brownian motions, fractional noises and applications. SIAM Rev. 10, 422–437 (1968).
Tajmirriahi, M., Amini, Z., Hamidi, A., Zam, A. & Rabbani, H. Modeling of retinal optical coherence tomography based on stochastic differential equations: Application to denoising. IEEE Trans. Med. Imaging. 40, 2129–2141 (2021).
Chen, J., Chen, H., Cai, X., Weng, P. & Nie, H. Parameter estimation of stable distribution based on zero - Order statistics. In AIP Conf. Proc. Vol. 1864 (2017).
Das, S. & Pan, I. Fractional Order Signal Processing: Introductory Concepts and Applications 83–96 (Springer, 2012). https://doi.org/10.1007/978-3-642-23117-9.
Véhel, J. L. et al. Explicit and combined estimators for stable distributions parameters To cite this version: HAL Id : hal-01791934 Explicit and combined estimators for stable distributions parameters (2021).
Koutrouvelis, I. A. Regression-type Estimation of the parameters of stable laws. J. Am. Stat. Assoc. 75, 918–928 (1980).
Du, P. et al. Predicting histopathological grading of adult gliomas based on preoperative conventional multimodal MRI radiomics: A machine learning model. Brain Sci. 13 (2023).
Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable (2020).
Wilk, M. B. & Gnanadesikan, R. Probability plotting methods for the analysis of data. Biometrika 55, 1–17 (1968).
Forero, M. G., Arias-Rubio, C. & González, B. T. Analytical comparison of histogram distance measures. In Lect. Notes Comput. Sci. (including Subser. Lect Notes Artif. Intell. Lect Notes Bioinformatics) 81–90 (2019).
Stevens, F., Carrasco, B. & Baeten, V. & Fernández Pierna, J. A. Use of t-distributed stochastic neighbour embedding in vibrational spectroscopy. J. Chemom. 38, 1–11 (2024).
Bdaqli, M. et al. Diagnosis of Parkinson disease from EEG signals using a CNN-LSTM model and explainable AI. In Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14674 LNCS (Springe, 2024).
Funding
This research was funded by the Vice-Chancellery for Research and Technology at Isfahan University of Medical Sciences under Grant Number 3402464.
Author information
Authors and Affiliations
Contributions
Conceptualization, M.R.N. and Z.A.; methodology, M.R.N., Z.A., and M.T.; software, M.R.N. and M.T.; validation, M.R.N., Z.A., M.T., and H.R.; formal analysis, M.R.N., Z.A., and M.T.; resources, M.R.N.; data curation, M.R.N.; writing—original draft preparation, M.R.N., Z.A., and M.T.; writing—review and editing, M.R.N., Z.A., M.T., and H.R.; supervision, Z.A., M.T., and H.R.; project administration, Z.A. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Raisi-Nafchi, M., Tajmirriahi, M., Rabbani, H. et al. Stochastic differential equation modeling approach for grading astrocytomas on brain MRI images. Sci Rep 15, 22835 (2025). https://doi.org/10.1038/s41598-025-06144-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-06144-0










