MM FD ConvFormer multimodal frequency aware deformable CNN transformer network for robust brain tumor classification

Arockia Selvarathinam, Anto Lourdu Xavier Raj; Lilhore, Umesh Kumar; Alroobaea, Roobaea; Alsafyani, Majed; Baqasah, Abdullah M.; Algarni, Sultan; Khan, Monish

doi:10.1038/s41598-026-43616-3

Download PDF

Article
Open access
Published: 09 March 2026

MM FD ConvFormer multimodal frequency aware deformable CNN transformer network for robust brain tumor classification

Anto Lourdu Xavier Raj Arockia Selvarathinam ORCID: orcid.org/0009-0007-3389-031X¹,
Umesh Kumar Lilhore²,
Roobaea Alroobaea³,
Majed Alsafyani³,
Abdullah M. Baqasah⁴,
Sultan Algarni⁵ &
…
Monish Khan⁶

Scientific Reports , Article number: (2026) Cite this article

860 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Accurate brain tumor classification from magnetic resonance imaging (MRI) is critical for early diagnosis and effective clinical decision-making. Although recent CNN–Transformer hybrid models have shown promising performance, most approaches rely primarily on single-modal spatial information, limiting their ability to capture complementary spectral features, model tumor heterogeneity, and generalize across datasets. To address these challenges, this paper proposes MM-FD-ConvFormer, a multimodal frequency-aware deformable CNN–Transformer network for robust brain tumor classification with enhanced interpretability. The proposed mode integrates three complementary modalities: (1) spatial MRI representations from original images, (2) frequency-domain MRI representations obtained via Fourier or wavelet transforms to capture texture and intensity variations, and (3) multi-scale contextual features for modeling global dependencies. A ConvNeXt V2 backbone is employed to extract discriminative spatial features, while a parallel lightweight ConvNeXt-based branch processes frequency-domain inputs. These features are subsequently fused and refined using a Swin Transformer V2 to capture long-range contextual relationships. To effectively integrate heterogeneous modalities and adapt to irregular tumor boundaries, a deformable cross-modal attention mechanism is introduced, enabling dynamic and shape-aware feature fusion. Final classification is performed on a unified multimodal representation, with an optional uncertainty-aware prediction head to improve reliability. The proposed model is evaluated using multiple public datasets, including the Kaggle Brain Tumor MRI and Figshare datasets for training, with external validation on the clinically relevant BraTS 2020/2021 dataset and optional testing on TCIA/REMBRANDT to assess cross-dataset generalization. Extensive experiments demonstrate that MM-FD-ConvFormer consistently outperforms standard CNN baselines, advanced transformer-based models, and hybrid approaches in terms of accuracy, macro-F1 score, and AUC. Furthermore, qualitative analyses using Grad-CAM, attention map visualization, and weakly supervised pseudo-segmentation provide interpretable insights into tumor localization and model decision-making. Overall, MM-FD-ConvFormer offers a robust, interpretable, and generalizable solution for automated brain tumor classification in real-world clinical imaging applications.

Brain tumor classification utilizing pixel distribution and spatial dependencies higher-order statistical measurements through explainable ML models

Article Open access 28 October 2024

Brain tumor classification from MRI images using a multi-scale channel attention CNN integrated with SVM

Article Open access 27 January 2026

Brain tumor segmentation based on deep learning and an attention mechanism using MRI multi-modalities brain images

Article Open access 25 May 2021

Data availability

Dataset Availability: Dataset is publicly available, can access from Reference 33 to 37.

Abbreviations

AUC:: Area Under the Curve
ANOVA:: Analysis of Variance
BraTS:: Brain Tumor Segmentation Dataset
CAM:: Class Activation Mapping
CNN:: Convolutional Neural Network
ConvNeXt:: Convolutional Next Architecture
DNN:: Deep Neural Network
F1-score:: Harmonic Mean of Precision and Recall
FD:: Frequency Domain
GAN:: Generative Adversarial Network
GNN:: Graph Neural Network
Grad-CAM:: Gradient-weighted Class Activation Mapping
IoU:: Intersection over Union
K-fold:: K-Fold Cross Validation
Macro-F1:: Macro-Averaged F1 Score
LSTM:: Long Short-Term Memory
MRI:: Magnetic Resonance Imaging
MM-FD-ConvFormer:: Multimodal Frequency-Domain Convolutional Transformer
MODIS:: Moderate Resolution Imaging Spectroradiometer
NIH:: National Institutes of Health
ROC:: Receiver Operating Characteristic
SD:: Standard Deviation
SHAP:: SHapley Additive exPlanations
SOTA:: State of the Art
Swin:: Shifted Window Transformer
TCIA:: The Cancer Imaging Archive
TPR:: True Positive Rate
U-Net:: U-shaped Neural Network
ViT:: Vision Transformer
XAI:: Explainable Artificial Intelligence

References

Z. Li, Y. Wan, L. Xu, W. Su, P. Wang and X. Zhou, A Diff usion Model and Few-Shot Learning Framework for EEGAbnormality Classifi cation and Cerebral Infarction Assessment, in IEEE Transactions on Computational Social Systems, https://doi.org/10.1109/TCSS.2026.3665229.
Jin, L. et al. Frequency-aware spatial–temporal attention explainable network for EEG decoding. IEEE J. Biomed. Health Inform. 29 (10), 7175–7185.https://doi.org/10.1109/JBHI.2025.3576088 (2025).
Jiang, M. et al. Frequency-aware diffusion model for multi-modal MRI image synthesis. J. Imaging. 11 (5), 152 (2025).
Google Scholar
Ke, L. et al. Brain tumor classification from MRI images using a multi-scale channel attention CNN integrated with SVM. Sci Rep. 16, 6297.https://doi.org/10.1038/s41598-026-36164-3 (2026).
Shinde, R. U., Sangolagi, V. A., Patil, M. B., Mhetre, V. & Kulkarni, S. Scalable and robust CNN models for brain tumor detection in healthcare applications. Proc. Copyr. 346, 353 (2024).
Google Scholar
Sahoo, J. R., Nanda, S. K. & Panda, G. Brain tumor detection using transformer-based EfficientB0Net. SN Comput. Sci. 7 (1), 80 (2026).
Google Scholar
Anand, V., Khajuria, A., Pachauri, R. K. & Gupta, V. Multi-class classification of brain tumors using optimized CNN and transfer learning techniques. Sci Rep. 16, 4709.https://doi.org/10.1038/s41598-025-34806-6 (2026).
Tang, Z., Liao, X., Liao, B., Shen, C. & Zhang, Y. MRI brain tumor classification using RFENet three-branch model with SwishReLU. Biomed. Signal. Process. Control. 113, 108893 (2026).
Google Scholar
Pacal, I. & Banerjee, T. Towards accurate and interpretable brain tumor diagnosis: T-FSPANNet with tri-attribute and pyramidal attention-based feature fusion. Biomed. Signal. Process. Control. 113, 108852 (2026).
Google Scholar
Balamurugan, A. G., Srinivasan, S., Monica, P., Mathivanan, S. K. & Shah, M. A. Robust brain tumor classification by fusion of deep learning and channel-wise attention mode approach. BMC Med. Imaging. 24 (1), 147 (2024).
Google Scholar
Prasad, A. Y., Tanaka, K., Krishnamoorthy, R. & Thiagarajan, R. Robust brain tumor detection and classification from multichannel MRI using deep learning. Dev. Neurobiol. 85 (3), e22991 (2025).
Google Scholar
Nassar, S. E., Yasser, I., Amer, H. M. & Mohamed, M. A. A robust MRI-based brain tumor classification via a hybrid deep learning technique. J. Supercomput. 80 (2), 2403–2427 (2024).
Google Scholar
Ganesh, S., Kannadhasan, S. & Jayachandran, A. Multi-class robust brain tumor with hybrid classification using DTA algorithm. Heliyon. 10 (1),https://doi.org/10.1016/j.heliyon.2023.e23610 (2024).
Zhang, J. et al. EFF_D_SVM: a robust multi-type brain tumor classification system. Front. Neurosci. 17, 1269100 (2023).
Google Scholar
Shah, H. A. et al. A robust approach for brain tumor detection in magnetic resonance images using finetuned EfficientNet. IEEE Access. 10, 65426–65438 (2022).
Google Scholar
Hasan, N., Ahmed, M. F., Nasif, M. A., Haq, M. R. & Rahman, M. Hybrid feature extraction approach for robust brain tumor classification: HOG, GLCM, and artificial neural network. In: Proc 6th Int Conf Electrical Engineering and Information & Communication Technology (ICEEICT). IEEE, pp 1292–1297 (2024).
Babu Vimala, B. et al. Detection and classification of brain tumor using hybrid deep learning models. Sci. Rep. 13 (1), 23029 (2023).
Google Scholar
Agarwal, M. et al. Deep learning for enhanced brain tumor detection and classification. Results Eng. 22, 102117 (2024).
Google Scholar
Sharif, M. I., Khan, M. A., Alhussein, M., Aurangzeb, K. & Raza, M. A decision support system for multimodal brain tumor classification using deep learning. Complex. Intell. Syst. 8 (4), 3007–3020 (2022).
Google Scholar
Lerousseau, M., Deutsch, E., Paragios, N. Multimodal Brain Tumor Classifi cation. In: Crimi, A., Bakas, S. (eds) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2020. Lecture Notes in Computer Science(), 12659. Springer, Cham.https://doi.org/10.1007/978-3-030-72087-2_42 (2021).
Usha, M. P., Kannan, G. & Ramamoorthy, M. Multimodal brain tumor classification using convolutional Tumnet architecture. Behav Neurol 2024:4678554 (2024).
Rohini, A. et al. Multimodal hybrid convolutional neural network based brain tumor grade classification. BMC Bioinform. 24 (1), 382 (2023).
Google Scholar
Ullah, M. S. et al. BrainNet: a fusion assisted novel optimal framework of residual blocks and stacked autoencoders for multimodal brain tumor classification. Sci. Rep. 14 (1), 5895 (2024).
Google Scholar
Razzaghi, P., Abbasi, K., Shirazi, M. & Rashidi, S. Multimodal brain tumor detection using multimodal deep transfer learning. Appl. Soft Comput. 129, 109631 (2022).
Google Scholar
Khan, M. A. et al. Multimodal brain tumor detection and classification using deep saliency map and improved dragonfly optimization algorithm. Int. J. Imaging Syst. Technol. 33 (2), 572–587 (2023).
Google Scholar
Lilhore, U. K. et al. AG-MS3D-CNN: multiscale attention-guided 3D convolutional neural network for robust brain tumor segmentation across MRI protocols. Sci. Rep. 15 (1), 24306 (2025).
Google Scholar
Lilhore, U. K., Simaiya, S., Prasad, D. & Guleria, K. A hybrid tumour detection and classification based on machine learning. J. Comput. Theor. Nanosci. 17 (6), 2539–2544 (2020).
Google Scholar
Dalal, S. et al. An efficient brain tumor segmentation method based on adaptive moving self-organizing map and fuzzy K-mean clustering. Sensors 23 (18), 7816 (2023).
Google Scholar
Simaiya, S., Lilhore, U. K., Walia, R., Chauhan, S. & Vajpayee, A. An efficient brain tumour detection from MR images based on deep learning and transfer learning model. In: Proc Int Conf IoT, Communication and Automation Technology (ICICAT). IEEE, pp 1–5 (2023).
Sayah, A. et al. Enhancing the REMBRANDT MRI collection with expert segmentation labels and quantitative radiomic features. Sci. Data. 9 (1), 338 (2022).
Google Scholar
Henry, T. et al. Brain tumor segmentation with self-ensembled, deeply supervised 3D U-Net neural networks: a BraTS 2020 challenge solution. In MICCAI Brainlesion Workshop 327–339. ArXiv, abs/2011.01045. (Springer, 2020).
Google Scholar
Kaggle Brain Tumor MRI Dataset. Available online: (2023). https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset
Figshare Brain Tumor Dataset. Available online: (2015). https://figshare.com/articles/dataset/brain_tumor_dataset/1512427
BraTS 2021 Dataset. The Cancer Imaging Archive. Available online: (2021). https://www.cancerimagingarchive.net/analysis-result/rsna-asnr-miccai-brats-2021/
TCIA REMBRANDT Dataset. The Cancer Imaging Archive. Available online: (2022). https://www.cancerimagingarchive.net/collection/rembrandt/

Download references

Acknowledgements

The authors would like to acknowledge the Deanship of Graduate Studies and Scientific Research, Taif University for funding this work.

Author information

Authors and Affiliations

Department of Data Science and Analytics, College of Computing, Grand Valley State University, Michigan, USA
Anto Lourdu Xavier Raj Arockia Selvarathinam
School of Computing Science and Engineering, Galgotias University, Greater Noida, UP, India
Umesh Kumar Lilhore
Department of Computer Science, College of Computers and Information Technology, Taif University, P. O. Box 11099, Taif, 21944, Saudi Arabia
Roobaea Alroobaea & Majed Alsafyani
Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, 21974, Saudi Arabia
Abdullah M. Baqasah
Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Sultan Algarni
Research Department, Arba Minch University, Arba, Minch, Ethiopia
Monish Khan

Authors

Anto Lourdu Xavier Raj Arockia Selvarathinam
View author publications
Search author on:PubMed Google Scholar
Umesh Kumar Lilhore
View author publications
Search author on:PubMed Google Scholar
Roobaea Alroobaea
View author publications
Search author on:PubMed Google Scholar
Majed Alsafyani
View author publications
Search author on:PubMed Google Scholar
Abdullah M. Baqasah
View author publications
Search author on:PubMed Google Scholar
Sultan Algarni
View author publications
Search author on:PubMed Google Scholar
Monish Khan
View author publications
Search author on:PubMed Google Scholar

Contributions

Anto Lourdu Xavier Raj Arockia Selvarathinam contributed to conceptualization, methodology design, model development, experimental implementation, and manuscript drafting.- Umesh Kumar Lilhore contributed to conceptualization, supervision, formal analysis, results interpretation, and critical revision of the manuscript.- Roobaea Alroobaea contributed to data curation, experimental validation, and performance analysis.- Majed Alsafyani contributed to dataset preparation, experimental support, and result verification.- Abdullah M. Baqasah contributed to literature review, comparative analysis, and manuscript editing.- Sultan Algarni contributed to statistical analysis, result visualization, and discussion refinement.- MD Monish Khan contributed to supervision, project administration, funding acquisition, and final manuscript approval.

Corresponding authors

Correspondence to Umesh Kumar Lilhore or Monish Khan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Arockia Selvarathinam, A.X., Lilhore, U.K., Alroobaea, R. et al. MM FD ConvFormer multimodal frequency aware deformable CNN transformer network for robust brain tumor classification. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43616-3

Download citation

Received: 06 February 2026
Accepted: 05 March 2026
Published: 09 March 2026
DOI: https://doi.org/10.1038/s41598-026-43616-3