Abstract
Early detection of maize leaf diseases is essential to prevent yield losses. Existing vision-based models face challenges in real-world environments due to data imbalance, lighting variations, and interpretability. This study presents MaizeFormerX, a lightweight Vision Transformer designed for cross-domain, explainable maize disease detection on resource-limited settings. MaizeFormerX employs multi-scale patch embeddings and a Cross-Scale Attention Fusion (CSAF) module to capture both detailed lesion textures and larger disease patterns. The CSAF output is processed through a transformer encoder stack using multi-head self-attention to model long-range dependencies. Robust preprocessing and dataset-specific augmentations were applied to improve feature extraction and address class imbalances in the Dataverse, Tanzania, and Plagues Maiz datasets. For interpretability, Grad-CAM was used for pixel-level saliency mapping in an efficient web application. When benchmarked against MobileViT, EfficientFormer, TinyViT, and Swin Transformer, MaizeFormerX achieved 97.8% accuracy on Dataverse, 97.5% on Tanzania, and 96.9% on Plagues Maiz, outperforming Swin Transformer V2 by 2–3%. Cross-domain testing yielded 88.9% accuracy when trained on Dataverse and tested on Tanzania, surpassing baseline performance by 3–6%. Class-wise analysis revealed F1 scores over 98% for Healthy and MLB classes with 6× augmentation, and over 97% for MSV. Ablation studies highlighted the significance of the cross-scale attention module for high MCC during domain shifts. This study introduces a precise, explainable, and efficient image-based method for classifying maize diseases, which could aid in more targeted crop management, reduce unnecessary agrochemical use, and promote sustainable maize production in future decision-support environments.
Data availability
The datasets used in this study are publicly available and sourced from Dataverse (https://doi.org/10.7910/DVN/LPGHKK), Tanzania (https://doi.org/10.17632/fkw49mz3xs.1) and Plagues Maiz (https://figshare.com/articles/MaizePD/10314539/3) . All code, preprocessing pipelines, and experimental configurations used in this work are available at: https://github.com/rezaul-h/MaizeFormerX/.
References
Ashesh, A. et al. Current scenario and future prospects pp. 392–412 (in: Colored Cereals, CRC, 2025).
G´orska-Warsewicz, H., Rejman, K., Ganczewski, G. & Kwiatkowski, B. Economic importance of nutritional and healthy cereals and/or cereal products pp. 433–450 (in: Developing Sustainable and Health Promoting Cereals and Pseudocereals, Elsevier,, 2023).
Zhang, B., Hastings, A., Clifton-Brown, J. C., Jiang, D. & Faaij, A. P. Modeled spatial assessment of biomass productivity and technical potential of miscanthus× giganteus, panicum virgatum l., and jatropha on marginal land in china. GCB Bioenergy. 12 (5), 328–345 (2020).
Erenstein, O., Chamberlin, J. & Sonder, K. Estimating the global number and distribution of maize and wheat farms. Global Food Secur. 30, 100558 (2021).
Shiferaw, B. et al. Crops that feed the world 10. past successes and future challenges to the role played by wheat in global food security. Food Secur. 5, 291–317 (2013).
Boyd, C. E., McNevin, A. A. & Davis, R. P. The contribution of fisheries and aquaculture to the global protein supply. Food Secur. 14 (3), 805–827 (2022).
Boyd, C. E. et al. Achieving sustainable aquaculture: Historical and current perspectives and future needs and challenges. J. World Aquaculture Soc. 51 (3), 578–633 (2020).
Vilpoux, O. F. & Junior, J. F. S. S. Global production and use of starch, in: Starchy crops morphology, extraction, properties and applications pp. 43–66 (Elsevier, 2023).
Maitra, S. & Singh, V. Invited review on ‘maize in the 21st century’emerging trends of maize biorefineries in the 21st century: Scientific and technological advancements in biofuel and bio-sustainable market. J. Cereal Sci. 101, 103272 (2021).
Shahbaz, M. et al. A comprehensive review of biomass based thermochemical conversion technologies integrated with co2 capture and utilisation within beccs networks. Resour. Conserv. Recycl. 173, 105734 (2021).
Scrimgeour, F. Agriculture: Continued strengths, in: Public policy and governance frontiers in New Zealand Vol. 32, pp. 91–112 (Emerald Publishing Limited, 2020).
Waqas, M. A. et al. Thermal stresses in maize: effects and management strategies. Plants 10 (2), 293 (2021).
Biswal, A. K. et al. Maize lethal necrosis disease: review of molecular and genetic resistance mechanisms, socio-economic impacts, and mitigation strategies in sub-saharan africa. BMC Plant Biol. 22 (1), 542 (2022).
Astapati, A. D. & Nath, S. The complex interplay between plant-microbe and virus interactions in sustainable agriculture: Harnessing phytomicrobiomes for enhanced soil health, designer plants, resource use efficiency, and food security. Crop Des. 2 (1), 100028 (2023).
Martin, D. P. & Shepherd, D. N. The epidemiology, economic impact and control of maize streak disease. Food Secur. 1, 305–315 (2009).
Jones, R. A. Global plant virus disease pandemics and epidemics. Plants 10 (2), 233 (2021).
Grace, D. Burden of foodborne disease in low-income and middle-income countries and opportunities for scaling food safety interventions. Food Secur. 15 (6), 1475–1488 (2023).
Abbas, M. T. et al. Viral diseases of maize, in: Cereal Diseases: Nanobiotechnological Approaches for Diagnosis and Management, Springer, 83–96. (2022).
Jeger, M. et al. Global challenges facing plant pathology: multidisciplinary approaches to meet the food security and environmental challenges in the mid-twenty-first century. CABI Agric. Bioscience. 2 (1), 1–18 (2021).
Buja, I. et al. Advances in plant disease detection and monitoring: From traditional assays to in-field diagnostics. Sensors 21 (6), 2129 (2021).
Liu, Y., Pu, H. & Sun, D. W. Efficient extraction of deep image features using convolutional neural network (cnn) for applications in detecting and analysing complex food matrices. Trends Food Sci. Technol. 113, 193–204 (2021).
Jasrotia, S., Yadav, J., Rajpal, N., Arora, M. & Chaudhary, J. Convolutional neural network based maize plant disease identification. Procedia Comput. Sci. 218, 1712–1721 (2023).
O’Halloran, T., Obaido, G., Otegbade, B. & Mienye, I. D. A deep learning approach for maize lethal necrosis and maize streak virus disease detection. Mach. Learn. Appl. 16, 100556 (2024).
Elmasry, A., Abdullah, W., Kang, B. G. & Nam, Y. A novel hybrid approach based on cnn for corn diseases detection. Optim. Agric. 1, 94–104 (2024).
Masood, M. et al. Maizenet: A deep learning approach for effective recognition of maize plant leaf diseases. IEEE Access. 11, 52862–52876 (2023).
Malik, M. M. et al. A novel deep cnn model with entropy coded sine cosine for corn disease classification. J. King Saud UniversityComputer Inform. Sci. 36 (7), 102126 (2024).
Thakur, P. S., Sheorey, T. & Ojha, A. Vgg-icnn: A lightweight cnn model for crop disease identification. Multimedia Tools Appl. 82 (1), 497–520 (2023).
Rajeena PP, F., SU, A., Moustafa, M. A. & Ali, M. A. Detecting plant disease in corn leaf using efficientnet architecture—an analytical approach, Electronics 12 (8) 1938. (2023).
Suharto, D. N. & Mandala, R. Identification of diseases on corn leaves using cnn denoising (decnn), in: Proceeding International Conference on Religion, Science and Education, Vol. 3, pp. 645–653. (2024).
Ramadan, S. T. Y., Sakib, T., Jahangir, R. & Rahman, S. Maize leaf disease detection using vision transformers (vits) and cnn-based classifiers: comparative analysis, in: International Conference on Human-Centric Smart Computing, Springer, pp. 513–524. (2023).
Ramadan, S. T. Y., Sakib, T., Rahat, M. A. & Mosharrof, S. Cyclegan-based data augmentation with cnn and vision transformers (vit) models for improved maize leaf disease classification, in: 2023 IEEE 64th international scientific conference on information technology and management science of Riga Technical University (ITMS), IEEE, pp. 1–6. (2023).
Isinkaye, F. O., Olusanya, M. O. & Akinyelu, A. A. A multi-class hybrid variational autoencoder and vision transformer model for enhanced plant disease identification. Intell. Syst. Appl. 26, 200490 (2025).
Padshetty, S. et al. A novel twin vision transformer framework for crop disease classification with deformable attention. Biomed. Signal Process. Control. 105, 107551 (2025).
Mayo, F., Maina, C., Mgala, M. & Mduma, N. Deep learning models for the early detection of maize streak virus and maize lethal necrosis diseases in tanzania. Front. Artif. Intell. 7, 1384709 (2024).
Gole, P., Bedi, P., Marwaha, S., Haque, M. A. & Deb, C. K. Trincnet: a lightweight vision transformer network for identification of plant diseases. Front. Plant Sci. 14, 1221557 (2023).
Li, H. et al. Maize disease classification system design based on improved convnext. Sustainability 15 (20), 14858 (2023).
Zhu, T. et al. A deep learning model for accurate maize disease detection based on state-space attention and feature fusion. Plants 13 (22), 3151 (2024).
Aboelenin, S., Elbasheer, F. A., Eltoukhy, M. M., El-Hady, W. M. & Hosny, K. M. A hybrid framework for plant leaf disease detection and classification using convolutional neural networks and vision transformer. Complex. Intell. Syst. 11 (2), 142 (2025).
Singh, A. K., Rao, A., Chattopadhyay, P., Maurya, R. & Singh, L. Effective plant disease diagnosis using vision transformer trained with leafygenerative adversarial network-generated images. Expert Syst. Appl. 254, 124387 (2024).
Hu, Y., Liu, G., Chen, Z., Liu, J. & Guo, J. Lightweight one-stage maize leaf disease detection model with knowledge distillation. Agriculture 13 (9), 1664 (2023).
Li, R. et al. Lightweight network for corn leaf disease identification based on improved yolo v8s. Agriculture 14 (2), 220 (2024).
Wang, B. et al. An ultralightweight efficient network for image-based plant disease and pest infection detection. Precision Agric. 24 (5), 1836–1861 (2023).
Ji, Z., Bao, S., Chen, M. & Wei, L. Ics-resnet: a lightweight network for maize leaf disease classification. Agronomy 14 (7), 1587 (2024).
Kunduracioglu, I. & CNN Models Approaches for Robust Classification of Apple Diseases. Oct.,, Computer and Decision Making: An International Journal, vol. 1, pp. 235–251, (2024). https://doi.org/10.59543/COMDEM.V1I.10957
Kunduracioglu, I. & Pacal, I. Advancements in deep learning for accurate classification of grape leaves and diagnosis of grape diseases, Journal of Plant Diseases and Protection 2024 131:3, vol. 131, no. 3, pp. 1061–1080, Mar. (2024). https://doi.org/10.1007/S41348-024-00896-Z
Kunduracioglu, I. Utilizing ResNet Architectures for Identification of Tomato Diseases, Journal of Intelligent Decision Making and Information Science, vol. 1, pp. 104–119, Dec. (2024). https://doi.org/10.59543/JIDMIS.V1I.11949
Pacal, I. et al. Sep., A systematic review of deep learning techniques for plant diseases, Artificial Intelligence Review 2024 57:11, vol. 57, no. 11, p. 304-, (2024). https://doi.org/10.1007/S10462-024-10944-7
Babirye, C. et al. Makerere Univ. Maize Image Dataset doi:10.7910/DVN/LPGHKK. URL https://doi.org/10.7910/DVN/LPGHKK (2022).
Mduma, N. & Mayo, F. Maize imagery dataset - tanzania, Mendeley Data, V1 (2023). https://doi.org/10.17632/fkw49mz3xs.1. URL https://doi.org/10.17632/fkw49mz3xs.1.
Manjarrez-Sanchez, J. An assessment of mpeg-7 visual descriptors for images of maize plagues and diseases. figshare Dataset. https://doi.org/10.6084/m9.figshare.10314539.v3 (2019). https://doi.org/10.6084/m9.figshare.10314539.v3
Meng, Y., Wu, P., Feng, J. & Zhang, X. Mixmobilenet: A mixed mobile network for edge vision applications. Electronics 13 (3), 519 (2024).
Thaalbi, O., Akhloufi, M. A. & Genetformer Transformer-based framework for gene expression prediction in breast cancer. AI 6 (3), 43 (2025).
Bangalore Vijayakumar, S., Chitty-Venkata, K. T., Arya, K. & Somani, A. K. Convision benchmark: A contemporary framework to benchmark cnn and vit models. AI 5 (3), 1132–1171 (2024).
Papa, L., Russo, P., Amerini, I. & Zhou, L. A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking. IEEE Trans. Pattern Anal. Mach. Intell. (2024).
Joshi, D. & Witharana, C. Vision transformer-based unhealthy tree crown detection in mixed northeastern us forests and evaluation of annotation uncertainty. Remote Sens. 17 (6), 1066 (2025).
Funding
This research received no specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
Conceptualization: Md Mostafizur Rahman, Md Najmul Gony, Muhammad E. H. Chowdhury, M Murugappan; Data Curation: Md Mostafizur Rahman, Md Shafiq Ullah, Md Habibur Rahman, Sd Maria Khatun Shuvra; Literature Review: S M Masfequier Rahman Swapno Formal Analysis: M Murugappan, Rezaul Haque, Md. Redwan Ahmed, Md Habibur Rahman; Methodology: Mnr, Muhammad E. H. Chowdhury, M Murugappan, V Saravanan; Supervision: M Murugappan, Muhammad E. H. Chowdhury, Md Habibur Rahman Software: Md Mostafizur Rahman, Md Najmul Gony, Md Shafiq Ullah, S M Masfequier Rahman Swapno; Validation: Gomesh Nair, Md. Redwan Ahmed, Md Habibur Rahman, Rezaul Haque, S M Masfequier Rahman Swapno; Visualization: Sd Maria Khatun Shuvra, Md Shafiq Ullah, Md Habibur Rahman; Writing - Original Draft: Mnr, Md Najmul Gony, Muhammad E. H. Chowdhury; Review & Revise Draft: M Murugappan, Gomesh Nair, V Saravanan, S M Masfequier Rahman Swapno.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Rahman, M.M., Gony, M.N., Ullah, M.S. et al. MaizeFormerX: a lightweight vision transformer with cross-scale attention for explainable maize leaf disease diagnosis. Sci Rep (2026). https://doi.org/10.1038/s41598-026-44550-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-44550-0