Abstract
To achieve agricultural automation, deep learning applications for early and accurate disease detection in tomato plants have been extensively developed. However, there is a fundamental trade-off between computational efficiency and diagnostic accuracy in resource-constrained agricultural edge environments. This paper proposes an evaluation framework for seven architectures that represent standard, efficient, and hybrid CNN structures to assess their implementation potential. Through evaluations of explainability, computational efficiency, and diagnostic performance, seven lightweight architectures (ShuffleNetV2, MobileNetV3-Small, SqueezeNet, MobilePlantViT, DenseNet121, ResNet50, and VGG16) are thoroughly examined. Three significant findings are derived from experiments conducted on a subset of tomato diseases in the PlantVillage dataset. First, the MobilePlantViT architecture accurately strikes the ideal balance between efficiency and performance. Second, in order to quantitatively assess the explainability of XAI models (Grad-CAM, SHAP, and LIME) and identify the best option for edge devices, we propose the perturbation stability score (PSS) metric. Third, we test CPU inference measurements to better reflect the actual scenario and find that the hybrid design effectively leverages parallel computing. According to these findings, MobilePlantViT is the ideal architecture for applications that require operation on edge devices with limited resources and achieve high diagnosis accuracy (above 99.5%).
Data availability
The datasets generated and/or analysed during the current study were obtained from two publicly available datasets (PlantVillage and PlantDoc) to create subsets of tomato data. We enhanced the quality of the PlantDoc dataset by collaborating with experts to identify and crop regions containing disease-specific symptoms, while eliminating irrelevant image content. For long-term preservation and ease of access, we have stored copies of the datasets in the published repository. The datasets are available at the following links: https://www.kaggle.com/datasets/cthngon/tomato-plantvillage-datasets, https://www.kaggle.com/datasets/cthngon/tomato-only.
References
Ahmed, T., Noman, M., Shahid, M., Hameed, A. & Li, B. Pathogenesis and disease control in crops: The key to global food security. Plants 12, 3266. https://doi.org/10.3390/plants12183266 (2023).
Alkhaled, A. & Mayhoub, M. Smart detection of tomato leaf diseases using transfer learning-based convolutional neural networks. Agriculture 13, 139. https://doi.org/10.3390/agriculture13010139 (2023).
Kasera, R. K., Nath, S., Das, B., Kumar, A. & Acharjee, T. IOT enabled smart agriculture system for detection and classification of tomato and brinjal plant leaves disease. Scalable Comput. Pract. Exp. 26, 96–113 (2025).
Yasin, A. & Fatima, R. On the image-based detection of tomato and corn leaves diseases: An in-depth comparative experiments. arXiv preprint 10.48550/arXiv:2312.08659 (2023).
Prince, R. H. et al. Csxai: A lightweight 2d CNN-SVM model for detection and classification of various crop diseases with explainable AI visualization. Front. Plant Sci. 15, 1412988 (2024).
Pal, C., Karmakar, S., Mukherjee, I. & Chakrabarti, P. P. A lightweight and explainable cnn model for empowering plant disease diagnosis. Sci. Rep. 15. https://doi.org/10.1038/s41598-025-94083-1 (2025).
Karim, M. J. et al. Enhancing agriculture through real-time grape leaf disease classification via an edge device with a lightweight cnn architecture and grad-cam. Sci. Rep. 14, 16022 (2024).
Howard, A. et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 1314–1324 (2019).
Zhang, X., Zhou, X., Lin, M. & Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6848–6856 (2018).
Iandola, F. N. et al. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).
Zhang, J., Yang, X., Fu, X., Wang, B. & Li, H. Ldl-mobilenetv3s: An enhanced lightweight mobilenetv3-small model for potato leaf disease diagnosis through multi-module fusion. Front. Plant Sci. 16, 1656731 (2025).
Zhou, H. et al. Identification of leaf diseases in field crops based on improved shufflenetv2. Front. Plant Sci. 15, 1342123 (2024).
Tegegne, A. G., Walle, Y. M., Haile, M. B., Yehulu, G. T. & Yohannes, S. T. Comparative evaluation of cnn architectures for wheat rust diseases classification. Discov. Appl. Sci. 7, 1070 (2025).
Albahli, S. Agrifusionnet: A lightweight deep learning model for multisource plant disease diagnosis. Agriculture 15, 1523 (2025).
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR) (2021).
Mehta, S. & Rastegari, M. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. In International Conference on Learning Representations (ICLR) (2022).
Jia, S. et al. Convtransnet-s: A cnn-transformer hybrid disease recognition model for complex field environments. Plants 14, 2252 (2025).
Borhani, Y., Khoramdel, J. & Najafi, E. A deep learning based approach for automated plant disease classification using vision transformer. Sci. Rep. 12, 11554 (2022).
Han, Z. & Sun, J. Tomato leaf diseases recognition model based on improved mobilevit. In 2024 IEEE 4th International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA). 1205–1209. https://doi.org/10.1109/ICIBA62489.2024.10868553 (IEEE, 2024)
Ding, Y. & Yang, W. Classification of apple leaf diseases based on mobilevit transfer learning. In International Conference on Image Processing and Artificial Intelligence (ICIPAl 2024). Vol. 13213. 384–390 (SPIE, 2024).
Sharma, V. et al. Soyatrans: A novel transformer model for soybean leaf disease classification. Exp. Syst. Appl. 260. https://doi.org/10.1016/j.eswa.2024.125385 (2025) .
Sharma, V. et al. Clgannet: A novel method for maize leaf disease identification using clgan and deep cnn. Signal Process. Image Commun. 120, 117074. https://doi.org/10.1016/j.image.2023.117074 (2024).
Selvaraju, R. R. et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 618–626. https://doi.org/10.1109/ICCV.2017.74 (2017).
Zhong, Y., Huang, B. & Tang, C. Classification of cassava leaf disease based on a non-balanced dataset using transformer-embedded resnet. Agriculture 12, 1360 (2022).
Alhammad, S. M., Khafaga, D. S., El-Hady, W. M., Samy, F. M. & Hosny, K. M. Deep learning and explainable AI for classification of potato leaf diseases. Front. Artif. Intell. 7, 1449329 (2025).
Hughes, D. P. & Salathé, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv preprint arXiv:1511.08060 (2015).
Ferentinos, K. P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–322. https://doi.org/10.1016/j.compag.2018.01.009 (2018).
Mohanty, S. P., Hughes, D. P. & Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7, 1419. https://doi.org/10.3389/fpls.2016.01419 (2016).
Barbedo, J. G. A. Factors influencing the use of deep learning for plant disease recognition. Biosyst. Eng. 172, 84–91 (2018).
Picon, A. et al. Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Comput. Electron. Agric. 161, 280–290. https://doi.org/10.1016/j.compag.2018.09.037 (2019).
Gao, Y. et al. Benchmarking yolov8 to yolov13 for robust hand gesture recognition in human-robot interaction. Sci. Rep. 15, 40043 (2025).
Ultralytics. Ultralytics yolo: Model architectures and multi-task vision framework. In Technical Documentation (2025). Accessed 2024–2025.
Roboflow Team. Rf-detr: A real-time detection transformer. In Roboflow Technical Report / Blog (2024).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR) (2015).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778 (2016).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4700–4708 (2017).
Ma, N., Zhang, X., Zheng, H.-T. & Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV). 116–131 (2018).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR) (2015).
Microsoft. Onnx runtime: Cross-platform, high performance ml inferencing and training accelerator. https://onnxruntime.ai/ (2023).
Ribeiro, M. T., Singh, S. & Guestrin, C. why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144. https://doi.org/10.1145/2939672.2939778 (2016).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (NeurIPS). Vol. 30 (2017).
Yeh, C.-K., Hsieh, C.-Y., Suggala, A., Inouye, D. I. & Ravikumar, P. K. On the (in)fidelity and sensitivity of explanations. In Advances in Neural Information Processing Systems. 10965–10976 (2019).
Ghorbani, A., Abid, A. & Zou, J. Interpretation of neural networks is fragile. Proc. AAAI Conf. Artif. Intell. 33, 3681–3688 (2019).
Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).
Acknowledgements
This work was supported by the project B2026-MHN01.01 and Posts and Telecommunications Institute of Technology (PTIT).
Funding
This work was supported by the project B2026-MHN01.01 and PTIT.
Author information
Authors and Affiliations
Contributions
T.M. Hoang gives the main concept. T.M. Hoang and A.T. Pham conceived the experiments and revised the manuscript. The experiments were conducted by V.H. Bui, V.S.Nguyen, and D.T. Doan, while H.A.Dang analyzed the results. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hoang, TM., Bui, VH., Nguyen, VS. et al. A comprehensive evaluation of lightweight deep learning models for tomato disease classification on edge computing environments. Sci Rep (2026). https://doi.org/10.1038/s41598-026-42439-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-42439-6