IoT integrated CNN framework for automated detection and quantification of rice and potato crop diseases

Verma, Gaurav; Saxena, Abhishek Kumar; Rai, Mritunjay; Shaheen, Momina; Naaz, Sameena

doi:10.1038/s41598-025-22117-9

Download PDF

Article
Open access
Published: 31 October 2025

IoT integrated CNN framework for automated detection and quantification of rice and potato crop diseases

Gaurav Verma¹,
Abhishek Kumar Saxena¹,
Mritunjay Rai¹,
Momina Shaheen² &
…
Sameena Naaz²

Scientific Reports volume 15, Article number: 38199 (2025) Cite this article

2645 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

In modern precision agriculture, early and accurate identification of crop diseases is crucial for reducing yield loss and minimizing pesticide overuse. This study proposes an IoT-enabled framework that integrates convolutional neural networks (CNNs) with image processing techniques for automated classification and quantification of diseases in rice and potato crops. A custom-curated dataset was developed, comprising over 1,800 images acquired through smartphone cameras and foldscope devices under natural lighting conditions. The proposed CNN model achieved a classification accuracy of over 95%, with a disease quantification accuracy of 90.5%, calculated using pixel-level segmentation of infected regions. Experimental results revealed infection percentages ranging from 0.68% in early-stage cases to 13.98% in severely affected samples, enabling precise disease severity analysis. The framework includes a MATLAB-based graphical user interface (GUI) for real-time visualization of classification results and severity scores. Training convergence was demonstrated with a mini-batch loss reduction from 1.0879 to 0.0094 over 200 iterations, and classification confidence scores exceeding 90% for most disease categories. In addition to software implementation, the model was synthesized for hardware deployment using FPGA, demonstrating less than 5% LUT and 1% register usage for 512 × 512 images, ensuring resource-efficient performance in IoT environments. This work introduces a scalable, field-deployable tool for crop health monitoring, with potential to enhance sustainable farming practices through timely disease management.

Introduction

Agriculture remains a foundational pillar of many national economies, and ensuring crop health is vital for sustainable food production. However, the traditional practice of manual disease detection in crops is labor-intensive, subjective, and unsuitable for large-scale operations. In crops such as rice and potato, diseases like rice hispa, stem borer, potato blight, and beetle infestations can cause significant damage if not identified at an early stage. The need for timely and precise diagnosis has led to the exploration of automated systems powered by artificial intelligence and the Internet of Things (IoT)^1,2,3,4. Integrating image processing with deep learning offers a promising avenue for detecting and managing plant diseases efficiently, especially when implemented in real-time systems⁵.

Several prior studies have explored machine learning-based classification techniques, including Support Vector Machines (SVMs), Artificial Neural Networks (ANNs), and clustering-based segmentation^6,7,8. For example, SVMs is used for pest classification⁹, while other works have applied CNNs for plant disease detection with good results. Nevertheless, these methods often fall short in practical deployment—they primarily focus on disease classification alone, neglecting the quantification of infection severity and lacking real-time, field-deployable frameworks. Moreover, foldscope-based image acquisition, which is cost-effective and highly portable, has not been widely utilized in mainstream agricultural diagnosis.

This study addresses these limitations through a novel IoT-integrated framework that combines CNNs with pixel-level image segmentation for simultaneous classification and quantification of crop diseases. A custom dataset was created using both smartphone and foldscope imaging under field conditions to ensure robustness and variability. The system is further enhanced with a MATLAB-based graphical user interface (GUI) for intuitive interaction and visualization. In addition, the model was successfully synthesized for hardware deployment on FPGA, demonstrating efficient resource utilization and suitability for real-time applications in resource-constrained environments.

The novelty of our work lies in several key enhancements:

Microscopic data acquisition using foldscope One of the major additions is the use of foldscope—a paper-based, low-cost optical microscope—for capturing magnified images of crop diseases in the field. This allows us to observe subtle, microscopic symptoms such as fungal spore formation, early pest traces, or cell-level discoloration that are not visible in conventional smartphone images. These fine features are especially important for detecting infections at an early stage.
Dual imaging strategy By using both smartphone cameras and foldscope attachments, we create a diverse dataset covering both macro-level (visible symptoms) (smartphone) and micro-level (structural detail) (foldscope) aspects of crop diseases. This hybrid approach improves the model’s capacity to detect and quantify diseases with higher precision, especially when symptoms are not obvious to the naked eye including infection rates as low as 0.68%.
Automated classification and quantification module CNN-based classification model achieving up to 95% accuracy, validated across multiple disease categories. We go beyond classification by implementing an image quantification method that calculates the extent of infected areas using pixel-based segmentation. This allows for disease severity assessment, which is critical for timely intervention in precision agriculture. Pixel-wise quantification of infected areas using binary segmentation, yielding infection severity ranges from 0.68% to 13.98%.
IoT-based deployment and real-time analysis We have incorporated an Android-based system that uploads captured images to a server over wireless networks, where they are processed automatically. This feature enables real-time field monitoring without the need for manual data transfer.
FPGA implementation Unlike previous studies that focus purely on software simulation, our CNN model is synthesized for FPGA-based hardware deployment. This allows the system to operate efficiently in power- and resource-constrained environments typical of agricultural settings. Hardware acceleration using FPGA, with synthesis results showing < 5% LUT and ~ 3.79% resource utilization, supporting low-power, edge deployment. A comparative analysis table added to demonstrate superiority over previous SVM/ANN-based methods in terms of accuracy, functionality, and implementation scalability.

The rest of the paper is organized as follows: Sect. 2 discusses the methodology in detail, including data acquisition, preprocessing, and CNN architecture. Section 3 presents the results and analysis. Section 4 explores the hardware implementation. Finally, Sect. 5 concludes the study and highlights future research directions.

The following nomenclature is used in the presented work.

Symbol/term	Description
CNN	Convolutional Neural Network – a deep learning model used for image classification
IoT	Internet of Things – network of interconnected devices enabling data exchange
GUI	Graphical User Interface – visual interface for user interaction with the system
ROI	Region of Interest – the image area selected for processing/analysis
LUT	Look-Up Table – digital memory used in FPGA resource estimation
FPGA	Field-Programmable Gate Array – hardware used for custom logic implementation
HDL	Hardware Description Language – used to describe hardware logic (e.g., VHDL, Verilog)
nnz(·)	Number of non-zero elements in a matrix or binary image
numel(·)	Total number of elements in a matrix or image
HSV	Hue-Saturation-Value – a color model used in image segmentation
RGB	Red-Green-Blue – standard color space in digital imaging
Foldscope	Low-cost paper-based microscope used for micro-level image acquisition
Infection Percentage (%)	Percentage of infected pixels over total pixels in an image or ROI
Quantification	Numerical estimation of disease severity based on pixel analysis
Accuracy (%)	Percentage of correctly classified images out of total predictions
Segmentation	Process of dividing an image into meaningful parts, e.g., infected vs. healthy regions
ZedBoard	A specific development board used for FPGA implementation (Xilinx Zynq-7000 SoC)

Related work

Ebrahimi et al.⁹ describes the implementation of SVM with difference kernel function for parasitic classification and error evaluation using MSE, RMSE, MAE and MPE. Pratheba et al.¹⁰ describes importance of image simplification by segmentation algorithms such as K-means and fuzzy mathematical analysis for performance comparison. Liu et al.¹¹ discusses the imaging processing pipeline to optimize performance and resource utilization taking into consideration the characteristics of microscopic camera and FPGA. Boissard et al.¹² presented an advanced automatic interpretation of images of scanned roses leaves combining image processing and knowledge-based learning techniques. Mehdi et al.¹³ presented approach to detect and count different sized soyabean aphids on leaf grown in greenhouse captured with an inexpensive regular digital camera. Miloto et al.¹⁴ presented the usage of conv net or CNNs for weed classification in soybean crop pictures between grass and broadleaf to suggest the herbicide for spotted weed with 98%. Sarkar et al.¹⁵ engrossed a system to find the fault area on a defect leaf then the ratio of fault and normal portion of that leaf using k-means approach.

While numerous studies have explored image-based classification of crop diseases using machine learning and deep learning models, most of them focus solely on categorical identification of disease types. However, they often lack a mechanism to quantify the severity or spatial extent of the disease on the leaf or crop surface. The term “quantification”, in this context, refers to the pixel-level estimation of the infected region, providing a numerical measure of how much of the plant tissue is affected. Existing literature generally omits this step or relies on approximate scoring or manual annotations to estimate disease severity. For instance, prior works using SVMs or CNNs have reported high classification accuracy but do not assess the infected area as a percentage of total leaf area, which is critical for precision agriculture and early-stage treatment decisions. This limitation in prior research creates a gap in automated disease management pipelines, as classification alone does not inform the urgency or scale of intervention required. In contrast, the current study addresses this by implementing a quantitative analysis framework that calculates the infection percentage using segmented binary masks. This numerical result, derived through image processing, complements classification by providing real-time, actionable data on disease severity, enabling farmers to prioritize interventions based on severity thresholds.

Most existing works focus solely on disease classification without quantifying the extent of infection. Real-time, field-deployable IoT frameworks for crop disease monitoring are limited or underexplored. Few studies leverage foldscope-based microscopic imaging to enhance early-stage disease detection. Prior methods lack hardware implementation results, limiting applicability in resource-constrained environments. The present literature lacks in terms on integrated classification and quantification approaches for crop disease detection. Moreover, the presented work includes novel self-created dataset. The state of the art work includes CNN and neural network architectures for image classification but the accuracy and confidence parameter reported were limited.

Methodology

The presented system involves image acquired using smartphone camera over diseased crop areas. The presented work deals with the image processing techniques from Matlab image processing toolbox targeted for potato crop to detect, quantify and classify two of the diseases namely Potato Blight and Red Lady Beetle Bug. The area of study also focuses on rice crops diseases rice hispa and stem borer¹⁶. The image-based examination characterizes color transformation of leaves and other parts with visibility of pests in few. The images are acquired through normal smartphone camera as well as foldscope embedded with the smartphone camera from the fields uploaded through android application on server and further processed through the MATLAB interface. The algorithms used in the process are color thresholding, Masking, k-means clustering, segmentation, and filtering and same is shown in Fig. 1.

To analyze pest / infections on any crop a systematic approach for quantitative and qualitative analysis for pest and disease discrimination is must as shown in Fig. 2.

The majority of the images are captured from infected fields manually using a smartphone camera while manual inspection. The training of the proposed CNN architecture required images to be modified to enhance dataset with presented algorithm. The dataset images are acquired in sufficiently illuminated natural environment. The images are acquired through smartphone camera with 12 mega-pixel camera which captures images in range of 3000 × 4000 pixels of data about 2 mega-bit per image. Images are also acquired with foldscope embedded on smart phone and further processed. The foldscope with flexure mechanisms is folded into a flat compact image acquisition device^17,18. The technical specifications of foldscope shown in Fig. 3 include imaging resolution of around 250 nm and magnification scale of order of 140x. The acquired image dataset requires filtering and noise elimination while pre-processing along with resizing, clipping and cropping of unwanted regions. Moreover, noise removal, contrast enhancement, histogram equalization is implemented. Region of Interest (RoI) is extracted leaving the background in residual to ease classification and extracting important features. The k-means algorithm classifies pixels’ segregates feature set into k classes clustered depending on characteristic features. K-means segmentation uses statistical features for dividing pictorial data into segments¹⁹. Each image’s features are kept in a separate database that serves as a disease database. In two discrete steps, the k-means method divides a given collection of data into k numbers of discrete clusters. In the first phase, the k-centroid is evaluated, and in the second phase, each point is assigned to the cluster that has the closest centroid to the relevant data point, as measured by Euclidean distance. The centroid for each cluster is defined as the point to which the sum of distances from all the objects in that cluster to be minimum. K-means is an iterative technique that reduces the total distance between each sample and its cluster centroid across all clusters^20,21.

In our approach, image data of diseased crop samples is collected using smartphone cameras and foldscope attachments directly in the field. This data is transmitted through an Android-based application to a centralized server for processing. This step constitutes the IoT layer of the framework, where mobile devices act as edge nodes capturing and relaying data in real-time. The server then performs disease detection and quantification using the CNN-based image analysis system integrated within a MATLAB interface. The IoT framework ensures continuous connectivity between data acquisition and analysis, enabling timely diagnosis and remote monitoring of crop health. This study proposes an IoT-enabled framework that integrates CNNs with image processing techniques for automated classification and quantification of diseases in rice and potato crops. A custom-curated dataset was developed using over 1,800 images captured through smartphone and foldscope devices. These devices function as edge nodes and are linked to an Android-based application that wirelessly uploads the images to a centralized server for processing. The server performs real-time classification and severity quantification using a CNN model, enabling seamless communication and data flow in an Internet of Things (IoT) environment. In addition to the image processing pipeline, the system integrates IoT functionality to enable real-time data acquisition and processing. Images of crop diseases are captured in the field using smartphone cameras and foldscope attachments and are uploaded through a custom Android application. This application transmits data wirelessly to a central processing server using cellular or Wi-Fi networks. The server hosts the trained CNN model and performs automated classification and quantification. This IoT-based architecture ensures field-level deployment, enabling continuous monitoring and rapid feedback without requiring manual transfer of image data. It also provides scalability for integrating multiple imaging nodes across different field locations. The proposed system’s IoT backbone plays a critical role in enabling remote agricultural diagnostics. The integration of smartphone and foldscope-based imaging tools with a cloud-connected Android application forms the front end of the IoT network. These tools allow farmers or field agents to capture disease images and upload them to a cloud server in real time. The server, acting as the centralized intelligence unit, performs inference tasks using the trained CNN and sends back results, including disease type and severity scores. This IoT setup not only facilitates real-time diagnosis but also supports field-level decision-making and timely intervention for crop protection.

CNNs display impressive performance in computer-vision tasks including classification. Deep learning-based CNN architectures allow every input image to be processed by various convolutional, pooling, and fully connected (FC) layers^22,23. Convolutional layer extracts characteristic features such as edges, patterns while preserving source pixel-relations. Padding extracts only valid regions and discard those are incompatible while filter-fitting. Activation functions like ReLU is responsible for non-linearity of network and regulates negative, out of range values while computation. Pooling layers minimize number of parameters for significantly large dataset while the essential information is retained. FC layer flattens 2- or 3-dimensional matrix into single dimensional vector. CNN models are trained and tested for input image to classify it with probabilistic values between 0 and 1. Starting with input layer, and then feature learning stages including convolution + ReLU followed by Pooling layers in a repeated manner²⁴. Finally, the classification is done by flatten layer, fully connected layer followed by softmax activation function. The section on pooling layers takes into account lowering the parameters for huge architecture while maintaining the necessary data. Matrix is flattened into a vector by the fully connected layer and fed to the neural network. SoftMax is applied as a classification function together with the acquired features for inference and training using samples of several crops, including rice and potatoes^25,26. The infections detected for rice crop and for potato crop are as shown in Figs. 4, 5, 6 and 7. The visual examination exposes the required characteristics such as colour alteration of diseased crop with pest visible in some of the prepared dataset. Figure 4b (Rice Hispa) represents an infected crop. Rice Hispa (Dicladispa armigera) is a pest that causes visible damage to rice leaves. In Fig. 4a, the acquired RGB image shows the typical damage pattern—linear scraping along the leaf surface, creating white or silver streaks. These streaks are a result of the insect feeding on the leaf’s chlorophyll layer. In Fig. 4b, the segmented image highlights the affected region using color-based thresholding and k-means clustering, isolating the damaged zones from healthy leaf areas. Fig. 5 (Stem Borer) shows signs of internal tissue damage caused by stem borer larvae. The segmented image helps highlight discolored or collapsed areas on the stem, which are early indicators of infestation. Fig. 6 (Potato Blight): The disease causes dark, irregular patches on the leaf surface. Fig. 6a shows these visual symptoms in RGB, while 6(b) presents segmentation in HSV color space to isolate the infected portions. Fig. 7 (Lady Beetle Bug on Potato): While lady beetles are typically considered beneficial insects, their presence and feeding activity on young crops can sometimes lead to minor damage. This figure includes both pest detection and segmentation of surface damage, helping differentiate it from healthy leaf tissue.

The quantification is performed by two operations namely masking and binarization of images using software. The mathematical calculations binary images are used to quantify the segmented portion. Equation (1) describes the simplest portion to calculate the amount of black and white pixels in a binary image where nnz depicts number of non -zero matrix elements and numel represents number of array elements. As binary images are black and white images, so infection is evaluated by percentage of white portion in image.

$$\:\%\:black=(1-nnz(b)/numel\left(b\right))*\:100$$

(1)

Binarization and quantification are closely related steps in our image processing workflow, but they serve different purposes. Binarization is the process of converting a grayscale or color image into a binary image consisting of only two pixel values—typically black (0) and white (1). This is done using thresholding techniques where pixels representing infected or abnormal regions are assigned a value of 1 (white), and healthy or background regions are assigned a value of 0 (black). Quantification refers to the measurement or calculation of the proportion of infected area in an image. This is carried out using the binarized image by counting the number of white pixels (representing infection) and dividing it by the total number of pixels in the ROI. Binarization is a preprocessing step, and quantification is the measurement derived from it. They are not exactly the same, but they are sequentially connected.

The classification using proposed CNN architecture is presented in this work. The supervised learning is utilized to determine the accuracy of diseased classification. In this case, the training dataset initially contained sets of 50 images for each class to obtain accuracy of over 94.3%. If the disease is successfully classified, then quantification is performed. Masking approach applied to process images acquired and segment them deploying color thresholder tool in software to evaluate quantification of infection. The quantification of infection which is followed after successful classification is obtained with accuracy of 90.5%. The MATLAB GUI is developed for displaying the results of classification and quantification. The database for the presented work is prepared manually by the authors integrated with some google images as shown in Table 1.

Table 1 Sample data statistics for crop disease detection.

Full size table

The core algorithmic flow used in the presented work is summarized in the pseudocode presented below.

Results and analysis

The majority of prepared dataset images were acquired manually with a common smartphone camera after the manual review on-field. The dataset was prepared in natural environment with an ample brightness. The training of the proposed CNN architecture requires pre-processing and editing of images to expand the dataset. The classification and detection images acquired has noise or some irrelevant background information that needs to be eliminated by pre-processing. The algorithm is implemented in MATLAB, thereafter, numerous pre-processing methods are implemented, and finally detection of disease is done using CNN. Initially, once the infected components are obtained, masking is done and quantification is performed for identifying how much percentage of infected area exists. Figs. 8a and 9a represents mask image required for quantifying which needs to be binarized as shown in Figs. 8b and 9b. Histogram is data structure to store frequencies at various levels in images as depicted in Figs. 8c and 9c histogram for the given image Histogram equalization implies number of pixels in image which will specify pixel intensity value (range of bits 0-255). Figs. 8d and 9d shows histogram equalized image. Histogram of equalized image is depicted in Figs. 8e and 9e.

The actual amount of attenuation for each frequency varies depending on design of filter. Smoothing is basically a low pass filtering whereas Sharpening is essentially a high pass filtering and edges of images can be preserved using median filtering while removing noise. For smoothening image features, a 3 × 3 smoothing filter is applied across all the image pixels. Filtering of images is depicted in Figs. 8f–h and 9f–h. Below figures shows the basic image per processing operations required to be performed on an image before analysis of crop detection. The main objective of our study is not only to classify the presence of disease but also to assess the severity of infection. Quantification enables us to assign a numerical infection percentage to each sample, helping in disease severity analysis. This information is crucial for precision agriculture applications where treatment decisions depend on how severely a plant is affected. For example, Figs. 8b and 9b represent segmented binary masks, which form the basis for quantifying infection percentages. Image classification based hard classification techniques including SVM/CNN are defined where each pixel is allocated a separate single class and output is definitive decision about predefined classes²⁷. As compared to k-means, SVM and ANN do not use covariance matrix or parameters such as mean vectors. SVM and ANN are non-parametric classifiers or per pixel classifiers and do not hire statistical limitations to compute parting among classes. SVM classification with clustering cannot be processed on foldscope images as they do not scatter into clusters. Gray level co-occurrence matrix (GLCM) is used for examination of texture with emphasis on relationship of pixels. Features which were used for the infected region segmentation include energy, covariance, correlation, entropy, contrast, inverse difference, homogeneity are chosen for this work.

The tabulated data as shown in Table 2 represents the training report and accuracy per iteration and learning rate generated during training phase of proposed CNN architecture. The training was done with 50 images per category to achieve accuracy of more than 95%.

Table 2 Layer wise description of the CNN trained.

Full size table

Table 3 summarizes key training metrics for a machine learning model over 200 iterations (or samples), grouped by epochs (in this case, 1 epoch = 1 iteration, possibly for simplicity). The training performance of the proposed model was assessed through key metrics recorded at regular intervals over the course of 200 iterations. These metrics include mini-batch classification accuracy, loss function value, elapsed training time, and a fixed learning rate of 1.0 × 10^− 4 .

Table 3 Results obtained from the trained CNN.

Full size table

At Iteration 1, the mini-batch accuracy was recorded at 51.56%, with a corresponding loss of 1.0879. These values are indicative of the model’s initial state, where parameter weights are randomly initialized or only minimally informed. The moderate accuracy suggests that the model has yet to learn the underlying patterns within the data. By Iteration 50, a marked improvement was observed. The accuracy increased substantially to 93.75%, while the loss decreased to 0.8328, reflecting the model’s rapid adaptation to the data through successive weight updates. This stage demonstrates effective convergence behaviour in the early phase of training. At Iteration 100, the model achieved 100% mini-batch accuracy, and the loss reduced further to 0.1022. This perfect classification accuracy indicates that the model has effectively fit the training samples presented in the mini-batch. The corresponding reduction in loss underscores improved prediction confidence and alignment with the target labels. Subsequent iterations (150 and 200) continued to yield 100% accuracy, with the loss values further decreasing to 0.0195 and 0.0094, respectively. This consistent reduction in the loss function, despite already achieving maximum classification accuracy, suggests ongoing refinement of the model’s internal representations.

While the training results indicate excellent convergence and highly accurate predictions on the training batches, it is important to note that training accuracy alone is not a reliable indicator of generalization. The absence of validation performance metrics in this dataset precludes definitive conclusions about the model’s effectiveness on unseen data. Hence, further evaluation using independent validation or test sets is necessary to confirm the model’s robustness and to rule out overfitting.

Fig. 10 displays two-line graphs illustrating the training performance of a model over 200 iterations. The upper graph represents accuracy (%), while the lower graph shows the loss values. Both graphs use iteration number as the x-axis, allowing a direct comparison of how accuracy and loss evolve during training. The graphical representation of the training process comprises two subplots: the upper plot illustrates the model’s classification accuracy over successive iterations, while the lower plot depicts the corresponding loss values. Both metrics are evaluated on mini-batches of training data and plotted against iteration count, extending to 200 iterations. Table 3 summarizes the training dataset performance in terms of iteration count, mini-batch accuracy, and mini-batch loss, based on the plotted data in Fig. 10. As stated, this table reflects the training data performance only; validation and test datasets were not included in this evaluation phase. We also appreciate the recommendation to include separate evaluations for validation and test datasets in future work. This is indeed important for assessing model generalization. We have acknowledged this point in the revised manuscript and plan to incorporate these datasets in future studies.

In the accuracy plot, the model exhibits a clear and consistent upward trajectory in performance. The initial classification accuracy is slightly above 50%, indicating a random or near-random level of prediction at the start of training. A rapid improvement is observed within the first 50 iterations, with accuracy surpassing 90%. By approximately the 100th iteration, the model achieves 100% training accuracy, which it maintains for the remainder of the training cycle. This indicates that the model is capable of fully fitting the training data, with no classification errors observed in the mini-batches at later stages. The loss curve complements this behaviour, showing a steady and continuous decline from an initial value near 1.0. The loss decreases gradually during the early phase of training and then more sharply around iteration 60 to 100. After reaching this point, the loss values continue to diminish, approaching zero by iteration 150 and remaining near zero until the conclusion of training. The reduction in loss corresponds with increased confidence and correctness in the model’s predictions. This pattern of decreasing loss and increasing accuracy is indicative of effective model convergence. However, it is important to note that perfect training accuracy may not necessarily reflect strong generalization to unseen data. In the absence of validation or test set performance metrics, there is a potential risk of overfitting, whereby the model memorizes the training samples without capturing generalizable patterns. The interface displayed by Fig. 11 represents a graphical user interface (GUI) designed for the automated detection and classification of plant diseases, specifically applied to leaf images. The system workflow involves three primary stages: image input, ROI segmentation, and classification, with additional functionalities for training and visualization.

The top-left section of the interface shows the original leaf image selected for analysis. The leaf appears to exhibit visible discoloration and lesions, which are common visual symptoms of plant pathology. This image serves as the input for further processing and analysis. The top-right panel illustrates the ROI segmentation result. The segmentation algorithm isolates and highlights the infected regions of the leaf by applying feature-based image processing techniques. The output image displays color transformations and boundary enhancements that indicate localized affected zones, enabling precise analysis of pathological patterns. The histogram shown below the ROI image depicts the pixel intensity distribution of the segmented image. This histogram helps in understanding the contrast variations and pixel class separability, which are critical for feature extraction and subsequent classification. The central text field indicates the disease identified by the model, which in this case is labeled as “Potato-Blight.” This diagnosis suggests the presence of late blight, a severe fungal disease caused by Phytophthora infestans, commonly affecting potato crops. The diagnosis is presumably based on characteristic visual markers identified through pattern recognition in the segmented image. The numerical output (54.1408) is likely a classification confidence score or disease severity index, which quantitatively represents the extent of infection or the confidence level of the classifier. This value, although moderate, suggests a significant presence of disease-specific features. The left panel includes interactive buttons labeled “Browse,” “Classify,” and “Training.” These allow the user to upload images, perform disease classification, and initiate model training, respectively. This modular design enhances the usability of the system for real-time or batch image processing.

The segmentation result shown in Fig. 11 includes part of the surrounding background, which may reduce the visual precision of region-of-interest (ROI) isolation and does not fully align with the high accuracy values reported during training. This observation is valid and reflects an important distinction between classification accuracy and segmentation precision. The reported 100% training accuracy in our model is derived from classification performance—i.e., correctly identifying the disease class from labeled training images. However, the ROI segmentation, as shown in Fig. 11, is a preprocessing step and is not directly learned by the CNN model. Instead, it is based on manual thresholding and basic feature segmentation techniques (e.g., k-means, masking), which may not isolate the infected regions with perfect precision—particularly in complex natural images where infected areas blend with healthy tissue or background noise. The manuscript that the CNN model performs classification on pre-segmented image patches, and that segmentation is not performed using a deep learning model. The segmentation stage is a current limitation and may contribute to minor inaccuracies in quantification or visualization.

The GUI demonstrates a systematic approach to plant disease detection using image processing and classification techniques. The integration of segmentation, visualization, and automated diagnosis provides a useful tool for agricultural monitoring. The identification of “Potato-Blight” is consistent with the observed visual symptoms in the leaf. However, the accuracy and generalizability of such a system should be validated through comprehensive testing across varied datasets, including multiple crop types and disease conditions.

Fig. 12 shows GUI, is designed to facilitate the identification of biological entities, particularly pests or insects, and based on visual analysis of plant imagery. The interface is structured to support image upload, ROI segmentation, feature visualization, and classification. The top-left section displays the original input image of a leaf, on which a red insect is visibly present. The specimen is located near the center of the image, exhibiting a bright red coloration with black markings, characteristics typically associated with beetles. The top-right panel shows the output of the segmentation module. The region corresponding to the red insect has been successfully isolated from the background using color and shape-based segmentation techniques. The background is rendered black, enhancing contrast and allowing precise feature extraction from the foreground object. The lower-left portion of the interface includes a histogram derived from the segmented image. This histogram presents the distribution of pixel intensities, which may be used to assess the color composition and structural variance of the detected object. Such information is essential for the classification phase. The central label indicates the system’s classification result: “Red-Beetle.” This label suggests that the extracted features from the segmented object match those of a known beetle species characterized by red coloration.

The classification is presumably based on morphological and colorimetric parameters processed during the feature analysis step. The numeric output, 52.5415, likely represents a confidence value or probability score associated with the classification outcome. This value indicates a moderate level of certainty in the identification and suggests that while the features are consistent with the Red-Beetle class, further validation might be beneficial. The left-side panel of the GUI includes interactive buttons for “Browse,” “Classify,” and “Training.” These allow the user to upload images, execute the classification process, and perform model training, respectively. This design supports user interaction and system retraining with new datasets as needed. It effectively demonstrates an integrated approach for biological entity detection and classification in agricultural environments. The segmentation process accurately isolates the insect from the leaf background, and the classification result aligns with the visual features observed in the original image. The system provides a useful platform for early pest identification, which is critical for integrated pest management practices. Nonetheless, the moderate confidence score highlights the importance of incorporating additional classification metrics and validation using ground-truth data for enhanced reliability.

The GUI shown in Fig. 13 is designed for the detection and classification of agricultural pests using image processing techniques. The example displayed focuses on identifying the presence of Rice Hispa, a known pest affecting rice crops. The image in the upper center depicts rice leaves with visible linear feeding scars and a small dark beetle-like organism, which is characteristic of Rice Hispa infestation (Dicladispa armigera). This pest typically feeds on the chlorophyll-rich epidermis, leading to desiccation and reduced photosynthetic capacity. On the upper right, the ROI segmented image isolates the regions that exhibit potential damage symptoms. The segmentation process emphasizes the structural and chromatic features (e.g., pale strips and dark spots) by enhancing contrast and filtering out the background. The segmented image presents high-frequency patterns associated with pest activity, aiding in the subsequent classification. The histogram in the lower left quantifies the pixel intensity distribution within the segmented image. This graphical representation allows for the assessment of image contrast and texture diversity, both of which are key indicators in differentiating types of pest-induced damage. The presence of concentrated intensity peaks supports the identification of localized infestation patterns.

The central classification output designates the pest as “Rice Hispa”, corroborating the visual and textural features observed in both the original and segmented images. The automated identification corresponds well with known biological symptoms of Rice Hispa infestation, such as longitudinal white streaks and the presence of the beetle on the leaf surface. The numerical value 24.1055 represents the model’s confidence or classification metric (depending on implementation—this could be a feature vector magnitude, a probability estimate, or a distance score). While the classification appears biologically accurate, the moderate value indicates room for improved classification certainty through enhancement in training data diversity, feature extraction fidelity, or segmentation quality. The interface includes core interactive elements: Browse – to select input images, Classify – to initiate the CNN classification process, CNN Training – for model retraining or refinement. These functionalities support a user-centric diagnostic workflow and are essential for practical deployment in agricultural field settings.

The classification interface accurately identifies Rice Hispa based on characteristic visual markers. The modular structure—from image input to ROI segmentation and final classification—demonstrates a coherent workflow for pest detection. Despite the correct identification, the moderate confidence score suggests further refinement in the image processing pipeline could improve reliability, particularly in field conditions with variable image quality. Such systems, when optimized, offer valuable support in integrated pest management practices.

The GUI shown in Fig. 14 is structured to perform classification of agricultural pests through digital image analysis. It facilitates the processes of image input, segmentation, feature extraction, and pest classification. The input image, shown in the upper left section, captures a close-up view of rice panicles surrounded by green foliage. Visible symptoms on the panicles suggest discoloration or structural deformation, which may be associated with pest infestation, particularly by stem borers. The top-right section presents the ROI extracted from the original image. The segmentation process isolates the potentially infected or infested part of the plant, enhancing its contrast against a black background. This allows for detailed analysis of the affected area, emphasizing texture and color anomalies typical of stem borer activity. The lower-left corner displays the histogram of the segmented image. This graphical representation of pixel intensity distribution is a key step in quantifying image features such as brightness, contrast, and color composition, which are instrumental in classifying biological damage. The classification output, displayed at the center, identifies the object of interest as “StemBorer.” This identification is consistent with the visual symptoms often caused by the larvae of stem borers, which feed internally within the stem or panicle of rice plants, leading to damage of the reproductive structures. The numeric value 19.5222, located beneath the classification label, is indicative of the model’s confidence or probability estimation.

The numeric values 24.1055 (Fig. 13) and 19.522 (Fig. 14) correspond to the quantified infection severity percentage, computed by pixel-wise analysis of the segmented binary image (post-classification). These values are not accuracy scores or confidence probabilities from the CNN model. Instead, they represent the proportion of infected area (in percentage) detected on the leaf or plant part in the uploaded image. In the context of field-level crop disease diagnosis, values in the range of 15–25% infection severity (as seen in Figs. 13 and 14) are considered moderate to high. Such values typically warrant preventive action or treatment, especially in early disease stages. Thus, these values are meaningful and relevant for real-world decision-making in precision agriculture. However, their interpretation depends on segmentation accuracy and how well the infected region is isolated. While the CNN provides 100% accuracy on the training dataset for disease classification, the segmentation and quantification processes are based on thresholding techniques, which are not part of the learned model and may introduce some imprecision. If surrounding regions are mistakenly included during segmentation (as mentioned earlier in Fig. 11), it may slightly overestimate the quantification. So, the presence of moderate quantification values is not indicative of a model flaw, but rather an area for refinement in the image preprocessing pipeline. These values are independent of the reported 100% classification accuracy, which refers to the correct disease category prediction on the training dataset. The quantification step occurs after classification and is performed using image processing (binarization and pixel counting), not the CNN model. The quantification percentages themselves do not imply overfitting. Overfitting would be suggested if the model performed well on training data but failed to generalize to unseen test data. Since these values reflect infection area computation, not prediction accuracy, they do not indicate overfitting.

Confidence scores support risk-based decision-making. For example, a 95% confidence of a disease diagnosis might warrant immediate action, while a 60% score may prompt further monitoring or testing. The relatively low value may suggest uncertainty in the classification or overlapping visual features with other classes. This outcome highlights the necessity for incorporating additional training data or enhancing segmentation precision. Physically, the numeric value shows quantification of crop disease computed pixelwise with an assumption that complete image represents 100% value. This is the first work that provides classification as well as quantification of crop disease detection from image processing techniques. In multi-class classification tasks (e.g., differentiating among several plant diseases), our system with cases where symptoms overlap—a limitation in many rule-based or binary systems. The presented our confidence scoring is based on calibrated probabilities, ensuring that a prediction with 80% confidence actually reflects an 80% chance of being correct. Many existing models produce overconfident or poorly calibrated scores, reducing their practical reliability. This confidence scoring is based on calibrated probabilities, ensuring that a prediction with 80% confidence actually reflects an 80% chance of being correct. Many existing models produce overconfident or poorly calibrated scores, reducing their practical reliability. The interface provides three primary controls—Browse, Classify, and Training—enabling the user to select input images, execute classification routines, and initiate training cycles. These features support both manual operation and iterative system improvement through supervised learning.

The system demonstrates an effective workflow for identifying pest infestation, specifically from stem borers, by leveraging image processing techniques. The segmented ROI allows for targeted analysis of potential damage, and the classification output is biologically plausible. However, the confidence score suggests that while the classification aligns with field expectations, further optimization of feature extraction or training data expansion is warranted to improve diagnostic certainty and robustness. This interface is a valuable tool in precision agriculture for supporting early pest detection and crop management interventions.

Table 4 Results for potato and rice disease classification.

Full size table

Table 4 shows the results for detected infection on rice and potato crops acquired during the presented work. The dataset comprises various pest and disease categories affecting crops, with image samples collected through both camera-based and foldscope methods. Each category includes designated training and testing images alongside a calculated infected percentage, reflecting the extent of visible damage or infestation. Potato blight, stem borer, and rice hispa exhibit moderate to high infection rates, indicating clear symptomatology suitable for visual detection, while red lady beetle bugs and foldscope images show minimal damage, suggesting either early-stage infection or beneficial insect presence. The variation in infected percentage highlights differences in pest impact, detection visibility, and image acquisition scale. The use of foldscope imaging provides microscopic insights, particularly useful for sub-visual symptom analysis, though limited by smaller datasets. Overall, the image-based dataset supports the development of precision agriculture tools, relying on classical image segmentation and quantification techniques rather than complex AI-based processing.

The bar graph shown in Fig. 15 presents a comparative analysis of the percentage of infection across different crop diseases and pest infestations. The highest infection rate is observed in the case of Rice Hispa at approximately 13.989%, followed by Potato Blight at 10.8174%, indicating significant damage and potential for yield loss in these categories. In contrast, Stem Borer shows a lower infection percentage of 3.7467%, suggesting either limited spread or effective early intervention. The presence of Red Lady Beetle Bugs corresponds to a minimal infection rate of 1.2281%, consistent with their role as generally beneficial insects rather than pests. Foldscope imagery, likely representing microscopic or early-stage detection, exhibits the lowest quantified infection at 0.6821%, emphasizing its utility in identifying subclinical or latent infections. The graph quantitatively illustrates the extent of visible damage on crops, reinforcing the need for targeted monitoring and mitigation strategies based on severity levels observed through conventional imaging techniques.

To ensure the reliability and effectiveness of the proposed framework, both qualitative and quantitative validations were performed using the collected image dataset for rice and potato crops.

Dataset splitting and controlled evaluation The dataset comprising images captured through smartphone and foldscope devices was divided into training and testing sets to avoid overfitting. Approximately 80% of the images were used for training the CNN model, and the remaining 20% were reserved for testing. This split ensured that the model was evaluated on unseen data, validating its generalization capability. Quantitative performance was measured using the classification Accuracy (%): Calculated as the ratio of correctly predicted disease classes to total predictions made on the test set and infection percentage (%): Computed using pixel-wise binary segmentation. This value represents the proportion of infected area in each image and was validated manually against visual observations. Quantified infection percentages were cross-verified with segmented images (Figs. 8, 9, 10 and 11) to ensure that the binarization step reflected realistic infection boundaries. Low-percentage cases (e.g., 0.68%) were inspected carefully using foldscope data to confirm early-stage infection was correctly captured. Training progression was monitored through accuracy and loss curves (Fig. 10), and summary values were tabulated (Table 3). Sudden stagnation or decline in loss helped identify potential overfitting, which was considered while selecting final iteration models. As part of validation, the proposed system’s performance was compared against recent studies (see Table 5). Higher classification accuracy, integration of quantification, and hardware feasibility collectively reinforced the system’s efficiency. The FPGA synthesis results (Sect. 4.5) were obtained using MATLAB HDL Coder and Xilinx Vivado. Metrics such as logic utilization, processing time, and resource constraints were used to validate the practicality of edge deployment.

Hardware implementation approach for next-generation IoT applications

The presented work is extended for hardware implementation with Simulink modeling using Xilinx design tools. The corresponding HDL code generation is performed to directly have synthesizable code from model as shown in Fig. 16.

Fig. 16

(a) MATLAB-based workflow advisor tool (b) FPGA hardware.

Full size image

An image input is directly read into HDL code for hardware generation. With the inclusion of new dataset photos, the developed classification system is expanded to include new crops and diseases. The findings of the work that has been presented are obtained using an FPGA board for the hardware implementation of image processing and classification algorithms. Simulink modelling and HDL workflow advisor are used to further process the MATLAB codes and methods for HDL code production. Processing speed and resource usage of the hardware used to implement the image processing algorithms are both improved. The conventional convolution operation for any of the processes including image edge detection, filtering, CNN is implemented on Zed board with a kernel size of 3 × 3. The results show that for image size of 512 × 512 requires less than 5% and 1% LUTs, and slice registers, respectively. The overall resource utilization at hardware level inclusive of convolution, control, and FIFO registers is limited to 3.79% which supports the efficient implementation in resource constrained environment. The presented approach can thus be implemented for many real-world applications utilized in the IoT and AI tasks.

The conventional hardware implementation using FPGA or ASIC design is based on CMOS logic which is area and power inefficient. The improvement in area and power efficiency can be achieved at device, circuit, architecture and algorithmic levels. The basic device-based improvement is achieved by replacing conventional memory devices with novel non-volatile memories (NVMs) to implement processing in memory (PIM)²⁵. The feasibility of PIM based hardware implementation of image processing algorithms is able to achieve much higher energy and area efficiency. Among the novel memristive devices, spintronics based magnetic random-access memory (MRAM) are considered to be most suitable for next generation universal memory²⁶. The next-generation hardware solutions integrated with in-house novel non-volatile memory devices-based PIM architectures are worth exploring for IoT and AI applications.

The proposed work targets software-hardware co-simulation and implementation of rice and potato crop disease detection. This work utilizes image processing algorithms such as histogram equalization, color thresholding for pre-processing and CNN for image classification. The high performance across all metrics including accuracy and confidence level indicates the CNN model’s robustness in feature extraction and classification. Rice and potato diseases with distinct visual symptoms, such as early blight and bacterial blight, showed higher accuracy, suggesting the model’s ability to differentiate based on texture and color patterns. A confusion matrix revealed that only ~ 2–3% of diseased samples were incorrectly classified as healthy, which is acceptable for field use but can be further minimized with ensemble learning or attention-based mechanisms. The quantification module applied image segmentation and pixel-wise lesion analysis to determine the severity of the infection. The percentage of affected leaf area (lesion coverage) was used to estimate the disease severity based on pre-defined thresholds.

There are some limitations with the presented work including environmental variability where weather conditions affected image clarity. Shielded or adaptive imaging setups could help. Limited scalability as larger fields with heterogeneous crop types may need multiple imaging nodes and federated learning to handle distributed data. Model Generalization where while results are promising, introducing cross-domain datasets (from other regions or varieties) can improve generalizability.

Comparison with related works

This section presents a side-by-side evaluation of our method with existing models. The key metrics compared include classification accuracy, infection quantification capability, IoT integration, and hardware deployability. This addition clearly demonstrates that our proposed system offers:

Higher classification accuracy (up to 95%) with a custom CNN architecture,
Quantitative infection severity analysis (not just classification),
Real-time field deployment via IoT and GUI integration,
And hardware implementation on FPGA, enabling edge processing for resource-limited settings.

Table 5 Comparison with other literature.

Full size table

The purpose of implementing the image processing and classification algorithms on hardware (FPGA) is to demonstrate the system’s real-time feasibility and energy/resource efficiency for field-level deployment. In real agricultural settings, computational resources are limited. FPGA implementation helps offload processing from cloud or PC environments, enabling low-latency, edge-level decision making. This is particularly important for IoT-based agricultural systems where power consumption, size, and portability are key constraints. The CNN classification model and preprocessing modules (segmentation, masking, etc.) were converted into synthesizable HDL using MATLAB’s HDL Coder and Simulink Workflow Advisor. Implementation was done on Xilinx ZedBoard (Zynq-7000) using a 512 × 512 image resolution. Resource Utilization: LUTs used: Less than 5%, Slice registers: Less than 1%, Total FPGA resource usage (logic + control + FIFO): ~3.79%, Processing time per image: < 250 ms (approx.), supporting near real-time performance and Power consumption estimated to be significantly lower than CPU-based processing, though exact measurements are part of future work. These results demonstrate that our proposed framework can be efficiently deployed on FPGA-based platforms, enabling scalable, portable, and low-power disease detection systems for smart farming applications^28,29.

Conclusions and future scope

This work demonstrates the application of image processing techniques for the early detection and classification of pest-affected and healthy agricultural crops, specifically rice and potato. Early-stage pest identification is essential to mitigate yield losses, and the limitations of manual surveillance necessitate automated digital methods. The proposed system employs a combination of K-means clustering, SVM classifiers, and CNNs for effective segmentation and disease classification. A MATLAB-based GUI has been developed to enable visualization and interaction with the classification and quantification processes. The current GUI implementation relies on the user to provide the crop type implicitly (based on image selection or dataset organization). Incorporating an automatic crop classification step before disease classification is a valuable future direction. It can be achieved by training a preliminary CNN classifier or a lightweight image filter that distinguishes between rice and potato crops based on leaf structure, color texture, and background features. The implemented framework achieved an accuracy of approximately 90% with a limited training dataset (~ 30 images), which improved to over 95% with a more extensive dataset (~ 150 images). The modular nature of the system allows for the inclusion of additional crop datasets and supports scalable training. Literature suggests the potential of integrating fuzzy logic techniques to enhance classification capabilities further. This approach lays the foundation for a comprehensive, expandable Agri-electronic system that can be adapted for diverse crop types with appropriate dataset augmentation and system training. The future work can target to develop an edge computing device with real time image and video processing to detect and report early crop diseases. The multi-disease and multi-crop scalability improvement and integration with IoT and decision support system will be targeted further. This paper presents a comprehensive, IoT-enabled framework for the automated detection and quantification of rice and potato crop diseases using a CNN. The system integrates dual-mode image acquisition via smartphone and foldscope devices, allowing for both macro- and micro-level disease detection. A pixel-based quantification module estimates infection severity by analyzing segmented binary images, enabling actionable insights beyond basic classification. In comparison to existing approaches, this work offers a more integrated and field-deployable solution by combining accurate classification, quantitative severity analysis, and edge-ready hardware deployment. These features collectively support precision agriculture by enabling early diagnosis, timely intervention, and improved crop management. While the current framework demonstrates promising results, several areas remain for future enhancement:

Dataset expansion and validation The dataset can be expanded to include more crop types, environmental conditions, and disease stages. Incorporating an independently sourced validation set will improve generalization assessment.
Advanced segmentation techniques Replacing traditional k-means and thresholding with deep learning-based semantic segmentation models (e.g., U-Net, DeepLabV3) may improve the precision of infected region isolation.
Crop-type identification Integrating an automatic crop recognition module within the GUI would enhance usability in multi-crop scenarios and reduce user dependency.
Energy and performance profiling Detailed evaluation of power consumption, latency, and performance across different hardware platforms (e.g., mobile processors, microcontrollers) will be critical for real-world deployment.
Time-series disease monitoring Extending the system to support longitudinal monitoring could allow for tracking disease progression and optimizing treatment schedules.
Cloud and edge integration Future work may also focus on hybrid architectures combining cloud analytics with real-time edge inference to balance scalability and latency.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

Upadhyay, A. et al. Deep learning and computer vision in plant disease detection: A comprehensive review of techniques, models, and trends in precision agriculture. Artif. Intell. Rev. 58(92), 1–64. https://doi.org/10.1007/s10462-024-11100-x (2025).
Morchid, A. et al. March., IoT-enabled smart agriculture for improving water management: A smart irrigation control using embedded systems and Server-Sent Events. Sci. Afr. 27, 1–17 (2025).
Morchid, A., Said, Z., Abdelaziz, A. Y., Siano, P. & Qjidaa, H. Fuzzy logic-based IoT system for optimizing irrigation with cloud computing: Enhancing water sustainability in smart agriculture. Smart Agric. Technol. 11, 1–14 (2025).
Morchid, A., Et-taibi, B., El Alami, R., Abid, M. R. & Boufounas, E. M. Internet of things (IoT)-Based sustainable agriculture: A smart irrigation system using embedded system and websockets. In: (eds Elsadany, A. A., Adel, W. & Sabbar, Y.) Biology and Sustainable Development Goals. Mathematics for Sustainable Developments. Springer. https://doi.org/10.1007/978-981-96-3094-3_10 (2025).
Chapter Google Scholar
Verma, G., Taluja, C. & Saxena, A. K. Vision-based detection and classification of disease on rice crops using convolutional neural network. In Proceedings of the International Conference on Cutting-edge Technologies in Engineering (ICon-CuTE) 1–4 (2019).
Lu, D., & Weng, Q. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 28(5), 823–870. https://doi.org/10.1080/01431160600746456 (2007).
Desai, F. S., Bhat, R. R., Shah, H. N., Rathod, S. S. & Kasambe, P. V. FPGA-based implementation of color image processing techniques. In Proceedings of SPIT-IEEE Colloquium and International Conference Vol. 2, 59–62 (2008).
Tasfe, M., Attallah, O., & El-Sappagh, S. Deep learning-based models for paddy disease identification and classification: A systematic survey. IEEE Access 12, 100862. https://doi.org/10.1109/ACCESS.2024.100862 (2024).
Ebrahimi, M. A., Khoshtaghaza, M. H., Minaei, S., & Jamshidi, B. Vision-based pest detection based on SVM classification method. Comput. Electron. Agric. 137, 52–58. https://doi.org/10.1016/j.compag.2017.03.016 (2017).
Pratheba, R. & Sivasangari, A. Performance analysis of pest detection for agricultural field using clustering techniques. In Proceedings of the International Conference on Circuit, Power and Computing Technologies 1–5 (2014).
Liu, H., & Chahl, J. S. A multispectral machine vision system for invertebrate detection on green leaves. Comput. Electron. Agric. 150, 279–288. https://doi.org/10.1016/j.compag.2018.05.027 (2018).
Boissard, P., Martin, V., & Moisan, S. A cognitive vision approach to early pest detection in greenhouse crops. Comput. Electron. Agric. 62(2), 81–93. https://doi.org/10.1016/j.compag.2007.11.009 (2008).
Maharlooei, M. M., Sivarajan, S., Bajwa, S. G., Harmon, J. P., & Nowatzki, J. Detection of soybean aphids in a greenhouse using an image processing technique. Comput. Electron. Agric. 132, 63–70. https://doi.org/10.1016/j.compag.2016.11.011 (2017).
Milioto, A., Lottes, P. & Stachniss, C. Real-time semantic segmentation of crop and weed for precision agriculture robots leveraging background knowledge in CNNs. In Proceedings of the IEEE International Conference on Robotics and Automation 2229–2235 (2018). https://doi.org/10.48550/arXiv.1709.06764
Sarkar, S., Biswas, S., Tapadar, A. & Saha, P. AI-based fault detection on leaf and disease prediction using K-means clustering. Int. Res. J. Eng. Technol. 5(3), 930–934 (2018).
Google Scholar
Barman, S., Farid, F. A., Raihan, J., Khan, N. A. & Hafiz, M. F. B. Optimized crop disease identification in bangladesh: A deep learning and SVM hybrid model for rice, potato, and corn. J. Imaging. 10(8), 183. https://doi.org/10.3390/jimaging10080183 (2024).
Article PubMed PubMed Central Google Scholar
Zeng, Y. et al. A low-cost and portable smartphone microscopic device for cell counting. Sens. Actuators A 274, 57–63. https://doi.org/10.1016/j.sna.2018.03.008 (2018).
Cybulski, J. S., Clements, J., & Prakash, M. Foldscope: Origami-based paper microscope. PLOS ONE 9(6), e98781. https://doi.org/10.1371/journal.pone.0098781 (2014).
Bashish, D. A., Braik, M. & Ahmad, S. B. Detection and classification of leaf diseases using K-means-based segmentation and neural networks-based classification. Inf. Technol. J. 10(2), 267–275 (2011).
Article Google Scholar
Badnakhe, M. R. & Deshmukh, P. R. An application of K-means clustering and artificial intelligence in pattern recognition for crop diseases. In Proceedings of the International Conference on Advances in Information Technology Vol. 20, 1–5 (2011).
Arivazhagan, S., Shebiah, R. N., Ananthi, S. & Varthini, S. V. Detection of unhealthy region of plant leaves and classification of plant leaf diseases using texture features. Agric. Eng. Int. CIGR J. 15(1), 211–217 (2013).
Google Scholar
Barbedo, J. G. A. Using digital image processing for counting whiteflies on soybean leaves. J. Asia. Pac. Entomol. 17(4), 685–690. https://doi.org/10.1016/j.aspen.2014.06.014 (2014).
Article Google Scholar
Li, Y., Xia, C. & Lee, J. Detection of small-sized insect pest in greenhouses based on multifractal analysis. Optik - Int. J. Light Electron. Opt. 126(19), 2138–2143. https://doi.org/10.1016/j.ijleo.2015.05.025 (2015).
Article Google Scholar
Rai, M., Tyagi, N., Tripathi, P., Kumar, N. & Kumari, P. Advancement in agricultural techniques with the introduction of artificial intelligence and image processing. In Smart Village Infrastructure and Sustainable Rural cCommunities (eds Mishra, S. N. & Bansal, A.) (IGI Global, 2023). https://doi.org/10.4018/978-1-6684-6418-2.ch004
Tripathi, P., Kumar, N., Rai, M., Shukla, P. K. & Verma, K. N. Applications of machine learning in agriculture. In Smart Village Infrastructure and Sustainable Rural Communities (eds Mishra, S. N. & Bansal, A.) (IGI Global, 2023). https://doi.org/10.4018/978-1-6684-6418-2.ch006
Tripathi, P., Kumar, N., Rai, M. & Khan, A. Applications of deep learning in agriculture. In Artificial Intelligence Applications in Agriculture and Food Quality Improvement (eds. Tomar, R. S. & Pandey, A.) (IGI Global, 2022). https://doi.org/10.4018/978-1-6684-5141-0
Shi, Y., Huang, W., Luo, J., Huang, L. & Zhou, X. Detection and discrimination of pests and diseases in winter wheat based on spectral indices and kernel discriminant analysis. Comput. Electron. Agric. 141, 171–180. https://doi.org/10.1016/j.compag.2017.07.019 (2017).
Article Google Scholar
Ferreira, A. S., Freitas, D. M., Silva, G. G., Pistori, H. & Folhes, M. T. Weed detection in soybean crops using ConvNets. Comput. Electron. Agric. 143, 314–324. https://doi.org/10.1016/j.compag.2017.10.027 (2017).
Article Google Scholar
Verma, G., Prajapati, A., Bindal, N. & Kaushik, B. K. Comparative analysis of spin-based memories for neuromorphic computing. Proc. SPIE. 11470, 1147013. https://doi.org/10.1117/12.2568378 (2020).

Download references

Acknowledgements

The authors would like to express their gratitude to all well-wishers for their support and valuable contributions to this research.

Author information

Authors and Affiliations

Department of Electrical and Electronics Engineering, Shri Ramswaroop Memorial University, Barabanki, India
Gaurav Verma, Abhishek Kumar Saxena & Mritunjay Rai
University of Roehampton, London, UK
Momina Shaheen & Sameena Naaz

Authors

Gaurav Verma
View author publications
Search author on:PubMed Google Scholar
Abhishek Kumar Saxena
View author publications
Search author on:PubMed Google Scholar
Mritunjay Rai
View author publications
Search author on:PubMed Google Scholar
Momina Shaheen
View author publications
Search author on:PubMed Google Scholar
Sameena Naaz
View author publications
Search author on:PubMed Google Scholar

Contributions

Gaurav did experimental work and prepared an initial draft of the manuscript. Abhishek did experimental and analysis on the results. Mritunjay did the experimental work and supervised the whole work. Momina and Sameena did the literature and final proofreading of the work.

Corresponding author

Correspondence to Mritunjay Rai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Verma, G., Saxena, A.K., Rai, M. et al. IoT integrated CNN framework for automated detection and quantification of rice and potato crop diseases. Sci Rep 15, 38199 (2025). https://doi.org/10.1038/s41598-025-22117-9

Download citation

Received: 03 June 2025
Accepted: 25 September 2025
Published: 31 October 2025
Version of record: 31 October 2025
DOI: https://doi.org/10.1038/s41598-025-22117-9