Introduction

The continuation of life in nature largely depends on insects. Insects pollinate plants, ensuring the continuation of plant diversity1. In this way, sustainable agriculture enables living beings to obtain food and maintain a high quality of life. The goals of sustainable agriculture include various elements, such as nutrition in Fig. 1, protecting the future, protecting quality food, and ensuring food safety. In addition, many economic, cultural, and social improvements are among the targets2.

Fig. 1
figure 1

Goals of sustainable agriculture.

Ensuring and protecting plant diversity is a crucial aspect of sustainable agriculture. This is facilitated by establishing technological infrastructures, such as plant protection and early disease diagnosis systems, which utilize advanced technologies like the Internet of Things (IoT)3. While beneficial insects contribute to plant diversity, harmful insects also pose a threat4. These destructive insects must be combated through agriculture to minimize their damage. Eliminating harmful insects through such control measures preserves plant diversity, quality, and sustainability and prevents economic losses. Figure 2 shows images of the jute plant as it is grown and its image transformed into a product after processing.

Fig. 2
figure 2

Jute production and product sample images.

Now that the effects of harmful insects have been observed, it is necessary to determine how to combat them effectively and take action promptly. The fight against pests is being continued with internationally accepted methods such as integrated pest management (IPM) and area-wide (AW) pest management5. Many different methods can combat insects that fall into the pest category, such as biological, chemical, pesticide use, heat application, or cooling. The critical point during the struggle phase is memorizing the correct method. An incorrectly determined method may even cause an increase in harmful insects, let alone getting rid of them. While some insects increase in hot environments, others survive and reproduce by adapting to cold environments. In addition, a pesticide used for drug-resistant pests can create a more comfortable breeding ground instead of destroying them. To prevent such situations, the insect species to be combated must be determined correctly. This correct detection phase is the first step and the most critical point in the fight against harmful insects. Then, deciding which method is most suitable for pest control and starting the fight begins. An incorrectly determined control method may fail to destroy the harmful insect species and cause additional damage to the plant species.

Considering the harmful effects of insects, the importance of accurate and rapid detection becomes evident. Two different methods can be followed to detect harmful insects. The first step is to get expert support and determine the type of support. This method may not be accessible for every region. Disadvantages, such as a lack of experts and remote areas, may limit the usability of this method for general use. The second method is recognition using artificial intelligence systems. Considering the success of artificial intelligence systems, especially in image processing, it is possible to identify species with insect images.

Identification of harmful insect species, which has been transformed into a problem that can be solved by artificial intelligence, will provide an essential step towards ensuring biodiversity and sustainable agriculture. At this point, the issues that are intended to be solved step by step with this study can be listed as follows;

  • Collection of images of harmful insects

  • Separation into categories by pre-processing

  • Creating the appropriate artificial intelligence model

  • Extracting features

  • Implementation of optimization

  • Classification

  • Deciding on the type of pest and its performance.

Artificial intelligence-based methods are frequently used to classify plant diseases. Classification of diseases on tomato6, tea7, rice8, and apricot9 leaves are just a few of them. However, more studies are needed to detect harmful insects in plants. Therefore, this study we conducted is original regarding the subject and the methods used. Artificial intelligence offers significant potential in data analysis and pattern recognition techniques, especially in solving complex problems like detecting harmful insects. Such methods have become an essential tool for increasing productivity in agriculture by creating accurate diagnosis and early warning systems.

Related work

When research on sustainable agriculture is examined, it is revealed that effects such as infertile soils, environmental pollution, quality deterioration, and decreased profitability endanger sustainable agriculture10. Just as there are studies on pest detection, there are also studies on plant diseases11. Ebrahimi et al.12 proposed an image-processing-based model for automatic pest detection. They determined the threshold value for pixel segmentation by using the histogram graph created with the Otsu method to remove the background from the image obtained after applying the gamma operator. From the new images obtained, hue (H), saturation (S), and intensity (I) components were obtained using HSI (hue, saturation, intensity) color decomposition as features. They made a classification using SVM to detect the pest.

In recent years, deep learning algorithms have been frequently used in studies to identify pest species. Trufelea et al.13 determined the interspecific similarity and intraspecific variability of four Phyllocephalidae insect species with the SSD deep learning method, the extended convolution layer of VGG-16. As a performance indicator, they reached an intersection over the union (IoU) rate of 70.2%.

Maican et al.14 used transfer learning to detect pests with the MobilNet-SSD-v2 they used. For pest detection on corn plants, they selected five insect species (Coleoptera), four of which are harmful (Anoxia, Diabrotica, Opatrum, and Zabrus) and the beneficial ones (Coccinella sp.). A mAP of 0.892 performance was achieved with MobileNet-SSD-v2-Lite. have achieved value.

Meena et al.15, in their study on the detection of diseases in plants and the detection of harmful insects with ready-made deep learning models, studied 38 different diseases, nine different harmful insect species, and four different weed species. They achieved disease detection with 99.06% accuracy with DensNet, weed detection with 97.56% accuracy by fine-tuning the Hyperparameter Search 2D layer, and 88.75% accuracy in pest detection with MobileNet.

Yang et al.16 created a dataset to detect pests in tea plants. Then, they designed a model with the Yolov7-small deep learning model. They correctly detected the pests on the tea plant with a 93.2% accuracy rate with the model they tested with a dataset of 782 images.

Vedhamuru et al.17 proposed a Feature Pyramid Dilation Residual Convolution neural network (FDPRC) to detect pests in tomato plants. After training and testing the model, they detected pests with a 93.46% accuracy rate.

In addition to using images, some academic studies classify insects by analyzing their vocalizations. One of these studies is Dhanaraj et al. It was carried out using the deep learning technique proposed by18 using high accuracy with Multi-Layer Perceptron (MLP) models. In addition to their proposed model, the dataset they tested with models such as DenseNet, VGG16, YOLO, and ResNet50 reached a superior accuracy rate compared to other models, with a 99.78% accuracy rate with MLP.

Talukder et al. created the jute pest dataset and conducted their studies using the relevant dataset. The jute plant is essential because it is the most used plant fiber after cotton and is an important source of agricultural income in many countries19. Like other agricultural products, jute can be attacked by pests, and as a result, economic losses may occur. This paper suggests a model for detecting and classifying pests harmful to the jute plant.

Contribution and novelty

This study used a 17-class dataset to detect jute pests. For faster model development, instead of training a model from scratch, features obtained from six different architectures accepted in the literature were classified using five different classifiers. Among these architectures, DarkNet-53 and DenseNet-201 achieved the highest performance, and these two models served as the base for the proposed model. The features obtained in DarkNet53 and DenseNet201 models are concatenated to use different features of the images together. In this way, the developed model achieved more successful results because various features of the same image were combined. At the same time, only feature extraction was performed with DarkNet53 and DenseNet201; these models were not trained. This allowed this phase to be completed quickly. However, these combined features have the potential to contain redundant features, and due to their high dimensionality, efficient feature selection is required to improve classification performance. For feature selection, this study uses Mountain Gazelle Optimizer (MGO), a novel meta-heuristic optimization technique proposed by Abdollahzadeh et al.20. Inspired by the social behavior and group dynamics of mountain gazelles, MGO is employed. Meta-heuristics can provide successful and efficient solutions when they balance the exploration and exploitation phases, which are crucial for efficiently navigating the search space and converging on optimal solutions. The four different mechanisms in the base design of the MGO provide a strong balance between the exploration and exploitation phases. To verify the performance of MGO, Abdollahzadeh et al.20 conducted experiments on a total of 52 benchmark functions, including seven unimodal, 16 multimodal, 29 CEC2017 functions, and seven real-world engineering optimization problems, in their original study. Then, they compared the results obtained from MGO with nine widely used metaheuristic optimization algorithms, namely Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), Farmland Fertility Algorithm (FFA), Tunicate Swarm Algorithm (TSA), Gravitational Search Algorithm (GSA), Moth-Flame Optimization Algorithm (MFO), Sine Cosine Algorithm (SCA), Multi-Verse Optimizer (MVO) and Whale Optimization Algorithm (WOA), and showed that MGO gives more successful results than the others. The advantages mentioned above and the fact that MGO has not yet been applied to agricultural image classification problems, especially for pest detection, are the primary motivations for choosing MGO in this study. The study eliminated unnecessary features by using MGO for feature selection, and an accuracy rate of 96.779% was achieved. The study reached the specified accuracy value without performing the data augmentation step. Not performing the data augmentation step allowed our developed model to run faster. When the studies in the literature were examined, more successful results were achieved in the model we proposed without performing the data augmentation step.

Organization of paper

The second part of this proposed study on classifying pests harmful to jute plants includes information about the dataset, CNN architectures, and classifiers used. In the third section, metaheuristic MGO and the developed method are discussed. The fourth part includes information about the performance values calculated after applying the model to the images. The fifth part gives the performance results obtained from applying the model to the images in detail. The final section assesses the proposed model’s scientific contribution, results, and potential for future studies.

Backgrounds

This section examines the dataset used in the paper and pre-trained models and classifiers used in the study to detect jute pests. The summary flow chart of the paper is given in Fig. 3.

Fig. 3
figure 3

Rough flow diagram of the study.

Jute pest dataset, deep models and classifiers

Jute plants, like many plants, can be damaged by insects. Fighting against these insects is essential regarding the jute plant’s production of quality fiber and its economic contribution. Therefore, detecting these pests will increase the quantity and quality of the product. Using web resources, Talukder et al.19,21 brought together the insects that damage the jute plant. They completed the dataset by removing possible light, environment, perspective, and image distortions in the collected pest images. Since the number of images in the test folder was minimal in this study, the images in the train and test folders were combined. As a result of these operations, the dataset consists of a total of 6828 images. Then, 20% of this data is reserved for testing. There are 17 different pest classes in the created dataset. The images used in the study are in .jpg format and are of various sizes. Since pre-trained models automatically resized the images, original-size images were used. The species in the dataset are given in Fig. 4.

Fig. 4
figure 4

Jute Pest Dataset.

Figure 5 visually shows the total number of images and sample pest types for all classes in the dataset.

Fig. 5
figure 5

Jute Pest Dataset sample images for every class.

Six deep models accepted in the literature were used to compare the performance of our hybrid model developed for detecting Jute pests. In addition, the two models that showed the highest performance among these models were used as the basis for the model we created. ShuffleNet22, MobileNetV223, Vgg1924, EfficientNetb025, Darknet5326 and DenseNet20127 are the models used in the study. Five different classifiers were used to classify the features obtained in the relevant deep and developed models. At this stage, Support Vector Machine (SVM)28, Fine Tree (FT)29, Linear Discriminant (LD)30, k-nearest Neighbors (KNN)31, and Ensemble Subspace KNN (ES)32 these are classifiers used in the classification process.

The mountain gazelle optimizer (MGO)

Abdollahzadeh et al.20 introduced the Mountain Gazelle Optimizer (MGO) algorithm, a novel metaheuristic optimization technique derived from mountain gazelles’ social behavior and group dynamics. The optimization process in MGO consists of four main stages: Territorial Solitary Males, Maternity Herds, Bachelor Male Herds, and Migration to Search for Food. Implementing these four primary stages concurrently executes the algorithm’s exploitation and exploration phases.

Within the parameters of the search space, MGO begins with an initial population generated at random. A gazelle is a candidate solution, and every member of the original population is a candidate solution. Each gazelle (X(i)) in the population has the potential to join one of the three groups mentioned in the main stages above. An adult male gazelle represents the global optimum solution. The four main stages of the algorithm are presented in detail below.

Territorial solitary males (TSM)

Male gazelles establish solitary territories when they reach adulthood. These territories are defined as in the following equations:

$$TSM={male}_{g}-\left|\left({Rndi}_{1}\times BH-{Rndi}_{2}\times X\left(t\right)\right)\times F\right|\times {C}_{r}$$
(1)
$$BH= {X}_{rnda}\times rnd1+ {M}_{pr} \times rnd2, rnda=\left\{\frac{N}{3},\dots ,N\right\}$$
(2)
$$F={N}_{1}(D)\times exp\left(2-Iter\times \left(\frac{2}{MaxIter}\right)\right)$$
(3)
$${C}_{r}=\left\{\begin{array}{c}\left(a+1\right)+rnd3,\\ a\times {N}_{2}\left(D\right),\\ rnd4(D)\\ {N}_{3}\left(D\right)\times {{N}_{4}\left(D\right)}^{2}\times \text{cos}\left(\left(rnd4\times 2\right)\times {N}_{3}\left(D\right)\right),\end{array}\right.$$
(4)

In Eq. 1, maleg is defined as a vector that contains the position of the best global candidate solution. Rndi1 and Rndi2 are random integers with values of 1 or 2. BH is the coefficient vector for the young male herd, calculated by Eq. 2. F is calculated using Eq. 3. Cr is a coefficient vector used to increase the algorithm’s searchability, as calculated by Eq. 4.20

In Eq. 2, Xrnda denotes a randomly selected candidate solution in the population at interval rnda. Mpr represents the randomly generated average population (N/3). Here, N is the population number of gazelles. Also, rnd1 and rnd2 are random numbers between 0 and 1. In Eq. 3, N1 represents a random number from a standard distribution. MaxIter is the maximum number of iterations and Iter is the number of current iterations. Exp is the exponential function. In Eq. 4, rnd3 and rnd4 are randomly selected numbers between 0 and 1. N2, N3, and N4 are random numbers in the normal range and dimensions of the problem. a is calculated as in the following equation20:

$$a=-1+Iter\times \left(\frac{-1}{MaxIter}\right)$$
(5)

Maternity herds (MH)

Maternal herds play a crucial role in the life cycle of mountain gazelles as they give birth to strong male gazelles. Male gazelles may also contribute to the possession of females and the birth of gazelles. This is formulated using the following equation20:

$$MH=\left(BH+{C}_{2r}\right)+\left({Rndi}_{3}\times {male}_{g}- {Rndi}_{4} \times {X}_{rnd}\right)\times {C}_{3r}$$
(6)

In Eq. 6, BH is calculated as in Eq. 2. C2r and C3r are randomly chosen co-efficient vectors calculated independently by Eq. 4. Rndi3 and Rndi4 are random integers with values of 1 or 2. Xrnd is a randomly selected gazelle from the population.

Bachelor male herds (BMH)

As male gazelles mature, they seek to establish their territories and search for female gazelles, which leads to conflicts with other male gazelles. This behavior can be expressed mathematically as follows20:

$$BMH=\left(X\left(t\right)-D\right)+\left({Rndi}_{5}+{male}_{g}-{Rndi}_{6}\times BH\right)\times {C}_{r}$$
(7)

In Eq. 7, X(t) represents the position of the gazelle at the current iteration. Rndi5 and Rndi6 are random integers with values of 1 or 2. maleg is defined as a vector that contains the position of the best global candidate solution. BH and Cr are calculated by Eqs. 2 and 4, respectively. The following equation calculates D:

$$D=(\left|X\left(t\right)\right|+\left|{male}_{g}\right|)\times (2\times rnd5-1)$$
(8)

In Eq. 8, X(t) represents the position of the gazelle at the current iteration. maleg is defined as a vector that contains the position of the best global candidate solution. rnd5 is a random number between 0 and 1.

Migration to search for food (MSF)

Mountain gazelles migrate to find food sources. This is expressed by the following equation20:

$$MSF=\left(U-L\right)\times rnd6+L$$
(9)

In Eq. 9, U and L are the search space’s upper and lower bounds, respectively. rnd6 is a random number between 0 and 1.

The main stages of the algorithm, TSM, MH, BMH, and MSF, are applied to all gazelles to generate new generations. In each iteration, gazelles are sorted according to their fitness values, and gazelles with promising values are preserved in the population, while gazelles with poor values are eliminated. The algorithm’s flowchart is presented in Fig. 6.

Fig. 6
figure 6

Flowchart of the MGO.

Proposed model for automatic detection of jute pests

This study proposes a CNN and MGO-based model for detecting Jute Pests. Feature extraction was performed with six different models to be used as a base in the proposed model. The features obtained from six different models were classified into five different classifiers. Since the highest accuracy values were obtained from DarkNet-53 and DenseNet-201 models, DarkNet-53 and DenseNet-201 models were used as a base in the study. Then, feature selection was performed with MGO, and the SVM classifier, which achieved the highest performance among five different classifiers, was used as the base in the proposed model. The flow chart of the developed model is presented in Fig. 7.

Fig. 7
figure 7

Flowchart of proposed model.

First, the features of the images in the dataset are obtained from the “conv53” layer of DarkNet53 and the “fc1000” layer of DenseNet201. There are 6828 images in the dataset; for each image, both CNN models give 1000 features. In other words, two different 6828 × 1000 feature matrices are obtained from both CNN models. These two matrices are then combined to obtain a single 6828 × 2000 feature matrix.

Feature selection is then performed with MGO. The MGO population is initially created with N gazelles (candidate solutions). Each candidate solution is a randomly generated vector of 2000 dimensions between 0 and 1, derived from the number of features obtained from CNN. A feature will be chosen for classification if its value in the candidate solution is higher than 0.5. This feature will be disregarded otherwise. The SVM classifier is then used to determine each candidate solution’s fitness value in the MGO. The classifier considers features that match dimensions larger than 0.5 in the candidate solution. The accuracy value at the end of the classification process determines the fitness value of the candidate solution. Once the maximum number of iterations has been achieved, the candidate solution with the highest fitness value is considered the issue solution.

Performance metrics

While analyzing the classification results of the developed model, performance metrics are calculated, and the obtained results are interpreted to decide whether the model is suitable or not. Thanks to the calculated metrics, the correct prediction rates of the model and the incorrect classification rate and deficiencies are determined. While the calculations are made for each class and the rates are evaluated separately, a general evaluation is made by calculating the general rates. While calculating the performance values, True-Positive (TP), True-Negative (TN), False-Positive (FP), and False-Negative (FN) values ​​are used. These values are given in the confusion metric and are calculated from the columns corresponding to the true and predicted columns.

The formulas in Table 1 are used to calculate the basic criteria for Accuracy, Precision, recall, and F1-Score. The calculations are first made separately for each class, and then the overall accuracy rate, Overall Precision (Precision—macro Mean and Micro Mean), Overall Sensitivity (Recall—macro Mean and Micro Mean), and Overall F1-Score values are calculated.

Table 1 Performance metrics.

Application results

The experimental results in this study, which aimed to detect jute pests automatically, were acquired in the Matlab environment. The same dataset and metrics were used to compare models and classifiers under the same conditions. Twenty percent of the Jute Pets dataset’s jute pest images were used for testing, and the remaining eighty percent were kept aside for training. In the first stage, six distinct pre-trained models were used to create feature maps of the images in the dataset to compare the model’s performance we suggested. Five distinct classifiers were used to categorize these acquired attributes.

In the second step, the feature maps obtained in the DarkNet53 and DenseNet201 models were combined, and this combined feature map was classified similarly to five different classifiers. The reason for combining the features obtained in the DarkNet53 and DenseNet201 models in this step is to get the most successful results in these two architectures and to use these architectures as the basis for the suggested model. The flow chart of the results obtained in this study is given in Fig. 8.

Fig. 8
figure 8

Flow chart of the results obtained.

Results were obtained in five different classifiers using feature extraction methods

In this section, feature extraction is performed using popular deep models frequently used in the literature. Feature extraction in MobileNetV2 architecture was obtained from Logits layers, node_202 in ShuffleNet architecture, conv53 in DarkNet53 architecture, efficientnet-b0|model|head|dense|MatMul in EfficientNetb0 architecture, fc8 in VGG19 architecture, and FC1000 in DenseNet201 architecture. In each architecture, 1000 features were extracted, and these features were classified into five different classifiers. The relevant results are given in Table 2.

Table 2 Classification of features extracted with CNN architectures in classifiers (%).

When Table 1 is examined, the features extracted with CNN architectures in classifiers were classified. The most successful results were obtained after combining the features obtained with DenseNet201 + DarkNet53. The most successful classifiers in this step were LD and SVM.

The comparative graph of the performance values obtained for all models is given in Fig. 9.

Fig. 9
figure 9

Performance comparison graph for all models.

Figure 10 shows the confusion matrix obtained when the feature maps of DenseNet201 + DarkNet53 architectures are given to the LD classifier.

Fig. 10
figure 10

Confusion matrix of DenseNet201+DarkNet53+LD.

Figure 11 shows the confusion matrix obtained when the feature maps of DenseNet201 + DarkNet53 architectures are given to the SVM classifier.

Fig. 11
figure 11

Confusion matrix of DenseNet201+DarkNet53+SVM.

When Figs. 10 and 11 are examined, in Fig. 10, out of 1366 test images, 1286 were predicted correctly, while 80 were mispredicted. In Fig. 11, out of 1366 test images, 1283 were predicted correctly, while 83 were mispredicted.

Results of the proposed model

In the method we proposed for automatically detecting Jute malware, feature extraction was first performed using DenseNet201 and DarkNet53 models. To improve the performance of our suggested model, we merged feature maps from two distinct architectures. This combined various aspects of the same image. The suggested model’s feature selection was then performed using MGO, and SVM was employed as a classifier. The initial population number of MGO was set as 10, the maximum number of iterations was set as 100, and 10 independent experiments were performed. Table 3 shows the mean accuracy, max accuracy, and standard deviation values obtained in 10 independent experiments.

Table 3 Results of the proposed model.

Figure 12 gives the confusion matrix of the best result obtained in 10 independent experiments.

Fig. 12
figure 12

Confusion matrix of the proposed model.

When Fig. 12 is examined, 1322 of the 1366 test images were predicted correctly, while 44 were mispredicted. The performance values of the proposed model calculated separately for each class are given in Table 4. Separate calculations were made for 17 classes in the table.

Table 4 Class-based performance metrics.

The overall performance values calculated for the proposed model are given in Table 5.

Table 5 General performance values of the proposed model.

The study’s limitation is that the proposed model is tested using a single public dataset. Another area for improvement is that no other dataset on the subject exists in the literature. Our goals in the future include creating a more detailed dataset by including experts in the field and developing a mobile application on the subject.

Conclusion

Artificial intelligence-based methods have recently gained prominence in many fields. Therefore, this study used it to detect jute pests automatically. Since jute is a significant income source, the product’s quality and quantity must be high. In addition, it is tough to detect jute pests visually, and it may be too late to take some precautions. A hybrid structure is used in the proposed model to detect Jute pests automatically. Pre-trained models were used instead of training a model from scratch so that the proposed model could work quickly. In addition, feature merging was performed to improve the proposed model’s results. The metaheuristic Mountain Gazelle algorithm was preferred for our model to work faster. When the optimized feature map was classified in the SVM classifier, an accuracy value of 96.779% was achieved. This value shows that our model can be used to detect jute pests.