Introduction

THE IoT is one of the disruptive technologies that is quickly changing the different industries to carry out automated operations and services through connected devices. IoT stands as a networked system of servers, sensors, and software components1, and plays a significant role in numerous applications, from home automation to industry2. Although there are numerous benefits that have been associated with the development of IoT devices, this has been achieved at an extremely fast pace leading to increased security and privacy concerns as these devices become the major targets of cyber-attacks and this has led to great losses and insecurity3.

The research that has been conducted recently points to the various essential concerns as the primary areas of vulnerability in IoT networks4. The existence of numerous platforms, hardware, and software and the variety of communication protocols create a lot of vulnerabilities that make strategies of creating strong security measures more challenging5. Furthermore, IoT devices run with low processing power which intensifies these issues. The available security solutions have not proved to be highly effective in identifying new generation attacks or the low and slow attacks. The attackers are innovative, and they do not repeat the same attack pattern since they invent new ways of attacking and avoiding the security measures put in place6. These issues highlight the significance of developing new security frameworks with adequate solutions that can adequately meet the threat environment that characterizes IoT ecosystems.

Current IoT security simply cannot detect new and constant types of threats as it uses IDS that cannot keep up with present sophistications. Current IDS tools suffer from real-time identifications, the high false positive and false negative alarm rates, and the inability to update to new intrusion types, approaches, and tactics. The present literature lacks effective deep learning-based hybrid models that can address these limitations by utilising both the spatial and temporal aspects of data.

To address this gap, in this study we propose a hybrid classifier that combines two advanced deep learning models: CNNs and GRUs to classify threat in IoT environment. CNNs, mainly employed for computer vision and image analysis tasks, are expected to analyze data with a spatial relation similar to arrays. CNN has an input layer which takes in the data and further several hidden layers that include the convolution and pooling layers7. Such layers convolve and sub-sample the input features to help develop and caricature the input features non-trivial characteristics. The final layers are fully connected layers, and it used the extracted information to make predictions such as the class. The model finds the weights and biases of each layer that brings out the least error between predicted label and the actual one as estimated by the loss function. On the other hand, Gated Recurrent Units (GRUs) are a category of Recurrent Neural Network (RNN) which is especially used for sequential data processing. The main difference between GRUs and other recurrent neural networks relates to the gating process of information flow. In GRUs, to compute how much of the past information to keep or to forget, update and reset gates are used and hence, prevent the vanishing and exploding gradients which is more frequently observed in basic RNNs.

These research objectives will be achieved through the CNNs in parallel with GRUs, ensuring the optimization of security attacks on IoTs. CNNs are perfect for generating spatial hierarchies from the data, and thus, are effective in identifying latent structures in raw input attributes such as network traffic patterns. Thus, for tracking the temporal dependencies, which are important for understanding the sequence and time of events in the attacks, GRUs are good.

GRUs are designed to be simpler than LSTMs, requiring fewer parameters and less computational power. This efficiency is crucial for IoT applications where resources may be limited8. On other hand, CNNs are highly parallelizable which allows for faster training and inference times compared to RNNs9.

Therefore, to combine the advantages of both CNNs and GRUs, the proposed model was developed that allows for an effective detection of security threats in IoT networks.

In addition, a new optimization algorithm called, Self-Upgraded Cat and Mouse Optimization (SUCMO) is formulated to improve the efficiency of the hybrid classification model. For the classifier, SUCMO fortifies the weights to increase the work’s precision in identifying several specific types of attacks.

The study makes the following key contributions:

  • Hybrid Deep Learning-based IDS for IoT Security: We propose a novel hybrid deep learning-based IDS for classification of IoT security attacks. This comprises CNN and GRUs to capture both spatial and temporal features from the network data.

  • Use of Self-upgraded Cat and Mouse Optimization Algorithm (SUCMO): To further optimize the performance of proposed hybrid deep learning model-based IDS, we use SCUMO algorithm which is a state of art optimization technique.

  • Comprehensive Experimental and Comparison: In order to evaluate the proposed model performance, we conducted extensive experiments using two different datasets. To prove the superiority of the proposed model, we perform comparisons with traditional methods as well as state of the art methods.

The remainder of the paper unfolds as follows: Section “Related works” delves into related works, presenting a comprehensive review of existing literature and studies that bear relevance to the research topic. Proposed methodology is discussed in Section “Proposed methodology”, outlining the research design, data collection methods, and any relevant frameworks or models used. Section “Result and discussion” unveils the findings and discussion, displaying the results derived from the data analysis and interpretation. The concluding remarks of the research paper, summarizing the key findings, their implications, and their contributions to the field of study, are encapsulated in Section “Conclusion”.

Related works

A review of literature on IoT security shows that several techniques have been proposed to address cyber threats10,11. Historically, there were rule based systems and signatures, which are inefficient for dealing with complex and dynamic threats12,13. The newer technologies in machine learning have created more flexible and complex security measures which have been discussed in this paper. Significantly, CNNs and RNNs have elevated techniques in analyzing traffic pattern of a network and finding anomalous behaviors. Combining different deep learning architectures as a fused system was also considered to take benefits from their advantages. However, the proposed usage of CNNs and GRUs together with more advanced algorithms such as SUCMO are still not well investigated. This section presents a review of past work and outlines the major concerns that our research seeks to present.

Literature review

To address the aforementioned challenges, Otoum et al.14 introduced a combined IDS for IoT. The IDS uses both anomaly-based as well as signature-based detection methodologies in order to detect both the known and unknown attacks. IoT gateway traffic is filtered, data preprocessed with Target Encoder, Z-score normalization, Discrete Hessian Eigenmap (DHE), and then comes the hybrid IDS. For the signature-based IDS LightNet with HMS and Boyer Moore algorithms are used, and KDD-Tre with Deep Q-learning for the anomaly-based IDS. The proposed system was tested using the NSL-KDD dataset which indicated increased in the detection rate, false alarm rate, specificity and reduced computation time compared to the traditional programs.

Thakkar et al.15 provided a systematic review of machine learning and deep learning approaches applied to IDS. In the paper, the author focuses on the development process of IDS, the shift from using simple patterns matching to introducing machine learning-based anomaly detection schemes. In improving the detection accuracy, the authors owe much to feature selection, and data preprocessing. They also examine other machine learning approaches which include the decision tree, support vector machine, and neural network, and its use in IDS. The assessment closes with a reflection on future developments in the literature, especially the possibilities and weaknesses of applying deep learning methods that can enhance the detection efficiency and decrease the level of false positive findings.

Otoum et al.16 proposed a new generation IDS for IoT networks as a fusion of the anomaly and signature-based approaches. The proposed system, AS-IDS, uses a three-phase approach: traffic filtering, preprocessing, and hybrid IDS were the other categories they concentrated on. Finally, the traffic filtering phase involves the match-filtering of packet features at IoT gateway, the preprocessing phase uses the Target Encoder and Z-score normalization techniques and lastly, the IDS uses LightNet as a signature-based detection and Deep Q-learning for anomaly-based IDS. The system is capable of handling real-time traffic and for evaluation NSL-KDD dataset has been employed and it is observed that it has exceedingly high detection rate and exceptionally low false alarm rate.

Nadia et al.17 carried out a systematic study on NIDS focused on IoT security and dedicated to machine learning algorithms incorporation. The survey also focuses on the major security issues arising from the massive expansion of IoT, which brings out the multi facets of security threats in IoT. The authors introduce IoT security threats and challenges and reflect on available NIDS tools and datasets and propose advanced NIDS solutions. Their concern includes the assessment of the architecture, detection techniques, validation mechanisms, treated threats, and the algorithms used in these NIDS. The survey also makes a point about the fact that conventional NIDS techniques will not be sufficient for IoT systems since they are resource-constrained, diverse, and have poor connectivity. The authors also significantly contribute to the academia and industry by proposing a literature review of IoT threats, as well as defining new smart approaches to improving the NIDS detection accuracy and reducing false positives.

In 2023, Mahmoud Ragab et al.18 presents the Piecewise Harris Hawks Optimizer with an Optimal Deep Learning Classifier (PHHO-ODLC) as a technique for identifying Distributed Denial of Service (DDoS) attacks within the IoT environment. It incorporates a three-stage process: feature selection using PHHO, DDoS attack detection via an attention-based bidirectional LSTM network, and hyper parameter optimization with a grey wolf optimizer19. The PHHO-ODLC demonstrates high effectiveness, achieving up to 99.20% accuracy in detecting DDoS attacks, thereby enhancing the security and reliability of IoT devices across various sectors. Future research is suggested in areas like scalability, handling diverse attack methods, adapting to evolving DDoS tactics, and integrating anomaly detection for more robust IoT security.

In 2023, Merve Ozkan et al.20 highlights increasing network security challenges and presenting a novel hybrid intrusion detection method called Feature Selection and Attack Classification Method (FSACM). FSACM employs feature selection and a signature-anomaly combined detection technique for improved accuracy and reduced false positives. Tested on diverse datasets, FSACM outperforms existing models, highlighting its potential for securing complex networks and adapting to evolving cyber threats. While emphasizing the inadequacy of traditional methods, the authors showcase the effectiveness of meticulous feature creation and learning phases in FSACM’s design.

In 2023, Zihao Zhu et al.21 developed a multi-feature fusion model that simulates attacker behavior and analyzes data using RNN, LSTM, and GRUs neural networks. The model selects the best-performing network as a classifier, achieving enhanced vulnerability detection and outperforming existing methods in terms of accuracy, efficiency, and practical applicability. This research highlights the model’s potential for real-world deployment amidst growing concerns about IoT security, particularly in the context of malicious Bitcoin mining activities.

In 2023, Muawia et al.22 identified DoS attacks as a major threat to WSNs and examined existing detection methods, exposing their weaknesses. To address this gap, they propose an agile, resource-efficient machine learning model using DT and Gini feature selection. Extensive testing on augmented WSN-DS data reveals their model’s superior accuracy (99.5%) and lower processing overhead compared to popular classifiers like RF, KNN, and XGBoost. Recognizing WSNs’ resource constraints, this approach shines in efficiency, highlighting its potential for real-world deployment. The authors stress the need for further research on different datasets and balanced data to refine accuracy even further.

In 2023, Hyunjong Lee et al.23 to combat the rising threat of diverse IoT malware, authors introduce a powerful detection and classification system. They overcome resource constraints by constructing low-dimensional features from opcode categories, capturing malware behaviors effectively. Across several machine learning models, their method achieves over 98% accuracy, making it a significant and reliable contribution to malware analysis. This advancement promises enhanced detection efficiency and a fortified defense against evolving cyber threats in the vast and vulnerable IoT ecosystem.

In 2022, Mohammed Saleh Ali Muthanna et al.24 suggested a smart SDN-enabled hybrid system utilizes Cuda Long Short-Term Memory Gated Recurrent Unit (CuLSTMGRU) aimed to recognize the attack in IoT contexts. The study shows that the planned architecture can grow and is cost-effective. In the future, the use of blockchain technology and the hybrid approach in NIDS could lead to even more effective and real-time ways to protect the IoT from threats. The authors recommend that DL-driven hybrid models should be investigated to protect the IoT ecosystem and keep up with changing computing paradigms.

In 2023, Ali et al.25 investigate the case of fine tuning the hyper-parameters of machine learning algorithm specially the SVM. The authors apply four different optimization algorithms such as: Ant Bee Colony (ABC) algorithm, Genetic Algorithm (GA), Whale Optimization and Particle Swam Optimization (PSO). The goal was to compute and evaluate the computational cost while hyper-parameter tuning of the machine learning model as well as compare the computational complexity of these optimization algorithms. The study found that the traditional optimization algorithm, like random search and grid search, suffers from the low convergence and taking high computation time. Based on the conducted experiments authors suggest that similar optimization algorithms can be applied on machine learning or deep learning model to fine tune the hyperparameter.

In recent years, IoT has become a multibillion-dollar industry. Despite its clear benefits, IoT is unsecure and a possible target for cyber-attacks due to its massive adoption. Additionally, the extensive connection and dynamic general nature of these devices may provide a smooth surface of threat for sophisticated malware attacks. Protecting the IoT environment against such threats and malware is imperative.

The characteristics and issues with the present IDS in IoT are listed in Table 1.

Table 1 Analysis of traditional IDS in IoT.

Proposed methodology

The proposed IDS for IoT security consist of three primary stages: preprocessing, feature extraction, and classification. In the preprocessing stage, all features with incomplete or irrelevant information were removed to ensure uniformity. The second stage, feature extraction, extracts statistics and higher order statistical features from the data in order to simplify and efficiently describe it. In the classification stage, a hybrid deep learning model combining CNN and GRUs is used to accurately classify security attacks. Furthermore, to improve the performance of the hybrid deep learning model, the SUCMO algorithm is used to optimize the model by fine-tuning its hyperparameters as shown in Fig. 1.

Fig. 1
figure 1

Proposed IDS framework.

Features extraction

Feature extraction and feature selection are crucial in machine learning for data preparation. Feature selection involves choosing the most relevant, unaltered features from the dataset to enhance predictive accuracy. In our case where we have used two datasets, both the datasets have 18 and 23 features, respectively. We discarded the features which were having incomplete and missing values. As the dataset already has a manageable number of features, feature selection was not strictly necessary and focused on feature extraction. Although in future work we could incorporate the feature selection process. The feature extraction process transforms raw data into a manageable set of new, representative features by combining or modifying the original data. It aims to boost model accuracy and efficiency, particularly in high-dimensional datasets, using statistical measures like mean, median, and mode for summarization.

Mean—It refers to the average value of the data in any given dataset. To find the meaning, add up entirely numbers and divide them by the total number of numbers. It can be used to label the center of a dataset and to make comparisons between different datasets.

Median—It is an indicator of central tendency that shows the middle value in a set of data after it has been sorted. It is a valuable tool for describing the center of the data and summarizing it, particularly in situations where the mean may not accurately represent the dataset due to outliers or skewed distribution.

Standard Deviation (SD)—The measured amount of variability or dispersion in the dataset is known as SD. It represents the average deviation of the values from the mean of the dataset. A smaller SD implies that the values in the dataset are closer to the mean, whereas a larger standard deviation indicates a more extensive spread or dispersion of values.

Mode—The mode in a dataset is the value that occurs most frequently. It is a measure of central tendency that is commonly used to describe the data in datasets with nominal or ordinal scale variables.

Min—The Min in a dataset is the minimum value in the dataset. It is an important statistic that provides information about the lower bound of the dataset values.

Max—The largest value in the dataset is referred to as Max. It is a measure of the upper end of the range of data and can be used to describe the spread of the data and to identify outliers or values that are significantly higher than the rest of the data.

Skewness—Skewness in a dataset is an indication of asymmetrical distribution in a set of data, which defines the degree to which data deviates from a normal or symmetrical distribution.

Kurtosis—It indicates the peakness or flatness of the distribution of the data, which describes the degree to which the data deviates from a normal distribution.

Hybrid classifier for attack classification

A hybrid deep learning classifier refers to a model that blends two or more distinct deep learning algorithms, aiming to enhance both accuracy and overall performance26. The objective of a hybrid deep learning classifier is to take advantage of the best part of different algorithms and overcome the limitations of algorithms. The choice of algorithms to include in a hybrid deep learning classifier depends on the problem and the characteristics of dataset. The finest hybrid classifier for a given problem is determined through experimentation and testing. CNN and GRUs deep learning model are used in the design of the hybrid classifier that has proposed, moreover averaging is used which combines the predictions of both deep learning models by taking the average of their outputs.

Proposed optimization algorithm (SUCMO)

The present CMBO27 is an optimization method or algorithm that is based on behavior of cats and mouse. The algorithm combines global and local search techniques to determine the optimal solution for a given problem. The algorithm derives its name from the way cats and mice engage, in which the cat walks slowly and methodically in an attempt to capture the swiftly moving mouse. When discussing optimization techniques, the cat symbolizes the global searches method and the mouse represents the local search method.

CMBO algorithm is used to optimize the weights of the classifiers such as CNN and GRUs by searching for the best possible weights. The algorithm starts with the initialization of random weights and evaluates the classifier performance based on accuracy. Following this, it iteratively updates the weights. It is doing this by using the best-performing solution to guide other solutions towards better results. Over multiple iterations, It finds the optimal weights that improve the classifier’s performance, helping the model make more accurate predictions.

The SUCMO model’s steps are provided below:

  1. 1.

    \(X\) Search agent starting population get initialized.

  2. 2.

    Initialize \(X,X_{h} ,X_{l} ,E\). Where \(X\) refers member count of \(Z\) population matrix.

  3. 3.

    According to Eq. (1), the starting population is generated.

    $$Z = \left[ {\begin{array}{*{20}c} {Z_{1} } \\ {Z_{2} } \\ \vdots \\ {Z_{X} } \\ \end{array} } \right]_{X*l} = \left[ {\begin{array}{*{20}c} {s_{1,1} } & \cdots & {s_{1,u} } & \cdots & {s_{1,l} } \\ \vdots & \ddots & \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots \\ {s_{i,1} } & \cdots & {s_{i,u} } & \cdots & {s_{i,l} } \\ \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots & \ddots & \vdots \\ {s_{X,1} } & \cdots & {s_{B,u} } & \cdots & {s_{X,l} } \\ \end{array} } \right]_{X*l}$$
    (1)

    where, \(s_{i,u}\) refers problem value of \(u\).

  4. 4.

    According to Eq. (2), the search agents’ fitness is calculated.

    $$obF = \min (err)$$
    (2)
  5. 5.

    Update sorted population matrix \(Z^{r}\) by applying Eqs. (3) to (4). Where, sorted population matrix \(i^{th}\) population is signified as \(s_{i,u}^{r}\). Furthermore, sorted objective function-based vector is represented as \(obF^{r}\).

    $$Z^{r} = \left[ {\begin{array}{*{20}c} {Z_{1}^{r} } \\ {Z_{2}^{r} } \\ \vdots \\ {Z_{X}^{r} } \\ \end{array} } \right]_{B*m} = \left[ {\begin{array}{*{20}c} {s_{1,1}^{r} } & \cdots & {s_{1,u}^{r} } & \cdots & {s_{1,l}^{r} } \\ \vdots & \ddots & \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots \\ {s_{i,1}^{r} } & \cdots & {s_{i,u}^{r} } & \cdots & {s_{i,l}^{r} } \\ \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots & \ddots & \vdots \\ {s_{X,1}^{r} } & \cdots & {s_{X,u}^{r} } & \cdots & {s_{X,l}^{r} } \\ \end{array} } \right]_{X*l}$$
    (3)
    $$obF^{r} = \left[ {\begin{array}{*{20}c} {obF_{1}^{r} } \\ {obF_{2}^{r} } \\ \vdots \\ {obF_{X}^{r} } \\ \end{array} \begin{array}{*{20}c} {\min (obF)} \\ {\min (obF)} \\ \vdots \\ {\min (obF)} \\ \end{array} } \right]_{X*1}$$
    (4)
  6. 6.

    Mice population is selected by employing Eq. (5).

    $$L = \left[ {\begin{array}{*{20}c} \begin{gathered} L_{1} = U_{1}^{r} \hfill \\ \vdots \hfill \\ \end{gathered} \\ {L_{i} = U_{i}^{r} } \\ \vdots \\ {L_{{X_{l} }} = U_{{X_{l} }}^{r} } \\ \end{array} } \right]_{{X_{l} *l}} = \left[ {\begin{array}{*{20}c} {s_{1,1}^{r} } & \cdots & {s_{1,u}^{r} } & \cdots & {s_{1,l}^{r} } \\ \vdots & \ddots & \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots \\ {s_{i,1}^{r} } & \cdots & {s_{i,u}^{r} } & \cdots & {s_{i,l}^{r} } \\ \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots & \ddots & \vdots \\ {s_{{X_{l} ,1}}^{r} } & \cdots & {s_{{X_{l} ,u}}^{r} } & \cdots & {s_{{X_{l} ,l}}^{r} } \\ \end{array} } \right]_{{X_{l} *l}}$$
    (5)
  7. 7.

    Cat population is chosen via Eq. (6).

    $$C = \left[ {\begin{array}{*{20}c} \begin{gathered} C_{1} = U_{{X_{l} + 1}}^{r} \hfill \\ \vdots \hfill \\ \end{gathered} \\ {C_{i} = U_{{X_{l} + j}}^{r} } \\ \vdots \\ {C_{{X_{c} }} = U_{{X_{m} + X_{c} }}^{r} } \\ \end{array} } \right]_{{X_{c} *l}} = \left[ {\begin{array}{*{20}c} {s_{{X_{l} + 1,1}}^{r} } & \cdots & {s_{{X_{m} + 1,u}}^{r} } & \cdots & {s_{{X_{l} + 1,l}}^{r} } \\ \vdots & \ddots & \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots \\ {s_{{X_{l} + j,1}}^{r} } & \cdots & {s_{{X_{l} + j,u}}^{r} } & \cdots & {s_{{X_{l} + j,l}}^{r} } \\ \vdots & {\mathinner{\mkern2mu\raise1pt\hbox{.}\mkern2mu \raise4pt\hbox{.}\mkern2mu\raise7pt\hbox{.}\mkern1mu}} & \vdots & \ddots & \vdots \\ {s_{{X_{l} + X_{c} ,1}}^{r} } & \cdots & {s_{{X_{l} + X_{c} ,u}}^{r} } & \cdots & {s_{{X_{l} + X_{c} ,l}}^{r} } \\ \end{array} } \right]_{X*l}$$
    (6)
  8. 8.

    Where, \(M,X_{l} ,M_{i} ,C,X_{c} ,C_{j}\) represents mice population, mice total, \(j{\text{th}}\) mice, population of cat, cats total and \(i{\text{th}}\) cat, correspondingly.

  9. 9.

    Cat’s position update is provided in Eq. (7), Here,\(C_{j}^{new}\) is the \(j{\text{th}}\) cat’s new location and \(C_{j,u}\) is the \(u{\text{th}}\) problem’s current value. Furthermore, \(z\) is a random number which is estimated among the range [0, 1]. In this,\(R\) is evaluated using Eq. (8), Here, \(rand\) refers random integer.

    $$C_{j}^{new} = \left[ {C_{j,u} + z*(M_{k,u} - R*C_{j,u} )} \right]$$
    (7)
    $${\text{Here}},R = round(1 + rand)$$
    (8)
  10. 10.

    If \(j = X_{c}\) (a) If the criterion is met, then \(H_{i}\) is formed via Eq. (9).

    $$H_{i} = h_{i,u} = s_{m,u} \& i = 1:X_{l} ,u = 1:l,m \in 1:X$$
    (9)
  11. 11.

    Mice position update is provided in Eq. (10) and Eq. (11). Traditionally, update \(L_{i}\) using Eq. (11), however, according to SU-CMO method, update \(L_{i}\) depending on random integers \(ra_{1}\),\(ra_{2}\) shown in Eq. (12) & (13). Where, \(ra_{1}\) , \(ra_{2}\) were set with 1.25 and 1.75.

    $$\begin{gathered} L_{i}^{new} :l_{m,u}^{new} = l_{m,u} + z*(h_{i,u} - R*l_{i,u} ) + Sign(J_{i}^{l} - J_{i}^{H} )\& \hfill \\ i = 1:X_{l} ,u = 1:l \hfill \\ \end{gathered}$$
    (10)
    $$L_{i} = \left\{ {\begin{array}{*{20}l} {L_{i}^{new} } \hfill & {\left| {J_{i}^{l,new} < J_{i}^{l} } \right.} \hfill \\ {L_{i} } \hfill & {\left| {else} \right.} \hfill \\ \end{array} } \right.$$
    (11)
    $$L_{i} = \left\{ {\begin{array}{*{20}l} {L_{i}^{new} } \hfill & {\left| {J_{i}^{l,new} \cdot ra_{1} < J_{i}^{l} } \right.} \hfill \\ {L_{i} } \hfill & {\left| {else} \right.} \hfill \\ \end{array} } \right.$$
    (12)
    $$L_{i} = \left\{ {\begin{array}{*{20}l} {L_{i}^{new} } \hfill & {\left| { J_{i}^{l,new} \cdot ra_{2} < J_{i}^{l} } \right.} \hfill \\ {L_{i} } \hfill & {\left| {else} \right.} \hfill \\ \end{array} } \right.$$
    (13)

    (b) If the aforementioned criteria are not met, then increase \(j\) by 1, and update \(C_{j}\) again.

    (c) Put an end to the if statement.

  12. 12.

    If \(i = X_{l}\) then

    (a) If aforementioned criterion is not met, then verify if \(t = T\).

    (b) If aforementioned criterion isn’t met, then increment \(i\) by 1.

    (c) End if

  13. 13.

    If \(t = T\), then optimal solution found so far is returned,

  14. 14.

    If \(t \ne T\), then increment \(i\) by 1 and start again from evaluating objective function.

  15. 15.

    End

The SUCMO algorithm, a significant enhancement of the existing CMBO algorithm, is achieved by integration of the two empirically derived scaling factors. These scaling factors are identified using a hit-and-trial approach which plays a key role in adjusting the dynamics of exploration and exploitation phases of optimization process. This approach is designed to test a wide range of random values in the CMBO, aiming to find the optimal combination which maximizes the algorithm’s performance. The chosen values 1.25 and 1.75 are integrated with the CMBO Eq. (12) and Eq. (13). We are introducing a novel tuning mechanism which is significantly improving the algorithm’s efficiency and adaptability.

The use of the empirically derived scaling factors is showing the ability by enhancing the optimization of hybrid deep learning models for the detection of security attacks in IoT environment, which is evidenced by superior performance metrics including accuracy and convergence speed. Figure 2 depicts the flowchart of SUCMO.

Fig. 2
figure 2

SUCMO flowchart.

Result and discussion

The implementation of the proposed model is carried out using Python.

Dataset description

UNSW-NB15 dataset has a total of 120 instances, in which 60 instances are of normal and rest are the attacks i.e., DoS attack only. The dataset contains a total of 19 features including the target also, such as packet size, flow duration, and protocol type, which are essential for classification the abnormal activity in the IoT environment. On the other hand, second dataset i.e., BoT-IoT contains a total of 2000 instances, which includes the 1000 normal and rest of are attacks instances. In this dataset, the attacks instances are further divided into 8 categories, such as: analysis, backdoor, DoS, Exploits, Fuzzers, Reconnaissance, Shell Code and Worms. However, all attack categories are combined into single category which is “attack” for binary classification. BoT-IoT dataset has a total of 24 features including the target feature, it includes such as, source and destination address, packet size, flow duration and other network details. Both the datasets offer richer and complex details for evaluation of the models against various attacks in the IoT environment. Figure 3 shows the instances distribution of normal and attacks for the both datasets. Table 2 below summarizing both the datasets.

Fig. 3
figure 3

Distribution of normal and attack instances in datasets.

Table 2 Summary of datasets.

Evaluation metrics

The effectiveness of the proposed approach has been contrasted to traditional methods on the basis of positive measures such as accuracy, sensitivity, specificity, and precision.

Achieving high accuracy is a primary goal in machine learning because it ensures that the model is making the correct predictions. It can be defined by the Eq. (14)

$${\text{Accuracy}} = \frac{TP + TN}{{TP + TN + FN + FP}}$$
(14)

Nevertheless, it is crucial to acknowledge that accuracy alone may not provide an adequate measure for assessing the effectiveness of a model. Other metrics such as precision, specificity, and sensitivity should also be considered.

Precision is another significant metric holding key role in assessing a model’s performance. It quantifies the accuracy of positive predictions by determining the proportion of true positives among instances predicted as positive. Precision is computed by; the number of true positives is divided by the sum of true positives and false positives, as depicted in Eq. (15).

$${\text{Precision}} = \frac{TP}{{TP + FP}}$$
(15)

Also, it examined with well-known methods such as Rock Hyraxes Swarm Optimization (RHSO)28, Butterfly Optimization Algorithm (BOA)29, Salp Swarm Optimization Algorithm (SSOA)30, and Blue Monkey Optimization (BMO)31 by modifying the learning percentage to 60, 70, 80 and 90 respectively.

Furthermore, Tables 3 and 4 outline the architecture of CNN and GRUs, detailing each layer’s type and parameters.

Table 3 CNN Architecture.
Table 4 GRU Architecture.

Analysis of proposed model for dataset 1

The efficiency of the proposed method for IoT intrusion detection was compared to traditional methods like RHSO, BOA, SSOA, and BMO, using measures such as accuracy, specificity, precision, sensitivity, FNR, FPR, F-measure, MCC, and NPV. The learning percentage varied between 60 and 90%. Results in Fig. 4 show that the proposed method outperformed the others, achieving higher positive values and lower negative values, which are desirable for satisfactory results. Referring Fig. 4a, we can see that the proposed strategy achieved accuracy of ( ~)94.28% and 96.65% at the 60% and 90% of learning percentage. Likewise, the suggested strategy seemed to have the highest value in the evaluation of positive measures for all Learning percentages.

Fig. 4
figure 4

Performance Assessment of the proposed Hybrid + SUCMO versus Traditional techniques (a) Accuracy, (b) Precision (c), Sensitivity (d), Specificity for UNSW-NB15 dataset.

Similarly, in the examination of Table 5 the negative measure (FNR, FPR), the suggested technique produced the lowest FPR as 0.009, 0.001, 0.003, and 0.004 in the learning percentages of 60, 70, 80, and 90. Furthermore, the suggested strategy has revealed more effective results than the traditional methodologies. The proposed approach has the highest F-measure i.e., above ( ~)90%, in all learning rate percentage. Consequently, at the 90% of learning rate, the NPV of the adopted technique is 93.56%, which is superior to existing schemes like RHSO is 90.61%, BOA is 88.67%, SSOA is 84.13% and BMO is 86.48%. Also, the proposed method’s negative measure exhibited less error than the others, ranging from 0.01 to 0.03, from a learning percentage of 70–90%. Thus, the proposed IDS system established excellent outcomes and improved efficiency for the IoT Intrusion Detection.

Table 5 Analysis on classifiers of the proposed hybrid + SUCMO model for datasets.

Analysis of proposed model for dataset 2

Figure 5 provides a detailed analysis of the performance of the proposed approach, assessing metrics including accuracy, sensitivity, specificity, and precision. When the results were analyzed, the suggested methodology exceeded the traditional schemes by producing better outcomes. Likewise, the suggested methodology provides the highest positive measure values (Accuracy, F-measure, Precision and so on) and lowest negative (FNR, FPR) measure values. The suggested method detection accuracy is 98.71%, at a learning rate of 90%, which is significantly higher than the traditional methods like RHSO = 89.62%, BOA = 75.45%, SSOA = 79.49% and BMO = 78.54%, respectively. Similarly, the other positive metric also reached its highest value while compared to the other approaches.

Fig. 5
figure 5

Performance Assessment of the proposed Hybrid + SUCMO versus Traditional techniques (a) Accuracy (b) Precision (c) Sensitivity (d) Specificity for BoT-IoT dataset.

Moreover, in the 80% of learning percentage, the FNR of the proposed method is 0.078, which is quite low in contrast to the RHSO, BOA, SSOA and BMO. The other measures, such as F-measure, MCC, and NPV, have increased towards the maximum value, i.e., above ( ~)90% in all the learning percentage. Particularly, ( ~)98.58% is the F-measure of the suggested method in the 90% of learning percent. Hence, the evaluation makes it evident that the suggested Hybrid + SUCMO model significantly outperforms for intrusion detection in IoT with high accuracy and precision rate. Altogether, the hybrid classifier with proposed optimization strategy, SUCMO shows its performance over the other state-of-the art models with respect to intrusion detection precisely.

Analysis on classifier of the proposed model for dataset 1 And dataset 2

This section describes how the proposed Hybrid + SUCMO model performs more efficiently than traditional. Table 5 summarizes the classifier analysis of the suggested techniques for both datasets. Both datasets achieved superior results to those of traditional techniques. This indicates the efficacy of the suggested approach for IoT intrusion detection. Dataset 1 shows that the existing methods, such as SVM, ANN, CNN, RF, QNN and GRU-CNN, attained incredibly low precision of 71.20%, 73.35%, 58.8%, 73.23%, 76.57% and 68.28%, respectively, meanwhile the suggested Hybrid + SUCMO method achieved precision of 93%. Likewise in the examination of negative measures, the suggested methodology generated the lowest FNR (0.1008) and FPR results (0.00191).

Considering BoT-IoT dataset , the proposed technique is determined to have higher accuracy than the extant methodologies. The adopted work’s F-measure is 92.78%, much higher than the F-measures of the following models: SVM (71.20%), ANN (73.35%), CNN (58.8%), RF (73.23%), QNN (76.75%) and GRU-CNN (68.28%). respectively. Thereby, the suggested Hybrid + SUCMO method demonstrates its impressive ability to solve the detection problem with a lower error rate and produce a high accuracy rate.

Moreover, accuracy of Hybrid + SUCMO method also compared with the state of artwork for both datasets in Table 6.

Table 6 Accuracy analysis of the proposed methods with state of art methods.

Convergence analysis of the proposed hybrid + SUCMO model

Figure 6 shows the outcomes of the suggested Hybrid + SUCMO model’s convergence analysis in contrast to more conventional methods. To indicate that the proposed method is highly effective at detecting intrusion detection in IoT, while making few errors. Each classifier has the largest error values during the initial iteration, but as the iterations progress, the classifiers’ error rates gradually decrease. However, regarding the dataset1, in iterations 0–5, the recommended scheme had a larger error rate of ( ~)0.48, whereas in iterations 6–10, it had a lower error rate (0.008) than the other well-established classifiers. Moreover, analyzing the dataset2, the proposed work’s error rate at the 50th iteration is 1.01. Therefore, it has been conclusively demonstrated that the suggested Hybrid + SUCMO technique offers enhanced intrusion detection with lower error.

Fig. 6
figure 6

Convergence analysis of the proposed Hybrid + SUCMO model versus Traditional methods (a) UNSW-NB15 Dataset (b) BoT-IoT Dataset.

Ablation study of the proposed hybrid + SUCMO method

With regard to both datasets, Table 7 analyses the Hybrid + SUCMO scheme, the models without feature extraction. Here, analysis is carried out for a variety of measures, including accuracy, F-measure, MCC, FNR, FPR, and others, as well as the results are displayed. By comparing the proposed Hybrid + SUCMO model to the other features, the ablation study results show that the suggested method has achieved higher values for all positive metrics. For dataset 1, the detection accuracy of the suggested strategy is 93.36%, compared to 82% for the model without feature extraction. Consequently, the F-measure, FNR, MCC and Precision of the suggested methodology is 93.68%, 0.1008, 86.82% and 90.93%, respectively.

Table 7 Ablation study of the proposed hybrid + SUCMO method.

Results discussion and contributions to IoT security

The proposed model, which incorporates CNN and GRU, and optimized by the SUCMO algorithm, is a major advancement in IoT security because it solves some main shortcomings of current models. In contrast to the conventional IDS paradigms that address either spatial or temporal aspects of feature information, this work enhances the detection rate with fewer false alarms. Compared to previous research, our method achieves a greater accuracy of classification as well as a multitude of false positives on benchmark IoT datasets. This helps to improve the effectiveness of the disruption of IoT space, the strengthening of its protection from the new dangerous threats.

Table 8 showing the comparison between the proposed model and recent studies.

Table 8 Comparison with Latest IDS.

Conclusion

This paper details a comprehensive study on a hybrid deep learning model for detecting security attacks in IoT environments, focusing on evaluation and comparative analysis. The study aims to strengthen IoT security by detecting and mitigating intrusions. The proposed IDS involves three stages: preprocessing for data normalization, feature extraction using statistical methods for accurate intrusion detection, and classification through a hybrid CNN and GRU model. The model’s performance is enhanced using the SUCMO algorithm for weight optimization. Extensive tests using two datasets show that the hybrid CNN-GRU classifier excels in intrusion detection in IoT contexts, with high accuracy and other positive metrics. This research contributes significantly to IoT security, offering an efficient IDS approach and opening pathways for future advancements in protecting IoT systems against attacks.