Introduction

The IoT has recently developed, allowing ubiquitous computing and sensing to link many things to the Internet1. An IoT system links with other crucial structures in the smart city environment, like telecommunication networks, smart homes, and smart airports, to give citizens diverse benefits that can improve their lives2. An IoT structure is a cyber-physical system (CPS) that contains physical and computational abilities that permit interactions, data sharing, and connections between devices and machines to upgrade efficiency and functionality without requiring a person in the loop3. Consequently, such gadgets’ memory and power limits make this system susceptible to hacking and contribute to cyber threats. Thus, cybersecurity has become vital to keep these systems operational4. Cybersecurity is securing gadgets, data, and systems against illegal usage or unauthorized access and retaining information privacy, availability, and integrity. At the same time, cyber defensive mechanisms develop at the network, application, data, and host levels. The Internet has become a crucial device in people’s everyday lives, and the number of systems connected to the Internet extends too5.

The development of mobile devices, computer servers, and networks has considerably increased Internet utilization. Explainable AI (XAI) increases attention to several applications due to having multiple benefits, like being a trustworthy, highly transparent, and interpretable method6. AI methods are being developed every day with more advanced aspects. AI has also become a stage where human brains can quickly interface with machines. Nevertheless, these are frequently susceptible to method bias, absence of code, and trust concerns7. To address such hazards and maintain the AI methods transparent, the development of XAI provides a significant understanding of the method without any confusion when making decisions or embracing solutions. Consequences of XAI in the existing businesses can switch the traditional AI methods, and make a significant impact with better improvement and advancement in the manufacturing, production, wealth management, financial sectors, and supply chain8. Recently, XAI technology has been of extensive interest to both academia and industry. The development of this technology has attained considerable success, and trustworthy decisions have been made using these methods9. The XAI application in cybersecurity might be a double-edged sword: it can significantly enhance cybersecurity methods and assist in addressing novel threats to AI applications. It is also Explainable to the attacker who can pose severe security attacks10. AI methods, intense learning (DL) and ML methodologies can give impressive performances on benchmark datasets in various applications of the cybersecurity field.

This paper presents the Explainable Artificial Intelligence for Cyber Resilience Using a Hybrid Deep Learning and Optimization Algorithm (XAICR-HDLOA) approach to improve cyber threat detection and interpretation in IoT environments. Min-max normalization is initially applied to standardize feature scales, followed by the Bald Eagle Search (BES) model for selecting key features. Moreover, the hybrid Convolutional Neural Networks-Bidirectional Gated Recurrent Unit (CNN-BiGRU) model is employed for cyberattack classification. Furthermore, the Improved Chimp Optimizer Algorithm (IChoA) is implemented for the hyperparameter tuning process. Finally, SHAP is applied to improve model interpretability, increasing trust and reliability in cybersecurity. Simulations of the XAICR-HDLOA approach are performed under the Edge-IIoT and Bot-IoT datasets. The key contribution of the XAICR-HDLOA approach is listed below.

  • The XAICR-HDLOA approach applies min-max normalization to standardize feature scales, ensuring consistent data processing and improving the model’s capability for handling diverse input data. This technique enhances the accuracy of subsequent algorithms by removing biases caused by varying feature ranges. It contributes to the overall efficiency of the model, making it more reliable in IoT environments.

  • The XAICR-HDLOA approach effectively employs the BES model to choose the most relevant features, optimizing the classification process. Detecting key features mitigates dimensionality and improves the model’s performance. This approach confirms that only the most informative data is used, improving the accuracy and efficiency of threat detection in IoT systems.

  • The XAICR-HDLOA approach implements the hybrid CNN-BiGRU classification approach, utilizing the power of CNNs for feature extraction and BiGRU units for capturing temporal dependencies. This integration improves the model’s ability to detect complex patterns and identify threats. It significantly enhances the robustness and precision of the IDS in dynamic IoT environments.

  • The XAICR-HDLOA approach employs the IChoA technique to fine-tune model parameters, optimizing the search for optimal solutions. Adjusting hyperparameters more effectively improves the overall performance and accuracy of the model. This results in more precise predictions and enhanced efficiency in real-time threat detection within IoT systems.

  • The novelty of the XAICR-HDLOA approach is in its unique combination of advanced techniques, incorporating a hybrid CNN-BiGRU model with BES for efficient feature selection and IChoA for optimization. This approach is specifically designed to improve intrusion detection in resource-constrained IoT environments. By seamlessly combining these methods, the model balances performance, accuracy, and computational efficiency, making it appropriate for real-time IoT applications.

Comprehensive literature review on cybersecurity in IoT and IIoT networks

Birahim et al.11 developed an innovative IDS utilizing PSO and an ensemble ML method associating DT, KNN, and RF methods to improve the precision and dependability of intrusion detection in WSN. The projected method accomplishes substantial growth by integrating OTE-Tomek models to balance the data, and proposes utilizing XAI models like SHAP and LIME. Narkedimilli et al.12 projected a lightweight and scalable curriculum learning structure developed with XAI models, comprising LIME. The presented method utilizes an innovative neural network (NN) structure employed at each phase of Curriculum Learning. Sturdiness is accomplished through staged learning, where the technique iteratively upgrades itself by extracting lower-relevance aspects and enhancing performance. The workflow comprises edge-optimized pruning and quantization to safeguard portability that can be employed in the edge-IoT gadgets. An ensemble method integrating random forest (RF), XGBoost, and the staged learning base continues to improve generalization. Naif Alatawi13 projected an innovative IDS structure, which incorporates sophisticated ML models containing transfer learning (TL), feature engineering, and ensemble learning, to improve recognition precision, interpretability, and adaptability. The ensemble learning module integrates different classifiers, for instance, DT, KNN, and LR, by utilizing their unique capabilities to increase recognition rates. Pre-training techniques applied to TL are connected to cybersecurity datasets and fine-tuned on the combined dataset. Izuazu et al.14 projected the eXplainable cyber-threat detection framework (XC-TDF) as an innovative solution to overcome these tasks. The projected model improves sturdiness against adversarial threats and noise by applying adversarial training and regularisation correspondingly, and also upgrades transparency over an XAI component. Patel et al.15 developed X-NET, an XAI-based system data security method for medical care 4.0 applications. For comparison purposes, five diverse kinds of conventional feature extraction models are employed together with logistic regression (LR), naive bayes (NB), and Insight. Using X-AI models like SHAP and LIME has substantially boosted X-NET’s performance and dependability. Baral et al.16 developed a novel, wide-ranging structure for real-world IoT threat recognition and response that utilizes XAI, LLM, and ML. Combining XAI models like LIME and SHAP with a model-independent structure guarantees this structure’s flexibility through several ML models. Furthermore, integrating LLM improves the accessibility and interpretability of recognition decisions and human-understandable explanations of identified attacks. Tripathy et al.17 projected a structure that depends on XAI for safeguarding user IoT applications in smart cities. At the initial stage of protocol execution, the participants switch authenticated data over the blockchain (BC)-based AKA process. Simultaneously, this model implements the Python-based SHAP structure to interpret and explain the core aspects guiding decision-making.

In18, a BC-enabled XAI is projected to improve the decision-making ability of cyber-attack recognition in the context of Smart Medical care Methods. Initially, this model utilizes BC to store and validate data among different cloud vendors by applying a Clique Proof-of-Authority (C-PoA) consensus. Then, a new DL-based threat-hunting method is developed by relating Parallel Stacked LSTM (PSLSTM) techniques with a multi-head attention mechanism for enhanced threat recognition. Zeghida et al.19 developed ML and DL models such as RF, support vector machine (SVM), convolutional neural network (CNN), CNN with long short-term memory (CNN-LSTM) for detecting various cyberattacks. Also, an explainable AI framework using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) is utilized to provide transparent and interpretable intrusion detection. Nandanwar and Katarya20 presented a robust DL technique named AttackNet, which utilizes an adaptive CNN and gated recurrent unit (CNN-GRU) methodology for accurate detection and classification of botnet attacks. Reddy et al.21 developed an advanced intrusion prevention framework using graph-residual adversarial network (GRANet) approach incorporated with hawk-bee stride finder (HBSF) optimization. Nandanwar and Katarya22 proposed a robust CNN Neural Network–bidirectional long short-term memory (CNN-BiLSTM) with TL-BiLSTM. Nandanwar and Katarya23 presented a robust DL-based IDS named cyber-sentinet for CPS in industrial IoT environments. The model integrates SHAP to improve interpretability while accurately detecting diverse cyber-attacks. Kauhsik et al.24 proposed a methodology by using ML and DL techniques. Nandanwar and Katarya25 developed a secure and efficient healthcare data management system by integrating BC technology, smart contracts, non-interactive zero-knowledge proof (NIZKP), and inter-planetary file system (IPFS) with an IDS to ensure data confidentiality, integrity, and privacy in IoT-enabled healthcare environments. Attique et al.26 developed a transparent and data-efficient IDS for IIoT environments by using BiLSTM and a self-adaptive attention mechanism (S-AAM). Additionally, SHAP from explainable artificial intelligence (XAI) are integrated to improve model interpretability and trustworthiness. Bajpai and Patankar27 presented a self-configuring intrusion detection framework for BC networks using Adaptive Goal Target Optimization with Deep BiLSTM (AGLSTM). The model also incorporates rider-cheetah hybrid optimization (RCHO) and synthetic minority oversampling technique (SMOTE) to improve feature learning and data balance. Elsaid et al.28 proposed an optimized IDS by integrating grey wolf optimization (GWO) with DL models such as GRU and LSTM (GRU-GWO) and (LSTM-GWO).

Summary of related works with identification of research gaps and limitations

Table 1 summarises the existing studies on cyberthreat detection.

Table 1 Summary of the BC-enabled XAI framework for cyber threat detection.

The limitations of the existing studies are in the reliance on specific, often limited, datasets, which may not fully capture the diverse and growing nature of IoT threats. Several models also encounter threats in terms of scalability, specifically when applied to resource-constrained devices in real-time IoT environments. Though the XAI models are utilized, transparency and interpretability remain constrained, particularly with complex ensemble or DL models. Furthermore, a research gap exists in addressing the robustness of these systems against adversarial attacks and noise. The integration of BC for data validation is promising but still lacks comprehensive testing in large-scale, heterogeneous IoT networks. Moreover, many approaches concentrate on detection accuracy, neglecting the optimization of response times and real-time adaptability in dynamic IoT systems. Additionally, explainability and transparency in decision-making remain underexplored, limiting trust and practical deployment.

The proposed methodology

This paper proposes the XAICR-HDLOA approach. The main objective of the XAICR-HDLOA approach is to enhance cyber threat detection and interpretation in IoT environments. To accomplish this, the XAICR-HDLOA approach has data normalization, BES-based feature selection, a hybrid of CNN-BiGRU models, and parameter tuning using IChoA. Figure 1 represents the entire procedure of the XAICR-HDLOA approach.

Fig. 1
figure 1

Overall process of XAICR-HDLOA approach.

Data normalization: min-max normalization

Initially, the XAICR-HDLOA approach applies the min-max normalization approach to standardize feature scales during the data normalization process29. This model is chosen due to its simplicity and efficiency in scaling features within [0, 1]. This confirms that all features contribute equally to the model, preventing dominance by those with larger scales. Compared to other techniques, such as Z-score normalization, this technique does not assume a normal distribution, making it more appropriate for diverse datasets, particularly those with skewed or non-normal distributions. Furthermore, it is computationally efficient and easy to implement, making it ideal for real-time applications in IoT systems. Its capability to preserve the relationships between the original data values improves the model’s performance. It balances simplicity and effectiveness, specifically in environments with resource constraints.

A min-max scaling method was applied to certify data consistency and uniformity across measures. This method regularizes data by regulating the least and most significant values to 0 to 1, respectively, with every intermediate value measured evenly in this collection. The primary purpose of standardization is to avert the uneven amplification of input features, which might badly impact the learning procedure. Moreover, using standardized data is vital in NNs as it aids in decreasing error propagation. The mathematical formulation is given in Eq. (1).

$$\:{x}_{scaled}=\frac{{x}_{real}-\text{m}\text{i}\text{n}\left(x\right)}{\text{m}\text{a}\text{x}\left(x\right)-\text{m}\text{i}\text{n}\left(x\right)}$$
(1)

Dimensionality reduction: BES approach

For dimensionality reduction, the BES model is employed to select the most relevant features30. This model is chosen for its effectiveness in selecting the most relevant features while maintaining high classification performance. Unlike conventional methods such as PCA, which mitigate dimensions based purely on variance, BES utilizes a nature-inspired optimization technique to detect key features directly affecting the model’s predictive capabilities. This technique is highly efficient in intrinsic datasets, averting the loss of crucial data during reduction. The model also adapts well to nonlinear relationships and averts overfitting by concentrating on the most relevant features for the task. Its capability in handling massive, high-dimensional datasets makes it ideal for resource-constrained IoT environments, where computational efficiency is crucial. Overall, BES balances accuracy and efficiency, ensuring optimal performance for IDS. Figure 2 illustrates the BES technique.

Fig. 2
figure 2

Workflow of the BES model.

As a stochastic optimizer model, BES is stimulated naturally and established in the group’s behaviours. BES derives from the joint hunting approaches bald eagles use while hunting prey. The population of bald eagles matches their assault on prey over three distinct tactics, such as choosing the search region, recognizing the prey, and diving to take the prey.

Choose search scope

The eagle group presents data in this phase to maintain the swarm’s unity. The behaviour of \(\:tth\) group in the search space is expressed below:

$$\:{p}_{i}^{(c+1{)}_{s}}={p}_{\text{*}}^{c}+\alpha\:\cdot\:{r}_{1}\cdot\:\left({p}_{mean}^{c}-{p}_{i}^{c}\right)$$
(2)

Here, \(\:\alpha\:\in\:\left(\text{1.5,2}\right),{r}_{1}\in\:\left[\text{0,1}\right],\) \(\:{p}_{meon}^{t}=\frac{1}{N}{\sum\:}_{i=1}^{N}{p}_{i}^{t},{p}_{\text{*}}^{t}\) signifies the finest individual, \(\:N\) represents the dimension of the swarm, \(\:{p}_{i}^{t}\), and \(\:{p}_{i}^{(t+1{)}_{s}}\) epitomize an\(\:\:ith\) individuals before and after the searching space range. Over the progress of Eq. (2), a population \(\:{p}^{(t+1{)}_{s}}=[{p}_{1}^{(t+1{)}_{s}},{p}_{2}^{(t+1{)}_{s}},\dots\:,{p}_{N}^{(t+1{)}_{s}}]\) is moulded.

Pick quarry

Here, the flock of eagles utilizes chain rubrics for spiral upgrades.

$$\:{\theta\:}_{i}^{tz}=a\cdot\:\pi\:\cdot\:rand{r}_{i}^{tz}={\theta\:}_{i}^{tz}+R\cdot\:rand$$
$$\:x{r}_{i}^{tz}={r}_{i}^{rz}\cdot\:\text{s}\text{i}\text{n}\left({\theta\:}_{i}^{tz}\right){y}^{{\beta\:}_{i}}={r}_{i}^{tz}\cdot\:\text{c}\text{o}\text{s}\left({\theta\:}_{i}^{tz}\right)$$
(3)
$$\:{x}_{i}^{tz}=x{r}_{i}^{tz}/\underset{1\le\:i\le\:N}{\text{max}}\left(\right|x{r}_{i}^{tz}\left|\right){y}_{i}^{tz}=y{r}_{i}^{tz}/\underset{1\le\:i\le\:N}{\text{max}}\left(\right|y{r}_{i}^{tz}\left|\right)$$
$$\:{p}_{i}^{(t+1{)}_{z}}={p}_{i}(+{x}_{i}\cdot\:\left({p}_{i}^{(t+1{)}_{s}}-{p}_{mean}^{t}\right)+{y}_{i}^{rz}\cdot\:\left({p}_{i}^{(t+1{)}_{s}}-{p}_{i+1}^{(t+1{)}_{s}}\right)$$
(4)

While \(\:a\in\:\left(\text{0,5}\right)\)and R \(\:\in\:(052\:are\) employed for controlling the dimension of the spiral.\(\:{p}_{i+1}^{(t+1{)}_{s}}\) denotes the bald eagle whose population \(\:pis\) is situated at \(\: the\:i+1-th\) location, and \(\:{p}_{i}^{(t+1{)}_{z}}\) signifies the discrete upgrade by Eq. (4).

Dive to catch prey

If the bald eagle has eliminated in the optimum location of its prey, then it will protect its novel position by finding a hyperbolic route.

$$\:{\theta\:}_{i}^{t}=a\cdot\:\pi\:\cdot\:rand{r}_{i}^{t}={\theta\:}_{i}^{t}$$
$$\:x{r}_{i}^{t}={r}_{i}^{t}\cdot\:\text{s}\text{i}\text{n}\text{h}\left({\theta\:}_{i}^{t}\right)y{r}_{i}^{t}={r}_{i}^{t}\cdot\:\text{c}\text{o}\text{s}\text{h}\left({\theta\:}_{i}^{t}\right)$$
(5)
$$\:{x}_{\text{i}}^{\text{t}\text{z}}={xr}_{\text{i}}^{\text{t}\text{z}}/\underset{1\le\:i\le\:N}{\text{max}}\left(\right|xr\left|\right)y{l}_{i}^{c}=y{r}_{i}^{\mathfrak{c}/\:\text{m}\text{a}\text{x}\:(1{\mathcal{Y}}^{{\varGamma\:}_{i}^{f}\left|\right)}}$$
$$\:{p}_{i}^{t+1}=rand\cdot\:{p}_{\text{*}}^{tm}+x{\text{l}}_{\text{i}}^{t}\cdot\:\left({p}_{i}^{(t+1{)}_{z}}-{c}_{1}\cdot\:{p}_{meon}^{t}\right)+y{\text{l}}_{\text{i}}^{t}\cdot\:\left({p}_{i}^{(t+1{)}_{z}}-{c}_{2}\cdot\:{p}_{\text{*}}^{tm}\right)$$
(6)

Here, \(\:sinh\left(\right)\) and \(\:\text{c}\text{o}\text{s}\text{h}\left(\right)\) represent the hyperbolic functions,\(\:{P}^{tm}\) is the finest solution, and \(\:{c}_{1},{c}_{2}\in\:\left(\text{1,2}\right)\). The BES concludes the upgrading of dispersion and aggregation over Eqs. (2)- (6) and discovers an optimum solution to an issue over any iterations.

In the BES approach, the fitness function (FF) employed is intended to balance the number of selected features (least) and the accuracy of classification (highest) attained by consuming these chosen features. Equation (7) characterizes the FF to estimate the solution.

$$\:Fitness=\alpha\:{\gamma\:}_{R}\left(D\right)+\beta\:\frac{\left|R\right|}{\left|C\right|}$$
(7)

Here, \(\:{\gamma\:}_{R}\left(D\right)\) signifies an assumed classifier’s classification error rate. \(\:\left|R\right|\:\)refers to the cardinality of the chosen subset; \(\:\left|C\right|\) represents the total number of features in the dataset, \(\:\alpha\:\) and \(\:\beta\:\) represent a binary parameter corresponding to the impact of classifier excellence and subset length. [1,0] and \(\:\beta\:=1-\alpha\:.\).

Hybrid classification process: CNN-BiGRU model

Moreover, the hybrid of CNN-BiGRU model is employed for cyberattack classification31. This model was chosen for its capability of efficiently capturing both spatial and temporal dependencies in data. BiGRU shows efficiency in capturing sequential data by processing data in both forward and backwards directions, while CNNs outperform in extracting raw data, and detecting intrinsic patterns. This integration allows the model to handle spatial patterns effectively (from CNNs) and temporal relationships (from BiGRUs), making it highly appropriate for intrusion detection in dynamic IoT environments. Unlike conventional classification models, this hybrid approach can better adapt to real-time, evolving threats. It also gives superior accuracy in detecting complex attack patterns, enhancing anomaly robustness. The model’s capability to process diverse data types makes it more effective than single-method models, ensuring high performance even in resource-constrained settings. Figure 3 represents the infrastructure of CNN-BiGRU technique.

Fig. 3
figure 3

Architecture of CNN-BiGRU model.

An ID-CNN is utilized to extract the complete features of an input feature. The Bi-GRU technique was employed to mine the links between input features. First, the feature subset is used as an input after feature selection. Then, CNN is employed to extract the links among features, combining manifold attributes to absorb the coupling features between them. Furthermore, the removed features are input to extract the sequential feature. Lastly, the backwards and forward outputs were combined and produced. CNN‐BiGRU can efficiently mine both global and local feature data.

The main aim of CNN is to mine noticeable features from input data. A classic CNN structure includes numerous layers, such as convolutional, pooling, and fully connected (FC). The convolutional layer is fundamental in extracting features, where convolution kernels take appropriate features from input data. The abstraction level of the mined feature increases with the number of convolution kernels employed. The FC layers compress the pooling neurons into a 1D vector method, enabling more convenient data processing. Conversely, CNNs were ineffectual in seizing the time-based dependencies. So, it is vital to incorporate the recurrent NN (RNN) methods and unite CNN with Bi-GRU models to enhance the performance.

GRU is a kind of RNN method generally employed for handling successive data that tackles long-term dependency problems. When equated to conventional RNNs, GRU presents a gating device that allows it to acquire, forget, or retain data efficiently. GRU includes dual gates, such as reset and update. The update gate manages how much historical data is to be recollected. In contrast, the reset gate aids the system in defining how much past data wants to be disregarded, enabling the handling of short‐term dependency. The mathematical formulations for every GRU gate unit are formulated in Eqs. (8)- (11):

$$\:{r}_{t}=\sigma\:\left({W}_{r}{x}_{t}+{U}_{r}{h}_{t-1}\right)$$
(8)
$$\:{z}_{t}=\sigma\:\left({W}_{z}{x}_{t}+{U}_{z}{h}_{t-1}\right)$$
(9)
$$\:{\stackrel{\sim}{h}}_{t}=tanh\left({W}_{h}{x}_{t}+{U}_{h}\left({r}_{t}\odot\:{h}_{t-1}\right)\right)$$
(10)
$$\:{h}_{t}={z}_{t}\odot\:{\stackrel{\sim}{h}}_{t}+\left(1-{z}_{t}\right)\odot\:{h}_{t-1}$$
(11)

Here, \(\:{x}_{t}\) signifies the input to the hidden layer (HL) at \(\:tth\) time\(\:{;\:h}_{t}\) means the present output at \(\:tth\) time\(\:;\) \(\:{z}_{t}\) and \(\:{r}_{t}\) signify the update gate and reset gate, correspondingly; \(\:{W}_{z}\) and \(\:{U}_{z}\) represent the weight coefficient for reset gate; \(\:{\stackrel{\sim}{h}}_{t}\) indicates the unit of candidate memory at \(\:tth\) time\(\:;\) \(\:{U}_{r}\) and \(\:{W}_{r}\) signifies the weight coefficient for update gate; \(\:\sigma\:\) denotes an activation function. As a unidirectional structure of RNN, GRU naturally spreads states in a forward route. However, Bi-GRU holds dual GRU methods with reverse ways, permitting it to capture long-term dependencies and global data widely.

Parameter tuning: IChoA technique

To fine-tune the hyperparameter values of CNN-BiGRU model, the IChoA is utilized32. This model is chosen for its efficient search for optimal hyperparameters, giving a robust solution for complex optimization problems. Unlike conventional gradient-based methods, IChoA is a nature-inspired, population-based approach that doesn’t depend on gradient information, making it appropriate for models with non-differentiable or highly non-convex objective functions. It effectually explores the solution space by replicating the social behaviour of chimpanzees, ensuring better global optimization. Compared to other optimization techniques, such as grid search or genetic algorithms, IChoA presents faster convergence and avoids local optima, resulting in more accurate and efficient model performance. Its adaptability to various types of data and model structures makes it an ideal choice for improving the performance of IDS in resource-constrained IoT environments. Overall, IChoA improves both the accuracy and computational efficiency of the model. Figure 4 demonstrates the working flow of the IChoA approach.

Fig. 4
figure 4

Working flow of the IChoA technique.

The ChoA method stems from the chimp’s hunting behaviour. The algorithm sets the chimp’s locations and picks the four lowest-fitness chimps, such as Chaser, Attacker, Barrier, and Driver, signifying the top four optimum solutions: \(\:\left({X}_{attacker},\:{X}_{barrier},\:{X}_{chaser},\:{and\:X}_{driver}\right)\).

In the phase of chimps chasing and driving prey, each chimp alters its location depending on the prey’s position in the hunting procedure. It is mathematically formulated below:

$$\:\left\{\begin{array}{l}d=|c\cdot\:{X}_{prey}(t)-m\cdot\:{X}_{chimp}(t\left)\right|\\\:{X}_{chimp}(t+1)={X}_{prey}\left(t\right)-a\cdot\:d\\\:a=f(2{r}_{1}-1)\\\:c=2{r}_{2}\\\:f=2\left(1-\left(\frac{t}{{t}_{\text{m}\text{a}\text{x}}}\right)\right)\end{array}\right.$$
(12)

Correspondingly, \(\:t\) and \(\:{t}_{\text{m}\text{a}\text{x}}\) mean the present and maximum iteration count. d signifies the distance between the prey and the chimp. \(\:{X}_{prey}\) and \(\:{X}_{chimp}\) denote the locations of the prey and the chimp, respectively. \(\:{r}_{l}\) and \(\:{r}_{2}\) represent randomly generated numbers within the interval of [0, 1]. \(\:f\) refers to a factor of convergence that linearly reduces from 2 to 0 throughout the iteration procedure. \(\:a\) and \(\:c\) denote the constant vectors, \(\:and\:m\) refers to the chaotic vector.

In an attack stage, chimps discover the prey’s position and encircle it with the Chaser, Attacker, Barrier, and Driver. Then, it introduces a synchronized attack on the victim. It is mathematically given below:

$$\:\left\{\begin{array}{c}{d}_{attacker}=\left|{c}_{1}\cdot\:{X}_{attacker}-{m}_{1}\cdot\:X\right|\\\:{d}_{barrier}=\left|{c}_{2}\cdot\:{X}_{barrier}-{m}_{2}\cdot\:X\right|\\\:{d}_{chraser}=\left|{c}_{3}\cdot\:{X}_{chaser}-{m}_{3}\cdot\:X\right|\\\:{d}_{driver}=\left|{c}_{4}\cdot\:{X}_{driver}-{m}_{4}\cdot\:X\right|\end{array}\right.$$
(13)

While \(\:{X}_{1},\) \(\:{X}_{2},\) \(\:{X}_{3}\), and \(\:{X}_{4}\) represent the upgraded location vectors of Attacker, Barrier, Chaser, and Driver, \(\:X\) denotes the location vector. The other chimp individual’s locations are defined equally by four optimum chimp locations. \(\:X(t+1)\) signifies their upgraded location vector, and the formulation is given below:

$$\:X\left(t+1\right)=\frac{{X}_{1}+{X}_{2}+{X}_{3}+{X}_{4}}{4}$$
(14)

Generally, the ChOA experiences problems like a tendency to meet local goals and slow convergence velocity. The population was initialized utilizing Logistic mapping, and the separate chimp location upgrade model was enhanced. The Spiral functions were presented in the Choa, and this function improved the model’s performance by developing the searching space and harmonizing global and local hunts. The IChOA has enhanced the ChOA from 4 dissimilar sizes.

SPM chaotic map

ChoA randomly implements the initialization of the population, which might lead to an uneven population spread and affect the model’s performance. The chaotic map might efficiently produce a significantly expanded initial population. When equated to a standard chaotic map, the SPM chaotic map delivers a population with sturdier arbitrariness and even distribution. It is mathematically computed below:

$$\:x\left(t+1\right)=\left\{\begin{array}{c}mod(\frac{x\left(t\right)}{\eta\:}+\mu\:\text{sin}\left(\pi\:x\left(t\right)+r,1\right),0\le\:x(t)<\eta\:\\\:mod(\frac{\frac{x\left(t\right)}{\eta\:}}{0.5-\eta\:}+\mu\:\text{sin}\left(\pi\:x\left(t\right)+r,1\right),\eta\:\le\:x(t)<0.5\\\:mod(\frac{(1-x\left(t\right))}{\eta\:}+\mu\:\text{sin}\left(\pi\:\left(t\right)\left)\right)+r,1\right),0.5\le\:x(t)<1-\eta\:\\\:mod(\frac{\left(1-x\left(t\right)\right)}{\eta\:}+\mu\:\text{sin}\left(\pi\:\left(1-x\left(t\right)\right)+r,1\right),1-\eta\:\le\:x(t)<1\end{array}\right.$$
(15)

Here, \(\:r\) and \(\:x\left(t\right)\) represent randomly generated numbers 0 and 1.

Nonlinear convergence factor

The convergence factor \(\:f\) doesn’t efficiently balance local exploitation and global exploration throughout population location upgrade. A slow declining convergence factor in the initial phases of model iteration permits the populations to discover the global optimum more efficiently. The quicker diminishing convergence factor advantages the technique in hunting for the local optimum solutions.

$$\:f=\text{c}\text{o}\text{s}\left(\pi\:\left(\frac{t}{{t}_{\text{m}\text{a}\text{x}}}\right)\right)+1$$
(16)

\(\:{T}\)-distribution and opposition‐based learning perturbation tactic

The attacker’s location affects the accuracy and efficacy of the model’s optimizer. However, dependency on this location can lead to other chimpanzees collecting everywhere, obstructing the search of different areas in the search space. T-distribution and opposition-based learning were employed to interrupt an attacker’s location to evade convergence stagnation. Its mathematical formulation is given below:

$$\:{X}_{op}\left(t\right)=ub+r\oplus\:\left(lb-{X}_{attacker}\left(t\right)\right)$$
(17)
$$\:{X}_{t}\left(t\right)=\left\{\begin{array}{c}{X}_{attacker}\left(t\right)+{X}_{attacker}\left(t\right)\cdot\:t\_dis(t+1),\:\:\:t\le\:\frac{{t}_{max}}{2}\\\:{X}_{attacker}\left(t\right)+{X}_{attacker}\left(t\right)\cdot\:t\_dis\left({t}^{2}\right),\:\:\:t\le\:\frac{{t}_{max}}{2}\end{array}\right.$$

Here, \(\:{X}_{op}\) and \(\:{X}_{t}\) signify the locations after the attacker, \(\:ub\) and \(\:lb\) mean the searching space limits, \(\:r\) refers to a randomly generated vector among [0, 1], \(\:and\:f\left(x\right)\) indicates the fitness value of location \(\:x\). The function of \(\:t\_dis\) is employed to create a random number that obeys a \(\:t\)-distribution. The \(\:t\) distribution looks like a Cauchy distribution, with a higher distribution of randomly generated numbers beneficial to global exploration. In future phases, the freedom grades increase more quickly, so the \(\:t\)-distribution approaches the Gaussian distribution with randomly generated values concentrated around the mean. If the chosen fitness value location is less than an attacker’s, it signifies a higher solution. It is mathematically computed below:

$$\:{X}_{best}\left(t\right)=\left\{\begin{array}{c}{X}_{op}\left(t\right)\:\:\:,\:f\left({X}_{op}\left(t\right)\right)<f\left({X}_{t}\left(t\right)\right)\\ \: {X}_{t}\left(t\right)\:\:\:,\:f\left({X}_{op}\left(t\right)\right)\ge \:f\left({X}_{t}\left(t\right)\right)\end{array}\right.$$
(18)
$$\:{X}_{attacker}\left(t\right)=\left\{\begin{array}{c}{X}_{best}\left(t\right)\:\:\:\:,\:f\left({X}_{best}\left(t\right)\right)< f\left({X}_{attacker}\left(t\right)\right)\\\:{X}_{attacker}\left(t\right)\:\:\:,\:f\left({X}_{best}\left(t\right)\right)\ge\:f\left({X}_{attacker}\left(t\right)\right)\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right.$$
(19)

The IChoA originates an FF to accomplish enhanced performance of classification. It defines a positive number to denote the better outcome of the candidate solution. The reduction of the classifier rate of error is measured as FF, as set in Eq. (20).

$$\:fitness\left({x}_{i}\right)=ClassifierErrorRate\left({x}_{i}\right)$$
$$\:=\frac{no.\:of\:misclassified\:samples}{Total\:no.\:of\:samples}\times\:100$$
(20)

XAI process: SHAP

Finally, the SHAP is integrated to enhance interpretability, offering insights into model decisions for improved trust and reliability in cybersecurity. The SHAP method is one of the effective models for illustrating the forecasts of ML methods33. The values of SHAP depend upon the values of Shapley from game theory and were employed to allocate every participant’s cooperation to the joint outcomes equally. The values of SHAP have numerous main properties, such as efficacy, additivity, and symmetry. These features certify that a characteristic contribution is equivalent to an alteration between the mean and the predicted value, and that data collection from a single method is comparable to the forecasts of every model combined. The mathematical formulation for computing the value of SHAP is given as follows:

$$\:{y}_{i}={y}_{base}+f\left({x}_{i1}\right)+f\left({x}_{i2}\right)+\cdots\:+f\left({x}_{ij}\right)$$
(21)

While \(\:{y}_{base}\) denotes an average value of the target variable, \(\:f\left({x}_{ij}\right)\) means a SHAP value of \(\:{x}_{ij}\). The SHAP values formulation includes multiplying the minimal donation of every feature by the equivalent weight and totalling. This model considers the contribution of features in every sample and displays the effect’s negativity and positivity. Here, the SHAP model depends on game theory to deduce and examine the technique. The model owns consistency and local accuracy, which can efficiently construe the outcomes of ML forecast methods.

Result analysis and discussion

The experimental evaluation of the XAICR-HDLOA approach is examined under the Edge-IIoT dataset34. The dataset encompasses 56,000 samples with 12 classes as defined in Table 2.

Dataset features overview with a comprehensive list of 62 features and a highlighted selection of 47 key attributes for analysis

The dataset comprises a total of 62 features, including frame.time, ip.src_host, ip.dst_host, arp.dst.proto_ipv4, arp.opcode, arp.hw.size, arp.src.proto_ipv4, icmp.checksum, icmp.seq_le, icmp.transmit_timestamp, icmp.unused, http.file_data, http.content_length, http.request.uri.query, http.request.method, http.referer, http.request.full_uri, http.request.version, http.response, http.tls_port, tcp.ack, tcp.ack_raw, tcp.checksum, tcp.connection.fin, tcp.connection.rst, tcp.connection.syn, tcp.connection.synack, tcp.dstport, tcp.flags, tcp.flags.ack, tcp.len, tcp.options, tcp.payload, tcp.seq, tcp.srcport, udp.port, udp.stream, udp.time_delta, dns.qry.name, dns.qry.name.len, dns.qry.qu, dns.qry.type, dns.retransmission, dns.retransmit_request, dns.retransmit_request_in, mqtt.conack.flags, mqtt.conflag.cleansess, mqtt.conflags, mqtt.hdrflags, mqtt.len, mqtt.msg_decoded_as, mqtt.msg, mqtt.msgtype, mqtt.proto_len, mqtt.protoname, mqtt.topic, mqtt.topic_len, mqtt.ver, mbtcp.len, mbtcp.trans_id, mbtcp.unit_id, Attack_label, and Attack_type.

From the overall features, 47 key features were selected for analysis, such as frame.time, arp.opcode, arp.hw.size, icmp.checksum, icmp.seq_le, icmp.transmit_timestamp, icmp.unused, http.content_length, http.request.method, http.referer, http.request.version, http.response, http.tls_port, tcp.ack, tcp.checksum, tcp.connection.fin, tcp.connection.rst, tcp.connection.syn, tcp.connection.synack, tcp.dstport, tcp.flags, tcp.len, tcp.seq, tcp.srcport, udp.port, udp.stream, udp.time_delta, dns.qry.name.len, dns.qry.qu, dns.qry.type, dns.retransmission, dns.retransmit_request, dns.retransmit_request_in, mqtt.conack.flags, mqtt.conflag.cleansess, mqtt.conflags, mqtt.hdrflags, mqtt.len, mqtt.msgtype, mqtt.proto_len, mqtt.protoname, mqtt.topic_len, mqtt.ver, mbtcp.len, mbtcp.trans_id, mbtcp.unit_id, and Attack_type. The chosen features represent critical protocol-specific parameters across layers, capturing diverse behavioral patterns crucial for distinguishing benign and malicious traffic. This multidimensional selection ensures comprehensive coverage of temporal, structural, and content-based indicators relevant to intrusion detection.

Table 2 Details on Edge-IIoT dataset.

Analysis and results highlighting key findings and performance evaluation

Figure 5 displays the classifier performances of the XAICR-HDLOA approach on the Edge-IIoT dataset. Figure 5a and b exhibits the confusion matrices through precise identification and classification of all 12 class labels on a 70% of training set (TRASE) and 30% of testing set (TESSE). Figure 5c illustrates the PR study, which enhanced performance over 12 classes. Finally, Fig. 5d demonstrates the ROC outcome, illustrating capable solutions with great ROC values for 12 distinct classes.

Fig. 5
figure 5

Edge-IIoT dataset (a-b) confusion matrices and (c-d) curves of PR and ROC.

Table 3; Fig. 6 indicate the overall attack detection results of the XAICR-HDLOA approach under the Edge-IIoT dataset with 70% TRASE and 30% TESSE. The performances exemplify that the XAICR-HDLOA approach suitably acknowledged varied class labels. On 70%TRASE, the XAICR-HDLOA approach presents an average \(\:acc{u}_{y}\), \(\:pre{c}_{n}\), \(\:rec{a}_{l}\), \(\:{F1}_{score}\), \(\:{AUC}_{score}\), and Kappa of 98.37%, 90.13%, 89.69%, 89.87%, 94.40%, and 94.47%, correspondingly. Followed by, based on 30%TESSE, the XAICR-HDLOA approach provides an average \(\:acc{u}_{y}\), \(\:pre{c}_{n}\), \(\:rec{a}_{l}\), \(\:{F1}_{score}\), \(\:{AUC}_{score}\), and Kappa of 98.41%, 90.42%, 90.01%, 90.19%, 94.57%, and 94.64%, respectively.

Table 3 Attack detection outcome of the XAICR-HDLOA approach under the Edge-IIoT dataset.
Fig. 6
figure 6

Average outcome of XAICR-HDLOA approach under the Edge-IIoT dataset.

Figure 7 depicts the TRA \(\:acc{u}_{y}\) (TRAAY) and validation \(\:acc{u}_{y}\) (VLAAY) performances of the XAICR-HDLOA approach below the Edge-IIoT dataset. The values of\(\:\:acc{u}_{y}\) are computed across a period of 0–25 epochs. The figure underscored that the values of TRAAY and VLAAY present a growing tendency to indicate the proficiency of the XAICR-HDLOA approach with higher performance across numerous repetitions. In addition, the TRAAY and VLAAY values remain close through the epochs, indicating diminishing overfitting and expressing superior performance of the XAICR-HDLOA approach, which guarantees steady calculation on unseen samples.

Fig. 7
figure 7

\(\:Acc{u}_{y}\) curve of XAICR-HDLOA approach under the Edge-IIoT dataset.

Figure 8 shows the TRA loss (TRALO) and VLA loss (VLALO) graph of the XAICR-HDLOA approach below the Edge-IIoT dataset. The loss values are computed throughout 0–25 epochs. The values of TRALO and VLALO demonstrate a diminishing tendency, which indicates the proficiency of the XAICR-HDLOA approach in corresponding to a tradeoff between data fitting and generalization. The successive dilution in loss values also assures the superior performance of the XAICR-HDLOA approach and tunes the calculation results over time.

Fig. 8
figure 8

Loss curve of XAICR-HDLOA approach under the Edge-IIoT dataset.

In Table 4; Fig. 9, a detailed comparison analysis of the XAICR-HDLOA approach is reported. The performances demonstrated that the SVM, RF, FFNN, MLP, and K-NN models have shown ineffectual detection results with the least \(\:acc{u}_{y}\) of 92.31%, 92.91%, 93.60%, 94.73%, and 95.14%, respectively. In the meantime, the CNN model has exhibited considerable performance with \(\:acc{u}_{y}\) of 96.84%, \(\:pre{c}_{n}\) of 86.74%, \(\:rec{a}_{l}\) of 87.19%, and \(\:{F1}_{score}\) of 83.64%. Furthermore, the XGBoost model has accomplished reasonable outcomes with \(\:acc{u}_{y}\) of 97.09%, \(\:pre{c}_{n}\) of 83.80%, \(\:rec{a}_{l}\) of 85.96%, and \(\:{F1}_{score}\) of 85.29%. Finally, the XAICR-HDLOA approach demonstrates superior performance with an increased\(\:\:acc{u}_{y}\) of 98.41%, \(\:pre{c}_{n}\) of 90.42%, \(\:rec{a}_{l}\) of 90.01%, and \(\:{F1}_{score}\) approach of 90.19%.

Table 4 Comparative outcome of XAICR-HDLOA approach under the Edge-IIoT dataset with existing models.
Fig. 9
figure 9

Comparative outcome of XAICR-HDLOA approach under the Edge-IIoT dataset with existing models.

Table 5; Fig. 10 illustrate the computational time (CT) analysis of the XAICR-HDLOA approach with existing techniques under the Edge-IIoT dataset. The RF model takes 5.42 s, the kNN technique requires 7.16 s, and the CNN classifier needs 6.62 s for completion. Other techniques such as XGBoost, with a CT of 7.98 s, and the FFNN method, which takes 4.78 s, illustrate varying time requirements. The MLP model has a CT of 6.39 s, while the SVM method processes in 6.22 s. The XAICR-HDLOA approach outperforms with the lowest CT of 3.48 s, indicating its efficiency in processing tasks compared to the other methods. This reduced CT is crucial for real-time intrusion detection in time-sensitive IIoT applications, enhancing overall system responsiveness and scalability.

Table 5 CT analysis of XAICR-HDLOA approach under the Edge-IIoT dataset over existing techniques.
Fig. 10
figure 10

CT analysis of XAICR-HDLOA approach under the Edge-IIoT dataset over existing techniques.

Table 6; Fig. 11 indicates the ablation analysis of the XAICR-HDLOA methodology under the Edge-IIoT dataset. The BES model achieved an \(\:acc{u}_{y}\) of 96.86%, \(\:pre{c}_{n}\) of 88.50%, \(\:rec{a}_{l}\) of 88.31%, and \(\:{F1}_{score}\) of 88.44%, while IChoA slightly improved the results with an \(\:acc{u}_{y}\) of 97.37%, \(\:pre{c}_{n}\) of 89.05%, \(\:rec{a}_{l}\) of 88.82%, and \(\:{F1}_{score}\) of 89.04%. Further enhancement was seen in the CNN-BiGRU model, which reached an \(\:acc{u}_{y}\) of 97.88%, \(\:pre{c}_{n}\) of 89.82%, \(\:rec{a}_{l}\) of 89.44%, and \(\:{F1}_{score}\) of 89.54%. The XAICR-HDLOA approach outperformed all baselines, delivering an \(\:acc{u}_{y}\) of 98.41%, \(\:pre{c}_{n}\) of 90.42%, \(\:rec{a}_{l}\) of 90.01%, and \(\:{F1}_{score}\) of 90.19%, highlighting its superior capability in learning complex patterns and improve classification performance.

Table 6 Ablation study results comparing XAICR-HDLOA method under the Edge-IIoT dataset over existing techniques.
Fig. 11
figure 11

Ablation study results comparing XAICR-HDLOA method under the Edge-IIoT dataset over existing techniques.

Table 7 specifies the computational efficiency of the XAICR-HDLOA methodology under the Edge-IIoT dataset. The XAICR-HDLOA methodology indicated the lowest computational cost with 10.77 GFLOPs and the least memory usage at 1061 MB, highlighting its suitability for resource-constrained environments. In contrast, EfficientNet required 24.8 GFLOPs and 1354 MB, while MobileNetV2 and ShuffleNet consumed 23.42 GFLOPs and 2874 MB, and 18.53 GFLOPs and 3015 MB, respectively. GhostNet and MnasNet also illustrated higher demands, with 19.21 GFLOPs and 1876 MB, and 23.41 GFLOPs and 1881 MB. Although CGENet exhibited moderate efficiency at 22.97 GFLOPs and 1507 MB, the XAICR-HDLOA model clearly outperforms all others in both computational load and memory footprint, making it ideal for real-time IoT-based IDS.

Table 7 Comparison of computational efficiency and memory usage of XAICR-HDLOA technique under the Edge-IIoT dataset.

Overview of dataset features including 35 attributes with a focused selection of 8 key features for analysis

The experimental validation of the XAICR-HDLOA approach is examined under the BoT-IoT dataset35. The dataset consists of 2056 samples with 5 class labels as defined in Table 8.

The dataset comprises a total of 30 features including pkSeqID, stime, flgs, proto, saddr, sport, daddr, dport, pkts, bytes, state, ltime, seq, dur, mean, stddev, smac, dmac, sum, min, max, soui, doui, sco, dco, spkts, dpkts, sbytes, dbytes, rate, srate, drate, attack, category, and subcategory. Out of these, the eight key features proto, saddr, sport, daddr, dport, pkts, bytes, and attack have been carefully chosen to ensure effectual evaluation and accurate detection. The features are chosen for their direct relevance to traffic flow dynamics and classification. These attributes effectually capture source-destination behavior, protocol type, and volumetric patterns, which are significant in detecting and differentiating attack signatures.

Table 8 Details on BoT-IoT dataset.

Findings and performance evaluation emphasizing key analytical results

Figure 12 shows the classifier performances of the XAICR-HDLOA approach on BoT-IoT dataset. Figure 12a and b displays the confusion matrices through specific identification and classification of all 5 class labels on a 70%TRASE and 30%TESSE. Figure 12c exhibits the PR examination, which showed lower performance over five classes. Eventually, Fig. 12d signifies the ROC study, which represents a skilful solution with great ROC values for five different classes.

Fig. 12
figure 12

BoT-IoT dataset (a-b) confusion matrices and (c-d) curves of PR and ROC.

Table 9; Fig. 13 imply an attack detection solution of the XAICR-HDLOA approach below the Bot-IoT dataset using 70% TRASE and 30% TESSE. The performances suggest that the XAICR-HDLOA approach can accurately recognize different classes. Based on 70%TRASE, the XAICR-HDLOA approach presents an average \(\:acc{u}_{y}\), \(\:pre{c}_{n}\), \(\:rec{a}_{l}\), \(\:{F1}_{score}\), \(\:{AUC}_{score}\), and Kappa of 97.72%, 92.84%, 90.17%, 91.33%, 94.35%, and 94.41%, respectively. Afterwards, using 30%TESSE, the XAICR-HDLOA approach presents an average \(\:acc{u}_{y}\), \(\:pre{c}_{n}\), \(\:rec{a}_{l}\), \(\:{F1}_{score}\), \(\:{AUC}_{score}\), and Kappa of 98.25%, 95.52%, 91.92%, 93.42%, 95.39%, and 95.46%, subsequently.

Table 9 Attack detection outcome of XAICR-HDLOA approach under BoT-IoT dataset.
Fig. 13
figure 13

Average outcome of XAICR-HDLOA approach under BoT-IoT dataset.

Figure 14 depicts the TRAAY and VLAAY performances of the XAICR-HDLOA approach below BoT-IoT dataset. The values of \(\:acc{u}_{y}\) are computed through a period of 0–25 epochs. The figure underscored that the values of TRAAY and VLAAY present a cumulative tendency indicating the competency of the XAICR-HDLOA approach with higher performance through multiple repetitions. Moreover, the TRAAY and VLAAY values remain close across the epochs, indicating decreased overfitting and maximum performance of the XAICR-HDLOA approach, ensuring reliable calculation on unnoticed samples.

Fig. 14
figure 14

\(\:Acc{u}_{y}\) curve of XAICR-HDLOA approach under BoT-IoT dataset.

Figure 15 demonstrates the TRALO and VLALO graphs of the XAICR-HDLOA approach below the Bot-IoT dataset. The loss values are computed across a period of 0–25 epochs. The values of TRALO and VLALO represent a declining tendency, indicating the capacity of the XAICR-HDLOA approach to equalize a tradeoff between data fitting and generalization. The succeeding dilution in values of loss and securities improves the performance of the XAICR-HDLOA approach and tunes the calculation results gradually.

Fig. 15
figure 15

Loss curve of XAICR-HDLOA approach under BoT-IoT dataset.

Table 10; Fig. 16 show a detailed comparison of the XAICR-HDLOA approach36,37,38,39,40,41. The performances illustrated that the GANs + AE, Decision Tree (DT), NB, Bi-LSTM, and GBC models displayed inefficient detection solutions with minimum \(\:acc{u}_{y}\) of 87.61%, 89.14%, 89.60%, 90.80%, and 94.29%, respectively. Meanwhile, the GSOM technique exhibited substantial outcome with \(\:acc{u}_{y}\) of 92.21%, \(\:pre{c}_{n}\) of 94.15%, \(\:rec{a}_{l}\) of 86.71%, and \(\:{F1}_{score}\) of 90.44%. In addition, the CGANs + FNN technique has obtained judicious models with \(\:acc{u}_{y}\) of 97.98%, \(\:pre{c}_{n}\) of 91.00%, \(\:rec{a}_{l}\) of 87.13%, and \(\:{F1}_{score}\) of 87.51%. Lastly, the XAICR-HDLOA approach represents maximum performance with superior\(\:\:acc{u}_{y}\) of 98.25%, \(\:pre{c}_{n}\) of 95.52%, \(\:rec{a}_{l}\) of 91.92%, and \(\:{F1}_{score}\) of 93.42%. Hence, the XAICR-HDLOA approach is applied for improved cyber resilience in the IoT environment.

Table 10 Comparative outcome of XAICR-HDLOA approach under BoT-IoT dataset with existing models.
Fig. 16
figure 16

Comparative outcome of XAICR-HDLOA approach under BoT-IoT dataset with existing models.

Table 11; Fig. 17 demonstrates the CT analysis of XAICR-HDLOA approach with existing models. The GBC model requires a CT of 13.84 s, while the DT technique takes 9.97 s. The GSOM method requires 10.44 s, and the GAN integrated with autoencoders method needs a CT of 10.09 s. The CGANs + FNN methodology acquired 13.72 s, while the NB method requires 10.51 s. The Bi-LSTM method has a computation time of of 9.55 s. Finally, the XAICR-HDLOA approach outperforms with a CT of 5.00 s, highlighting its efficiency compared to the existing models.

Table 11 CT analysis of XAICR-HDLOA approach under BoT-IoT dataset with existing models.
Fig. 17
figure 17

CT analysis of XAICR-HDLOA approach under BoT-IoT dataset with existing models.

Table 12; Fig. 18 specifies the ablation study of the XAICR-HDLOA technique under the BoT-IoT dataset. The XAICR-HDLOA technique attains an \(\:acc{u}_{y}\) of 98.25%, \(\:pre{c}_{n}\) of 95.52%, \(\:rec{a}_{l}\) of 91.92%, and \(\:{F1}_{score}\) of 93.42%. In comparison, the BES method achieves an \(\:acc{u}_{y}\) of 96.41%, \(\:pre{c}_{n}\) of 93.66%, \(\:rec{a}_{l}\) of 90.02%, and \(\:{F1}_{score}\) of 91.44%, while IChoA shows an \(\:acc{u}_{y}\) of 97.15%, \(\:pre{c}_{n}\) of 94.35%, \(\:rec{a}_{l}\) of 90.63%, and \(\:{F1}_{score}\) of 92.21%. The CNN-BiGRU model achieves an \(\:acc{u}_{y}\) of 97.73%, \(\:pre{c}_{n}\) of 94.98%, \(\:rec{a}_{l}\) of 91.23%, and \(\:{F1}_{score}\) of 92.91%. The consistently higher values of the XAICR-HDLOA method highlight its efficiency in improving classification accuracy and robustness for intrusion detection.

Table 12 Ablation study-based comparative analysis of the XAICR-HDLOA technique under the BoT-IoT dataset.
Fig. 18
figure 18

Ablation study-based comparative analysis of the XAICR-HDLOA technique under the BoT-IoT dataset.

Table 13 demonstrates the superior computational efficiency of the XAICR-HDLOA approach under the BoT-IoT dataset. The EfficientNet model consumes 24.8G FLOPs and 1354 M GPU memory, while GhostNet, MnasNet, and MobileNetV2 require 19.21G, 23.41G, and 23.42G FLOPs with 1876 M, 1881 M, and 2874 M GPU memory, respectively. ShuffleNet, despite having the lowest FLOPs among the compared models at 18.53G, uses the highest GPU memory at 3015 M. CGENet reports 22.97G FLOPs and 1507 M GPU usage. But the XAICR-HDLOA model illustrates the lowest FLOPs of 10.77G and minimal GPU memory usage of 1061 M. The significantly lower resource demands of the XAICR-HDLOA model highlight its suitability for deployment in resource-constrained edge and IoT environments without losing performance.

Table 13 Performance comparison of XAICR-HDLOA technique in terms of computational cost and memory utilization on the BoT-IoT dataset.

Conclusion

In this paper, the XAICR-HDLOA approach is proposed. The main objective of the XAICR-HDLOA approach is to improve cyber threat detection and interpretation in IoT environments. To accomplish this, the XAICR-HDLOA approach applies the min-max normalization approach to standardize feature scales during the data normalization. The BES model selects the most relevant features for dimensionality reduction. Moreover, the hybrid of CNN-BiGRU model is used for the cyberattack classification. To fine-tune the hyperparameter values of CNN-BiGRU model, the IChoA is utilized. Finally, the SHAP is integrated to enhance interpretability, offering insights into model decisions for improved trust and reliability in cybersecurity. A wide range of simulations is performed to ensure the significance of the XAICR-HDLOA approach under the Edge-IIoT dataset. The performance validation of the XAICR-HDLOA approach portrayed a superior accuracy value of 98.41% and 98.25% over existing techniques under the Edge-IIoT and BoT-IoT datasets. The limitations of the XAICR-HDLOA approach include reliance on specific datasets, which may not fully capture the diversity of real-world IoT environments and cyberattacks. Furthermore, the model’s performance may degrade in extremely resource-constrained devices with restricted computational power and memory. The study also assumes that all IoT devices are equally secure, which may not reflect real-world vulnerabilities. Future work may explore the application of the model to larger, more diverse datasets and investigate its performance in more heterogeneous IoT environments. Further study may optimize the model for edge devices with lesser computational resources. Moreover, integrating more advanced anomaly detection models and real-time adaptation could improve the technique’s effectiveness in growing cybersecurity threats. Finally, assessing the robustness of the model against adversarial attacks remains an area for future research.