Abstract
In the present digital era, malware defences and attacks are becoming more difficult, creating a progressing cyberthreat landscape. With the fast development in technology, cyberthreats have shown improved intricacy and potency that frequently exceed the abilities of conventional defence systems. The Internet of Things (IoT) is a technical development that allows machine-to-machine and human-to-human interaction for essential data exchange. The IoT provides numerous advantages but also builds several problems. Exposures in IoT methods are problematic and main to devices enduring various threats, with the danger of denial of service (DoS) and security challenges like privacy, confidentiality, and obtainability to assault. This manuscript proposes a cyberthreat defence mechanism using a Binary Ebola Optimization Search Algorithm and Ensemble Models (CDM-BEOSAEM) method. The main intention of the CDM-BEOSAEM method is to enhance the cyberattack detection method in an IoT environment. Initially, the min-max normalization is applied in the data normalization stage to convert input data into a beneficial format. Furthermore, the binary ebola optimization search algorithm (BEOSA) model recognizes the most appropriate features in the feature selection (FS) process. For the classification of cyberthreat defence, the proposed CDM-BEOSAEM model utilizes an ensemble of bidirectional gated recurrent unit (BiGRU), auto-encoders (AE), and graph convolutional network (GCN) techniques. Finally, the hyperparameter selection of ensemble models is performed by implementing the escape Coati Optimization Algorithm (eCOA) technique. The simulation of the CDM-BEOSAEM approach is accomplished under the ToN-IoT dataset, and the results are measured using various measures. The performance validation of the CDM-BEOSAEM approach portrayed a superior accuracy value of 99.29% over existing models.
Similar content being viewed by others
Introduction
In recent years, there has been a dramatic change to a novel model where the IoT has become incorporated into daily life and virtually available by every age in multiple applications, from smart homes and medical care solutions to smart cities1. Pervasive computing is exchangeable with ubiquitous computing and adheres to 3 major designs: smart gadgets like wireless, service, and mobile, and smart communication for peer-to-peer gadget interaction2. This ubiquitous IoT smart setting can enhance effectiveness with home control and automation, energy consumption, active response, automated healthcare monitoring, and seamlessly incorporating a smart city. With a broad range of cyber communication occurrences between IoT-inserted firmware people and devices, cybersecurity is a real attack on society3. Figure 1 signifies the typical structure of cyberthreats in the IoT environment.
The progression of the cyber-attack environment has substantially changed the nature and range of cybersecurity challenges4. As digital settings become progressively complex and interconnected, cyber-attack complexity is elevated, making conventional reactive measures insufficient. Historical cybersecurity methods frequently interact with threats after they happen, restricting their efficiency in precluding data breaches and reducing damage. These traditional, related models often fail to address the fast rate and developing nature of present cyber-attacks5. So, the forward-thinking method is vital in a landscape where attacks are becoming different and more advanced. Cyberthreat intelligence (CTI) performs a significant role in this proactive approach, giving actionable visions and context, which permits organizations to predict and defend against possible attacks more efficiently6. By utilizing CTI, organizations may obtain more profound knowledge of threat vectors, actors, and evolving susceptibilities, consequently improving their capability to pre-emptively address security concerns. CTI depicts a vital element of present cybersecurity structures, acting as a proactive measure to anticipate, mitigate, and understand cyber-attacks. CTI is the analysis, collection, and distribution of information about possible or existing threats that threaten the security of an organization’s digital resources7. The primary purpose of CTI is to allow cybersecurity teams the experience required to produce informed decisions, increasing their capability to prevent, predict, and respond to cyber-attacks effectually. As threats develop, adaptive recognition models become more significant in cybersecurity. These models are based on self-teaching models and adaptable defence mechanisms that may enhance and adapt as they experience novel attack vectors and threats8. The most impactful approaches in this domain employ anomaly detection, deep learning (DL), machine learning (ML), threat intelligence, behavioural analysis, and artificial intelligence (AI) to progress adaptable and strong security methods9. The primary method comprises DL-based models for examining complex datasets and identifying subtle signs of malicious behaviour10. AI technology is a unique way to reduce the complexity concerning risk management and identification, consequently decreasing the time taken to reduce threats while mitigating the level to which individuals are involved.
This manuscript proposes a cyberthreat defence mechanism using a Binary Ebola Optimization Search Algorithm and Ensemble Models (CDM-BEOSAEM) method. The main intention of the CDM-BEOSAEM method is to enhance the cyberattack detection method in an IoT environment. Initially, the min-max normalization is applied in the data normalization stage to convert input data into a beneficial format. Furthermore, the binary ebola optimization search algorithm (BEOSA) model recognizes the most appropriate features in the feature selection (FS) process. For the classification of cyberthreat defence, the proposed CDM-BEOSAEM model utilizes an ensemble of bidirectional gated recurrent unit (BiGRU), auto-encoders (AE), and graph convolutional network (GCN) techniques. Finally, the hyperparameter selection of ensemble models is performed by implementing the escape Coati Optimization Algorithm (eCOA) technique. The simulation of the CDM-BEOSAEM approach is accomplished under the ToN-IoT dataset, and the results are measured using various measures. The key contribution of the CDM-BEOSAEM approach is listed below.
-
The CDM-BEOSAEM model utilizes min-max normalization to rescale feature values within a uniform range, ensuring consistent input for all learning components. This step improves the training efficiency and stability of DL models. It also contributes to faster convergence and enhanced overall model performance.
-
The CDM-BEOSAEM approach employs the BEOSA method to choose the dataset’s most relevant and informative features. This mitigates dimensionality and eliminates redundant data, improving the model’s efficiency. Concentrating on critical attributes significantly enhances detection accuracy.
-
The CDM-BEOSAEM method integrates an ensemble of BiGRU, AE, and GCN techniques to capture the data’s temporal dependencies and structural associations. This deep feature learning approach confirms the robust representation of complex IoT patterns and improves the model’s capability to detect diverse and advanced cyberthreats.
-
The CDM-BEOSAEM methodology automatically implements the eCOA model to fine-tune hyperparameters for optimal performance. This adaptive tuning improves the technique’s learning capacity and generalization, resulting in higher detection accuracy and reduced computational overhead.
-
The CDM-BEOSAEM model’s novelty is its synergistic integration of BEOSA-based feature selection and eCOA-driven hyperparameter optimization within a deep ensemble framework integrating BiGRU, AE, and GCN. This unique incorporation improves learning from both sequential and structural data patterns. It enables accurate, effective, and scalable IoT threat detection. Such a hybrid approach has not been explored in the existing literature.
The article’s structure is as follows: Sect. 2 reviews the literature, Sect. 3 describes the proposed method, Sect. 4 presents the evaluation of results, and Sect. 5 offers the study’s conclusions.
Related works
Wazid et al.11 developed a Secure DL-enabled malware attack detection for the IoT-enabled intelligent transportation system (SDLMA-IITS) model. The explainable AI (XAI) method is employed for effectual malware detection. A deep security analysis of projected SDLMA-IITS is introduced to verify its security against possible threats. Ultimately, a practical performance of SDLMA-IITS is offered to evaluate its influence on the security of IoT-enabled ITS devices and systems. Algethami and Alshamrani12 developed a hybrid DL-based intrusion detection system, which utilizes a gated recurrent unit (GRU) and ANN with bidirectional long short-term memory (Bi-LSTM) structures to handle crucial cybersecurity attacks in IoHT. Kumar et al.13 introduce an enhanced DL-based technique for recognizing cyber-attacks and IoMT data authentication in smart medical care. Initially, it presents an embedded Ensemble Learning model to choose significant IoMT features that reduce unnecessary aspects. These scaled inputs are sent to the projected 1D-CLSTM neural network to categorize cyber-attacks. Duy et al.14 developed a novel threat structure, AWG - adversarial website generation, which utilizes generative adversarial networks (GAN) and transfer-based black box threats to generate AEs. This structure closely mirrors real-time threat situations, guaranteeing realism and higher effectiveness. Eventually, this method projects defence approaches with simple implementation and higher efficiency to improve the model resistance. Ragab et al.15 developed the HPO with DL-enabled biometric verification for cybersecurity (HPODL-BVCS) approaches. The developed method employs the DL technique to attain biometric verification in higher educational institutions. Moreover, the projected model utilizes the ShuffleNet-v2.3 for the aim of feature extractor. The presented method employs the CAE method with RMSProp optimizer for classification.
Akhunzada et al.16 developed a novel solution, an ensemble learning-based cyber-attack intellectual mechanism proficient at effectively recognizing advanced multi-variant cyber-attacks and threats. A wide-ranging similarity with currently intended ensemble and hybrid DL structures, together with benchmark DL models. Imtiaz et al.17 developed XIoT, an innovative XIoT threat recognition method to address the varying cyber risks opposing IoT systems; specifically, they communicate with optical communication structures. XIoT employs cutting-edge DL approaches like CNN. By inspecting these images’ sequential and spatial features, XIoT develops an extensive and nuanced understanding of the fundamental aspects of malicious activities. A key distinctive feature of XIoT is its highlights on interpretability, allowing stakeholders to obtain visions and rationale behind its forecasts. By incorporating XAI mechanisms, XIoT provides precise classifications of IoT threats and explains the main factors that move its decision-making process forward. Almazroi and Ayub18 developed a specialized BERT-based Feed Forward NN (BEFNet) for IoT backgrounds. For this assessment, a new structure with different segments is used for the complete examination of 8 datasets, each representing diverse kinds of malware.
Despite the improvements in DL-based IoT security models, various limitations still exist. Many models are constructed for specific applications, such as intelligent transportation systems or biometric verification, restricting their adaptability across diverse IoT environments. Moreover, there is a lack of integration between temporal, structural, and spatial learning, which is significant for capturing complex IoT threat patterns. Few models don’t perform feature sections or optimization processes properly. Additionally, existing methods mostly fail to integrate various learning models, namely ensemble or hybrid models, to improve generalization and accuracy. A key research gap is developing a scalable, adaptive model that integrates hyperparameter optimization, feature selection, and DL for more comprehensive, interpretable, and accurate IoT threat detection.
Materials and methods
This manuscript proposes a novel CDM-BEOSAEM method. The method’s main intention is to enhance cyberattack detection in the IoT environment. Figure 2 demonstrates distinct processes involved: data normalization, dimensionality reduction, ensemble cyberthreat classification, and parameter tuning using eCOA.
Stage I: Min–max normalization
At first, the min-max normalization method is applied in the data normalization stage for converting input data into a beneficial format19. This method is chosen because it can rescale features to a consistent range, usually between 0 and 1. This is beneficial when dealing with varying feature scales, as it confirms that no single feature dominates the learning process. By maintaining the relative distribution of data, min-max normalization averts biases in algorithms sensitive to feature magnitude, like neural networks. Moreover, it assists in speeding up convergence during training, as the model can learn more effectively from uniformly scaled data. This model is easy to use and is more appropriate for models needing bounded input values when compared to other normalization techniques. The model also enhances performance, particularly in DL techniques where scaling can affect the training dynamics.
Normalization is essential in the DL method. It might transform data of dissimilar sizes into an equal range of scale, removing the influence of dimension changes on training methodology. This method is applied to linearly mapping data to the range [0, 1]. Denormalization and normalization are carried out in detail utilizing Eqs. (1) and (2):
Whereas \(\:x\) signifies the novel data, \(\:{x}^{{\prime\:}}\) denotes the normalized value, and \(\:{x}_{\text{m}\text{i}\text{n}}\) and \(\:{x}_{\text{m}\text{a}\text{x}}\) represent the maximum and minimum values of the loading data, respectively.
Stage II: dimensionality reduction process
The BEOSA is then deployed for the FS process to identify the most relevant features20. This model is ideal for dimensionality reduction as it can effectively detect the most pertinent features while maintaining high accuracy. Unlike conventional techniques such as PCA or feature selection based on statistical thresholds, this model utilizes optimization methods for exploring the feature space more appropriately, confirming that only the most informative features are obtained. This results in an enhanced model efficiency and mitigated computational complexity. The model’s capability to handle massive, intrinsic datasets with non-linear associations gives it an edge over simpler techniques. Furthermore, the optimization process of the model is adaptable, allowing it to tune feature selection dynamically depending on the specific characteristics of the dataset, which improves its robustness and generalization. This results in improved model performance and faster convergence during training. Figure 3 specifies the working flow of the BEOSA model.
The BOESA was initially proposed based on the novel constant variations of EOSA. Utilizing the conventional V- and S-shaped functions, the continuous meta-heuristic variation was binarized to solve difficulties with related models. The searching area formality for the binary versions permits the solutions initialized in the area by 1’s, as established in Eq. (3), such that the optimizer procedure converts the dimensionality \(\:\left(d\right)\) of all items or individual \(\:\left({x}_{i}\right)\) to values in the interval \(\:0s\:\)to 1\(\:s\).
The binary optimizer depends on the rate of Ebola infection, such that things in the searching area characterize animals, human individuals, or organisms affected by the disease. Thus, it was agreed that the search area characterizes population \(\:X\).
The optimizer procedure has an enhanced rate of propagation adequate to yield dissimilar subpopulations or subgroups like the recovered, infected, and others. Nevertheless, the population with infection yields a novel group of people whose anatomical feature selection was mutated. These mutations occur in a searching region with individuals or items whose alignments are no lengthier than the \(\:d-\)dimensions of representations.
The notation usage \(\:\varDelta\:\) is to signify the mutation feature, which discriminates the rate and manner wherein the individual is transmuted. In the meantime, the \(\:rnd\) yields randomly generated numbers inside the interval of [− 1, 1] of uniform distributions. Remarkably, the transfer functions are used to transform \(\:{x}_{{i}^{new}}.\)
These binarization and mutation functions are fundamental to the BOESA and offer to utilize the model reason for the hybrid models. In the BEOSA approach, the fitness function (FF) used is intended to have a balance among the amount of chosen features in each solution (least) and an accuracy of classifier (greatest) achieved by employing these desired features.
Here, \(\:{\gamma\:}_{R}\left(D\right)\) suggests a known classifier’s classification rate of error. \(\:\left|R\right|\:\)is the cardinality of the preferred subset, and \(\:\left|C\right|\) denotes the complete feature count in the dataset. \(\:\alpha\:\) and \(\:\beta\:\) represent binary parameters, which correspond to an impact of classifier excellence and subset length. ∈ [1,0] and \(\:\beta\:=1-\alpha\:.\).
Stage III: ensemble of cyberthreat classification
For the classification of cyberthreat defence, the proposed CDM-BEOSAEM model designs ensemble models, namely the BiGRU method, AE model, and GCN technique. This model efficiently captures temporal dependencies, making it appropriate for sequential data analysis, while AE outperforms feature extraction and dimensionality reduction, ensuring relevant data is retained for classification. The GCN model captures structural relationships, which is significant for detecting intrinsic patterns in graph-based IoT data. By integrating these models, the ensemble benefits from robust feature learning across both sequential and structural dimensions, making the model superior in generalization and accuracy related to any single model alone. This model improves performance on heterogeneous datasets and enhances the interpretability of the model by incorporating diverse learning perspectives. The motivation behind integrating the models is in the nature of cyberthreat data which is sequential such as logs over time, high-dimensional with noise requiring compression, and often relational such as shared attack paths thus demanding a hybrid model that captures all three facets holistically.
BiGRU model
The Bi-GRU, a development above the GRU, incorporates information from either forward or backward directions19. The calculation procedure of the Bi-GRU component comprises the succeeding equations. Equations (11) and (12) are applied to calculate the reset gate \(\:{R}_{t}\:\)and update gate \(\:{Z}_{t}\) correspondingly. Equation (13) is applied to make the candidate hidden layer (HL) \(\:{\stackrel{\sim}{H}}_{t}\). Lastly, Eq. (14) incorporates candidate HL and the update gate to make the last HL \(\:{H}_{t}\):
Whereas \(\:{Z}_{t}\) and \(\:{R}_{t}\) signify the update and reset gate at time \(\:t,\) \(\:{\stackrel{\sim}{H}}_{t}\) represents the candidate HL at time \(\:t,\) \(\:{H}_{t}\) signifies the last HL at time \(\:t,\) \(\:{X}_{t}\) refers to input at time \(\:t,\) \(\:W,\) and \(\:b\) designates the bias and weight, individually, and \(\:\text{*}\) specifies the convolutional process. The \(\:tanh\) and sigmoid functions are described by Eqs. (15) and (16):
Finally, the Bi-GRU joins the outputs of either the backward or forward GRU, as represented in (17) or (18):
Whereas\(\:\:{\overleftarrow{H}}_{t}\) and \(\:{\overrightarrow{H}}_{t}\) represent the HLs of the backward and forward GRU outputs at time \(\:t\), individually, and are incorporated to give the last HL \(\:{H}_{t},\) \(\:{v}_{t}\) and \(\:{\omega\:}_{t}\) represent weights of the backward and forward HL at time \(\:t,\:\)GRU represents the GRU’s output function neural network.
AE method
AEs are unsupervised neural networks which target learning an effective, compressed representation of input data by encoding it into a low-dimensional hidden space and then rebuilding the new input from this latent space21. Presented in terms of feature extraction and dimensionality reduction, AEs contain dual key modules: the decoder and the encoder. The encoding maps the input data to the latent space, whereas the decoding rebuilds the input from the latent representation. Unlike supervised methods, which depend on labelled data, AEs learn by reducing the reconstruction error amongst the input and its reconstruction. The AE framework contains dual key parts: Encoder: The encoded compresses the input data into a low‐dimensional representation. The decoded rebuilds the input data from the encoding representation.
Let \(\:x\) \(\:\in\:\) \(\:{\mathbb{R}}^{n}\) be the input vector, while \(\:n\) denotes input dimensionality. The encoding maps \(\:x\) to a latent representation \(\:z\in\:{\mathbb{R}}^{m}\), whereas \(\:m<n\). The decoding subsequently rebuilds\(\:{x}^{{\prime\:}}\in\:{\mathbb{R}}^{n}\), an estimate of \(\:x\), from the latent representation \(\:z.\) The encoding is the function \(\:{f}_{{\theta\:}_{e}}\left(x\right)\), parameterized by \(\:{\theta\:}_{e}\), which maps the input \(\:x\) to the latent area \(\:z\):
Now, \(\:{W}_{e}\in\:{\mathbb{R}}^{m\times\:n}\) denotes a weighted matrix, \(\:{b}_{e}\in\:{\mathbb{R}}^{m}\) signifies a biased vector, and \(\:{\sigma\:}_{e}\) represents the non-linear activation function, usually the ReLU or sigmoid function. The latent space \(\:z\) characterizes a shorter version of the input.
The decoding is a function \(\:{g}_{{\theta\:}_{d}}\left(z\right)\), parameterized by \(\:{\theta\:}_{d}\), which rebuilds the input from the latent space:
In this equation, \(\:{W}_{d}\in\:{\mathbb{R}}^{n\times\:m}\) stands for the decoder’s weighted matrix, \(\:{b}_{d}\in\:{\mathbb{R}}^{n}\) means the bias vector, and \(\:\sigma\:d\) symbolizes the activation function applied in the decoder, which is frequently sigmoid to limit the output in the interval of \(\:\left[\text{0,1}\right].\).
GCN classifier
In the GCN method, transmission lines are reflected as edges, and buses are reflected as nodes22. GCN contains three layers: hidden, input, and output. There is a graph \(\:G=(N,\:E)\), where \(\:N\) characterizes nodes, and \(\:E\) embodies the edges among nodes.
In this equation, \(\:X\) embodies a feature matrix with dimensions. Moreover, \(\:n\) individually specifies the sum of buses or nodes in the system and the input feature counts. \(\:A\) additionally contains \(\:n\)x\(\:n\) sizes that characterize the adjacency matrix.
As specified in this equation, \(\:{A}_{ij}\) specifies whether node \(\:ith\) attaches to node \(\:jth\). Figure 4 depicts the structure of GCN.
The GCN’s HL can gather and transfer node information to the following layer by utilizing propagation rules.
Meanwhile, \(\:{w}^{l}\) denotes the trainable linear transformation weight computed by reducing the loss function of each labelled data. \(\:{b}^{l}\) denotes the bias variable. \(\:\overline{A}\) refers to a standardized adjacency matrix. \(\:Q\) signifies the input graph’s degree matrix. \(\:0\) symbolizes non-linear activation functions. \(\:{h}_{i}^{l}\) stands for the \(\:ith\) node feature of the \(\:lth\) HL.
To incorporate the strengths of the BiGRU, AE, and GCN models, their predictions are aggregated using a soft-voting ensemble approach. Each model produces a probability distribution over the classes, and these probabilities are averaged to determine the final classification. This method effectively integrates temporal dependencies captured by BiGRU, feature representations learned by the AE, and structural relationships modeled by the GCN, resulting in an enhanced robustness, generalization, and accuracy. By implementing the complementary merits of these models, the ensemble outperforms individual model performance.
Stage IV: parameter tuning using eCOA
Eventually, the hyperparameter selection of ensemble models is implemented by the design of the eCOA23. This model is chosen due to its effectiveness in exploring complex and high-dimensional search spaces for optimal hyperparameters. Unlike conventional optimization techniques, namely grid or random search, this model utilizes a biologically inspired approach that adapts to dynamic landscapes, averting local optima and ensuring a more global search for hyperparameter values. The model also results in faster convergence and more precise tuning. The technique’s robustness in handling non-linear, multi-objective optimization problems makes the model more efficient for DL techniques with various hyperparameters. Its capability to balance exploration and exploitation allows for an enhanced generalization and prevents overfitting. Overall, eCOA improves the model’s performance by fine-tuning parameters more effectually than conventional techniques, resulting in an improved outcome in IoT threat detection tasks. Figure 5 demonstrating the workflow of the eCOA technique.
The COA model is an optimizer approach mimicking the natural behaviour of the coatis. COA contains dual stages: the exploration and exploitation stage, which simulates the coati’s behaviour of attacking iguanas and escaping predators. COA may pursue the area well and discover improved outcomes. Nevertheless, in some optimizer states, COA might result in early convergence. Then, the eCOA model is presented to alleviate this problem. By presenting dissimilar approaches derived from the various escaping possibilities of prey, the eCOA model is more adjustable and well-suited for exploring the searching region. This assists the model in escaping local bests and finding global bests more efficiently.
Initialization stage
Like other optimizer models, the eCOA approach arbitrarily initializes the coati’s locations in this stage. The initialization procedure is presented as Eq. (27).
Whereas \(\:{X}_{j}\) signifies the place of \(\:ith\) coati, \(\:{x}_{i,j}\) represents the value of the \(\:jth\) size for the \(\:ith\) coati, \(\:N\) and \(\:m\) denote the number of coatis and dimensions, \(\:r\) refers to randomly generated numbers that are within the \(\:\left[\text{0,1}\right]\) interval, and \(\:u{b}_{j}\) and \(\:l{b}_{j}\) represent the upper and lower limits of the \(\:jth\) size, correspondingly.
Exploration stage
During this stage, the coati population is separated into dual groups. One group is situated between trees, while the Iguana’s location is measured as the optimum position. The Iguana’s location is characterized in Eq. (28).
Here, \(\:{x}_{j}^{best}\) characterizes the value of \(\:jth\) size at the optimal location, and \(\:Iguan{a}_{\dot{j}}\) characterizes the value of the \(\:jth\:\)size for iguanas.
At this stage, the coati’s location is exhibited utilizing Eq. (29). After establishing the novel movement location \(\:{Y}_{i}\), they assess the efficiency of this move \(\:{F}_{i}^{Y}\) to measure its efficacy. When the outcomes are negative (like iguanas showing more cunning escaping tactics), coatis rapidly fine-tuned their model and applied sequences of novel movements to seize the prey. As the rapidity and irregularity of the Levy flight (LF), this study implies that the coatis moves based on the LF manner, as presented in Eq. (30). This assistances the coatis well adjusting to the escaping behaviour of the prey and so improves the possibility of an effective catch.
If \(\:{r}_{escape}\ge\:0.5\), iguanas experience a better likelihood of effectively escaping; coatis use a hard encirclement tactic to catch iguanas, which precisely controls the search area limitations. Currently, the coatis position is provided by Eq. (31).
When the upgraded novel location enhances the value of the objective function, the upgrade procedure endures. Otherwise, the coatis will stay in their present locations. This upgrade procedure is demonstrated in Eq. (32).
Now \(\:{X}_{i}\) signifies the upgraded location of the\(\:\:ith\) coati, \(\:{X}_{i}^{P1}\) signifies its novel location, \(\:{X}_{ij}^{P1}\) embodies its novel value on the \(\:jth\) dimension, \(\:{F}_{i}^{P1}\) is its novel value of the objective function, and \(\:{F}_{j}\) is its new value of the objective function. \(\:LF\) emulates an LF distribution.
Exploitation stage
In the exploitation stage, the coatis encounter an attack and escape from their present place to a safer place nearby.
On the other hand, \(\:t\) denotes the iteration counter, and \(\:T\) signifies total iteration counts.
Akin to the exploration stage, when the upgrade enhances the value of the objective function, the upgrade is recognized. Or else the location remains the same. This update procedure is displayed by Eq. (35).
Termination stage
An iterative procedure is finished over Eqs. (27) to (35). eCOA endures till the maximal iteration counts are attained and the optimal outcome is returned.
Fitness choice is a significant factor influencing the performance of eCOA. The process of parameter choice contains the encoded system for evaluating the effectiveness of the candidate results. The eCOA considers accuracy the main measure to project the FF, which is expressed below.
Meanwhile, \(\:TP\) represents the positive value of true, and FP denotes the positive value of false.
Experimental result and analysis
The experimental analysis of the CDM-BEOSAEM technique is examined under the ToN-IoT dataset24. It contains 73,000 data under ten classes. Table 1 provides complete details of this dataset. Moreover, it has 42 features, but only 29 features are chosen.
Figure 6 exhibits the classifier performances of the CDM-BEOSAEM model on the ToN-IoT dataset. Figure 6a-b represents the confusion matrices by precisely identifying and classifying all distinct classes below 70%TRPH and 30%TSPH. Figure 6c shows the PR outcome, which notified higher performance through all classes. Finally, Fig. 6d demonstrates the ROC outcome, which signifies a capable solution with great ROC values for dissimilar classes.
Table 2; Fig. 7 depict the cyberthreat detection of the CDM-BEOSAEM method on the ToN-IoT dataset. The performances reported that the CDM-BEOSAEM method has suitably organized all classes. According to 70% TRPH, the proposed CDM-BEOSAEM model obtains an average \(\:acc{u}_{y}\) of 98.93%, \(\:pre{c}_{n}\) of 93.61%, \(\:rec{a}_{l}\) of 91.82%, \(\:{F}_{score}\:\)of 92.56%, and \(\:MCC\:\)of 92.05%. Moreover, according to 30% TSPH, the proposed CDM-BEOSAEM method attains an average \(\:acc{u}_{y}\) of 98.91%, \(\:pre{c}_{n}\) of 93.60%, \(\:rec{a}_{l}\) of 92.01%, \(\:{F}_{score}\:\)of 92.69%, and \(\:MCC\:\)of 92.14%.
In Fig. 8, the training (TRA) \(\:acc{u}_{y}\) and validation (VAL) \(\:acc{u}_{y}\) performances of the CDM-BEOSAEM model on the ToN-IoT dataset are showcased. The values of \(\:acc{u}_{y}\:\)are computed across a period of 0–35 epochs. The figure underscored that the values of TRA and VAL \(\:acc{u}_{y}\) present an increasing trend, indicating the capability of the CDM-BEOSAEM method with enhanced performance through multiple repetitions. Moreover, the TRA and VAL \(\:acc{u}_{y}\) values remain close across the epochs, notifying diminished overfitting and displaying the improved outcome outcomes of the CDM-BEOSAEM method, guaranteeing reliable calculation on unseen samples.
Figure 9 shows the TRA loss (TRALOS) and VAL loss (VALLOS) graph of the CDM-BEOSAEM technique on the ToN-IoT dataset. The loss values are computed through a period of 0–35 epochs. It is depicted that the values of TRALOS and VALLOS demonstrate a declining tendency, which indicates the proficiency of the CDM-BEOSAEM approach in corresponding a tradeoff between data fitting and generalization. The subsequent dilution in values of loss and securities is the maximum outcome of the CDM-BEOSAEM approach, and the calculation results are tuned after a while.
Table 3; Fig. 10 examine the comparative study of the CDM-BEOSAEM model on the ToN-IoT dataset with the existing methodologies25,26,27,28. The performances emphasized that the 1D CNN, support vector machine (SVM), Gradient Boost (GB), linear discriminant analysis (LDA), classification and regression trees (CART), and Bi-LSTM models have exhibited lesser solutions. In the meantime, the FedMLDL-HPO approach has gained slightly closer performance. Additionally, the CDM-BEOSAEM approach indicated maximum performance with higher \(\:acc{u}_{y},\) \(\:pre{c}_{n}\), \(\:rec{a}_{l},\) and \(\:{F}_{score}\) of 98.93%, 93.61%, 91.82%, and 92.56%, correspondingly.
Table 4; Fig. 11 demonstrates the computational time (CT) analysis of the CDM-BEOSAEM approach over existing methods. The comparative evaluation of computational time reveals that the CDM-BEOSAEM approach achieves the most efficient performance, completing classification tasks in just 5.20 s. This is significantly faster than conventional methods such as SVM method at 13.42 s, LDA model at 11.62 s, and GB at 10.82 s. Other DL models like Bi-LSTM and FedMLDL-HPO required 10.28 s and 10.37 s, respectively, while the 1D CNN and CART method recorded 8.93 s and 9.74 s. The results highlight the computational efficiency of the CDM-BEOSAEM model, making it more appropriate for time-sensitive IoT applications.
The ablation study of the CDM-BEOSAEM technique is represented in Table 5; Fig. 12. The BEOSA approach achieved \(\:acc{u}_{y}\) of 96.48%, \(\:pre{c}_{n}\) of 91.03%, recall of 89.13%, and \(\:{F}_{score}\) of 90.01%. The BiGRU model illustrated improved results with \(\:acc{u}_{y}\) of 96.99%, \(\:pre{c}_{n}\) of 91.80%, \(\:rec{a}_{l}\) of 89.86%, and \(\:{F}_{score}\) of 90.55%. The AE model performed better, attaining an \(\:acc{u}_{y}\) of 97.72%, \(\:pre{c}_{n}\) of 92.35%, \(\:rec{a}_{l}\) of 90.43%, and \(\:{F}_{score}\) of 91.11%. The GCN model showed slightly increased \(\:acc{u}_{y}\) of 98.25%, \(\:pre{c}_{n}\) of 92.91%, \(\:rec{a}_{l}\) of 91.23%, and \(\:{F}_{score}\) of 91.76%. The CDM-BEOSAEM approach outperformed all, with \(\:acc{u}_{y}\) of 98.93%, \(\:pre{c}_{n}\) of 93.61%, \(\:rec{a}_{l}\) of 91.82%, and \(\:{F}_{score}\) of 92.56%, validating the robustness and synergy of the hybrid framework. The ablation study on the ToN-IoT dataset highlights the efficiency of the CDM-BEOSAEM approach compared to individual components and baseline methods.
In addition, the CDM-BEOSAEM technique is also examined under the Edge-IIoT dataset29. It contains 66,000 data under normal and attack classes, as depicted in Table 6. It has 63 features, but 37 features are selected.
Figure 13 displays the classifier performances of the CDM-BEOSAEM model on the Edge-IIoT dataset. Figure 13a-b indicates the confusion matrices through specific identification and classification of all class labels below 70%TRPH and 30%TSPH. Figure 13c presents the PR study, which indicates superior performance through all class labels. At last, Fig. 13d signifies the ROC study, which illustrates skilful solutions with great ROC values for different classes.
Table 7; Fig. 14 showcase the cyberthreat detection of the CDM-BEOSAEM approach on the Edge-IIoT dataset. The performances indicated that the CDM-BEOSAEM approach has properly categorized all the different classes. On 70% TRPH, the proposed CDM-BEOSAEM approach reaches an average \(\:acc{u}_{y}\) of 99.29%, \(\:pre{c}_{n}\) of 96.10%, \(\:rec{a}_{l}\) of 96.10%, \(\:{F}_{score}\:\)of 96.10%, and \(\:MCC\:\)of 95.71%. Moreover, on 30% TSPH, the proposed CDM-BEOSAEM technique attains an average \(\:acc{u}_{y}\) of 99.27%, \(\:pre{c}_{n}\) of 96.00%, \(\:rec{a}_{l}\) of 96.02%, \(\:{F}_{score}\:\)of 96.01%, and \(\:MCC\:\)of 95.61%.
In Fig. 15, the TRA \(\:acc{u}_{y}\) and VAL \(\:acc{u}_{y}\) performances of the CDM-BEOSAEM technique on the Edge-IIoT dataset are exemplified. The values of \(\:acc{u}_{y}\:\)are computed through a period of 0–25 epochs. The figure underscored that the values of TRA and VAL \(\:acc{u}_{y}\) express an increasing trend, indicating the competency of the CDM-BEOSAEM method with maximum performance across numerous repetitions. In addition, the TRA and VAL \(\:acc{u}_{y}\) values remain close through the epochs, notifying lesser overfitting and revealing the superior performance of the CDM-BEOSAEM method, which assurances steady calculation on unseen samples.
In Fig. 16, the TRA loss (TRALOS) and VAL loss (VALLOS) graph of the CDM-BEOSAEM approach on the Edge-IIoT dataset is exposed. The loss values are computed across a period of 0–25 epochs. The values of TRALOS and VALLOS represent a diminishing tendency, indicating the proficiency of the CDM-BEOSAEM model in equalizing a tradeoff between data fitting and generalization. The succeeding dilution in values of loss and securities increases the maximum performance of the CDM-BEOSAEM model and gradually tunes the calculation results.
Table 8; Fig. 17 study the comparative examination of the CDM-BEOSAEM model on the Edge-IIoT dataset with the existing methodologies. The performances underscored that the 1D CNN, SVM, Gradient Boost, J48, RNN, and LSTM approaches have stated poorer performance. Likewise, the FedMLDL-HPO technique has accomplished a slightly closer performance. In addition, the CDM-BEOSAEM technique indicated maximum performance with enhanced \(\:acc{u}_{y},\) \(\:pre{c}_{n}\), \(\:rec{a}_{l},\) and \(\:{F}_{score}\) of 99.29%, 96.10%, 96.10%, and 96.10%, respectively.
Table 9; Fig. 18 indicates the CT evaluation of the CDM-BEOSAEM methodology over existing models. The computational time analysis on the Edge-IIoT dataset demonstrates the efficiency of the -BEOSAEM methodology compared to various existing methods. The 1D CNN required 18.15 s, the SVM method took 18.65 s, and GB recorded 20.43 s. J48 and RNN models performed slightly better with 17.78 and 17.60 s respectively, while the LSTM classifier exhibited improved efficiency with 15.65 s. The FedMLDL-HPO method consumed 19.52 s. In contrast, the CDM-BEOSAEM model achieved significantly lower CT of 7.99 s, emphasizing its superiority in execution speed and real-time applicability for IIoT environments.
The ablation study of the CDM-BEOSAEM method is shown in Table 10; Fig. 19. The BEOSA method achieved \(\:acc{u}_{y}\) of 96.62%, \(\:pre{c}_{n}\) of 93.4%, \(\:rec{a}_{l}\) of 93.18%, and \(\:{F}_{score}\) of 93.81%. The BiGRU model improved these outputs with \(\:acc{u}_{y}\) of 97.32%, \(\:pre{c}_{n}\) of 94.03%, \(\:rec{a}_{l}\) of 93.93%, and \(\:{F}_{score}\) of 94.51%. Further gains were observed with the AE model, which reached \(\:acc{u}_{y}\) of 97.94%, \(\:pre{c}_{n}\) of 94.77%, \(\:rec{a}_{l}\) of 94.68%, and \(\:{F}_{score}\) of 95.02%. The GCN model depicted improvements with \(\:acc{u}_{y}\) of 98.73%, \(\:pre{c}_{n}\) of 95.57%, recall of 95.44%, and \(\:{F}_{score}\) of 95.57%. The proposed CDM-BEOSAEM technique outperformed all baselines with \(\:acc{u}_{y}\) of 99.29%, \(\:pre{c}_{n}\) of 96.1%, \(\:rec{a}_{l}\) of 96.1%, and \(\:{F}_{score}\) of 96.1%, confirming its efficiency in capturing complex patterns in IIoT threat detection.
Conclusion
In this manuscript, a novel CDM-BEOSAEM method is proposed. The main objective of the CDM-BEOSAEM method relies on enhancing the cyberattack detection method in the IoT environment. At first, min-max normalization is applied in the data normalization stage to convert input data into a beneficial format. Following this, the BEOSA is employed so that FS can recognize the most appropriate features. For the classification of cyberthreat defence, the proposed CDM-BEOSAEM model utilizes ensemble models, namely BiGRU, AE, and GCN techniques. Finally, the hyperparameter selection of ensemble models is accomplished by implementing the eCOA model. The simulation of the CDM-BEOSAEM approach is performed under the ToN-IoT dataset, and the results are measured using various measures. The performance validation of the CDM-BEOSAEM approach portrayed a superior accuracy value of 99.29% over existing models. The limitations of the CDM-BEOSAEM approach comprise the use of a limited dataset, which may affect the generalizability of the outputs across diverse real-world environments. The system is also not evaluated under real-time conditions or on resource-constrained devices, which could impact practical deployment. Additionally, the absence of clinical or contextual metadata may limit the depth of analysis in certain applications. There is also a requirement to assess the performance of the model under dynamic or evolving data scenarios. Future work may concentrate on expanding dataset diversity, performing real-time and on-device evaluations, and integrating contextual data to improve the relevance and reliability of the system in practical settings.
Data availability
The data supporting this study’s findings are openly available at [https://research.unsw.edu.au/projects/toniot-datasets](https:/research.unsw.edu.au/projects/toniot-datasets) and [https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot](https:/www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot) , reference number [24, 29].
References
Tsiknas, K., Taketzis, D., Demertzis, K. & Skianis, C. Cyberthreats to industrial iot: a survey on attacks and countermeasures. IoT 2 (1), 163–186 (2021).
Bou-Harb, E. & Neshenko, N. Cyberthreat Intelligence for the Internet of Thingspp. 1–89 (Springer, 2020).
Mishra, S., Albarakati, A. & Sharma, S. K. Cyberthreat intelligence for IoT using machine learning. Processes, 10(12), p.2673. (2022).
Shaji, R. S., Dev, S., Brindha, T. & V. and A methodological review on attack and defense strategies in cyber warfare. Wireless Netw. 25, 3323–3334 (2019).
Hajizadeh, M., Afraz, N., Ruffini, M. & Bauschert, T. June. Collaborative cyber attack defense in SDN networks using blockchain technology. In 2020 6th IEEE Conference on Network Softwarization (NetSoft) (pp. 487–492). IEEE. (2020).
Ferdous, J., Islam, R., Mahboubi, A. & Islam, M. Z. A State-of-the-Art review of malware attack trends and defense mechanism. IEEE Access (2023).
Vegesna, V. V. Comprehensive analysis of AI-enhanced defense systems in cyberspace. International Numeric J. Mach. Learn. Robots, 7(7). (2023).
Rohit, M. H., Fahim, S. M. & Khan, A. H. A. November. Mitigating and detecting ddos attack on iot environment. In 2019 IEEE International Conference on Robotics, Automation, Artificial-intelligence and Internet-of-Things (RAAICON) (pp. 5–8). IEEE. (2019).
Rawat, R. et al. Modeling of cyberthreat analysis and vulnerability in IoT-based healthcare systems during COVID. In Lessons from COVID-19 (405–425). Academic. (2022).
Han, Y., EL-Hasnony, I. M. & Cai, W. Dragonfly algorithm with gated recurrent unit for cybersecurity in social networking. Full Length Article, (2), (2021). pp.75 – 5.
Wazid, M. et al. Explainable deep Learning-Enabled malware attack detection for IoT-Enabled intelligent transportation systems. IEEE Trans. Intell. Transp. Systems (2025).
Algethami, S. A. & Alshamrani, S. S. A Deep Learning-Based Framework for Strengthening Cybersecurity in Internet of Health Things (IoHT) Environments. Applied Sciences, 14(11), p.4729. (2024).
Kumar, M., Singh, S. K. & Kim, S. Hybrid deep learning-based cyberthreat detection and IoMT data authentication model in smart healthcare. Future Generation Comput. Systems, p.107711. (2025).
Duy, P. T. et al. A study on adversarial sample resistance and defense mechanism for multimodal Learning-based phishing website detection. IEEE Access (2024).
Ragab, M. et al. Enhancing cybersecurity in higher education institutions using optimal deep learning-based biometric verification. Alexandria Eng. J. 117, 340–351 (2025).
Akhunzada, A., Al-Shamayleh, A. S., Zeadally, S., Almogren, A. & Abu-Shareha, A. A. Design and performance of an AI-enabled threat intelligence framework for IoT-enabled autonomous vehicles. Computers and Electrical Engineering, 119, p.109609. (2024).
Imtiaz, N. et al. January. A deep learning-based approach for the detection of various Internet of Things intrusion attacks through optical networks. In Photonics (Vol. 12, No. 35, pp. 1–39). MDPI. (2025).
Almazroi, A. A. & Ayub, N. Deep learning hybridization for improved malware detection in smart Internet of Things. Scientific Reports, 14(1), p.7838. (2024).
Dong, J. et al. Short-term power load forecasting using bidirectional gated recurrent units-based adaptive stacked autoencoder. International Journal of Electrical Power & Energy Systems, 165, p.110459. (2025).
Oyelade, O. N., Aminu, E. F., Wang, H. & Rafferty, K. An adaptation of hybrid binary optimization algorithms for medical image feature selection in neural network for classification of breast cancer. Neurocomputing, 617, p.129018. (2025).
Al-Ahmadi, A. Drone attitude and position prediction via stacked hybrid deep learning model for massive MIMO applications. IEEE Access (2024).
Azad, S. & Ameli, M. T. An imbalanced deep learning framework for Pre-Fault flexible Multi-Zone dynamic security assessment via transfer learning based graph convolutional network. Results Engineering, p.104172. (2025).
Lu, H. et al. A novel feature extraction method based on dynamic handwriting for parkinson’s disease detection. PloS One. 20 (1), e0318021 (2025).
Olawale, O. P. & Ebadinezhad, S. Cybersecurity anomaly detection: Ai and Ethereum blockchain for a secure and tamperproof Ioht data management. IEEE Access (2024).
Alsaedi, A., Moustafa, N., Tari, Z., Mahmood, A. & Anwar, A. TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems. Ieee Access. 8, 165130–165150 (2020).
Alkhonaini, M. A. et al. Sandpiper optimization with hybrid deep learning model for blockchain-assisted intrusion detection in Iot environment. Alexandria Eng. J. 112, 49–62 (2025).
Al Nuaimi, T. et al. A., A comparative evaluation of intrusion detection systems on the edge-IIoT-2022 dataset. Intelligent Systems with Applications, 20, p.200298. (2023).
https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot
Acknowledgments
The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/315/46. Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R361), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors are grateful to Scientific Council, Prince Mohammad Bin Fahd University for supporting the present work. The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research work through the project number “NBU-FFR-2025-1180-06. The authors are thankful to the Deanship of Graduate Studies and Scientific Research at University of Bisha for supporting this work through the Fast-Track Research Support Program.
Author information
Authors and Affiliations
Contributions
The authors declare that they have no conflict of interest. The manuscript was written with the contributions of all authors, and all authors have approved the final version.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Alanazi, M.H., Alkhateeb, J.H., Alamro, H. et al. Enhancing cyberthreat defense mechanisms using ensemble of representation learning with binary Ebola optimization search in internet of things environment. Sci Rep 15, 33193 (2025). https://doi.org/10.1038/s41598-025-17437-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-17437-9