Abstract
Smart devices are enabled via the Internet of Things (IoT) and are connected in an uninterrupted world. These connected devices pose a challenge to cybersecurity systems due attacks in network communications. Such attacks have continued to threaten the operation of systems and end-users. Therefore, Intrusion Detection Systems (IDS) remain one of the most used tools for maintaining such flaws against cyber-attacks. The dynamic and multi-dimensional threat landscape in IoT network increases the challenge of Traditional IDS. The focus of this paper aims to find the key features for developing an IDS that is reliable but also efficient in terms of computation. Therefore, Enhanced Grey Wolf Optimization (EGWO) for Feature Selection (FS) is implemented. The function of EGWO is to remove unnecessary features from datasets used for intrusion detection. To test the new FS technique and decide on an optimal set of features based on the accuracy achieved and the feature taking filters, the most recent FS approach relies on the NF-ToN-IoT dataset. The selected features are evaluated by using the Random Forest (RF) algorithm to combine multiple decision trees and create an accurate result. The experimental outcomes against the most recent procedures demonstrate the capacity of the recommended FS and classification methods to determine attacks in the IDS. Analysis of the results presents that the recommended approach performs more effectively than the other recent techniques with optimized features (i.e., 23 out of 43 features), high accuracy of 99.93% and improved convergence.
Similar content being viewed by others
Introduction
Background
Various fields require more frequent use of numerous applications and services on the Internet. Among them exist both e-commerce and e-learning. However, this tendency raises privacy and security issues. The recent rise in cyber-security breaches was caused by the convergence of the new scams and hacking tools’ targets, which seek to undermine the concepts of confidentiality, integrity, and availability. Any code that can be used to steal information, get around security measures, damage or compromise computer networks, software systems, or the Internet of Things (IoT) is referred to as malicious software1. Several security measures, like firewalls, encryption, and anti-malware software, are employed to counter these attacks. To further analyze attacks, digital forensics techniques are used2.
On the other hand, zero-day attacks and distinctive cyberattacks pose a serious threat to network security in widely used frameworks3. Nevertheless, cyber attackers appear to have no reservations or restrictions when it comes to carrying out their intrusions, regardless of the circumstances or nature of their targets. This includes compromising critical systems such as healthcare providers and the essential services they offer to patients. For example, on June 3, 2024, most hospitals in London were compelled to declare a severe incident after falling victim to a cyber-attack. This attack led to the cancellation of medical procedures and the diversion of emergency patients to other facilities. Initial investigations indicate that cyber criminals utilized ransomware to execute this assault.
Furthermore, cybersecurity expert Steve Sands from the Chartered Institute for IT warned that the ransomware threat has now become an ever-present danger for vital institutions, ranging from schools to hospitals (BBC, 2024). Malicious threats generate significant security challenges, necessitating the development of an innovative, adaptable, and more efficient intrusion detection system (IDS)4. An IDS is an alerting tool that identifies and categorizes compromises, attacks, or security policy breaches instantly at the network and host levels of infrastructure. Depending on intrusive behaviors, Intrusion detection is categorized into two types, i.e., IDS based on hosts (HIDS)5 and IDS based on networks (NIDS)6. Additionally, primary methods based on the detection mechanism are anomaly and signature-based methods. Using a signature-based technique, the IDS compares anomalous activity to stored and established trends. Therefore, the trend identification database needs to be updated on a regular basis to update IDS to identify newly generated attacks. In contrast, an anomaly-based IDS uses a pre-trained routine activity profile to identify various anomalous behaviors7,8.
Heuristic methods are used in anomaly detection to identify undetected harmful activity9. A high false positive rate is produced by anomaly detection in many cases. Most businesses combine anomaly detection with fraud detection in their commercial solution platforms to address this issue. Although stateful protocol analysis operates at the network, application, and transport layers, it is the most effective approach when compared to the other highlighted methods used for IDS. To identify variations in suitable protocols and applications, this makes use of the specified vendor standard settings10,11. Using machine learning (ML) approaches, an IDS that can effectively recognize and classify cyberattacks at the host and network layers is being created12. Still, there are a lot of obstacles because malicious attacks happen frequently and in huge quantities, necessitating an adaptable approach. An IDS’s performance is significantly affected by the feature selection (FS) procedure in addition to its core, apart from the machine learning classification algorithms13. The effectiveness of the IDS can be significantly increased with strong FS and effective integration with the classification procedure. Meta-heuristic algorithms are one way to implement FS14.
Meta-heuristic algorithms that derive inspiration from biological systems mimic regular behaviors of living things under specific circumstances, such as hunting and pursuing prey15,16. These approaches work well with data that has multiple aspects and in dynamic applications. Furthermore, the algorithms presented improved effectiveness in resolving optimization issues17. Additionally, according to researchers18,19,20, bio-inspired meta-heuristic algorithms are frequently employed to address complex, real-time optimization and engineering challenges. Because IDSs handle a lot of data and must identify online invasions in a dynamic, multi-dimensional domain. Therefore, an issue that is thought to be real-world, it can be utilized to develop an effective IDS21.
FS is a pre-processing phase that gathers the most useful characteristics to create an efficient model22. This important phase directly impacts the effectiveness of the IDS. There are two primary techniques of FS wrapper-based and filter-based approaches. When searching and optimizing, the wrapper-based strategy evaluates the outcome based on the learning technique, whereas the filter-based approach leverages the correlation between data and the associated class label without consulting the learning technique. The wrapper-based strategy is more widely employed due to its proven results compared to the less expensive filter-based solution23. Due to their exceptional accuracy, bio-inspired meta-heuristic algorithms are frequently employed in IDSs’ wrapper-based FS method24.
Related work
Anomalous intrusion detection systems have employed a variety of methodologies in the past, including information theory25, statistical analysis26, clustering27, and classification28,29,30. Nonetheless, there exist certain research obstacles, including but not limited to network data size, unavailability of lab-led data and attributes, and adjustment to dynamic network environments31. Few studies have used machine learning and produced some encouraging initial findings32,33. After a thorough literature analysis, it is discovered that machine learning could be effectively used for identifying network breaches34,35,36, malicious program detection37, and network traffic authentication38,39.
Zhang et al.40 presented an IDS model based on deep belief networks and improved genetic algorithm (IGA). Through multiple GA iterations, the optimum number of hidden layers and neurons in each layer are effectively created in response to different types of attacks, enabling the IDS based on the DBN (Deep Believe Networks) to facilitate an accurate detection rate with an efficient design. The framework and techniques are evaluated and simulated using the NSL-KDD dataset. The classification accuracy achieved is approximately 99%. Dwivedi et al.41 proposed an approach that combines the Ensemble of Feature Selection (EFS), and the metaheuristic-inspired Adaptive-Grasshopper Optimization Algorithm (AGOA) procedure. They demonstrate how the outcomes can be used to determine different types of attacks. The high-ranked selected group of characteristics is reduced using the EFS approach, and key attributes from the selected dataset are identified using the AGOA technique, which assists in predicting network traffic behavior.
According to Liu et al.’s42 discussion, the existing problems of low detection rate of the standard IDS framework and lack of scalability are two significant hurdles. The processes were established where standard IDS could not adapt to the dynamic and multidimensional IoT. Therefore, the authors solve the problems with a powered metaheuristic-based IDS that efficiently utilizes Particle Swarm Optimization using PSO amalgamated with these processes. To recognize and explain harmful data density using PSO, the function of the data is exemplified by the complete and conveyed data; in machine learning, it is labeled as OCSVM. The model proposed in this study is checked using the UNSW-NB15 dataset. The experiments showed the model’s accuracy in resolving various harmful or inconceivable data, particularly those minor examples, such as worms, backdoors, and shellcodes. Mehanović et al.43 proposed an IDS by utilizing a Genetic algorithm for FS. This work aims to minimize the feature selection computation time by adapting the technique to a MapReduce framework suitable for cloud computing. SVM and ANN machine learning techniques were also utilized for classification.
Li et al.44 presented to utilize of the concept of Artificial Neural Networks (ANN) for IoT security. The targeted area of the proposed framework is to detect possible anomalous behavior in a medical IoT system. Since the input characteristics of an ANN contribute to how much a technique can derive from it therefore the accuracy is improved. As a result, it could be difficult but crucial to identify the most significant and discriminative aspects in network traffic data. Butterfly Optimization Algorithm (BOA) is proposed for FS that can be useful for the learning process of ANN. The proposed system achieved 93.27% accuracy.
Neto et al.45 proposed an exhaustive IoT attack dataset to improve IoT operations security. A topology made up of 105 IoT devices was developed to accomplish this. The CICIoT2023 dataset is the result of research. Verma and Chandra46 proposed the RepuTE algorithm, which makes it possible to identify Sybil and DoS attacks. This research utilized the ToN-IoT dataset to evaluate the proposed work. Wang et al.47 proposed the Deep Learning-based Bi-LSTM technique, which is a lightweight model for detecting IoT attacks. The study made use of the N-BaIoT, CICIoT2023, and CICIDS2017 datasets. The design of IDSs is the focus of multiple recent research works48,49,50 with a focus to improve the accuracy of the model. Consequently, the ToN_IoT dataset was also utilized to evaluate the algorithm’s confidence. A hybrid deep learning technique utilizing LSTM and one-dimensional CNN was proposed by Yaras and Murat Dener51.
Various researchers52,53 proposed hybrid state of the art IDS techniques for efficient detection of threats with high accuracy. The model’s accuracy, precision, recall, and F1 characteristics were used to assess its performance. Table 1 presents the summary of the existing studies conducted on IDS in which experiments are carried out in various networks. This table summarizes the details with reference in terms of the published year, utilize dataset, proposed algorithm(s) for model training and feature selection, performance limitations of the proposed algorithm(s) and obtained accuracy. In addition to this, Table 2 presents specific gaps the proposed model aimed to fill. It presents the clarity regarding this study’s unique contributions compared to existing work in IDS for IoT.
This paper presents a metaheuristic-based strategy to improve the performance of the IDS, i.e., the EGWO algorithm. In this research, the suggested model for detecting IoT attacks is evaluated using the NF-ToN-IoT dataset as the desired dataset. This research uses an enhanced GWO-based approach to select the necessary attributes. The grey wolf algorithm searches for prey in three stages, which are modeled after hunting behavior that includes searching, encircling, and attacking. Exploration takes place in the first two stages, and exploitation occurs in the final one. One benefit of the GWO algorithms that is evident in a variety of applications is the decreased number of search parameters. This improves the convergence of the technique and improves the effectiveness of the GWO method. The core contribution of this research comprises the following objectives:
-
1.
To propose an Enhanced Grey Wolf Optimization (EGWO) technique by modifying the position update process in the standard GWO with respect to the omega wolf.
-
2.
To design an efficient and responsive intrusion detection system for detecting IoT network attacks by utilizing Random Forest.
-
3.
To test EGWO by incorporating the NF-ToN-IoT dataset based on true positive, true negative, and accuracy.
The remaining paper comprises of the following sections. Section 2 presents the overview framework architecture of IDS, features selection using Random Forest and Standard GWO Algorithm. Section 3 presents the methodology of the developed EGWO technique for IDS in IoT networks. Section 4 provides the results of the suggested approach. Section 5 presents a detailed conclusion and future recommendations.
Materials and methods
An overview of IDS designed specifically for IoT networks is provided at the beginning of this section. It also describes the theoretical foundation of IDS using the FS approach. In addition, it explains FS techniques for IoT network traffic-based intrusion detection. It also reviews the most popular IDS techniques. Furthermore, an explanation is provided for the chosen metaheuristic-based algorithm utilized in the suggested IDS.
Intrusion detection systems (IDS)
When an attacker exerts to access areas of a network that they are not authorized to access, it is known as interference into computer networks. During this procedure, intruders attempt to compromise network services and determine the best opportunity to breach the system. This criminal conduct is not always carried out by malware; network users may be the object of a hacker’s attention and accidentally give the information requested55. Online theft or phishing is a well-known term for this type of attack56. As seen in Fig. 1, implementing IDS is an essential element of defending against these types of attacks, which firewalls may reinforce to regulate traffic on IoT networks and identify network vulnerabilities.
The IDS in Fig. 1 analyses the network traffic pattern to identify a virus when the hacker creates abnormal activity in the system. IDS’s purpose is to produce a warning and notify the firewall or network supervisor. Typically, an IDS is positioned adjacent to a firewall, and in the event of an attack, an activation request is triggered. One of the leading security issues with information systems is intrusion and attack57. Unsurprisingly, there has been a noticeable increase in the frequency of attacks on online users and networking devices in recent years. As Fig. 2 shows58, the percentage of high-risk vulnerabilities found in 2022 that affect network and systems. Considering the prevalence of diverse attacks on networks and servers these days. It is required to put extra efforts into developing improved IDS for a range of network types to detect hackers and minimize potential harm59.
Top 10 serious vulnerabilities reported in 202258.
IDS based on FS adoption
Feature selection is a technique for extracting a subset of significant features (characteristics) from the entire data set by removing unnecessary and duplicated characteristics to construct a suitable learning approach. Furthermore, this procedure can reduce complexities and computational time60,61. Despite the evident shared characteristics, reducing dimensionality and FS are not identical. To create a classification model, it determines the best subset of relevant features that corresponds to the original set of features and has the lowest error rates43. Figure 3 shows the IDS mechanism design, which is based on feature selection employing a Random Forest classifier. IoT network traffic is considered as the input (NF-ToN-IoT data set) and attack identification. Additionally, deduction and alerts are the ultimate outputs. Furthermore, three important step’s structure the IDS model. These steps include training, FS, and categorization.
The input was used as the training data set in the first stage to create feature positions and evaluate each position. The positions with the fewest characteristics and the highest accuracy rate of classification were chosen. The ultimate result in this stage was the positions of each network crime and regular traffic using such optimized features. Training is considered the initial stage of intrusion detection when a training data set is used to train a random forest. To accomplish more accurate identification in the next stage, Random Forest builds decision trees based on the categorization of the training data set. There are two categories for the traffic data set: anomalous traffic and regular traffic62.
Grey Wolf optimization (GWO) algorithm
The hunting behavior of grey wolves is the source of inspiration for the GWO algorithm. It is a revolutionary metaheuristic-based method that mimics both the natural hunting strategy of grey wolves and the structure of dominance. The four types of grey wolves that GWO employed to determine and replicate the leadership structure are alpha, beta, delta, and omega. Grey wolves tend to prefer living in packs. The group size often varies from 5 to 12 members. They have a very rigid and dominating social ranking system, as shown in Fig. 4. This social hierarchy demonstrates feature selection through the GWO Algorithm. Firstly, alpha (α) is the most appropriate solution for a mathematical model of GWO. As a result, the outcomes in second and third place, respectively, are appropriately represented by beta (β) and delta (δ). The omega (Ω) is another projected outcome. Ω wolves track three wolves, while α, β, and δ direct the hunt efficiency based on the GWO algorithm. The primary stage of the hunt occurs when the grey wolf surrounds its target. To mathematically simulate the surrounding behavior, Eqs. (1)–(4) are utilized63. Table 3 presents the symbolic description used in the entire research work.
D = |C. Xp(t) - Xp(t) | (1)
X(t + 1) = Xp(t) - A.D (2)
Where, “Xp” is the prey’s positioning, “A” and “C” are constants, “t” is the current iteration, and “X” denotes the position of a particular grey wolf in the search space64.
A = 2.a.r1 - a (3)
C = 2. r2 (4)
Where r1 and r2 are the constants. Omega wolves adjust their positions based on positions α, β, and δ in every iteration since α, β, and δ are far more probable to know where prey is located. These adjustments are mathematically shown in Eqs. 5–11.
Dα = |C1.Xα(t)-X(t)| (5)
Dβ= |C2.Xβ(t)-X(t)| (6)
Dδ = |C3.Xδ(t)- X(t)| (7)
X1(t + 1) = Xα(t)- A1. Dα (8)
X2(t + 1) = Xβ(t) - A2. Dβ (9)
X3(t + 1) = Xδ(t)- A3. Dδ (10)
X(t + 1) = (X1 + X2 + X3)/3 (11)
The grey wolves figure out their hunt by attempting the target when it ends moving. To represent this mathematically, one can assume that “a*A” is a random variable in the interval [-2a, 2a], where “a” decreases from 2 to 0 over the span of several repetitions. The wolves are forced to assault the victim by |A|<1 (i.e. exploitation). While exploring for prey, the grey wolf pack is forced to stray from their target in the expectation of finding a more suitable meal (exploitation) because of |A|>1. Where, “C” is another element of GWO that encourages investigation. It has a random value in the interval [0, 2]. C > 1 highlights the attack, whereas C < 1 rejects it. In general, (EGWO) has accompanied promising performance, especially in terms of optimization problems. Furthermore, it is relatively easy to implement and apply in various sectors, including engineering, computer science, and mathematics.
Methodology of the proposed IDS Model
A description of the suggested approach’s methodology is given in this section. Figure 5 depicts the overview of the proposed framework. The proposed framework’s steps are covered in further detail in the subsequent sections. The proposed IDS for IoT networks aim to prevent crimes and increase the convenience of on-demand security measures. To identify malicious threats in the network traffic of the IoT network and in all other IoT system nodes, the suggested detection method developed a metaheuristic-inspired technique. The proposed technique improved the standard GWO algorithm. To analyze the suggested EGWO method, the NF-ToN-IoT dataset has been employed. During the data preprocessing stage of IDS, the suggested algorithm is used for testing and training with different values.
Step 1: Preparation of NF-ToN-IoT dataset
This step is crucial to the technique used in data analysis and artificial intelligence. Processing and presenting the data in the right way is the responsibility of data preparation. Particularly when the data includes a variety of information variables and types. The following steps constitute this step.
Preprocessing the NF-ToN-IoT dataset
The key tasks to do before providing data into EGWO are filtering and preprocessing to get optimal performance with improved accuracy. Missing data, categorical features, and class imbalance are among some of the many issues with the used dataset. Irrelevant characteristics could affect how well the proposed EGWO algorithm executes and performs. The present research employed various preprocessing techniques through permutations of multiple strategies for preprocessing and normalizing data.
-
Substitution of missing values:
Recommendations have been provided for oversampling, under-sampling, and hybrid approaches, along with alternatives to the imbalanced scenario. Replicating the minority class values is known as oversampling. It has been utilized by several scholars62,65. The disadvantage of this method is that it overfits these values. Some adopt under-sampling, which lowers the score of the leading class. The In the NF-ToN-IoT dataset, missing values frequently occur. It is necessary to handle these missing values appropriately. The value that occurs most often in each feature with missing data is used in the suggested model to replace the value that is missing. Utilizing mean value, a second method involved imputed statistical characteristics.
-
Numeric conversion of categorical variables: The NF-ToN-IoT dataset contains a variety of categorical attributes. The category attributes need to be given numerical values. To perform this, one-hot encoding has been utilized.
-
Class-imbalance: For balancing the classes in the applied dataset, the Synthetic Minority Oversampling Technique (SMOTE) has been utilized. The distributions with class imbalances afflict the ToN-IoT dataset. issue with this approach is that it might only be possible to adequately represent the class accurately by including some of the features that have been discarded. A hybrid approach is employed, which removes some majority class points and duplicates minority class points. By offering synthetic minority class instances, SMOTE improves basic oversampling at random by resolving the overfitting issue that could result from this method. Instead of copying already available data points, SMOTE produces new data instances. To create additional minority data points, two comparable minority instances are combined linearly66.
-
Remove irrelevant data: Since they could lead to overfitting, several characteristics, including time stamps, IP addresses, source-port, and destination-port, are also eliminated from the NF-ToN-IoT dataset.
Step 2: feature selection
The most effective subset of features is selected through the feature selection step once the dataset has been prepared. The GWO technique is modified for this research to choose the best possible set of features. Step 2 is explained in greater clarity in the following subsections:
Subset generation: preprocessing the NF-ToN-IoT dataset
Subset generation is a heuristic search strategy where each sample in the search space specifies a potential subset assessment alternative. A subset of characteristics is created for this investigation using the random subset creation technique67. The solution was generated using Eq. 12 during the initialization process.
P(i, j) = p(i, j)min+ ∆(i, j). (p(i, j)max- p(i, j)min) (12)
where P(i, j) represents the dimension of the matrix created during the initialization stage, p(i, j)max and p(i, j)min denote the matrix’s lower and upper limits, and the values of the parameters i and j range from 1 to N and 1 to D, respectively. where D denotes the solution’s dimensionality in the matrix and N denotes the number of alternative responses.
Enhanced Grey Wolf Optimization (EGWO) algorithm
To decrease the impact of the best strategies, the wolves in EGWO continually shift positions in accordance with the four best options i.e. Xα, Xβ, Xδ and XΩ. EGWO took advantage of the split method to move the grey wolves to a specific location in the search space based on Xα, Xβ, Xδ and XΩ in EGWO for carrying out FS. The flow chart for the suggested EGWO algorithm with FS is shown in Fig. 6.
According to Eq. (13), a straightforward technique of random probability distribution split has been utilized per dimension to split Xα, Xβ and Xδ outcomes in case of standard GWO.
Where, “d” represents dimension. For FS problem, the EGWO is unrestricted to the standard GWO. But in EGWO, the split strategy was used to determine the subsequent position by comparing i.e. Xα, Xβ, Xδ and XΩ. By including the omega wolf to assist in shifting the positions of the grey wolves, it is inspired from the unique version of standard GWO. The effect rate of the decision made by any wolf decreased from 0.25 to 0.17 because of an attempt to lower the impact rate of any of the most effective options by increasing the number of solutions involved in the decision-making process. The details of this are given in Eqs. (14)–(20).
Xit+1= split (x1, x2, x3, x4) (14)
Where, x1, x2, x3, and x4 correspond to the implications of the wolf move on the best four candidate solutions that are alpha, beta, delta, and omega, respectively. The mathematical representation of x1, x2, x3, and x4 is shown in Eqs. (15–18).
Where, “b” is constant; its value varies between 0 and 2, “Stepb(x), d” represents binary steps in dimension “d” and rand ∈ [0,1]. The mathematical representation of Stepb(x), d is shown in Eq. 20.
Where, s\(\:\left(\text{x}\right),\text{d}\) is mathematically represented as a sigmoidal function, as shown in Eq. 21.
To produce the crossover w1, w2, w3, and w4 improvements, a straightforward technique of random probability distribution crossover has been employed per dimension, as shown in Eq. (21).
Pseudocode of EGWO |
---|
Step 1: Parameter Initialization 1.1. Maximum iterations ‘N’ 1.2. Number of wolves in pack ‘wi’ 1.3. Dimensions in the search space ‘d’ Step 2: Calculate the fitness of each grey wolf |
Step 3: Identify the best four wolves (Xα, Xβ, Xδ and XΩ) based on fitness value |
Step 4: While (i < N) For each wolf “wi” in the pack do: For each dimension “d” do: Generate a random probability range 0–1 Utilize Split method for updating position (by using Eq. 13) If rand < 1/3: X(d) = X(α,d) Else If 1/3 ≤ rand < 2/3: X(d) = X(β,d) Else: X(d) = X(δ,d) # Crossover with 4 wolves (Eq. 21) If rand < 1/4: X(d) = X(α,d) Else If 1/4 ≤ rand < 2/4: X(d) = X(β,d) Else If 2/4 ≤ rand < 3/4: X(d) = X(δ,d) Else: X(d) = X(Ω,d) Update position using Eq. 14: Xit+1= split (x1, x2, x3, x4) Calculate impact using conditions for x1, x2, x3, x4 by using Eqs. 15–18 Apply crossover Adaptive parameter tuning Calculate new position and fitness values Update Xα, Xβ, Xδ and XΩ |
Step 5: Return the best solution “Xα” |
Instead of considering the three wolves’ positions, the suggested algorithm showed an evolving solution iteration (next position change) process based on the positions of the four best wolves. Additionally, the EGWO used a different decision impact rate computation, which had a significant effect on the search performance of the algorithm. Ultimately, even though the suggested technique’s new position update method increased its processing time in general, it did so at the expense of higher algorithmic performance. The impact of all leader wolves is reduced from 0.25 to 0.17 by incorporating the omega wolf in updating the position.
This results in a more balanced influence across the search agents, preventing early convergence to suboptimal solutions and allowing for better exploration of the solution space. Additionally, the probabilistic split decision for position updates reduces the risk of over-relying on the current best solutions, thus maintaining diversity in the search space. This theoretical adjustment enhances the exploration capabilities of the algorithm, making it less likely to get trapped in local optima early in the process.
Improved Exploration and Exploitation
The Enhanced Grey Wolf Optimization (EGWO) algorithm modified the omega wolf position update which affects the trade-off between exploration and exploitation behavior, so this cause enhancement of accuracy detection for Intrusion Detection System (IDS). In EGWO, the level of omega wolf was adjusted in respect to alpha, beta and delta wolves so that it reduces the decision-making impact of any single wolf since then number of multiple wolves is involved. In the calculation this is represented algorithmically with a split strategy and an additional omega wolf’s position. The traditional GWO employs a pack of three leader wolves (alpha, beta and delta) to direct the exploration-exploitation process. This is achieved by EGWO that introduces the omega wolf, which modifies Eq. 3 for position update equations and reduces the effect of top wolves from 0.25 to 0.17 as: This change is then reflected in the mathematics of equations that share results between those four wolves, allowing them to explore a wider space during search and reduce the chance of premature convergence on suboptimal solutions. The net effect of this change is a more able algorithm to explore various promising solutions. These changes improve the tradeoff between exploration and exploitation as follows:
Better exploration
Using four wolves to search instead of three increases the width of spacing, hence reducing chances that we get trapped in a local minimum and help reach more optimal solutions early.
Balanced convergence
Reduced impact of the individual wolf which allows solution improvements throughout exploitation. The algorithm can afford more balanced convergence (i.e. avoid premature convergence) towards global optimum.
That has a huge direct effect on the accuracy of the IDS. This can be attributed to the fact that, EGWO-based feature selection method selects better subset of features and accordingly Random Forest classifier performs best. The convergence performance of the EGWO algorithm demonstrates optimal convergence than that of standard GWO which implies improved classification accuracy as illustrated in this research.
Improved scalability
A new perspective introduced on the EGWO algorithm and coupled with verifying scalability properties in detecting intrusions over IoT networks by utilizing a Random Forest-based mechanism. Moreover, the proposed EGWO algorithm using Random Forest based approach for intrusion detection in IoT networks has demonstrated good scalability behaviour as well. Convergence graphs of the EGWO algorithm show that this method gives optimal solutions in a shorter time compared to traditional GWO, moreover when we consider it after 10th iteration. So, this faster convergence will help the algorithm to manage a greater number of input data points without significantly increasing processing time.
Figure 7 presents the exploration and exploitation graph over iterations of EGWO algorithm. This graph presents the effectiveness of the proposed algorithm by representing the balance between the exploration and exploitation normalized values. It shows how exploration values start decreasing after the exploitation value increased, making sure the balance in this search process while training the proposed model with iterations. This balance is fundamental to obtain improved performance in feature selection process of IDS and allows the EGWO algorithm to converge effectively on optimized solutions.
Evaluation of the generated subset
The objective function is the fundamental part of EGWO and is used for evaluating the fact that the generated subset satisfies the objectives. Researchers have utilized both accuracy and feature size as significant metrics to assess each feature subset and determine the fitness function. The efficiency of IDSs is determined by their ability to identify attacks and detect them accurately. A critical component in identifying breaches in IoT network security is the accuracy of classification, along with the quantity of features. The model makes use of the fitness function shown in Eq. 22:
Where “A” is Accuracy and “NF” is a number of features. There is a list of features in each feature subset. In a scenario when two subsets with differing numbers of characteristics meet the same Accuracy, the subset with fewer features will be chosen. Furthermore, the values of C1 and C2 in Eq. (22) can be independently determined. To do this, multiplying C1 by A and C2 by the inverse NF is the prerequisite for completion. In addition, accuracy is the primary consideration when it comes to the quantity of selected characteristics, and as a result, C1 is given more weight than C2. Thus, under all circumstances, C1 should be greater than C2, ensuring that the priority of accuracy. The performance of IDS is dependent on the proposed objective function. This balanced selection of the fitness function enables EGWO algorithm to dynamically choose those features which can help in accurate classification, yet diminishing risk on over fitting.
Step 3: IDS Stage
The selected features from Stage 2 are evaluated in this step using the Random Forest (RF) algorithm. RF is a machine-learning approach that combines random branch splitting and random node resampling to create decision trees. The various trees cast their votes to determine the ultimate categorization outcome68. The RF is an effective collective classifier that trains several decision trees by combining feature randomness with bagging. One standard classification method is the decision tree. To categorize the data, it attempts to learn a series of if-then conditions. RF algorithm in the proposed work handle the inherent randomness in decision tree construction, by using Random Initialization Control, Bootstrap Aggregating and Balanced Class Weights mechanisms to ensure consistent model performance across different iterations and data subsets. The constant parameter of “random_state” ensures the generation of consistent decision trees when the model reruns with same parameters and datasets. Additionally, bootstrapping mechanism keeps track of the diverse decision boundaries to deal with randomness. The parameters of “class_weight” are used to deal with the unbalanced IoT datasets.
It is evident from Fig. 5 that the classifier will receive the output of stage 2. Furthermore, the dataset was divided into two subsets: a testing set (Ptest, Qtest) and a training set (Ptrain, Qtrain). In the training set, Qtest denotes the class label, while Ptrain represents the features. Whereas the features in the testing set constitute Ptest, the class label for the features in the testing set is included in Qtest. Ptrain and Qtest will be used to train the machine learning RF classifier, and Ptest will subsequently be utilized as input into the model. P1, Q2, and Q stand for the feature set and the class label, respectively. Subsequently, the model’s output will be investigated against the Qtest; if the two match, the classifier accurately identified the dataset record’s behavior.
Overfitting mitigation strategies
To make the proposed model more robust and generalizable a number of strategies are applied to avoid overfitting. Complex models can easily go through overfitting issues; it becomes even more relevant when fitting high dimensional data such as NF-ToN-IoT. The following are the methods, used to avoid Overfitting and validate the proposed model:
Cross-validation
k-fold cross-validation has chosen to validate the model over different splits of data, hence making sure improved generalization and saves from training cycle bias (if any). This methodology is a better guarantee against the performance of models for unknown data because it calculates various training-validation partitions so that each model will be trained and validated several times.
Regularization
Random Forest algorithm resistant to overfitting due ensemble averaging process but regularization was used to keep a controlled model complexity. It puts a bound on the depth of trees and specifies the minimum samples splits as well as other conditions in leaf level. These parameters helped in limiting the size of individual trees, reduce overfitting (learns noise) fit from training data.
Random Forest’s Ensemble Approach
The ensemble method of the Random Forest algorithm makes itself a reliable to prevent from overfitting. Normal Random Forests reduce variance by building multiple decision trees on random samples and averaging their forecasts.
Comparative analysis based on performance
To determine the extent of the improvements that these overfitting mitigation measures have attained, a comparative analysis of the different model performance measures including accuracy, precision, recall and F1 scores has established with and without the mitigation strategies. The results in Table 4 presented show that through the application of the proposed strategies, the model robustness and stability were improved, and the model was more balanced in terms of complexity and generalizability.
Results
The evaluation characteristics and test findings are provided in this section. The IDS dataset of NF-ToN-IoT has been used for the performance evaluation of the proposed technique. The main types of targeted attacks used for both 80% training and 20% testing datasets are shown in Fig. 8. The proposed parameters for simulations by using the NF-ToN-IoT dataset are shown in Table 5.
Using the Jupyter Notebook platform, Pandas, which offers the ability to write in Python on Numfocus, was utilized in the research. The model was trained and tested employing the following configuration on a computer:
-
11th Gen Intel(R) Core (TM) i3-1115G4 @ 3.00 GHz–2.90 GHz.
-
8 GB RAM.
-
64-bit operating system, x64-based processor.
-
Windows 11 Home Single Language.
-
256 GB SSD.
EGWO selected 23 features from the NF-ToN-IoT dataset randomly. The selected features are shown in Table 6. In Table 6, ‘ID,’ ‘SF,’ and ‘DT’ represent the feature’s id, selected feature, and the data type of the selected feature, respectively. Figure 9 demonstrates the improved convergence of the proposed EGWO algorithm while selecting features over 20 iterations. It shows the superiority of EGWO over the standard GWO algorithm by plotting the fitness values against 20 iterations. The lower values of the objective function’s fitness represent better results, i.e., optimized fitness. EGWO starts converging more quickly after the 10th iteration. On the contrary, the standard GWO algorithm’s convergence is slower.
The feature engineering process to concentrate on identification of attributes that directly affect intrusion detection classification performance in IoT networks. The algorithm selected some features (F0-F22) as shown in Table 6 because they strongly described singular data which indicated possible intrusions. The selected features by the improved model increase the predictability of IDS by helping differentiate between legitimate and intrusive network activity, reducing false positives with higher accuracy.
The relationship between the accuracy score achieved during the FS process, benefiting from the EGWO algorithm and fitness values, is shown in Fig. 10. The graphical illustration shows how the IDS accuracy is affected by the optimization method. The curve in Fig. 10 shows that while the optimization proceeds and the value of the fitness converges, i.e., reduces, the accuracy score becomes much better. As the objective function’s value gets closer to its lowest, the accuracy score eventually stabilizes at 99.93%, proving that the EGWO algorithm has successfully found the optimal feature selection to maximize the IDS accuracy.
After optimized feature selection by utilizing EGWO, the RF algorithm has been developed using the scikit-learn package. This classification algorithm is used for evaluating the performance of the proposed IDS with optimized feature selection. Table 4 represents the parameter setting for the Random Forest Algorithm. Accuracy, precision, recall and F1 score are the evaluation measures used to evaluate the effectiveness of the classification algorithm. These measures are computed by using the four parameters named: true-positive (TP), false-negative (FN), false-positive (FP), and true-negative (TN). The number of events that have been accurately classified as normal is known as TP. The FN is the number of events that incorrectly identify normal data as an intrusion, whereas the FP is the number of malicious occurrences that are incorrectly identified as normal. The number of events that are accurately identified as malicious is represented by TN. Equations (23)–(26) have been employed to generate each of these performance evaluation measures. Table 7 presents performance metrics with and without Mitigation strategies. The results of the performance metrics of the classification RF algorithm are presented in Table 8.
A high accuracy in IDS means the model is successfully identifying all attacks (intrusions) and normal behaviour. But in case if the data is imbalanced, while using accuracy as the proposed IDS metric would be surprisingly misleading, because it will bias heavily to the majority class. Another metric needs to be consider with IDS is Precision, which determines how reliable the model can distinguish false alarms. The higher the precision value, fewer flagged events are legitimate intrusions but also there is less likelihood of false positives. A trade-off also exists between precision and recall. To minimise false positives, a model that prioritizes accuracy can ignore some threats. In addition, Recall is important in an IDS to make sure that all or most known intrusions are detected. This approach maximizes recall i.e. misses few malicious activities. By achieving a higher recall, the model will be more aggressive in detection what leads to accumulating false positives. IDS needs a balanced F1-score as the true detection of intrusions at minimal false alarm indicates good trade-off. It is extremely helpful with imbalanced dataset. Rather it prevents the bias and count every precision with an equal weight as recall, which prevents IDS system to be improved in detection but on false alarm rate.
A comparison of various existing studies that are conducted by utilizing the ToN-IoT dataset to the proposed IDS is presented in Table 9. This comparison is presented based on the algorithm used for Feature selection, a technique used for intrusion detection, the number of selected features, and the obtained accuracy.
Formal and Informal security analyses are also performed. For formal security analysis ProVerif tool is used. This tool facilitates verifying the confidentiality and authenticity properties. Table 10 contains the generated output from formal analyses. Furthermore, Table 11 presents the results of informal analysis. These results are summarized in the table illustrating the detection of the IDS model against security threats.
The results in Table 10 show that the confidentiality of key and data in Query 1 has been successfully achieved. In addition, Query 2 found potential lack in data confidentiality and Query 3 ensure high authenticity of the protocol.
Interpretability of recommendations
For an IDS to be practical and useful in exercising practical real-life applications, there is need for understanding the model used and reasons for arriving at given decisions. In this section, different methods regarding model interpretability are described to enhance the understanding of practitioners studying the model’s decision paths and making its suggestions more comprehensible.
Feature selection analysis
Random Forest used in the proposed model provides the results of selected feature importance. The degree of the feature’s contribution to IDS model decisions. In Table 4, F0-F22 are the selected features that have relatively high importance because they directly impact the possibility of distinguishing intrusive network activity from the non-intrusive one. This way, based on this ranking of features, the network administrators get to know this list of attributes together with the threats that they identify for the purpose of monitoring and defense enhancements.
Decision path analysis
Random Forest state includes few decision trees where every tree is offering the path of decision depending on the features of the threshold values. From such paths, one can distinguish how decisions lead to certain classifications and, hence making comparisons.
Recommendation interpretations for actionable insights
The proposed IDS is not only able to detect threats but also offers recommendations based on the features analysis of situations. For instance, when the IDS alerts high tcp_win_max_out, and sign of anomalous activity, the IDS suggests capping TCP window sizes for malicious IPs. The recommendations provided in this paper are based on the IDS analysis of its most valuable functions, so they pertain to the identified threats directly.
Implementing all these techniques makes the IDS model an effective tool in that it provides understandable and constructive instructions in help with the proactive securing of the network. These interpretability elements enable users to believe, and act based on the IDS’s results that further increases operational value and applicability in diverse IoT environments.
Conclusion and future work
The purpose of this research is to evaluate how well an enhanced GWO algorithm can boost the IDS’s performance. This investigation employs the suggested EGWO technique to select an ideal subset of features with high accuracy in classification using the FS procedure. Furthermore, 20% of the NF-ToN-IoT dataset is utilized to evaluate the effectiveness of the suggested methodology. This study’s analysis method is predicated on data separation. This method is essential for assessing the effectiveness of the IDS system and demonstrating its performance by applying it to the test against different types of IoT network challenges. Additionally, research on the suggested method produced high classification accuracy using an ideal subset of data under various attack scenarios. Furthermore, the proposed approach’s accuracy has been verified by comparison with more contemporary methodologies, demonstrating superior accuracy. Despite achieving high accuracy the proposed approach may degrade in case of detections while the environment requires high scalability. Furthermore, the proposed EGWO algorithm introduces some computational complexities and overhead issues; mainly because of the alterations made over the standard GWO. The randomness in EGWO is designed via probabilistic split distribution in all dimensions, with an intention of enhancing exploration capabilities and avoiding premature convergence with inferior solutions. Nevertheless, the random characteristic generated by this approach implies extra position reconsiderations within the search space, which can increase computational complexity in case of highly scalable environment. Even though EGWO optimizes its scalability and convergence rate, the dependence on more features or higher data dimensionality in each iteration will lead to higher memory use and longer processing time.
In future work, the proposed metaheuristic-based technique of EGWO can be further improved to a more scalable solution by integrating it with the Particle Swarm Optimization Algorithm. This integration will facilitate a more optimal selection of features in terms of achieving improved convergence with scalability. The integration will support the feature selection granularity by utilizing PSO’s velocity and position selection mechanism. Additionally, at the same time EGWO’s adaptive learning mechanism improves precision.
Data availability
All the data is available within the document. No external data is used for findings.
References
Mohanty, J. et al. IoT security, challenges, and solutions: a review. Progress in Advanced Computing and Intelligent Engineering: Proceedings of ICACIE 2019, Volume 2, : pp. 493–504. (2021).
Yaacoub, J. P. A. et al. Advanced digital forensics and anti-digital forensics for IoT systems: techniques, limitations and recommendations. Internet Things. 19, 100544 (2022).
Ahmad, R. et al. Zero-day attack detection: a systematic literature review. Artif. Intell. Rev. 56 (10), 10733–10811 (2023).
Elrawy, M. F., Awad, A. I. & Hamed, H. F. Intrusion detection systems for IoT-based smart environments: a survey. J. Cloud Comput. 7 (1), 1–20 (2018).
Martins, I. et al. Host-based IDS: a review and open issues of an anomaly detection system in IoT. Future Generation Comput. Syst. 133, 95–113 (2022).
Bivens, A. et al. Network-based intrusion detection using neural networks. Intell. Eng. Syst. Through Artif. Neural Networks. 12 (1), 579–584 (2002).
Samrin, R. & Vasumathi, D. Review on anomaly based network intrusion detection system. in 2017 international conference on electrical, electronics, communication, computer, and optimization techniques (ICEECCOT). IEEE. (2017).
Almaraz-Rivera, J. G., Perez-Diaz, J. A. & Cantoral-Ceballos, J. A. Transport and application layer DDoS attacks detection to IoT devices by using machine learning and deep learning models. Sensors 22 (9), 3367 (2022).
Guigou, F., Collet, P. & Parrend, P. SCHEDA: lightweight euclidean-like heuristics for anomaly detection in periodic time series. Appl. Soft Comput. 82, 105594 (2019).
Salman, T. & Jain, R. Networking protocols and standards for internet of things. Internet of things and data analytics handbook, : pp. 215–238. (2017).
Selvarajan, S. et al. Integrated probability relevancy classification (IPRC) for IDS in SCADA’, design framework for wireless network. Lect Notes Netw. Syst. 82 (1), 41–64 (2019).
Halimaa, A. & Sundarakantham, K. Machine learning based intrusion detection system. in 2019 3rd International conference on trends in electronics and informatics (ICOEI). IEEE. (2019).
Nimbalkar, P. & Kshirsagar, D. Feature selection for intrusion detection system in internet-of-things (IoT). ICT Express. 7 (2), 177–181 (2021).
Parimala, G. & Kayalvizhi, R. An effective intrusion detection system for securing IoT using feature selection and deep learning. in 2021 international conference on computer communication and informatics (ICCCI). IEEE. (2021).
Syed, D. et al. A comparative analysis of metaheuristic techniques for high availability systems (September 2023). IEEE Access 12, 7382–7398, (2024). 10.1109/ACCESS.2024.3352078
Syed, D., Shaikh, G. M. & Rizvi, S. Systematic review: particle swarm optimization (PSO) based load balancing for Cloud Computing. Sir Syed Univ. Res. J. Eng. Technol. 14 (1), 86–94 (2024).
Rabie, O. B. J. et al. A novel IoT intrusion detection framework using decisive red Fox optimization and descriptive back propagated radial basis function models. Sci. Rep. 14 (1), 386 (2024).
Elmagzoub, M. et al. A survey of swarm intelligence based load balancing techniques in cloud computing environment. Electronics 10 (21), 2718 (2021).
Al Reshan, M. S. et al. A fast converging and globally optimized approach for load balancing in cloud computing. IEEE Access. 11, 11390–11404 (2023).
Syed, D., Muhammad, G. & Rizvi, S. Systematic Review: Load Balancing in Cloud Computing by Using Metaheuristic Based Dynamic Algorithms39 (Intelligent Automation & Soft Computing, 2024). 3.
Halim, Z. et al. An effective genetic algorithm-based feature selection method for intrusion detection systems. Computers Secur. 110, 102448 (2021).
Prashanth, S. et al. Optimal feature selection based on evolutionary algorithm for intrusion detection. SN Comput. Sci. 3 (6), 439 (2022).
Almomani, A. et al. Metaheuristic algorithms-based feature selection approach for intrusion detection, in Machine Learning for Computer and Cyber Security. CRC. 184–208. (2019).
Acharya, N. & Singh, S. An IWD-based feature selection method for intrusion detection system. Soft. Comput. 22, 4407–4416 (2018).
Jain, A., Verma, B. & Rana, J. Anomaly intrusion detection techniques: a brief review. Int. J. Sci. Eng. Res. 5 (7), 1372–1383 (2014).
Jose, S. et al. A survey on anomaly based host intrusion detection system. in Journal of Physics: Conference Series. IOP Publishing. (2018).
Pattawaro, A. & Polprasert, C. Anomaly-based network intrusion detection system through feature selection and hybrid machine learning technique. in 2018 16th International Conference on ICT and Knowledge Engineering (ICT&KE). IEEE. (2018).
Pandey, S. K. Design and performance analysis of various feature selection methods for anomaly-based techniques in intrusion detection system. Secur. Priv. 2 (1), e56 (2019).
Ferrag, M. A. et al. Deep learning-based intrusion detection for distributed denial of service attack in agriculture 4.0. Electronics 10 (11), 1257 (2021).
Kumar, P. et al. Sad-iot: security analysis of ddos attacks in iot networks. Wireless Pers. Commun. 122 (1), 87–108 (2022).
Shitharth, S. et al. Intelligent Intrusion Detection Algorithm Based on multi-attack for edge-assisted Internet of Things, in Security and Risk Analysis for Intelligent Edge Computingp. 119–135 (Springer, 2023).
Hajisalem, V. & Babaie, S. A hybrid intrusion detection system based on ABC-AFS algorithm for misuse and anomaly detection. Comput. Netw. 136, 37–50 (2018).
Kumar, P., Gupta, G. P. & Tripathi, R. Toward design of an intelligent cyber attack detection system using hybrid feature reduced approach for iot networks. Arab. J. Sci. Eng. 46 (4), 3749–3778 (2021).
Fiore, U. et al. Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122, 13–23 (2013).
Alom, M. Z. & Bontupalli, V. and T.M. Taha. Intrusion detection using deep belief networks. in 2015 National Aerospace and Electronics Conference (NAECON). IEEE. (2015).
Verma, A. & Ranga, V. Machine learning based intrusion detection systems for IoT applications. Wireless Pers. Commun. 111 (4), 2287–2310 (2020).
Liu, Z. et al. Anomaly detection on iot network intrusion using machine learning. in 2020 International conference on artificial intelligence, big data, computing and data communication systems (icABCD). IEEE. (2020).
Moustafa, N., Turnbull, B. & Choo, K. K. R. An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet Things J. 6 (3), 4815–4830 (2018).
Shitharth, S. et al. IDS Detection Based on Optimization Based on WI-CS and GNN Algorithm in SCADA Network. In: Das, S.K., Samanta, S., Dey, N., Patel, B.S., Hassanien, A.E. (eds) Architectural Wireless Networks Solutions and Security Issues. Lecture Notes in Networks and Systems, vol 196, (2021). Springer, Singapore. https://doi.org/10.1007/978-981-16-0386-0_14.
Zhang, Y., Li, P. & Wang, X. Intrusion detection for IoT based on improved genetic algorithm and deep belief network. IEEE Access. 7, 31711–31722 (2019).
Dwivedi, S. et al. Implementation of adaptive scheme in evolutionary technique for anomaly-based intrusion detection. Evol. Intel. 13 (1), 103–117 (2020).
Liu, J. et al. Research on intrusion detection based on particle swarm optimization in IoT. IEEE Access. 9, 38254–38268 (2021).
Mehanović, D. et al. Feature selection using cloud-based parallel genetic algorithm for intrusion detection data classification. Neural Comput. Appl. 33, 11861–11873 (2021).
Li, Y., Ghoreishi, S. & Issakhov, A. Improving the accuracy of network intrusion detection system in medical IoT systems through butterfly optimization algorithm. Wireless Pers. Commun. 126 (3), 1999–2017 (2022).
Neto, E. C. P. et al. CICIoT2023: a real-time dataset and benchmark for large-scale attacks in IoT environment. Sensors 23 (13), 5941 (2023).
Verma, R. & Chandra, S. RepuTE: a soft voting ensemble learning framework for reputation-based attack detection in fog-IoT milieu. Eng. Appl. Artif. Intell. 118, 105670 (2023).
Wang, Z. et al. A lightweight intrusion detection method for IoT based on deep learning and dynamic quantization. PeerJ Comput. Sci. 9, e1569 (2023).
Qu, Z. et al. Localization of dummy data injection attacks in power systems considering incomplete topological information: a spatio-temporal graph wavelet convolutional neural network approach. Appl. Energy. 360, 122736 (2024).
Li, Y. et al. Detection of false data injection attacks in smart grid: a secure federated deep learning approach. IEEE Trans. Smart Grid. 13 (6), 4862–4872 (2022).
Qu, Z. et al. Active and passive hybrid detection method for power CPS false data injection attacks with improved AKF and GRU-CNN. IET Renew. Power Gener. 16 (7), 1490–1508 (2022).
Yaras, S. & Dener, M. IoT-Based intrusion detection system using New Hybrid Deep Learning Algorithm. Electronics 13 (6), 1053 (2024).
Sangeetha, K., Shitharth, S. & Mohammed, G. B. Enhanced SCADA IDS security by using MSOM hybrid unsupervised algorithm. Int. J. Web-Based Learn. Teach. Technol. (IJWLTT). 17 (2), 1–9 (2022).
Almomani, O. A Hybrid Model Using Bio-Inspired Metaheuristic Algorithms for Network Intrusion Detection System68 (Computers, Materials & Continua, 2021). 1.
Nazir, A. & Khan, R. A. A novel combinatorial optimization based feature selection method for network intrusion detection. Computers Secur. 102, 102164 (2021).
Khraisat, A. et al. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2 (1), 1–22 (2019).
Massa, D. & Valverde, R. A fraud detection system based on anomaly intrusion detection systems for e-commerce applications. Comput. Inform. Sci. 7 (2), 117–140 (2014).
Kizza, J. M. System Intrusion Detection and Prevention, in Guide to Computer Network Securityp. 295–323 (Springer, 2024).
Synopsys. Software Vulnerability Snapshot., (2023). https://www.synopsys.com/software-integrity/resources/analyst-reports.html., Editor.
Kamronbek, Y. et al. Time Series Mean Normalization for Enhanced Feature Extraction in In-Vehicle Network Intrusion Detection System. in International Conference on Broadband and Wireless Computing, Communication and Applications. Springer. (2023).
Mohy-Eddine, M. et al. An efficient network intrusion detection model for IoT security using K-NN classifier and feature selection. Multimedia Tools Appl. 82 (15), 23615–23633 (2023).
Gharaee, H. & Hosseinvand, H. A new feature selection IDS based on genetic algorithm and SVM. in 2016 8th International Symposium on Telecommunications (IST). IEEE. (2016).
Wali, S. & Khan, I. Explainable AI and Random Forest Based Reliable Intrusion Detection System (Authorea Preprints, 2023).
Rezaei, H., Bozorg-Haddad, O. & Chu, X. Grey wolf optimization (GWO) algorithm. Advanced optimization by nature-inspired algorithms, : pp. 81–91. (2018).
Selvarajan, S. A comprehensive study on modern optimization techniques for engineering applications. Artif. Intell. Rev. 57 (8), 194 (2024).
Talukder, M. A. et al. Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. J. big data. 11 (1), 33 (2024).
Bagui, S. & Li, K. Resampling imbalanced data for network intrusion detection datasets. J. Big Data. 8 (1), 6 (2021).
Kim, D. S. et al. Fusions of GA and SVM for anomaly detection in intrusion detection system. in Advances in Neural Networks–ISNN 2005: Second International Symposium on Neural Networks, Chongqing, China, May 30-June 1, 2005, Proceedings, Part III 2. Springer. (2005).
Dey, P. & Bhakta, D. A new random forest and support vector machine-based intrusion detection model in networks. Natl. Acad. Sci. Lett. 46 (5), 471–477 (2023).
Kumar, P., Gupta, G. P. & Tripathi, R. TP2SF: a trustworthy privacy-preserving secured Framework for sustainable smart cities by leveraging blockchain and machine learning. J. Syst. Architect. 115, 101954 (2021).
Gad, A. R., Nashat, A. A. & Barkat, T. M. Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset. IEEE Access. 9, 142206–142217 (2021).
Mohamed, R. H., Mosa, F. A. & Sadek, R. A. Efficient intrusion detection system for IoT environment. Int. J. Adv. Comput. Sci. Appl., 13(4). (2022).
Ibrahim Hairab, B. et al. Anomaly detection of zero-day attacks based on CNN and regularization techniques. Electronics 12 (3), 573 (2023).
Dobrojevic, M. et al. Addressing internet of things security by enhanced sine cosine metaheuristics tuned hybrid machine learning model and results interpretation based on shap approach. PeerJ Comput. Sci. 9, e1405 (2023).
Maseno, E. M. & Wang, Z. Intrusion Detection in IoT Using Ensemble Approach. in ATAIT. (2023).
Dey, A. K., Gupta, G. P. & Sahu, S. P. Hybrid meta-heuristic based feature selection mechanism for cyber-attack detection in iot-enabled networks. Procedia Comput. Sci. 218, 318–327 (2023).
Funding
The authors are thankful to the Deanship of Graduate Studies and Scientific Research at University of Bisha for supporting this work through the Fast-Track Research Support Program.
Author information
Authors and Affiliations
Contributions
Saad Said Alqahtany.: Conceptualization, Software, Investigation, Data curation, Writing – original draft, Writing – review & editing. Asadullah Shaikh.: Conceptualization, Investigation, Project administration, Data curation, Writing – original draft, Writing – review & editing. Ali Alqazzaz.: Validation, Investigation, Data curation, Writing – review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Alqahtany, S.S., Shaikh, A. & Alqazzaz, A. Enhanced Grey Wolf Optimization (EGWO) and random forest based mechanism for intrusion detection in IoT networks. Sci Rep 15, 1916 (2025). https://doi.org/10.1038/s41598-024-81147-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-81147-x
Keywords
This article is cited by
-
Enhancing intrusion detection in wireless sensor networks using a Tabu search based optimized random forest
Scientific Reports (2025)
-
An intelligent deep representation learning with enhanced feature selection approach for cyberattack detection in internet of things enabled cloud environment
Scientific Reports (2025)
-
Artificial intelligence-driven cybersecurity system for internet of things using self-attention deep learning and metaheuristic algorithms
Scientific Reports (2025)