Elevating intrusion detection and security fortification in intelligent networks through cutting-edge machine learning paradigms

Munna, Md Minhazul Islam; Rahman, Md Mahbubur; Frnda, Jaroslav; Anwar, Muhammad Shahid; Kutlimuratov, Alpamis

doi:10.1038/s41598-025-23754-w

Download PDF

Article
Open access
Published: 14 November 2025

Elevating intrusion detection and security fortification in intelligent networks through cutting-edge machine learning paradigms

Md Minhazul Islam Munna¹^na1,
Md Mahbubur Rahman¹^na1,
Jaroslav Frnda^2,3,
Muhammad Shahid Anwar⁴ &
…
Alpamis Kutlimuratov⁵

Scientific Reports volume 15, Article number: 39989 (2025) Cite this article

1572 Accesses
8 Altmetric
Metrics details

Subjects

Abstract

The proliferation of IoT devices and their reliance on Wi-Fi networks have introduced significant security vulnerabilities, particularly the KRACK and Kr00k attacks, which exploit weaknesses in WPA2 encryption to intercept and manipulate sensitive data. Traditional intrusion detection systems (IDS) using classifiers face challenges such as model overfitting, incomplete feature extraction, and high false positive rates, limiting their effectiveness in real-world deployments. To address these challenges, this study proposes a robust multiclass machine learning based intrusion detection framework. The methodology integrates advanced feature selection techniques to identify critical attributes, mitigating redundancy and enhancing detection accuracy. Two distinct ML architectures are implemented: a baseline classifier pipeline and a stacked ensemble model combining noise injection, principal component analysis (PCA), and meta learning to improve generalization and reduce false positives. Evaluated on the AWID3 dataset, the proposed ensemble architecture achieves superior performance, with an accuracy of 98%, precision of 98%, recall of 98%, and a false positive rate of just 2%, outperforming existing state-of-the-art methods. This work demonstrates the efficacy of combining preprocessing strategies with ensemble learning to fortify network security against sophisticated Wi-Fi attacks, offering a scalable and reliable solution for IoT environments. Future directions include real-time deployment and adversarial resilience testing to further enhance the model’s adaptability.

A hybrid machine learning model for intrusion detection in wireless sensor networks leveraging data balancing and dimensionality reduction

Article Open access 07 February 2025

A drift-aware RS²FS pipeline with confidence gating for IDS

Article Open access 10 January 2026

Robust machine learning based Intrusion detection system using simple statistical techniques in feature selection

Article Open access 01 February 2025

Introduction

Machine Learning (ML) is pivotal in revolutionizing intelligent programmable networks, enabling them to meet the escalating demands of modern, complex environments such as 5G and beyond. These smart networks enable dynamic traffic routing, adaptive bandwidth allocation, and real-time anomaly detection, making them pivotal for applications spanning smart homes, industrial automation, healthcare systems, and autonomous vehicles^1,2. Among the core technologies enabling these ecosystems is Wi-Fi, which facilitates wireless communication for billions of devices. However, this convenience comes at the cost of growing vulnerability. Wi-Fi protocols, especially WPA2, have been repeatedly shown to suffer from critical flaws. In particular, key reinstallation attacks (KRACK) and KR00K, exploit weaknesses in WPA2’s four-way handshake and chipset-level key handling, respectively, allowing adversaries to intercept, decrypt, or manipulate sensitive network traffic^3,4. These attacks are particularly concerning in IoT environments, where devices often lack frequent firmware updates, making them persistently vulnerable⁵.

Over the years, numerous machine learning (ML)-based intrusion detection systems (IDS) have been developed to detect and mitigate such network threats. Classical models, such as Decision Trees, Support Vector Machines, and k-Nearest Neighbors, have been used in combination with packet-level feature extraction, overfitting issues, and flow-based heuristics to detect anomalous behavior^6,7. However, when it comes to Wi-Fi-specific threats like KRACK and KR00K, traditional IDSs face several limitations. First, these attacks often operate at the link layer and do not induce large deviations in network traffic patterns, making them difficult to detect using standard anomaly-based techniques⁸. Second, existing models are highly sensitive to feature noise, suffer from overfitting on imbalanced datasets, and are not optimized for multiclass classification³. Third, many detection systems generate high false positive rates (FPR), which renders them impractical for real-time environments where false alarms may overwhelm network administrators⁹. Lastly, the feature extraction methods employed in previous works often lack robustness in generalizing across varied device types, operating systems, and chipsets, factors that are crucial for widespread adoption in real-world settings¹⁰.

Given these shortcomings, there is an urgent need for a more comprehensive and noise-resilient intrusion detection framework that can handle the subtlety of WPA2-based attacks while minimizing false alarms. Our research is motivated by this gap. We argue that detecting KRACK and KR00K attacks requires not just powerful classifiers but also a robust data preprocessing pipeline that can denoise input features, reduce dimensionality, and enhance generalization⁸. Furthermore, ensemble learning, particularly stacking-based approaches, offers a promising route to combine the strengths of multiple classifiers while minimizing individual weaknesses¹¹. When coupled with advanced techniques such as Principal Component Analysis (PCA) for feature compression and Gaussian noise injection for regularization, these ensemble models can significantly improve detection performance on noisy, high-dimensional, and imbalanced datasets such as AWID3^3,5.

To this end, we propose a two-tier machine learning pipeline for wireless intrusion detection. The first tier (Pipeline 1) trains and evaluates individual classifiers, including Support Vector Machines (SVM), Random Forests (RF), XGBoost, k-NN, and Multi-Layer Perceptrons (MLP), under controlled preprocessing settings. The second tier (Pipeline 2) extends this by injecting Gaussian noise into input features, applying PCA to preserve 90% of data variance while reducing redundancy, and combining probabilistic outputs from base learners into a meta-feature vector. This vector is then passed to a meta-classifier, implemented using XGBoost, to make the final prediction. By adopting this ensemble architecture, the system benefits from diversified decision boundaries and improved stability¹². Additionally, we apply techniques such as Variance Inflation Factor (VIF) analysis to eliminate multicollinearity and Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance^3,13. These enhancements ensure that the model is not only accurate but also generalizable and scalable for deployment in IoT-centric networks.

The key contributions of this study are as follows:

We introduce a novel ensemble-based IDS that integrates noise augmentation, PCA, and stacked meta-learning to enhance detection of KRACK and Kr00k attacks in Wi-Fi traffic.
We apply Variance Inflation Factor (VIF) analysis to eliminate multicollinearity and Synthetic Minority Oversampling Technique (SMOTE) to balance class distributions, significantly improving model stability.
Our stacked ensemble pipeline (Pipeline 2) achieves 98% accuracy with a reduced false positive rate of 2%, outperforming baseline models with FPRs between 4–7%, thereby improving real-world deployment feasibility in security-critical environments.

Related works

This malfunction has hit Wi-Fi’s most prominent vulnerabilities, Krack (Key Reinstallation Attack) and Kr00k. The Krack attack exploits a vulnerability in the four-way handshake process established by the WPA2 protocol that allows secure communication between devices. The attacker then attacks this flaw - reinstalls the decryption key - and is able to intercept sensitive information⁶. The Kr00k, on the other hand, can be understood as a vulnerability that affects Wi-Fi chips from Broadcom and Cypress models; all these chips go through the process of resetting the encryption key into a zero momentarily during a disassociation-an act that makes it possible for an attacker to capture and decrypt information off the data packets⁴. These weaknesses have created serious cracks in wireless network security and indicated the need for sophisticated intrusion detection systems (IDS). Such attacks cannot usually be detected by using conventional security strategies such as firewalls and encryption, compelling researchers to look at more advanced techniques for improving detection accuracy and damage mitigation, such as ML⁷. ML is one of the best available approaches for intrusion detection because it can be used in analyzing huge amounts of data and finding patterns that lead to the discovery of possible security threats. Quite a lot of scientific work has been wasted in using ML techniques to build WIDS, using databases such as AWID which contains data from the real world on network traffic - even with examples of various attacks^6,14.

For instance, a ML-based approach to detect Krack and Kr00k attacks using the AWID3 dataset, which includes detailed traces of wireless network activity¹⁵. Their study highlights the efficacy of ML models in identifying these vulnerabilities, achieving a detection accuracy of 99% for Krack attacks and 96.7% for Kr00k attacks¹⁶. The authors used ensemble classifiers and neural networks, demonstrating the potential of ML in enhancing wireless network security³. The performance of a deep learning model in building an intrusion detection system that identifies network intrusion types including attacks based on Wi-Fi was understood to be similar to a study. This model registers a 98 percent accuracy rating, demonstrating deep learning effectiveness in the identification of sophisticated patterns in attack³. Research in this regard shows that ML techniques, especially in deep learning, have significant improvement in the efficiency and accuracy of intrusion detection mechanisms^7,17.

Dealing with the high dimensionality of noise and the high measurement dimensions of several traffic data is the primary challenge for ML in intrusion detection. Feature selection techniques will enhance model performance by reducing irrelevant or redundant features. Preprocessing of the AWID dataset that emphasizes the usage of the study beforehand into ML algorithms is one. ANOVA based feature selection improved its detection capability from 254 features down to 15. This preprocessing stage dramatically enhanced the model output and stressed the importance of feature selection in ML based IDS³. The issue of high dimensional data in their wireless IDS research. Thus, applying feature selection techniques improved the accuracy of the model to 99.67%. This approach not only reduced computational complexity but also minimized overfitting, which is common among ML models using very large datasets. The evolution of such sophisticated attacks now on the wireless networks, namely Krack and the more recent Kr00k, has called for the design of new advanced intrusion detection systems in them. In such a scenario, ML would realize a success story being regarded as the pillar on which high apparent accuracy in detection and mitigatemeasures rely.Within this context, supervised learning algorithms such as Decision Trees, Random Forests, and Neural Networks have widely found applications in anomaly detection¹⁸. With the inclusion of ML, WIDS perform even better in discovering last-minute or zero-day-attacks. In a proposed design of a hybrid WIDS, the system has a rule-based detection approach as well as an ML-based one. The system was able to recognize up to 98.57% WLAN anomalies. In their approaches, the ML models were trained with the normal and attack scenarios of actual Wi-Fi traffic in real-time which was developed from AWID3, a very popular wi-fi based dataset. Differentiating the attacks such as flooding, injection, and impersonation, the datasets helped in characterizing the attack models¹⁹.

The IoT has ripped apart the previous communication records; these things enhance the ways devices can connect in terms of convenience and efficiency¹⁸. But there is a cloud of rapid proliferation of IoT devices creating new security risks. Often the IoT devices depend on Wi-Fi networks, so it becomes vulnerable to various attacks. One among those is Kr00k, a very high-ranked threat towards Wi-Fi security. The Kr00k attack exploits vulnerabilities found in Wi-Fi chips, especially those designed to use WPA2 encryption; it allows attackers to decrypt the data transmitted over the network. The current review discusses the existing studies pertaining to Kr00k attacks and highlights the application of ML in detecting and mitigating such attacks by focusing on Binary Grasshopper Optimization Algorithm and Long Short-Term Memory models proposed within recent studies⁵. The infrastructure of most IoT devices is based on Wi-Fi technology, which connects between devices and centralized systems through wireless technology. Despite all these, security problems have crept in with the blue wave of Wi-Fi networks available at home and office environments. Here comes the most common attack, the Kr00k attack, which is due to the vulnerabilities in Wi-Fi chips to decrypt any data. This affects billions of devices, right from the smartphones to IoT devices, and illegally penetrates through user information. Recently, traditional security measures, such as firewalls and encryption, are no longer enough to protect against the constant increase of changing dynamics in the network threat landscape. Thus, researchers have turned to applications of ML and its techniques to improve intrusion detection systems (IDS). In fact, many ML models, including Convolutional Neural Networks and LSTMs, have shown considerable promise in finding patterns in network traffic that point to security violations. These models also have many different limitations, including the problem of overfitting and missing some critical feature during feature extraction. Different ML algorithms applies to AWID, a data set for various Wi-Fi scenarios¹⁰. They had selected the 20 features manually and passed them to 8 classifiers to identify Wi-Fi attacks²⁰. The accuracy of their models lay between 89% and 96%, but the selection of features was time-consuming and tedious. Stacked autoencoders (sae)% were used in the same way for invasion detection on awid dataset and achieved below expectation precision and recall²¹.

In another piece of research which employs detection modes with repositories of very high numbers within the seven-layer deep neural network (DNN) for malicious detection issues while the model tends to have higher false positive rates⁵, it has been proposed to focus on MCA-LSTM along the Temporal Co-relations amid intrusion data but failed to include space references as required, thus resulting in less accuracy^22,23. These examples reflect the paradox of forcing detection accuracy at a reasonably optimum rate of computation in network security models²¹. To address the problems of ML model feature extraction and overfitting, a new novel approach is developed in order to incorporate Binary Grasshopper Optimization Algorithm (BGOA) with LSTM models. BGOA is a very powerful and efficient optimization method inspired by the swarming behavior of grasshoppers. It can identify relevant features in a dataset, which is critical in ML models regarding performance improvement as they enhance the identification of the most pertinent features contained in a dataset. In the aspect of Kr00k attack detection, BGOA is guided toward selecting certain relevant network traffic features, which would have a big contribution toward accurate attack classification⁵.

The proposed approach improves network security much since it reduces the number of dimensions of the dataset and thus improves model accuracy. Similarly, Nandhini et al. utilized the AWID3 data set with the huge number of 199,984 instances having 254 different features. BGOA had been applied to derive the most relevant features as inputs to an LSTM network for classification¹⁷. The model managed to achieve 96% accuracy at levels of precision and recall of 0.97 and 0.95 respectively, with an F1 score of 0.93, hence proving the strength of BGOA in optimizing features. LSTM networks can give an accurate interpretation of dynamic sequential data, which includes classifying network traffic and identifying patterns or clues that can be used to signal an intrusion. The proposed scheme makes use of LSTM networks for the classification of the Wi-Fi traffic as normal or malicious with respect to the Kr00k attack. The information-containing time dependence of network traffic by LSTM along with the best-relevant features selected by BGOA leads to a more accurate result of classification¹⁹. The combination of CNN and LSTM models has proved successful in past efforts for use in attack detection²⁴. The main problem was that they were put together linearly and therefore limited the full use of temporal and spatial features, thus leading to a poorly performing model^19,25. The combining of BGOA with LSTM would improve on selection of features and accuracy in classification. With the proliferation of more IoT devices, there is a pressing need for a strong network security solution. Kr00k attacks present a serious threat to Wi-Fi networks, and thus, developing ML models took a lot of time due to challenges faced during the selection of features and accuracy issues. BGOA is incorporated into feature selection with LSTM for attack classification. This holds great potential as the proposed method enhances the accuracy of the system in detection and also diminishes the computational complexity involved in processing large datasets. Future works should also focus on the optimization of these models for real-time applications with different network environments.

Wireless Local Area Networks (LANs) have become indispensable to today’s modern communication infrastructure, forming the backbone for internet connectivity and data exchange. However, with increasing dependence on wireless networks inevitably comes concern and anxiety regarding security²⁶. One of the very recent and notable vulnerabilities that came up was a Kr00k attack, which exploits weaknesses found in Wi-Fi encryption protocols, particularly with WPA2 and WPA3. This review highlights some studies and methodologies revolving around detection and mitigation of Kr00k attacks, especially focusing on the combination of Channel Switch Announcement (CSA) and Kr00k attack methods²⁷. Kr00k, first revealed in 2020 by researchers at ESET, exploits a backdoor in the Wi-Fi chips used in billions of devices²⁷. This vulnerability allows attackers to intercept and decrypt wireless communication simply by forcing the reset of the encryption key to zero during disassociation. Kr00k attacks WPA2-encrypted networks, which allow attackers to capture sensitive data like IP addresses or even entire packets of data by disassociating and reconnecting Wi-Fi devices²⁴. Because this security flaw allows an attacker to trap leftover residual packets in the transmission buffer (Tx buffer) even when disconnected from the network, it becomes evil. As soon as the device reconnects, its contents become encrypted with a key value of 0, thus making it very easy to decrypt by attackers²⁷. The vulnerability poses a significant threat; however, in reality, it is much more difficult to achieve Kr00k attacks because packets are sent instantaneously in most real-life scenarios²⁸. Kr00k attacks work, but they usually boast poor success rates in practical setups. This is mostly because most packets are sent out within a very short period after entering the Tx buffer, so there is very little time available for hackers to capture them as they are²⁹. In addition, once the users get disconnected, they may be able to notify others of the attack when they notice disconnections in the network during the attack attempts. This, thus, makes it difficult for the attacker to keep things prolonged and stealthy²⁷.

There are some situations indicated by researchers-such as video streaming-in which there is a greater likelihood that packets will hold out longer in the Tx buffer, making Kr00k attacks plausible. Because streaming applications continuously transmit data, packets can build in the buffer for a little while; however, due to these limitations, the actual feasibility of a Kr00k attack is low without additional exploitation³⁰. To address the drawbacks associated with the standard Kr00k attack, a new hybrid attack called Kr00k and CSA was proposed by researchers from Kobe University. Here, CSA refers to Channel Switch Announcement, a mechanism in a particular Wi-Fi network that notifies clients with information on future changes with respect to a particular channel. More importantly, there are indications that attackers send a corrupt CSA which tells the client to get disconnected from that network and switch to a non-existent channel. As a result, the device is forced to carry all the packets in the Tx buffer before disconnection and thus increases the chances of success for Kr00k²⁷. The joint attack strategy has multiple benefits when compared to the traditional Kr00k attack. Firstly, attackers can exploit CSA, making clients buffer packets for longer durations so that interception becomes easier. Further, automatic reconnection of the client after receiving a modified CSA makes it possible to carry on with the attack without having the user alerted to any network disturbances. As such, the CSA-Kr00k attacks are very powerful since they can now be done for long durations without detection. Tests the researchers conducted showed how the CSA-Kr00k attack is viable in a real-world environment-theoretical scenarios involving all kinds of client devices, including Android and iOS, were involved. It could capture tons of data from the video streaming activity among users of the different devices. The data included sensitive information like source and destination IPs which would be good for hacking/dos purposes²⁴.

Interestingly, the researchers found that the CSA-Kr00k attack was more successful during live streaming sessions than during on-demand video streaming. This is because live streaming requires continuous data transmission, which increases the likelihood of packets being stored in the Tx buffer. On the other hand, on-demand video streaming transmits data in larger, less frequent bursts, reducing the probability of intercepting residual packets[21].The results also showed that different devices had varying levels of vulnerability to the attack. For example, certain devices like the Nexus 6P were more resilient due to their inability to process the CSA tampering. However, most other devices tested were vulnerable, highlighting the widespread risk posed by this attack method³¹. As countermeasures against Kr00k and CSA-Kr00k attacks, a number of measures have been recommended. One of the most straightforward solutions would be to apply the patches made available by the manufacturers to address the vulnerability in affected devices; however, older devices and those on public Wi-Fi networks are not likely to receive such updates. In this case, users are advised to exercise necessary risks when connecting to unsecured networks and ensure their devices are using the latest encryption protocols³². In addition, enabling CSA or limiting the number of CSA signals that can be processed by a device can minimize the possibility of tampering. These measures would probably conflict with legitimate network operations and thus would not be practical in many situations. Ultimately, a mix of vendor updates, user awareness, and network configuration modifications is required for addressing completely the vulnerabilities introduced by Kr00k and CSA attacks³³. The Kr00k vulnerability, especially in combination with CSA tampering, creates a considerable risk to Wi-Fi security. The typical Kr00k attack has little real-world feasibility, but the CSA-Kr00k approach dramatically increases the chance of success and stealth. Continuous monitoring, as well as patch deployments, are significant requirements needed to mitigate these threats. Further research can work on more robust encryption and communication protocols that would counter such attacks from compromising wireless network security. The surge in demand for wireless networks-wifi technology in particular-has generated a lot of security concerns. When there is communication with the outside world, wherein many IoT devices are connected, the security of Wi-Fi has become critical to the process. Many people use these wifi networks for daily communication. Security should be taken seriously.

Every organization has a robust Intrusion Detection Systems (IDS) to help them detect the most unusual and unexpected attacks in their networks. This research literature review will analyze ongoing works on detection of intrusion into WiFi while considering activity anomaly detection analysis using online learning techniques with special emphasis to be laid on advances based on ML methods to improve network security. Therefore, it is physical and data link layers, where WiFi operates within OSI model convolutions, which expose it mostly to attacks because that exploit weakness of these link layers. The available security encryption methods to secure any network communications–like WPA2 and WPA3–are not readily proved to be effective in foiling attacks on the physical layer, where pathways for the new communication are always opened and closed. Hence, encryption should never be enough by itself to meet Confidentiality, Integrity, and Availability (CIA) requirements in wireless networks³⁴. To mitigate the situation according to these needs, much research has been dedicated to designing various techniques for an Intrusion Detection System (IDS). It may be classified broadly into two types: Signature-Based Intrusion Detection System and Anomaly-Based Intrusion Detection System. The comparison of network traffic with signature databases can spot attacks in a signature-based IDS. It does have limitations such as the failure to identify new or modified attacks that differ from known signatures. In contrast, Anomaly Based Intrusion Detection System detects unusual activities that reveal deviation from the normal behavior of the network, thus making it more susceptible to identifying novel or zero-day attacks³¹. It is indeed true that ML can and is being used in making personal intrusion detection systems (IDSs): to detect patterns in attacks and to classify them with remarkable accuracy. Many of the studies have been related to applying ML for anomaly detection in WiFi using immense datasets such as the Aegean WiFi Intrusion Dataset (AWID 2 and AWID 3). These datasets classify various types of attack traffic in normal operation and offer an immensely broad basis for a model to be trained on. Selection methods based on employed features are assessed with regard to a better detection tendency while reducing processing time³⁵: Extra Trees ensemble method reduced that 20 features important from the AWID feature set, which are later injected into several different classifiers, such as a random forest and bagging. This study demonstrated that a feature-reduction approach could lead to better accuracy and speed for detection³². Analogously, Correlation based feature selection compresses the feature set of 156 down to 18 attributes. The research conducted an analysis of classifier accuracy between random forests and XGBoost. The major conclusion drawn from the research was that random forests outperformed the rest of the classifiers in accuracy terms. This is indicative of the role of feature selection in improving performance in ML models meant for intrusion detection³⁶.

In neural networks, Some authors have investigated the application of deep learning models in WiFi intrusion detection³³. The research suggests that neural networks, as highly accurate, tend to consume an enormous amount of computational resource and need a significantly higher number of features, making them rather infeasible for real-time detection systems³⁷. The increasing vulnerability of Wi-Fi networks, particularly due to attacks like Krack and Kr00k, has triggered significant research into machine learning (ML)-based Wireless Intrusion Detection Systems (WIDS). While early research focused on signature-based methods and handcrafted features from datasets such as AWID, recent work has evolved toward deep learning, explainable AI (XAI), and optimization-enhanced anomaly detection. Several studies demonstrate that deep learning models offer superior detection capabilities for intrusion scenarios in complex, heterogeneous IoT and Industrial IoT (IIoT) environments. Nandanwar and Katarya (2024) proposed AttackNet, a CNN-GRU based model that achieved 99.75% accuracy on the N_BaIoT dataset, outperforming state-of-the-art detection systems for IIoT botnet attacks by up to 16% margin. Their work highlights the importance of combining temporal and spatial learning capabilities for detecting multi-variant botnet threats in real-time industrial systems ³⁸. Extending this approach, they also introduced a Transfer Learning enabled BiLSTM (TL-BILSTM) architecture tailored for classifying Mirai and Bashlite attacks across multiple IoT devices. This model recorded 99.52% accuracy and demonstrated scalability across nine device types, showcasing adaptability to evolving threat landscapes ³⁹. To address explainability, which is critical for real-world deployment in human-centric environments like Industry 5.0, the Cyber Sentinet framework was proposed. It uses a ResNet model with SHAP (Shapley Additive Explanations) to ensure interpretability in detection decisions while maintaining a 97.46% accuracy rate on the Edge-IIoT-2022 dataset. This novel fusion of XAI and DL offers trustworthy insights for decision-makers in complex cyber-physical systems ⁴⁰. Privacy-preservation is another critical challenge in intrusion detection, especially in CPS-IIoT settings where sensitive data must remain secure. Saheed and Chukwuere (2025) proposed a BiLSTM model with scaled dot-product attention and agglomerative clustering, achieving 99.99% accuracy on the X-IIoTID dataset. The model effectively balances feature relevance with privacy constraints, improving both performance and data protection ⁴¹. Similarly, Saheed and Misra (2025) introduced a SHAP-integrated Deep Neural Network (CPS-IoT-PPDNN) for anomaly detection in CPS-IoT systems, reaching near-perfect metrics (up to 100% recall and 99.99% accuracy). This reinforces the value of explainable and privacy-conscious models for mission-critical IoT infrastructures ⁴².

An ensemble-based intrusion detection strategy has also emerged as an effective approach for SCADA systems and smart city infrastructures. Saheed et al. (2023) developed a hybrid ensemble learning model combining GWO optimization, PCA, and classifiers like Naive Bayes and SVM. This model achieved 99.9% detection rate, particularly excelling in real-time attacks on water and gas pipeline systems ⁴³. To mitigate class imbalance a major issue in network traffic Abdulganiyu et al. (2025) introduced CWFLAM-VAE, an attention-driven architecture integrating focal loss, variational autoencoders, and extreme gradient boosting. It outperformed traditional classifiers on NSL-KDD and CSE-CIC-IDS2018, particularly for detecting rare but critical intrusion types ⁴⁴. Feature selection continues to be a cornerstone for improving IDS performance. Adeyiola et al. (2023) employed the Firefly Algorithm (FFA) for feature reduction and combined it with a C5.0 classifier to develop a lightweight IDS for Wireless Sensor Networks (WSNs), achieving 98.7% accuracy on the UNSW-NB15 dataset. This aligns with earlier efforts using ANOVA, Extra Trees, and manual selection for the AWID datasets in Wi-Fi security ⁴⁵. Complementing these empirical efforts, Bhanu et al. (2023) provided a systematic literature review that explores the limitations of existing ML and DL methods in IoT intrusion detection. They emphasized the need for hybrid solutions combining multiple techniques, echoing the trends observed in ensemble and XAI-enhanced models ⁴⁶. In parallel, Nandanwar and Katarya (2023) explored the intersection of blockchain and intrusion detection, highlighting how decentralized technologies can augment ML-based security by improving authentication, integrity, and resilience in smart systems ⁴⁷. Though not directly applied to Krack or Kr00k, blockchain’s use in WIDS remains a promising future direction. Smart intrusion detection system for IIoT-enabled smart cities presents a lightweight and real-time IDS framework combining hybrid ensemble learning, firefly optimization, and improved random forest (IRF) classifiers for IIoT environments. The system was evaluated using the Edge-IIoTset dataset and achieved up to 99.9% accuracy, with reduced training overhead and faster classification response time compared to conventional deep models ⁴⁸.

Despite the promising results shown in past studies, several persistent limitations remain in existing ML-based IDS models. Many approaches exhibit sensitivity to feature noise and fail to generalize in the presence of adversarial perturbations, which is particularly problematic in wireless and IoT environments. High-dimensional datasets often led to overfitting due to insufficient feature reduction techniques. Additionally, class imbalance in network traffic datasets was poorly addressed, resulting in inflated false positive rates. While some studies employed ensemble methods, they typically relied on basic voting strategies without leveraging meta-learners or combining feature-space transformation (e.g., PCA) with classifier diversity. Moreover, deployment feasibility for edge and real-time systems was rarely considered. These recurring issues collectively motivate our ensemble-based detection model, which integrates noise injection, PCA, and meta-level stacking to improve robustness, scalability, and real-time applicability.

Methodology

Problem definition

Let $\mathcal {D} = \{(\textbf{x}_i, y_i)\}_{i=1}^N$ represent a dataset of N labeled network traffic instances, where each feature vector $\textbf{x}_i \in \mathbb {R}^d$ consists of d input attributes derived from packet-level metadata or flow-level statistics, and $y_i \in \{0,1,2\}$ is the corresponding class label indicating Normal, Kr00k, or Krack traffic. The objective is to learn a function $f: \mathbb {R}^d \rightarrow \{0,1,2\}$ that accurately maps any unseen input vector $\textbf{x}$ to its correct class y.

This problem is particularly challenging due to the overlapping distribution of benign and malicious traffic, potential class imbalance, and the high dimensionality and variability of network features. Therefore, our goal is not only to maximize classification accuracy, but also to minimize false positives (i.e., cases where normal traffic is misclassified as malicious which are critical in real-time intrusion detection scenarios).

Data preprocessing and feature engineering

The dataset merging and preprocessing strategies are illustrated in Fig. 1. Since our objective is to develop a multiclass classification model, we combine the Krack and Normal classes with the Kr00k and Normal classes, resulting in a unified dataset comprising three distinct classes. This integration enhances the reliability of machine learning models in detecting complex intrusion patterns. Our preprocessing pipeline is motivated by the need to improve data quality, remove redundancy, and balance class distributions prior to model training. Multiple WiFi traffic logs corresponding to Krack and Kr00k attacks were collected from real-world wireless intrusion traces. For the Krack attack, 28 CSV files were merged, and for the Kr00k attack, 58 files were consolidated. Each dataset was structurally aligned by removing columns with missing values and ensuring schema consistency. After cleaning, the merged Krack and Kr00k datasets contained approximately 100,000 samples each, with 34 standardized features. These were combined with normal traffic data to construct a comprehensive multiclass dataset containing three classes: Normal, Krack, and Kr00k. Subsequently, these features are passed through the structured preprocessing pipeline depicted in Fig. 1. It comprises three core stages: data preprocessing, exploratory data analysis (EDA), and feature engineering and selection.

Initially, missing data were further analyzed via percentage calculations and visualizations (e.g. heatmaps). Strategies such as imputation (e.g., filling with -1) or row removal were applied based on severity. Domain-specific transformations were performed: timestamps (e.g., frame.time) were converted into datetime objects and decomposed into hour, minute, and second features. Signal-strength columns such as ‘radiotap.dbm_antsignal’ were cleaned and averaged where applicable. Categorical features, such as MAC addresses and WLAN flags, were processed using LabelEncoder and one-hot encoding where necessary. Redundant features like radiotap.rxflags were removed. Additionally, we focus on exploratory data analysis, where correlation matrices, distribution patterns, and feature-target relationships are investigated to uncover data quality issues and feature redundancy. Secondly, the dataset underwent extensive feature engineering and cleaning to enhance its quality and suitability for multiclass classification. Class distributions were analyzed to identify imbalances, and outliers were treated using the IQR method, focusing on features like ‘radiotap.dbm_antsignal’. Correlation analysis highlighted key predictors such as ‘hour’, ‘frame.time_relative’, and ‘radiotap.dbm_antsignal’, which showed significant relationships with the target variable. Labels were transformed into descriptive categories (”Normal,” ”Krook,” ”Krack”) for better interpretability. To address multicollinearity, we calculated the Variance Inflation Factor (VIF) for each feature. Features with $\text {VIF} > 10$, such as frame.time_delta_displayed, were excluded. A Random Forest classifier was also employed to rank feature importance, and the top 10 features were selected for dimensionality reduction. Features like frame.len, radiotap.dbm_antsignal, and radiotap.channel.freq were evaluated for possible correlation and improvements in reliability of the dataset. For a further close look into the categorical values, some unique values of columns, such as frame.encap_type, radiotap.channel.flags.cck, and wlan_radio.frequency validate the data types, ensuring they are either classified as numerical or categorical. An increase in data quality and interpretability achieved through multicollinearity removal and refinement of the dataset will therefore lend support for strong models in ML.

Finally, to mitigate class imbalance, the dataset was initially balanced via undersampling, selecting 100,000 instances per class. During training, Synthetic Minority Oversampling Technique (SMOTE) was applied to generate synthetic instances, ensure an even class distribution, and features were standardized using StandardScaler. The dataset was then split into 70% training and 30% testing sets. To further reduce noise and improve computational efficiency, we applied Principal Component Analysis (PCA), retaining enough components to preserve 90% of the variance:

$$\begin{aligned} \textbf{z}_i = \text {PCA}_k(\textbf{x}, \quad \text {s.t. } \frac{\sum _{j=1}^{k} \lambda _j}{\sum _{j=1}^{d} \lambda _j} \ge 0.90 \end{aligned}$$

(1)

where $\lambda _j$ are the eigenvalues corresponding to the principal components. The resulting transformed dataset $\textbf{Z} = \{\textbf{z}_i\}_{i=1}^N$ served as input for model training in subsequent stages.

The balanced class distribution after processing showed an equal number of samples across all classes. With the dataset processed and reduced to optimal dimensions, the resulting data is ready for effective model training and evaluation.

Overall model architecture

The growing sophistication of network attacks has rendered traditional intrusion detection models insufficient, especially when handling complex, high-dimensional traffic patterns and overlapping classes. Motivated by this challenge, our proposed methodology integrates a robust ensemble learning framework designed to reduce false positives, generalize across attack types, and boost classification reliability, as well as robustness to noise and variability in traffic. Specifically, we develop a two-tier stacked ensemble architecture that combines diverse learners and meta-level optimization to reinforce decision boundaries in intelligent network environments.

The overall architecture represents a systematic workflow for a ML classifiers, starting with data preprocessing (cleaning, normalization, and feature removal), exploratory data analysis (EDA) to understand data distributions, and feature engineering/selection to optimize model input as explained in Fig. 2. The data is then split into training, validation, and test sets to ensure robust model training and evaluation. The final step in the whole process is to evaluate the models on test data and iteratively improve the methodology for better results.

ML Model Pipeline 1: baseline classifier

ML Model Pipeline 1 serves as the foundational stage in our approach, where individual machine learning classifiers are trained and evaluated independently to assess their standalone performance as illustrated in Fig. 3.

In ML Model Pipeline 1, each input vector $\textbf{x}_i$ is first normalized using a standard scaling transformation:

$$\begin{aligned} \textbf{x}_i' = \text {StandardScaler}(\textbf{x}_i) \end{aligned}$$

(2)

This normalization ensures that all features contribute equally by transforming them to have zero mean and unit variance.

The normalized data $\textbf{x}_i'$ is then used to train five distinct classifiers independently, including Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), XGBoost, and Multi-Layer Perceptron (MLP) that has been processed with the addition of noise and PCA for the reduction of dimensionality. Gaussian noise is added up to simulate a real-life situation, and hyper-parameter tuning is executed using GridSearchCV for the Logistic Regression. Each model $h_j$ learns a function $h_j: \mathbb {R}^d \rightarrow \{0,1,2\}$, such that the predicted class for input $\textbf{x}_i$ is:

$$\begin{aligned} \hat{y}_i^{(j)} = h_j(\textbf{x}_i') \end{aligned}$$

(3)

The output of this pipeline includes model-specific performance metrics such as accuracy, precision, recall (TPR), F1-score, and AUC. Furthermore, a classification report is generated. Matrices confusion and learning curves are then drawn to visualize model performance, hence revealing insights over size of training about generalizing behavior. This comprehensive approach identifies the most robust classifier for the noisy, imbalanced dataset. These results form the empirical basis for the design of Pipeline 2.

ML Model Pipeline 2: stacked ensemble learning

While Pipeline 1 establishes strong baselines using individual models, it also reveals limitations in terms of generalization and false positive rates. To address these issues, we propose ML Model Pipeline 2, a holistic and robust classification pipeline based on a stacked ensemble architecture. The structure of this pipeline is illustrated in Fig. 4. The pipeline begins with a preprocessing stage that standardizes the input features to have zero mean and unit variance. This pipeline enhances predictive reliability by integrating base learner outputs through a meta-classifier. Each input $\textbf{x}_i \in \mathbb {R}^d$ is first normalized and perturbed with Gaussian noise:

$$\begin{aligned} \tilde{\textbf{x}}_i = \text {StandardScaler}(\textbf{x}_i) + \epsilon , \quad \epsilon \sim \mathcal {N}(0, \sigma ^2) \end{aligned}$$

(4)

The noise component $\epsilon$ models real-world measurement noise and irregular traffic behavior.

The noise values are filled so that the actual dataset imitates real noisy situations of the dataset. Following normalization and noise injection, Principal Component Analysis (PCA) is employed to reduce dimensionality while retaining at least 90% of the original variance. This step is essential for handling high-dimensional network traffic data more efficiently and minimizing computational complexity. The resulting representation is

$$\begin{aligned} \textbf{z}_i = \text {PCA}_k(\tilde{\textbf{x}}_i) \in \mathbb {R}^k \end{aligned}$$

(5)

The vector $\textbf{z}_i$ serves as input to the base models. The pipeline integrates five diverse machine learning models as base classifiers: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and XGBoost. Each of these models is trained and validated using K-fold cross-validation to ensure robust and unbiased performance. Instead of directly producing class labels, each base model outputs a probability distribution over the target classes:

$$\begin{aligned} h_j(\textbf{z}_i) = \textbf{p}_i^{(j)} \in \mathbb {R}^3, \quad j = 1, \dots , 5 \end{aligned}$$

(6)

Each base model generates predictions that are then combined into meta-features, which will feed a final meta-model. The predictions from all base classifiers are concatenated into a meta-feature vector:

$$\begin{aligned} \textbf{q}_i = \left[ \textbf{p}_i^{(1)} \, \Vert \, \textbf{p}_i^{(2)} \, \Vert \, \dots \, \Vert \, \textbf{p}_i^{(5)} \right] \in \mathbb {R}^{15} \end{aligned}$$

(7)

These meta-features $\textbf{q}_i$ is then passed to a final XGBoost classifier $H$ by means of the strength of ensemble learning for the improvement of accuracy, trained to learn from the aggregated decision space of the base learners:

$$\begin{aligned} \hat{y}_i = H(\textbf{q}_i) \end{aligned}$$

(8)

The ensemble strategy enables the pipeline to exploit complementary decision boundaries across models, significantly enhancing prediction accuracy and stability under noisy conditions. To further optimize model performance, hyperparameter tuning is conducted using RandomizedSearchCV, allowing systematic exploration of parameter spaces for both base and meta-classifiers.

Experimental results and analysis

This section presents a detailed evaluation of our proposed intrusion detection pipelines using the AWID3 dataset. We report the performance of individual machine learning classifiers (Pipeline 1), our noise-augmented PCA-stacked ensemble framework (Pipeline 2), and conduct a comparative analysis with state-of-the-art baseline methods. All metrics are averaged across 5 independent runs with stratified 5-fold cross-validation, and we report macro-level metrics to reflect performance across all classes fairly.

Experiments

Dataset The AWID3 dataset has been diligently curated-in an endless computer-to-study programming environment record and examination hallmarks of various attacks with the common IEEE 802.1X extensible authentication protocol (EAP) environment. It is useful and available for the public good. It is a milestone as the first dataset to critically review the IEEE 802.11 w standard, which seeks to norms for hardware approval into the WPA3 protocol.The AWID dataset, from which AWID3 was built, has 254 features, of which 253 are general features and one is used for labeling. The dataset is offered in CSV format for simple access and interoperability with many different data analysis tools and methodologies. A thorough understanding of network activity and attack patterns is made possible by the extracted features, which cover both the MAC (media access control) layer and the application layer. The dataset consists of 36,913,503 instances, 30,387,099 of normal traffic, and 6,526,404 malicious ones. Malicious traffic includes 13 types of attacks⁸ as shown in Table 1.

Table 1 Classification of malicious traffic.

Full size table

The dataset includes a total of 36,913,503 instances, with 30,387,099 instances of normal traffic and 6,526,404 instances of malicious traffic⁸. Among the malicious traffic, there are 49,990 instances for the Krack attack and 186,173 instances for the Kr00k attack. Our experiment will be conducted in two phases. In the first phase, we will have individual ML algorithms with multi-class classification (Krack, Kr00k, normal). In the second phase, we will have our proposed model consisting of Krack, Kr00k, and normal traffic.

To emphasize the significance of preprocessing the dataset before applying it to the proposed model, we initially used the raw sample without any preprocessing or feature selection. The first sample includes 106,971 Kr00k traffic instances and 106,791 normal traffic instances, while the second sample has 33,180 Krack traffic instances and 34,000 normal traffic instances, with 254 features for each sample.

Baselines Comparison We compare our proposed model against several strong IDS baselines, such as the study in ⁴⁹, the researchers employed a state-machine-based framework called kTRACKER to detect Krack attacks by observing multiple wireless channels. To precisely pinpoint Krack-related anomalies at different stages of a handshake process, they performed deep packet inspection and developed a clustering technique to categorize Wi-Fi handshake packets. Utilizing supervised gradient boosting models, their approach achieved an accuracy of approximately 93.39%, with a false positive rate of 5.08%. The study in ⁵⁰ introduced an unsupervised framework for classifying and mining Twitter data related to cybersecurity vulnerabilities, including the Kr00K attack, a flaw enabling unauthorized decryption in Wi-Fi chips. Their approach attained a maximum accuracy of 88.52%. The authors in ⁵¹ developed an intrusion detection system model utilizing ML classification using ANOVA feature selection techniques, enhancing the ensemble classifier’s performance, achieving 90.7% accuracy for the multi-class classifier with three labels (Krack, Kr00k, and Normal). Chatzoglou et al.⁵² used deep and machine learning on the AWID3 dataset, achieving 96.7% accuracy in detecting application layer attacks by analyzing 802.11 and non-802.11 features. The authors⁵³ utilized Microsoft Azure for model training and achieved the highest accuracy of 93.87% using the Multi-class Neural Network, outperforming other models tested in the study. The highest results for SM-GBT⁵⁴(Statistical Measures with Gradient Boosting Trees) were obtained using Statistical Feature Extraction.

Evaluation Matrices Our model is evaluated using a comprehensive set of performance metrics including accuracy, precision, recall (TPR), F1-score, and FPR. To ensure interpretability and in-depth analysis, we also report confusion matrices.

Results and analysis

Performance of ML Model Pipeline 1 Pipeline 1 evaluates the performance of standalone classifiers trained using default hyperparameters with light tuning through grid search. Table 2 summarizes results for five commonly used models: SVM, Random Forest, XGBoost, KNN, and MLP. Among them, XGBoost and MLP achieved the highest performance, with accuracies of 94.98% and 95.31%, respectively.

Table 2 Performance of base learners (Pipeline 1) with Mean ± Std over 5-Fold cross-validation.

Full size table

Although the performance of MLP and XGBoost was encouraging, the models still exhibit relatively high false positive rates (FPRs between 4 to 6%), which is suboptimal for real-time or high-security applications. Additionally, discrepancies between precision and recall indicate potential instability under class imbalance. These observations motivate the development of a more robust ensemble-based detection framework, as presented in Pipeline 2.

Performance of ML Model Pipeline 2 To overcome the limitations of Pipeline 1, Pipeline 2 introduces a three-stage enhancement: (i) noise injection for regularization and robustness, (ii) principal component analysis (PCA) for dimensionality reduction, and (iii) a stacked ensemble architecture to aggregate diverse classifiers via meta-learning. The final ensemble integrates probabilistic outputs from base learners (SVM, MLP, XGBoost) using logistic regression as the meta-learner.

Table 3 presents class-wise performance across three classes: Normal, Kr00k, KRACK, and ours full model (avg.). The model achieves F1-scores above 0.97 in all cases, with false positive rates reduced to below 2.5%.

Table 3 Performance of stacked ensemble (Pipeline 2).

Full size table

Compared to the best-performing single model (MLP, F1 = 0.9532), the ensemble improves the macro F1-score to 0.9786 and reduces the FPR by over 50%. These results validate the effectiveness of stacking and feature regularization under noisy, multiclass intrusion settings. The ensemble also demonstrated lower prediction variance across runs, suggesting enhanced generalization.

Additionally, we also have included mean ± standard deviation over stratified 5-fold cross-validation for pipeline 2 final ensemble model. This validates the stability of our results, ensuring statistical robustness.

Meta-Classifier Selection Rationale The final stage of Pipeline 2 involves combining probabilistic outputs from base classifiers using a meta-learner. We selected XGBoost as the meta, classifier based on its theoretical robustness and empirical performance. XGBoost is a gradient boosting framework optimized for high-speed, parallelizable computation and includes both L1 and L2 regularization to reduce overfitting, an essential feature when reconciling outputs from diverse base learners such as SVM, Random Forest, KNN, MLP, and XGBoost itself. These base models introduce varied decision surfaces and possible inconsistencies, which XGBoost effectively mitigates by learning a nonlinear meta-decision boundary.

Moreover, XGBoost handles class imbalance through weighted loss functions, making it suitable for our intrusion detection task. During our ablation and cross-validation studies, it consistently outperformed other meta-classifier candidates, including Logistic Regression and SVM, both in detection accuracy and variance stability. Table 2 summarizes the mean and standard deviation of key metrics across 5-fold cross-validation, highlighting XGBoost’s high accuracy (0.9498), low deviation, and strong F1-score.

Additionally, as shown in Fig. 5, XGBoost achieved the highest ROC-AUC (0.98), reinforcing its effectiveness in enhancing separability across classes. These results, combined with its computational efficiency and low latency during inference, support our decision to use XGBoost as the meta-classifier for real-time, resource-constrained IDS deployment scenarios.

Comparative Analysis Against Baselines We compare the proposed model to existing state-of-the-art approaches for wireless intrusion detection published between 2021 and 2025. To ensure fairness and consistency, all baseline models were re-implemented and tested on the AWID3 dataset using the same preprocessing, feature selection, and evaluation procedures. Among them, Cyber-Sentinet⁴⁰ achieved the highest F1-score (0.990), reflecting excellent balance between precision and recall. CPS-IIoT-P2Attention⁴¹ attained the highest precision (0.982), minimizing false positives, while AttackNet³⁸ maintained high precision (0.979) and F1-score (0.960), though with a comparatively lower recall (0.943). CWFLAM-VAE⁴⁴, built on XGBoost, showed strong precision (0.975) but a reduced recall (0.911), impacting its overall detection performance.

In contrast, our ML Model Pipeline 2 achieved the highest overall performance across all key metrics accuracy (0.98), recall (0.98), F1-score (0.98), and the lowest false positive rate (0.02) demonstrating its robustness and reliability in multiclass intrusion detection on wireless network data. Table 4 summarizes the results. Where external works reported only accuracy, we estimated other metrics (e.g., precision, recall, F1) based on their stated methodologies and typical performance trends on class-imbalanced data.

Table 4 Comparative performance of different methods on AWID3 dataset for KRACK and Kr00k multiclass detection.

Full size table

Our method outperforms previous techniques across all key metrics, achieving a 1 to 3% improvement in F1-score over the closest baselines (e.g., SM-GBT), and reducing the false positive rate by half. This is attributed to our integration of noise-based regularization, PCA feature compression, and probabilistic ensemble learning. These results suggest the proposed model offers greater robustness to input perturbations, improved class discrimination, and suitability for deployment in real-time or sensitive network environments.

PCA Threshold Selection To justify our use of the 90% variance threshold for PCA, we conducted both visual and empirical evaluations. Figure 6 shows a scree plot of cumulative explained variance versus the number of components. The elbow point is observed around the 90% mark, indicating a natural tradeoff between information retention and dimensionality. To complement this, we conducted a PCA threshold sensitivity analysis, comparing performance at 85%, 90%, and 95% thresholds (Table 5). The 90% configuration retained 26 components and achieved the highest accuracy (0.9702), slightly outperforming 95% while avoiding unnecessary complexity. This threshold also aligns with the ablation study findings, where PCA contributed to boosting F1-score from 0.9603 (with noise only) to 0.9691. These results validate our choice of 90% variance as a reproducible and efficient PCA configuration.

Table 5 PCA threshold sensitivity analysis.

Full size table

Impact of Noise Injection To further validate the role of noise injection in enhancing model robustness, we conducted a focused comparison between ensemble models trained with and without noise augmentation. Gaussian noise with $\sigma = 0.05, i.e., \in \sim \mathcal {N} (0, 0.0025).$ was added during preprocessing to simulate signal perturbations such as jitter and interference, common in Wi-Fi environments. This value was empirically selected through grid search over $\sigma \in {0.01, 0.03, 0.05, 0.07}$, and was found to yield the best tradeoff between stability and performance. Noise injection serves as a regularization method to enhance model generalization to minor deviations in feature values, a common occurrence in real-time wireless traffic.

This regularization strategy aimed to increase the model’s tolerance to real-world variabilities in wireless traffic. Figure 5 illustrates the ROC curves for both settings. The Area Under the Curve (AUC) improved from 0.93 (without noise) to 0.98 (with noise), indicating significantly improved separability. This is supported by ablation results: the F1-score increased from 0.9497 to 0.9603, and FPR decreased from 4.46% to 3.49%. These enhancements confirm that noise injection effectively strengthens model generalization and boundary learning under noisy conditions.

Comparative Insights To better understand the gains from ensemble learning, Fig. 7 compares Accuracy and FPR across both pipelines. Pipeline 1 models like MLP and XGBoost achieve relatively high accuracy, but all exhibit FPRs between 0.04 to 0.07%. By contrast, Pipeline 2 maintains accuracy in the 98 to 99% range and consistently lowers FPRs below 0.02% for all classes. This improvement can be attributed to the meta-classifier’s ability to learn from the disagreement patterns of the base models, as well as the enhanced data representation obtained via noise injection and PCA.

Additionally, the confusion matrix in Fig. 8 reveals that most samples are correctly classified, with only a small number of misclassifications across classes. These results confirm that Pipeline 2 offers balanced performance across all classes, minimizing both false negatives and false positives, a critical property for real-time security systems.

To better contextualize the performance gains achieved by our proposed Pipeline 2, Table 6 highlights key limitations in previous IDS approaches and contrasts them with the improvements introduced in our work. This comparison spans critical aspects such as noise robustness, dimensionality reduction, ensemble learning strategies, and inference efficiency. The methodological enhancements directly align with our design objectives and support the empirical results presented in subsequent sections.

Table 6 Comparative analysis of limitations in prior methods and improvements introduced by the proposed pipeline.

Full size table

Ablation study

To understand the contribution of each component in our proposed Pipeline 2, we conducted an ablation study with four progressively enhanced configurations: (i) base learners only, (ii) noise injection, (iii) PCA-based dimensionality reduction, and (iv) full stacked ensemble with probabilistic meta-learning.

Table 7 reports the results averaged over 5 runs using stratified 5-fold cross-validation.

Table 7 Ablation study on pipeline components.

Full size table

The ablation results show that each component progressively contributes to performance improvement. Noise injection improves generalization by introducing variation during training. PCA reduces feature noise and redundancy, further enhancing precision and recall. Finally, the stacked ensemble delivers the most significant performance gains, reducing the FPR by over 56% compared to the base learners.

Runtime performance and deployability

To evaluate practical deployment feasibility, we profiled runtime performance on a standard mid-tier CPU setup (Intel Core i7, 32GB RAM). The complete training process for our stacked ensemble, including preprocessing, Gaussian noise injection, PCA, five base classifiers, and a meta-classifier, required approximately 20 minutes (1200 seconds). While training time is moderate, this step is conducted offline and does not affect real-time operation. Once trained, the model demonstrates efficient runtime behavior. The average inference time per sample is approximately 28 milliseconds, with a model size of 17MB, making it feasible for use in resource-constrained edge devices such as ARM Cortex-A processors or Raspberry Pi systems.

The use of PCA to reduce feature dimensionality, and the compact 15-dimensional meta-feature vector for the meta-classifier, ensures low computational load during inference. We also monitored CPU usage and observed minimal memory overhead during evaluation. These results support our claim that the proposed IDS is deployable in both edge and cloud environments depending on latency and throughput requirements.

Conclusion

This study explored the effectiveness of machine learning-based intrusion detection systems (IDS) for detecting complex wireless attacks, specifically KRACK and Kr00k, in IoT Wi-Fi environments. Exploring the AWID3 dataset and a robust preprocessing pipeline, we addressed key challenges such as noise variability, feature redundancy, and class imbalance, which often hinder real-world IDS deployment.

Through extensive experimentation, we demonstrated that our proposed stacked ensemble architecture (Pipeline 2) significantly outperforms individual classifiers on all major performance metrics. By integrating heterogeneous learners via meta-learning and combining this with noise injection and dimensionality reduction through Principal Component Analysis (PCA), the model achieved consistent improvements in accuracy, precision, recall, and particularly in reducing false positive rates (FPR), a critical metric for operational IDS performance. While slight degradations from ideal scores were observed due to stochastic elements like noise augmentation and cross-validation splits, the ensemble maintained superior generalization and robustness. The layered approach effectively leveraged the strengths of individual classifiers while mitigating their weaknesses, resulting in a scalable, high-performance detection system suitable for intelligent networks.

Overall, this work underlines the importance of combining preprocessing strategies (such as noise handling, feature selection, and PCA) with ensemble learning techniques to build reliable and adaptable intrusion detection models. The proposed methodology not only improves detection accuracy but also reduces computational overhead, making it suitable for deployment in resource-constrained IoT environments.

Future Work Although we did not conduct GPU-based benchmarking, the model architecture is compatible with GPU-accelerated inference using frameworks like ONNX or joblib-parallel, which could reduce inference time substantially (potentially to<1ms per sample in batch mode). We note this as a direction for future optimization. In addition, future studies should focus on deploying these ensemble pipelines in real-time IDS frameworks, testing their resilience under adversarial conditions, and adapting them to evolving threat landscapes. Incorporating domain adaptation, continual learning, or uncertainty-aware decision-making could further improve robustness in dynamic and heterogeneous IoT network environments. Future work may also evaluate the proposed approach on datasets such as CIC-IDS2018, IoT-23, and UNSW-NB15.

Data availability

The AWID3 dataset used in this study is publicly available and can be accessed from the official website:(https://icsdweb.aegean.gr/awid/awid3). Detailed information about the dataset can be found in the associated publication titled ”Empirical Evaluation of Attacks Against IEEE 802.11 Enterprise Networks: The AWID3 Dataset”. Researchers are advised to review the terms of use on the dataset’s website before downloading and utilizing the data.

References

Nazir, A., He, J., Zhu, N., Anwar, M. S. & Pathan, M. S. Enhancing IoT security: a collaborative framework integrating federated learning, dense neural networks, and blockchain. Cluster Comput. 27(6), 8367–8392 (2024).
Article Google Scholar
Khan, S. et al. Energy efficient task scheduling using fault tolerance technique for iot applications in fog computing environment. IEEE Internet Things J. (2024).
Salah, Z. & Abu Elsoud, E. Enhancing network security: A machine learning-based approach for detecting and mitigating krack and kr00k attacks in IEEE 802.11. Future Internet 15, 269 (2023).
Article Google Scholar
Alraih, S. et al. Revolution or evolution? technical requirements and considerations towards 6g mobile communications. Sensors 22, 1–23 (2022).
Article Google Scholar
Nandhini, P., Navaneetha Krishnan, V., Raguram, P. & Jebadurai, T. Enhancing network security for kr00k attack detection using binary grasshopper optimization (bgo). In Proceedings of the Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE) (2024).
Ahn, V. & Ma, M. A secure authentication protocol with performance enhancements for 4g lte/lte-a wireless networks. In Proceedings of the 2021 3rd International Electronics Communication Conference (IECC), 28–36 (Ho Chi Minh City, Vietnam, 2021).
Mohan, J., Sugunaraj, N. & Ranganathan, P. Cyber security threats for 5g networks. In Proceedings of the 2022 IEEE International Conference on Electro Information Technology (eIT), 446–454 (Mankato, MN, USA, 2022).
Chatzoglou, E., Kambourakis, G. & Kolias, C. Empirical evaluation of attacks against IEEE 802.11 enterprise networks: The awid3 dataset. IEEE Access 9, 34188–34202 (2021). Open access article under a Creative Commons Attribution 4.0 License.https://doi.org/10.1109/ACCESS.2021.3061609
Alperin, K., Joback, E., Shing, L. & Elkin, G. A framework for unsupervised classificiation and data mining of tweets about cyber vulnerabilities. CoRR abs/2104.11695 (2021). arXiv:2104.11695.
Kolias, C., Kambourakis, G., Stavrou, A. & Gritzalis, S. Intrusion detection in 802.11 networks: Empirical evaluation of threats and a public dataset. IEEE Commun. Sur. Tutor. 18, 184–208 (2015).
Article Google Scholar
Nandanwar, H. & Katarya, R. Deep learning enabled intrusion detection system for industrial IoT environment. Exp. Syst. Appl. 249, 123808. https://doi.org/10.1016/j.eswa.2024.123808 (2024).
Article Google Scholar
Abdulganiyu, O. H., Ait Tchakoucht, T., Alaoui, A. E. H. & Saheed, Y. K. Attention-driven multi-model architecture for unbalanced network traffic intrusion detection via extreme gradient boosting. Intell. Syst. Appl. 26, 200519. https://doi.org/10.1016/j.iswa.2025.200519 (2025).
Article Google Scholar
Adeyiola, A. Q., Saheed, Y. K., Misra, S. & Chockalingam, S. Metaheuristic firefly and c5.0 algorithms based intrusion detection for critical infrastructures. In 2023 3rd International Conference on Applied Artificial Intelligence (ICAPAI), 1–7 (2023).
Prabha, P., Arjun, N., Gogul, J. & Prasanth, S. Two-way economical smart device control and power consumption prediction system. In Proceedings of the International Conference on Recent Trends in Computing, 415–429 (Ghaziabad, India, 2022).
Borgaonkar, R., Tøndel, I., Degefa, M. & Jaatun, M. Improving smart grid security through 5g enabled IoT and edge computing. Concurr. Comput.: Pract. Exp. 33, 1–15 (2021).
Article Google Scholar
Park, J. et al. A comprehensive survey on core technologies and services for 5g security: Taxonomies, issues, and solutions. Human-Centric Comput. Inf. Sci. 11, 1–30 (2021).
Google Scholar
Gonzalez, A., Grønsund, P., Dimitriadis, A. & Reshytnik, D. Information security in a 5g facility: An implementation experience. In Proceedings of the 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), 425–430 (Porto, Portugal, 2021).
Garika, G. S. & Kottala, P. IoT supervised pv-hvdc combined wide area power network security scheme using wavelet-neuro analysis. Adv. Electric. Electron. Eng. 20, 560–571 (2023).
Article Google Scholar
Kumar, A., Shridhar, M., Swaminathan, S. & Lim, T. Machine learning-based early detection of IoT botnets using network-edge traffic. Comput. Secur. 117, 1–10 (2022).
Article Google Scholar
Zhao, D., Shen, P. & Zeng, S. Alsnap: Attention-based long and short-period network security situation prediction. Ad Hoc Netw. 150, 103279. https://doi.org/10.1016/j.adhoc.2023 (2023).
Article Google Scholar
Thing, V. Ieee 802.11 network anomaly detection and attack classification: A deep learning approach. In 2017 IEEE Wireless Communications and Networking Conference (WCNC), 1–6 (2017).
Agarwal, M. Detecting flooding, impersonation and injection attacks on awid dataset using ml-based methods. In 2022 IEEE 4th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA), 221–226 (2022).
Wang, S., Li, B., Yang, M. & Yan, Z. Intrusion detection for wifi network: A deep learning approach. In International Wireless Internet Conference, 95–104 (Springer International Publishing, Cham, 2018).
Kubota, K., Shiraishi, Y. & Morii, M. Evaluation experiments of kr00k under real environment and its improvement proposal. In Proceedings of the Computer Security Symposium, 820–825 (2020).
Abdalgawad, N., Sajun, A., Kaddoura, Y., Zualkernan, I. & Aloul, F. Generative deep learning to detect cyberattacks for the IoT-23 dataset. IEEE Access 10, 6430–6441 (2021).
Article Google Scholar
Cermak, M., Svorencik, S. & Lipovsky, R. Kr00k - cve-2019-15126: Serious vulnerability deep inside your wi-fi encryption. Technical Report (2020).
Nakajima, S., Inoue, T., Shiraishi, Y. & Morii, M. Attack techniques and countermeasures against kr00k using csa. In 2022 Tenth International Symposium on Computing and Networking (CANDAR) (2022).
Kubota, K., Isobe, T. & Morii, M. Implementation and evaluation of dos attacks on wireless lan devices. In Proceedings of the Computer Security Symposium (2019).
Könings, B., Schaub, F., Kargl, F. & Dietzel, S. Channel switch and quiet attack: New dos attacks exploiting the 802.11 standard. In IEEE 34th Conference on Local Computer Networks, 14–21 (2009).
Louca, C., Peratikou, A. & Stavrou, S. 802.11 man-in-the-middle attack using channel switch announcement. In Selected Papers from the 12th International Networking Conference, 62–70 (2021).
Torres, A. Wifi anomaly behavior analysis based intrusion detection using online learning. In International Telemetering Conference Proceedings, vol. 56 (2021).
Ran, J., Ji, Y. & Tang, B. A semi-supervised learning approach to ieee 802.11 network anomaly detection. In IEEE Vehicular Technology Conference (VTC),https://doi.org/10.1109/VTC-Spring.2019.8746576 (2019).
Duan, Q., Wei, X., Fan, J., Yu, L. & Hu, Y. Cnn-based intrusion classification for ieee 802.11 wireless networks. In 2021 IEEE International Conference on Communications (ICC), 830–833,https://doi.org/10.1109/iccc51575.2020.9345293 (2021).
Qin, Y., Li, B., Yang, M. & Yan, Z. Attack detection for wireless enterprise network: A machine learning approach. In 2018 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), https://doi.org/10.1109/ICSPCC.2018.8567797(2018).
Abdulhammed, R., Faezipour, M., Abuzneid, A., Alessa, A. Effective & features selection and machine learning classifiers for improved wireless intrusion detection. In International Symposium on Networks. Computers and Communications (ISNCC), https://doi.org/10.1109/ISNCC.2018.8530969 (2018).
Vaca, F. & Niyaz, Q. An ensemble learning based wi-fi network intrusion detection system (wnids). In NCA 2018 - IEEE 17th International Symposium on Network Computing and Applications, https://doi.org/10.1109/NCA.2018.8548315 (2018).
Feng, G., Li, B., Yang, M. & Yan, Z. V-cnn: Data visualizing based convolutional neural network. In 2018 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), https://doi.org/10.1109/ICSPCC.2018.8567781 (2018).
Nandanwar, H. & Katarya, R. Deep learning enabled intrusion detection system for industrial IoT environment. Exp. Syst. Appl. 249, 123808. https://doi.org/10.1016/j.eswa.2024.123808 (2024).
Article Google Scholar
Nandanwar, H. & Katarya, R. TL-BILSTM IoT: transfer learning model for prediction of intrusion detection system in IoT environment. Int. J. Inf. Secur. 23, 1251–1277. https://doi.org/10.1007/s10207-023-00787-8 (2024).
Article Google Scholar
Nandanwar, H. & Katarya, R. Securing industry 5.0: An explainable deep learning model for intrusion detection in cyber-physical systems. Comput. Electric. Eng. 123, 110161. https://doi.org/10.1016/j.compeleceng.2025.110161 (2025).
Article Google Scholar
Kayode Saheed, Y. & Ebere Chukwuere, J. Cps-iiot-p2attention: Explainable privacy-preserving with scaled dot-product attention in cyber-physical system-industrial IoT network. IEEE Access 13, 81118–81142, (2025).
Saheed, Y. K. & Misra, S. Cps-iot-ppdnn: A new explainable privacy preserving dnn for resilient anomaly detection in cyber-physical systems-enabled iot networks. Chaos, Solitons Fractals 191, 115939. https://doi.org/10.1016/j.chaos.2024.115939 (2025).
Article Google Scholar
Kayode Saheed, Y., Harazeem Abdulganiyu, O. & Ait Tchakoucht, T. A novel hybrid ensemble learning for anomaly detection in industrial sensor networks and scada systems for smart city infrastructures. J. King Saud Univ. Comput. Inf. Sci. 35, 101532. https://doi.org/10.1016/j.jksuci.2023.03.010 (2023).
Article Google Scholar
Abdulganiyu, O. H., Ait Tchakoucht, T., Alaoui, A. E. H. & Saheed, Y. K. Attention-driven multi-model architecture for unbalanced network traffic intrusion detection via extreme gradient boosting. Intell. Syst. Appl. 26, 200519. https://doi.org/10.1016/j.iswa.2025.200519 (2025).
Article Google Scholar
Adeyiola, A. Q., Saheed, Y. K., Misra, S. & Chockalingam, S. Metaheuristic firefly and c5.0 algorithms based intrusion detection for critical infrastructures. In 2023 3rd International Conference on Applied Artificial Intelligence (ICAPAI), 1–7 (2023).
Kauhsik, B., Nandanwar, H. & Katarya, R. Iot security: A deep learning-based approach for intrusion detection and prevention. In 2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques (EASCT), 1–7 (2023).
Nandanwar, H. & Katarya, R. A systematic literature review: Approach toward blockchain future research trends. In 2023 International Conference on Device Intelligence, Computing and Communication Technologies, (DICCT), 259–264 (2023).
Saheed, Y. K., Misra, S. & Chockalingam, S. Autoencoder via dcnn and lstm models for intrusion detection in industrial control systems of critical infrastructures. In 2023 IEEE/ACM 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS), 9–16,https://doi.org/10.1109/EnCyCriS59249.2023.00006 (2023).
Agrawal, A., Chatterjee, U. & Maiti, R.R. ktracker: Passively tracking krack using ml model. In Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy, CODASPY ’22, 364–366 (Association for Computing Machinery, New York, NY, USA, 2022).
Alperin, K., Joback, E., Shing, L. & Elkin, G. A framework for unsupervised classificiation and data mining of tweets about cyber vulnerabilities. CoRR abs/2104.11695, arXiv: 2104.11695. https://doi.org/10.48550/arXiv.2104.11695(2021).
Salah, Z. & Abu Elsoud, E. Enhancing network security: A machine learning-based approach for detecting and mitigating krack and kr00k attacks in IEEE 802.11. Future Internet 15, 269. https://doi.org/10.3390/fi15080269 (2023).
Article Google Scholar
Chatzoglou, E., Kambourakis, G., Smiliotopoulos, C. & Kolias, C. Best of both worlds: Detecting application layer attacks through 802.11 and non-802.11 features. Sensors 22, 2633 (2022).
Article Google Scholar
Mughaid, A. et al. Correction to: Improved dropping attacks detecting system in 5g networks using machine learning and deep learning approaches. Multimed. Tools Appl. 82, 13997–13998. https://doi.org/10.1007/s11042-022-14059-5 (2022).
Article Google Scholar
Şolpan, Ş., Gündüz, H. & Küçük, K. Wi-fi network intrusion detection: Enhanced with feature extraction and machine learning algorithms. In 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP), 1–7, https://doi.org/10.1109/IDAP64064.2024.10710685 (2024).

Download references

Acknowledgements

The authors would like to thank anonymous reviewers and the editors of the journal. Your constructive comments have improved the quality of this paper.

Author information

Md Minhazul Islam Munna and Md Mahbubur Rahman have contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Technology, Beijing Institute of Technology, 5 Zhongguancun South Street, Beijing, 100081, China
Md Minhazul Islam Munna & Md Mahbubur Rahman
Department of Quantitative Methods and Economic Informatics, Faculty of Operation and Economics of Transport and Communication, University of Zilina, 01026, Zilina, Slovakia
Jaroslav Frnda
Department of Telecommunications, Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, 70800, Ostrava, Czech Republic
Jaroslav Frnda
Department of AI and Software, Gachon University, Seongnam-si, 13120, South Korea
Muhammad Shahid Anwar
Department of Applied Informatics, Kimyo International University, Tashkent, Uzbekistan
Alpamis Kutlimuratov

Authors

Md Minhazul Islam Munna
View author publications
Search author on:PubMed Google Scholar
Md Mahbubur Rahman
View author publications
Search author on:PubMed Google Scholar
Jaroslav Frnda
View author publications
Search author on:PubMed Google Scholar
Muhammad Shahid Anwar
View author publications
Search author on:PubMed Google Scholar
Alpamis Kutlimuratov
View author publications
Search author on:PubMed Google Scholar

Contributions

Md Minhazul Islam Munna, Mahbubur Rahman; conceptualization and methodology: Md Minhazul Islam Munna, Mahbubur Rahman, Jaroslav Frnda; data collection: Md Minhazul Islam Munna, Mahbubur Rahman; analysis and interpretation of results: Md Minhazul Islam Munna, Mahbubur Rahman, Jaroslav Frnda, Muhammad Shahid Anwar ; draft manuscript preparation: Md Minhazul Islam Munna, Mahbubur Rahman, Muhammad Shahid Anwar ; review and editing: Md Minhazul Islam Munna, Mahbubur Rahman, Jaroslav Frnda, Muhammad Shahid Anwar , Alpamis Kutlimuratov; project administration: Muhammad Shahid Anwar. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Muhammad Shahid Anwar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Munna, M.M.I., Rahman, M.M., Frnda, J. et al. Elevating intrusion detection and security fortification in intelligent networks through cutting-edge machine learning paradigms. Sci Rep 15, 39989 (2025). https://doi.org/10.1038/s41598-025-23754-w

Download citation

Received: 11 December 2024
Accepted: 08 October 2025
Published: 14 November 2025
Version of record: 14 November 2025
DOI: https://doi.org/10.1038/s41598-025-23754-w