CyberDetect MLP a big data enabled optimized deep learning framework for scalable cyberattack detection in IoT environments

Upender, Talluri; Neelakantappa, M.; Rao, C. Prakasa; Gera, Jaideep; Reddy, Vuyyuru Lakshma; Yamsani, Nagendar

doi:10.1038/s41598-025-24459-w

Download PDF

Article
Open access
Published: 19 November 2025

CyberDetect MLP a big data enabled optimized deep learning framework for scalable cyberattack detection in IoT environments

Talluri Upender¹,
M. Neelakantappa²,
C. Prakasa Rao³,
Jaideep Gera⁴,
Vuyyuru Lakshma Reddy⁵ &
…
Nagendar Yamsani⁶

Scientific Reports volume 15, Article number: 40865 (2025) Cite this article

1931 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The rapid growth in the adoption of Internet of Things (IoT) ecosystems has led to a large-scale influx of multidimensional data, highlighting vast attack surfaces that diverse types of cyber threats can exploit. However, existing traditional intrusion detection systems (IDS) and many common machine learning (ML) models do not scale very well. They are unfortunately not interpretable and unable to deal with high-dimensional significant data streams, which makes them very limited for use in large-scale IoT applications. In this paper, we propose CyberDetect-MLP, a scalable, explainable, big data-enabled, and optimized deep learning framework for IoT cyberattack detection, addressing these challenges. We present a robust framework that employs Apache Spark for distributed ingestion and preprocessing, Mutual information–based feature selection, and a multi-layer perceptron (MLP) with batch normalization, dropout, and cosine annealing scheduling to improve performance and generalization. To enhance transparency and ensure trust from the administrator, an optional explainable AI (XAI) module is added utilizing Grad-CAM and SHAP. Extensive experiments on the full TON_IoT dataset show that CyberDetect-MLP outperforms the baselines of Random Forest, XGBoost, and vanilla MLP with an accuracy of 98.87% and a ROC-AUC of 99.10%. Ablation studies and explainability evaluations further corroborate the framework’s robustness and the trustworthiness of the results. In contrast to existing methodologies, the proposed paradigm closes the gap between big data analytics and interpretable deep learning in cybersecurity to provide an end-to-end IDS approach specifically targeting real-time smart city, industrial IoT, and critical infrastructure applications. To ensure reproducibility and transparency, the complete implementation of the proposed CyberDetect-MLP framework, including data preprocessing, model training, and evaluation scripts, is publicly available at https://github.com/upender0123/CyberDetect-MLP.

Explainable artificial intelligence-based cyber resilience in internet of things networks using hybrid deep learning with improved chimp optimization algorithm

Article Open access 26 September 2025

Orchestrating machine learning models in a swarm architecture for IoT inline malware detection

Article Open access 20 December 2025

Enhanced intrusion detection in cybersecurity through dimensionality reduction and explainable artificial intelligence

Article Open access 30 September 2025

Introduction

The exponential growth of Internet of Things (IoT) devices across various sectors, including smart cities, industrial automation, and healthcare, has resulted in the generation of massive volumes of real-time, heterogeneous data. While this evolution enhances connectivity and automation, it also expands the attack surface for cyber threats targeting resource-constrained IoT endpoints and communication protocols. Traditional intrusion detection systems (IDS), primarily based on signature or rule-based mechanisms, have become insufficient in detecting novel or evolving threats within such dynamic environments. Consequently, data-driven and learning-based security analytics have gained traction for their ability to detect complex and previously unseen patterns.

Recent literature has explored the application of machine learning (ML) and deep learning (DL) models to intrusion detection^1,2, with notable success in static and structured environments. However, many existing models exhibit limitations in scalability, interpretability, and adaptability to the large-scale IoT deployments that generate big data^3,4. Moreover, classical ML models struggle with high-dimensional data, while deep architectures, though powerful, often operate as black boxes. A lack of integrated frameworks remains that combine significant data infrastructure, optimized deep learning, and explainable AI to deliver a robust and interpretable solution for IoT cyberattack detection.

While in some domains, this feature is effective for identifying malicious IoT traffic, the mentioned limitations indicate the importance of having a one-stop IDS for analyzing high-volume heterogeneous IoT traffic while providing acceptable scalability and interpretability. Flexibility to adapt to dynamic threats, real-time big data analytics, and transparency through reproducible decision-making for system administrations are the key requirements of Real-World IoT ecosystems.

To address these challenges, this research proposes CyberDetect-MLP, a Big Data-enabled Optimized Deep Learning Framework designed for scalable cyberattack detection in IoT environments. The primary objective is to develop an accurate, scalable, and explainable intrusion detection system (IDS) utilizing distributed data processing (Apache Spark), a custom-tuned multi-layer perceptron (MLP) architecture, and Grad-CAM-based interpretability. Key novelties include the integration of distributed preprocessing via Spark MLlib, optimized MLP configuration tailored for high-dimensional IoT data, and the incorporation of heatmap-based explainability to enhance model transparency.

In contrast to regular big data, IoT Big Data is multi-source and highly heterogeneous telemetry (sensors, devices, logs), high-velocity streaming with concept drift over time, and also with strict edge/fog processing constraints and even stronger privacy/security demands. They necessitate an architecture that can support distributed real-time analytics, adaptively learn as traffic patterns change over time, and explain the prediction to be compliant with regulations and establish trust.

Although many IDS frameworks for IoT security have been presented in previous works so far, most of them either use standard ML/DL models while lacking essential interpretability features necessary for real deployment, or disregard the scalability of big data use for building IoT applications. Unlike the current work, we present an integrated big data pipeline (Kafka/Flume–HDFS–Spark) along with an optimized multi-layer perceptron (MLP) architecture, which is tailored to high-dimensional IoT telemetry. Moreover, Mutual Information–based feature selection is applied to minimize computational costs, whilst explainable AI (Grad-CAM and SHAP) is used to enhance transparency in predicted explanations. Our work is unique due to this simultaneous consideration of scalability, interpretability, and systematic assessment on the complete TON_IoT dataset. The main contributions of this work can be summarized as follows:
CyberDetect-MLP Framework Development: A big data–enabled intrusion detection framework comprising a seamless integration of Apache Kafka/Flume for scalable data ingestion, Hadoop HDFS for distributed storage, and Apache Spark for parallel preprocessing and feature engineering.
Fine-grained Deep Learning Model: A custom 8-layer perceptron (MLP) with batch normalization, dropout regularization, and a cosin-annealing learning rate scheduler, tailor-made for high-dimensional IoT telemetry data.
Feature Selection for Efficiency. Mutual Information-based feature selection is used to select the relevant features and to reduce the dimension of the data, where the vital attributes are preserved, which helps to enhance the performance and efficiency of training.
Explainable AI Integration: To instill trust in model predictions by administrators, we introduce Grad-CAM and SHAP–based interpretability modules on top of it, such that one can have a transparent overview of why the model predicted this.
Extensive Empirical Validation: We perform extensive experiments based on the TON_IoT dataset, obtaining an accuracy of 98.87%, outperforming multiple baseline models (i.e., Random Forest, XGBoost, vanilla MLP) with extensive ablation studies and explainability analyses.

The remainder of this paper is structured as follows: Sect. 2 reviews the relevant literature across machine learning (ML), deep learning (DL), and big data frameworks for intrusion detection systems (IDS). Section 3 outlines the proposed methodology, which includes system design, dataset preparation, model implementation, and deployment strategy. Section 4 presents experimental results, comparisons, and visualization-based analyses. Section 5 discusses key insights and limitations of the study. Finally, Sect. 6 concludes the paper and outlines directions for future work.

Related work

This section reviews existing approaches in cyberattack detection for IoT environments, focusing on traditional machine learning models, deep learning architectures, and big data-enabled intrusion detection frameworks. The review identifies performance limitations, lack of scalability, and poor interpretability in current methods, thereby establishing the research gap that motivates the development of the proposed CyberDetect-MLP framework.

Machine learning and deep learning approaches for cyberattack detection

Extensive research has explored classical machine learning (ML) and advanced deep learning (DL) techniques for detecting cyberattacks across diverse networks. These methods enhance intrusion detection by learning patterns from labeled traffic data using algorithms like SVM, Random Forest, k-NN, CNNs, RNNs, LSTM, and hybrid models. Krishnashree Achuthan et al.¹ examined the integration of cybersecurity and environmental sustainability, highlighting important topics such as the application of blockchain in smart grids and the use of artificial intelligence in threat detection. The literature’s limited reach is one of its limitations; nevertheless, future research indicates more comprehensive SDG comparisons.

Abid Mohamed Nadhir et al.² present a Reinforcement Learning-based cybersecurity strategy for healthcare IoT utilizing PPO and A2C models in this research. PPO received fewer prizes than A2C. One of CyberBattleSim’s limitations is its inability to replicate the intricacies of the real world. Improvements to real-world modeling were proposed for future development. Tin Lai et al.³ present an ensemble machine learning method for IoT cybersecurity, which utilizes Bayesian optimization to detect anomalies. It demonstrates the effectiveness of tree-based models, such as XGB and GBM. In the future, various protocols and privacy-preserving strategies will be investigated.

Sankaramoorthy Muthubalaji et al.⁴ introduce the AEFS-KENN big data AI system for intelligent grid intrusion detection, which achieves 99.5% accuracy, in this paper. While efficient, its security reach is constrained; cryptographic integration is a future development. Noha Hussen et al.⁵ introduced FSBDL, a real-time intrusion detection system that achieves 99.93% accuracy by utilizing an optimized CNN with the Adam and RMSprop optimization algorithms. It is effective but has limited explainability; interpretability will be investigated in a subsequent study.

Faheem Ahmad et al.⁶ examined the dual function of big data in cybersecurity, emphasizing both machine learning and encryption, as both a target and a tool. It takes note of present issues and trends and projects future R&D expansion. Vinden Wylde et al.⁷ The report focuses on data privacy and legal concerns, highlighting the challenges in implementing blockchain technology for cybersecurity. The framework it suggests combines visualization, machine learning, and big data to improve security. Enhancing transparency and confidence in digital systems is the framework’s primary goal. SMEs face difficulties in adopting blockchain in real time due to its network latency. To encourage wider usage, further work will solve these latency difficulties and improve legal frameworks.

George Bravos et al.⁸ presented a CNN-LSTM hybrid model for IoT intrusion detection, which achieved an accuracy of 99.52% on the CICIDS2017 dataset. Although it excels at real-time detection, it needs to be enhanced for more comprehensive protection against diverse and sophisticated attacks. Iqbal H. Sarker⁹ offered a CNN-LSTM hybrid model that achieves 99.52% accuracy on CICIDS2017 for real-time IoT intrusion detection. Although it has high accuracy, it requires more testing and a substantial amount of training data. Mohamedamineferrag et al.¹⁰ compared RNN, CNN, and DNN on actual datasets to examine federated deep learning for IoT cybersecurity. It highlights vulnerabilities, demonstrates increased accuracy and privacy, and makes recommendations for future security improvements.

Mahmoudelsisi et al.¹¹ propose a unique CPS-based IoT architecture for precise defect and cyberattack detection, featuring clear visualization and utilizing XGBoost for real-time GIS monitoring. Future work will encompass larger power system applications. Mrutyunjaya Panda et al.¹² proposed an accurate, low-complexity machine learning and deep learning method for identifying IoT botnets. Future research will combine blockchain technology with edge and cloud computing to enable real-time detection and monitoring. Nighat Usman et al.¹³ presented a hybrid strategy for predicting IP reputation that combines data forensics, machine learning, and dynamic malware analysis. It enhances detection, reduces false alarms, and recommends future integration of firewall and antivirus systems. Aymen Yahyaoui et al.¹⁴ described a real-time intrusion detection system for IoT networks that uses Spark Streaming and Apache Flink to achieve excellent accuracy and speed. Other engines and anomaly detection kinds will be investigated in future development. Shahid Latif et al.¹⁵ presented DnRaNN, which achieves 99.14% accuracy in binary and 99.05% accuracy in multiclass situations for IoT intrusion detection. In the future, FPGA integration will be used to improve hardware performance.

Jingyu Liu et al.¹⁶ suggested PSO-LightGBM for IoT intrusion detection, which increases detection rates for assaults with small sample sizes (such as shellcode and backdoor). Future research will incorporate multi-task learning to improve performance. Nivedita Mishra and Sharnil Pandya¹⁷ examined IoT security with an emphasis on DDoS attacks, intrusion detection systems, and machine learning and deep learning-based mitigation strategies. The goal of future research is to create a reliable, all-purpose intrusion detection system. João Vitorino et al.¹⁸ use the IoT-23 dataset in the study to examine supervised, unsupervised, and reinforcement learning approaches for IoT intrusion detection. The best-performing model was LightGBM, and future research will combine models to improve detection.

Mohanad Sarhan et al.¹⁹ utilize three datasets in their evaluation of the NetFlow and CICFlowMeter feature sets for machine learning-based intrusion detection. NetFlow demonstrated better detection precision. The model’s explainability was assessed using the SHAP method. More extensive evaluations will be part of future work. Joseph Bamidele Awotunde and Sanjay Misra et al.²⁰ proposed a hybrid AI model that achieves a 99.75% detection rate and 99.45% accuracy for IoT intrusion detection. It outperforms current models and provides recommendations for future advancements that utilize genetic algorithms for feature selection.

Souradip Roy et al.²¹ presented a lightweight machine learning (ML)-based intrusion detection model (B-Stacking) that is tailored for the Internet of Things. It has been evaluated on CICIDS2017 and NSL-KDD, demonstrating high accuracy, minimal false alarms, and outperforming other models. Adeel Abbas et al.²² presented an ensemble intrusion detection system (IDS) that uses logistic regression, naive Bayes, and decision trees with hard voting. It has been evaluated on CICIDS2017 and has demonstrated good accuracy, a low false alarm rate, and resource efficiency. Although these ML/DL methods provide superior detection rates, most are evaluated on small/old datasets and omit necessary components for classifying high-velocity IoT traffic or provisioning interpretability. In this work, we address these challenges by integrating a big data pipeline with an explainable, optimized MLP architecture.

IoT-specific intrusion detection techniques

With the proliferation of IoT devices, specialized IDS solutions have emerged to handle resource constraints, protocol diversity, and limited training data. These include lightweight models, flow-based detection, federated learning, and anomaly detection optimized for edge computing. Vinayakumar Ravi et al.²³ introduced a GRU-based deep learning intrusion detection system for SDN-IoT that uses kernel-PCA and feature fusion to achieve improved accuracy and generalizability. However, it requires adversarial robustness analysis and optimization. Ajay Kumar et al.²⁴ introduced a network-based intrusion prevention system (NBIPS) for Internet of Things (IoT) security, which has been validated using signature-based detection. It provides robust protection but requires further development because it is not scalable or AI-based. YakubKayodeSaheed et al.²⁵ proposed an ML-IDS for IoT that achieves 99.9% accuracy by utilizing feature scaling, principal component analysis (PCA), and supervised learning. Although it has scalability issues, it produces good results; later research looks at ensemble approaches.

Pampapathi B M et al.²⁶ recommended a Filtered Deep Learning Model for IDS in IoT, which outperforms DLNN/ANN on the TON_IoT dataset with 96.12% accuracy. It utilizes MQTT/AQMP and LFEHO for data brokering; protocol and algorithm improvements are suggested for future work. P. L. S. Jayalaxmi et al.²⁷ examined IDS/IPS models with ML/DL, presenting a new risk factor mapping framework that contrasts with current approaches, identifies advantages and disadvantages, and suggests future lines of inquiry for improved IoT security.

Muhammad Asif et al.²⁸ introduced MR-IMID, a MapReduce-ML model that uses artificial neural networks (ANN) for real-time IoT intrusion detection. It has a 97.6% accuracy rate, is scalable, has fewer false alarms, and has room for improvement. Mouaad Mohy-Eddine et al.²⁹ proposed a hybrid intrusion detection system (IDS) for the Industrial Internet of Things (IIoT) that utilizes RF with PCC and IF for feature selection and outlier elimination. It achieves up to 99.99% accuracy and plans to validate its performance in a broader dataset. Shiyu Wang et al.³⁰ introduced Res-TranBiLSTM, a spatiotemporal deep learning intrusion detection system that utilizes ResNet, Transformer, and BiLSTM to achieve an accuracy of up to 99.56%. The goals of future work include unsupervised learning and real-world deployment.

Fuhua Huo³¹ presented an IoT-based cloud intrusion detection system that uses Mings distance and data clustering to achieve excellent accuracy with less than 0.3% of detections missed; further research could enhance real-time adaptability. Alireza Zohourian et al.³² introduce IoT-PRIDS, a lightweight, non-ML host-based intrusion detection system that utilizes packet representations in this study. It has achieved good CICIoT2023 results with few false positives; further research aims to enable dynamic device adaptation. Mohanad Sarhan et al.³³ assessed PCA, AE, and LDA using six machine learning models on three NIDS datasets, demonstrating the necessity for a broad baseline feature set and finding that no single optimal combination exists.

Ajay Kaushik and Hamed Al-Raweshidy³⁴ presented TLBO-IDS, an IoT-optimized low-overhead intrusion detection system that outperforms GA and bat methods by as much as 40% on UNSW-NB15; future research attempts to incorporate encryption standards. Amritpal Singh et al.³⁵ introduced SecureFlow, a dual-layer IDS and dynamic rule configuration system for IoT based on Software-Defined Networking (SDN), which exhibits good emulation results. Future research will focus on the real-time deployment and broader validation of attacks.

Ibrahim A. Fares et al.³⁶ presented the TFKAN Transformer for IoT IDS, utilizing Kolmogorov-Arnold Networks, which achieves up to 99.96% accuracy with 78% fewer parameters. Real-time application and training improvement are the goals of future work. Mahmoud Ragab et al.³⁷ introduced NGCAD-EDLM, an ensemble CNN-DBN model for IIoT cybersecurity that utilizes HBA and LEOA, achieving 99.21% accuracy. Future work aims to enhance adaptability and reduce complexity.

While recent IoT-oriented IDS solutions have explored fog architectures, threat prediction using LLM, and early botnet detection, they operate in isolated settings that are not scalable and lack a combination of big data analytics with transparent model explanations. CyperDetect-MLP provides a bridging architecture with a single Spark-enabled (backend) architecture and XAI modules.

Big data frameworks for security analytics

Big data technologies such as Hadoop and Spark have been adopted to process high-velocity security data. These frameworks support scalable data ingestion, transformation, and distributed model training, enabling real-time and batch analytics over massive IoT traffic. Rania Aboalela et al.³⁸ presents HFPODL-DDoSCD, a deep learning-based IoT DDoS detection model with 99.52% accuracy, utilizing STO and APO optimizations. The goals of future research include real-time edge deployment and broader generalization. Olanrewaju-George et al.³⁹ introduces FL-trained supervised and unsupervised DL models for IoT IDS, demonstrating that AE-FL performs best on the N-BaIoT dataset, offering privacy advantages but being constrained by device-specific tuning requirements. Hui Chen et al.⁴⁰ To improve performance in dynamic contexts with class imbalance handling and quantization, this research suggests the SICNN model for IoT intrusion detection. It performs better than existing models when tested on the CIC-IDS 2017 and CICIoT 2023 datasets.

Vanlalruata Hnamte and Jamal Hussain⁴¹ proposed a hybrid DCNNBiLSTM-based intrusion detection model that achieves high accuracy by utilizing the Edge_IIoT and CICIDS2018 datasets. Although it takes longer to train than single models, it performs better; real-time deployment will be a part of future work. Mimouna Abdullah Alkhonain et al.⁴² presented the SPOHDL-ID model, which combines Blockchain, Sandpiper Optimizer, CNN-SAE, and BSOA for IoT intrusion detection. It has been tested on ToN-IoT and CICIDS-2017, achieving an accuracy of approximately 99.5%; however, scalability and resource requirements remain issues. C. Rajathi and P. Rukmani⁴³ introduced a Hybrid Learning Model that combines parametric and non-parametric classifiers through stacking. It has been tested on NSL-KDD, UNSW-NB15, and CICIDS2017, achieving an accuracy of up to 99.98%. Its benefits include accuracy and robustness, but its drawbacks include complexity and sensitivity to data quality. Future research aims to reduce complexity and improve threat response.

Stefanos Tsimenidis et al.⁴⁴ examined deep learning models for IoT intrusion detection, emphasizing their effectiveness compared to more conventional techniques. Benefits include accuracy and adaptability, but drawbacks include computational demands and data scarcity. Future research proposes effective, decentralized, unsupervised IDS solutions. MINH-QUANG TRAN et al.⁴⁵ introduced an IoT-DNN system for real-time CNC machine monitoring that achieves precise cutting condition categorization and cyberattack detection. Although it has a high degree of reliability, it also presents complexity and environmental issues, and may eventually be utilized in larger innovative systems.

Mohameds. Abdalzaher et al.⁴⁶ examined the roles of machine learning and the Internet of Things in intelligent systems, presenting a taxonomy of machine learning models, assessing security methods, and examining two case studies (Smart Campus and EWS) before suggesting future research avenues. However, it has drawbacks, such as being too general. Martin Manuel Lopez et al.⁴⁷ described an EVL-based IoT intrusion detection system that uses SCARGC to manage concept drift. It has been tested on BotIoT and ToNIoT datasets and has demonstrated good accuracy; nevertheless, it has limitations, such as scalability, and further research is needed to integrate EVL fully.

Vanlalruata Hnamte et al.⁴⁸ report that the two-stage LSTM-AE hybrid IDS presented in this paper achieves an accuracy of up to 99.99% when tested on the CICIDS2017 and CSE-CICIDS2018 datasets. Its benefits include precision and robustness, but its drawbacks include a lengthy training period. Future research will examine attention mechanisms, transfer learning, and ensemble approaches. JIAWEI DU et al.⁴⁹ present the deep learning model NIDS-CNNLSTM, which combines CNN and LSTM for IIoT intrusion detection, in this study. It has been tested on KDDCUP99, NSL-KDD, and UNSW-NB15 datasets and has demonstrated good accuracy and minimal false alarms. Future work aims to address data imbalance and improve small-sample performance.

Privacy and communication-efficient IDS via a federated and collaborative approach offers performance efficiency, but this comes at the cost of sacrificing distributed big data processing and the gradient-based explanation mechanism. Our framework builds on these efforts and integrates distributed processing and interpretable deep learning for IoT.

Model interpretability and explainability in IDS

Explainable AI techniques have been introduced to interpret the decision-making of black-box IDS models. Approaches such as Grad-CAM, SHAP, LIME, and saliency maps are increasingly used to highlight the features contributing to predictions, thereby fostering trust in AI-based security systems. Tao Yi et al.⁵⁰ examined deep learning-based network attack detection, considering sample imbalance, traffic complexity, and evolving threats. It also identifies current issues and makes recommendations for future research into reliable and comprehensible models.

Sydney Mambwe Kasongo⁵¹ introduced an IDS framework utilizing XGBoost-based feature selection, which incorporates RNNs (LSTM, GRU, and Simple RNN). Tested on the UNSW-NB15 and NSL-KDD datasets, the results demonstrate faster training times and increased accuracy. Future research will use hybrid approaches.

Ahmed Abdelkhalek and Maggie Mashaly⁵² offered a method for resampling data that addresses class imbalance in NIDS by utilizing Tomek Links and ADASYN. Results from tests on the NSL-KDD dataset demonstrate increased detection rates and accuracy (99.8%). Other datasets will be tested in a future study. Soumyadeep Hore et al.⁵³ propose the DeepResNIDS framework, which utilizes AI and transfer learning to identify both known and unknown threats, in this research. When tested on benchmark datasets, its accuracy was 98.5%. Future research will focus on integrating human analysts and effective retraining.

Mamatha Maddu and Yamarthi Narasimha Rao⁵⁴ proposed a deep learning-based intrusion detection system (IDS) for Software-Defined Networking (SDN) that utilizes DCGAN, CenterNet, ResNet152V2, and SMA to identify and prevent network intrusions. It demonstrated good accuracy (99.65% and 99.31%) when tested on the InSDN and Edge IIoT datasets. Future research will focus on the actual installation of IoT networks and the detection of zero-day vulnerabilities. Nojood O. Aljehane et al.⁵⁵ presented GJOADL-IDSNS, a network intrusion detection system that uses SSA for hyperparameter tuning, A-BiLSTM for classification, and GJOA for feature selection. When tested on benchmark datasets, it outperforms current models. Adapting to real-time situations and changing cyber threats is part of the future effort.

Khushnaseeb Roshan et al.⁵⁶ examined adversarial assaults on NIDS using methods such as PGD and FGSM. It exhibits enhanced robustness after testing three defense tactics. Static datasets and streamlined attack models are among the drawbacks. Exploring new attacks and black-box situations is part of future work. Hichem Sedjelmaci⁵⁷ offered a detection method based on hierarchical reinforcement learning for protecting 5G networks from both external and internal threats. Effective detection of unknown assaults and negligible computing overhead are demonstrated. In the future, performance in actual 5G networks will be assessed.

Ahmad Ali AlZubi et al.⁵⁸ presented a cognitive machine learning (ML)-assisted threat detection framework that achieves excellent prediction accuracy (96.5%) and efficiency (97.8%) for safeguarding healthcare cyber-physical systems. In the future, clever CPS security procedures will be developed. SHAKILA ZAMAN et al.⁵⁹ examined IoT security risks and approaches for countermeasures based on AI/ML. It highlights the need for enhanced AI models and improved node security. Future research will concentrate on expanding datasets and increasing efficiency. Celestine Iwendi et al.⁶⁰ introduced an Intrusion Detection System (IDS) for IoT cyberattack detection that uses LSTM classifiers and deep learning. At 99.09% accuracy, it outperformed the most advanced models. Future projects will involve integrating blockchain technology and conducting real-time testing.

Benchmarking and evaluation using public datasets

Numerous studies benchmark IDS models on datasets such as NSL-KDD, CIC-IDS2017, UNSW-NB15, and TON_IoT. These comparisons help assess generalizability, robustness, and detection accuracy under diverse attack conditions. G.C. Amaizu et al.⁶¹ proposed a 99.66% accurate composite DDoS detection framework for 5G/B5G networks, which utilizes an effective feature extraction process and a multilayer perceptron. Improving latency and lowering computing cost are the goals of future research. Murat Kuzlu et al.⁶² examined attack methods, defense AI algorithms, and the difficulties in cybersecurity as they relate to IoT, AI, and hostile AI. Future dangers are emphasized, as is the necessity of improved IoT security against changing attacks. Kavitha Dhanushkodi and S. Thejas⁶³ examined the application of AI to cybersecurity, emphasizing developments in threat detection, including federated learning and deep learning. Privacy and real-time processing are challenges. Future research will integrate AI with cutting-edge technologies to enhance its capabilities.

Sowmya T. and Mary Anita E. A⁶⁴ examined AI-based intrusion detection techniques, emphasizing ensemble learning, machine learning, and deep learning. The results demonstrate over 99% detection accuracy; future research will focus on identifying unidentified threats and developing hybrid models. Salwa Alem⁶⁵ presented BIANO-IDS, which combines neural networks with anomaly-based and specification-based intrusion detection systems (IDS). It exhibits few false positives and great accuracy when tested in an actual industrial setting. Improving performance metrics is the primary focus of future work.

Heng Zeng et al.⁶⁶ offered a conceptual paradigm for smart city IoT networks that emphasizes user interaction and ongoing cybersecurity education through AI-based anomaly detection. Empirical research and policymakers’ recommendations are part of the future work. Matthew Baker et al.⁶⁷ present an anomaly detection and correction method for power electronic-dominated grids (PEDGs) based on LSTM-MPC in this research. With real-time detection, categorization, and remedial measures, it has been tested on a 14-bus system. Monika Vishwakarma and Nishtha Kesswani⁶⁸ introduced a deep neural network-based intrusion detection system (IDS) for Internet of Things (IoT) networks that can identify 20 different types of attacks using a benchmark dataset. The results demonstrate high efficiency, and real-time training will be the primary focus of future research.

Md. Asaduzzaman and Md. Mahbubur Rahman⁶⁹ introduced a hybrid LSTM-CNN model that improves accuracy to 93.53% for zero-day attack detection utilizing GAN-generated data. To detect more complex attacks, future studies will expand the scope of attack data. ZHIBO ZHANG et al.⁷⁰ discuss the necessity for transparency in AI-driven defense models in this paper’s evaluation of Explainable AI’s (XAI) application in cybersecurity. For XAI applications, it outlines a research path, obstacles, and future objectives.

Ankit Attkan and Virender Ranga⁷¹ examined IoT security with an emphasis on session keys, blockchain-AI integration, and authentication. It draws attention to issues like battery life and offers AI-powered solutions for IoT key management. Future studies are required. S. Markkandeyan et al.⁷² propose a hybrid Deep Learning model (ATFDNN + IPSO) in the study to identify malware and software piracy in the Internet of Things. Experiments on the Maling and Google Code Jam datasets demonstrate that it outperforms traditional techniques. Reliance on massive datasets and significant computational expense are among its drawbacks. Qawsar Gulzar and Khuram Mustafa⁷³ present a DeepCLG hybrid learning model for IIoT network intrusion detection in the research; it achieves high accuracy (99.82%) on the UNSW_NB15 and CICIoT 2023 datasets. Scalability and real-time flexibility are limitations that warrant further study.

Optimization techniques and hybrid model designs

Recent trends in IDS research include ensemble learning, neural architecture optimization, hyperparameter tuning, and hybrid deep learning architectures that combine CNNs, RNNs, or attention mechanisms for improved performance. Enerst Edozie et al.⁷⁴ examined AI-powered anomaly detection for telecom networks, emphasizing the efficiency of deep learning. Latency, scalability, and data volume are obstacles. Future research should focus on ongoing model maintenance and the development of hybrid models. GIOVANNI BATTISTA GAGGERO et al.⁷⁵ examined anomaly identification in smart grids with an emphasis on physics- and AI-based techniques. It identifies false favorable rates and real-world testing gaps, suggesting further study to increase grid dependability.

Vivek Menon et al.⁷⁶ examined the importance of AIoT in new technologies, architecture, and security. To direct future research in AIoT applications and advances, this addresses ML/DL algorithms, IoT security strategies, and associated problems. Malka N. Halgamugea and Dusit Niyato⁷⁷ introduced an AI/ML-integrated adaptive edge security architecture for the Internet of Things that generates dynamic policies. Addressing issues in IoT ecosystems improves compliance and resiliency. Validation and optimization are part of future development. Ilhan Firat Kilincer⁷⁸ uses the CL2-IDS dataset in the study to demonstrate a hybrid deep learning model (CNN + Bi-LSTM) for identifying Layer 2 network attacks. Its accuracy was 95.28%. Future research will focus on enhancing anomaly detection and growing the dataset.

Huiyao Dong and Igor Kotenko⁷⁹ examined machine learning methods in IDS, contrasting deep learning strategies with conventional machine learning models. It draws attention to developments, difficulties, and upcoming studies, especially in hybrid models and Internet of Things settings. Moneerah Alotaibi et al.⁸⁰ introduced a hybrid intrusion detection system (IDS) model that uses bio-inspired metaheuristic algorithms (GWQBBA) for attack categorization and feature selection. Random Forest reduces the number of features to 12 and achieves an accuracy of 98.5%. Enhancing scalability and optimizing parameters are the primary goals of future studies.

Existing research has investigated enhanced strategies for IoT intrusion detection and threat prediction, with a focus on explainability and scalability, as well as with emerging AI approaches. Korba et al. An early work that proposed an explainable anomaly detection model for early-stage IoT botnet attack mitigation⁸¹, whereas Korba et al. To this end, a zero-day botnet detection approach using an Isolation forest and Particle Swarm Optimization was developed by⁸² (highlighted with a blue block in Fig. 1). Further, Korba et al. have introduced a complete AI model-based framework for rapid Internet of Things botnet detection through network traffic analysis in⁸³. Diaf et al. Labiod et al. utilize large language models for predictive analysis of cyber-attacks in IoT networks, as seen in^84,85. An IDS architecture is designed to address the IoT security challenge in a fog-based infrastructure⁸⁶. Khan et al. Khan et al.⁸⁷ proposed a privacy-preserving, federated boosting technique to amplify attack detection capabilities through collaboration among distributed IoT devices. Lastly⁸⁸, presented a network aimed at collaborative SRU with both behavior aggregation and interpretability. Although these studies highlight advancements, they lack the unification of a significant data processing stream⁸ with integrated explainable deep learning¹³, which is addressed in our work.

In the past few years, deep learning based techniques have shown great promise in enabling IoT and IIoT intrusion detection. Research on IDS based on deep learning⁸⁹ in industrial IoT environments showed promising accuracy results using optimized neural architectures. Likewise, TL-BILSTM IoT proposed a transfer learning model to improve the performance of IDSs by utilizing bidirectional LSTM layers for IoT networks⁹⁰. For example, ref⁹¹. proposed an explainable deep learning framework for Industry 5.0 cyber-physical systems and underlined the importance of interpretability for critical infrastructure security. Additionally, a deep learning-based IoT security framework for intrusion detection and prevention⁹² demonstrated that adapting and scaling IDS solutions are necessary.

A variety of machine learning and deep learning models have been investigated, including decision trees, support vector machines (SVMs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and hybrid ensembles⁶. These methods are undoubtedly promising, but when applied to large-scale, heterogeneous IoT data, they often suffer from the challenges of real-time performance, scalability, and interpretability. Only a handful of works combine big data platforms such as Spark and explainable AI, indicating the necessity for the unified, scalable, and interpretable framework proposed herein.

Proposed framework

This section presents the proposed CyberDetect-MLP framework, designed to enable scalable and explainable detection of cyberattacks in IoT environments. It details the system architecture, dataset preparation, significant data processing pipeline, optimized deep learning model, algorithmic implementation, and deployment strategy, supported by corresponding figures, algorithms, and tables to illustrate the operational workflow and novel methodological contributions.

System overview

The overall system architecture for IoT cyberattack detection is designed as a comprehensive big data analytics framework that integrates scalable data ingestion, distributed storage, preprocessing, and an optimized deep learning model for accurate attack classification. IoT telemetry data, network traffic, and operating system logs are continuously collected from diverse sources, including smart homes, cities, and industrial environments. These high-volume and high-velocity data streams are ingested using Apache Kafka and Apache Flume into the Hadoop Distributed File System (HDFS) for fault-tolerant, scalable storage. Apache Spark facilitates distributed preprocessing and feature engineering, enabling efficient handling of data cleansing, normalization, and selection across large datasets. The processed data is then fed into the CyberDetect-MLP model, an optimized multilayer perceptron designed for scalable and accurate cyberattack classification. Detection results trigger real-time alerts and are logged for further analysis. The entire workflow emphasizes scalability, robustness, and real-time responsiveness, making it suitable for dynamic and heterogeneous IoT environments. The overall system architecture is illustrated in Fig. 1.

Figure 1 depicts the high-level architecture of the proposed big data-enabled IoT cyberattack detection framework. It illustrates the seamless integration of IoT data sources with distributed data ingestion mechanisms, scalable storage, and parallel preprocessing components. The diagram illustrates how processed feature sets are integrated into the CyberDetect-MLP model for attack classification, followed by alert generation and storage of detection outcomes. Key elements, including Apache Kafka, Hadoop HDFS, Apache Spark, and an optimized deep learning model, are represented, highlighting the modular and scalable nature of the system. Table 1 summarizes the key symbols and their descriptions used in the methodology to facilitate a clear understanding of the proposed framework.

Table 1 Notations and definitions used in the proposed CyberDetect-MLP framework.

Full size table

Dataset description and preparation

In this study, experiments were conducted on the TON_IoT dataset, a comprehensive and large-scale dataset specifically designed for cyberattack detection in the IoT environment. This dataset comprises telemetry data from IoT devices, PCAP/CSV format network traffic captures, and network logs from operating systems such as Windows, Ubuntu, and Android. Such a variety of sources emulates the actual smart home, city, and industrial IoT ecosystems. The TON_IoT dataset consists of subsets, each with a mix of benign activities and cyberattack activities, specifically Denial-of-Service (DoS), Distributed Denial-of-Service (DDoS), ransomware, backdoor access, injection attacks, cross-site scripting, and reconnaissance attack types.

The original TON_IoT data are placed in a Hadoop Distributed File System (HDFS) for scalable storage and concurrent access. For batch ingestion, it uses Apache Flume, while Apache Kafka is used to simulate streams in real-time. The first preprocessing step is to clean the data. Missing values are handled using mean imputation for continuous features and mode imputation for categorical attributes. To maintain the integrity of the data and avoid model bias, duplicate entries are dropped. One hot encoding technique is used for encoding categorical variables like device type or service name. We perform Min–Max normalization on all continuous numerical features to scale the data between 0 and 1 as shown in Eq. 1.

$$\:{X}^{{\prime\:}}=\frac{X-{X}_{min}}{{X}_{max}-{X}_{min}}$$

(1)

Indeed, $\:X$ is the original value of the feature, $\:{X}_{min}\:$and $\:{X}_{max}$ denote the minimum and maximum value of the feature, and $\:{X}^{{\prime\:}}$ is the normalized value from feature. A process to select the best features is included to enhance the quality of input data and to decrease dimensionality. We use Mutual Information (MI) to pick the top-k most informative features that are the most dependent on the attack labels. The mutual information of a feature X with the target class Y is calculated as in Eq. 2.

$$\:I\left(X;Y\right)=\sum\:_{x\in\:X}\sum\:_{y\in\:Y}p\left(x,y\right)\text{log}\frac{p\left(x,y\right)}{p\left(x\right)p\left(y\right)}$$

(2)

where $\:p\left(x,y\right)\:\:$the joint probability distribution of both X and Y; and $\:p\left(x\right)$ and $\:p\left(y\right)\:$the marginal probability distribution of X and Y. Features with maximum mutual information scores are kept for model training.

After preprocessing, filtering features and size, the dataset is split in training (70%), validation (15%) and test (15%) sets. Stratified sampling maintains the distribution of the classes in each of these subsets which will help eliminate any imbalance-induced bias affecting the evaluation of the model. The output processed dataset acts as input for training and testing the proposed CyberDetect-MLP model based on the distributed big data analytics framework.

Big data infrastructure and processing pipeline

To handle the large volume, velocity, and variety of the TON_IoT dataset, the proposed framework adopts a distributed big data infrastructure integrating Hadoop and Apache Spark. The Hadoop Distributed File System (HDFS) is utilized for scalable, fault-tolerant storage of ingested IoT telemetry, network traffic, and operating system log data. This enables high-throughput access across distributed nodes, ensuring that large-scale datasets can be efficiently read and written without centralized bottlenecks.

Data ingestion into the big data environment is carried out through two primary mechanisms. Apache Kafka serves as the real-time ingestion layer, simulating continuous IoT telemetry and network traffic streams. Apache Flume is used for batch ingestion of historical datasets, ensuring compatibility with structured CSV and semi-structured log data formats. The ingested data is serialized into efficient storage formats such as Avro or Parquet before being placed in HDFS to enhance compression and parallel processing capabilities.

Once stored, the data undergoes distributed preprocessing and feature engineering using Apache Spark. Spark’s Resilient Distributed Dataset (RDD) abstraction and DataFrame API enable parallel execution of data cleaning, missing value imputation, duplicate removal, categorical encoding, and feature normalization. Continuous features are normalized between 0 and 1 using the Min-Max normalization technique described previously in Eq. (1), ensuring uniform feature scaling for stable deep learning model convergence.

Figure 2 illustrates the distributed pipeline for data ingestion, storage, preprocessing, and feature selection using Hadoop and Apache Spark. For feature selection, Spark’s scalable computation is leveraged to calculate Mutual Information (MI) scores between input features and the target labels. Using the scoring method that has been defined and described in Eq. (2), features are rank-ordered, and the top-k most informative features are selected. Mason can carry out these computations in a distributed manner, allowing feature engineering that would be too computationally intensive otherwise to run on larger-scale IoT datasets.

After feature engineering, the data is partitioned into training, validation, and testing subsets using stratified sampling to maintain attack type class distributions across splits. This preprocessed and feature-selected data is subsequently used as input for the training and evaluation of the CyberDetect-MLP model. The integrated big data infrastructure ensures scalability, real-time data adaptability, and high computational efficiency, critical for IoT cyberattack detection applications.

Proposed CyberDetect-MLP model architecture

At the heart of the proposed framework is CyberDetect-MLP, shown in Fig. 3, which is a highly optimized deep learning architecture for scalable cyberattack detection in IoT environments. The input to the model is the dataset based on the TON_IoT sources that is feature-selected and normalized. The input vectors correspond to instances of telemetry data, network traffic, or system logs, and are already pre-processed to maintain only the features relevant to the attack classification task.

The preprocessed feature set is passed as input to CyberDetect-MLP through its input layer and is followed by one or more fully connected dense layers. The initial dense layer contains 512 neurons with ReLU activation for non-linearity, and after that uses batch normalization to stabilize and speed up the learning. Preventing Overfitting: Dropout layer rate = 0.3. Dropout is used to prevent overfitting by randomly turning off neurons during training. We pass the output of the first hidden layer to the second dense layer with 256 neurons, again with ReLU activation, batch normalization, and dropout. Then, it is followed by a third dense layer with 128 neurons. The transformation at each of the dense layers $\:l$ can be expressed mathematically as in Eq. 3.

$$\:{h}^{\left(l\right)}=Dropout\left(BatchNorm\left(\sigma\:\left({W}^{\left(l\right)}{h}^{\left(l-1\right)}+{b}^{\left(l\right)}\right)\right)\right)$$

(3)

Where $\:{h}^{\left(l-1\right)}$ is the input of the previous layer,$\:\:{W}^{\left(l\right)}$ and $\:{b}^{\left(l\right)}$ are the weights and biases of the current layer, $\:\sigma\:\left(\bullet\:\right)$ is the ReLU activation function, and batch normalization and dropout are sequentially applied.

In the last stage, the output layer is a dense layer with as many neurons as the number of attack classes. Multi-class attack classification is performed using a softmax activation function to produce normalized probability distributions per class. A single output neuron softmax operation for an output neuron $\:j\:$ is defined as in Eq. 4.

$$\:P\left(y=j\:|\:x\right)=\frac{{e}^{{z}_{j}}}{{\sum\:}_{k-1}^{K}{e}^{{z}_{k}}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\left(4\right)$$

(4)

K is the number of classes, $\:{z}_{j}$ is the logit for clas j. Due to multiclass classification nature, the model is trained with a categorical cross-entropy loss that inflicts a penalty on wrong class predictions while ensuring maximizing the probability of correct class, at the same time. L = loss function over N instances as in Eq. 4.

Where $\:{y}_{ij}$ is the indicator (1 or 0) if class label c is the correct classification for observation and $\:{\widehat{y}}_{ij}$ is the predicted probability of class j. Finally, to improve the efficiency and the generalization of the training, a learning rate scheduler (cosine annealing, step decay) is added. This progressively reduces the learning rate over the training epochs to enable the model to converge faster and escape local minima traps. Figure 3 provides a human figurative illustration of the CyberDetect-MLP architectural design, such as Input, Hidden, and Output layers, and optimization mechanisms.

Model training and hyperparameter optimization

Trained the supervised model CyberDetect-MLP on the selected features from the preprocessed TON_IoT dataset. This categorical cross-entropy loss described in Eq. (7) is minimized during the training to enable the model to successfully classify the various types of cyberattacks encountered in the IoT environments. Using stratified sampling to ensure the distribution of attack classes is maintained within the subsets, the dataset is then split into a training, a validation, and a test sample in a 70:15:15 ratio, respectively.

We perform mini-batch gradient descent over epochs for training. We define a batch size to ensure every batch contains a fixed number of samples. We then update the model weights after each step (batch) using Adam. Adam, which combines the advantages of both AdaGrad and RMSProp. Adams supplies adaptive learning rates for each parameter to speed up convergence. We can update a parameter θ at iteration tt using Adam as in Eq. 5.

$$\theta _{t} = \theta _{{t - t}} - \alpha \frac{{\hat{m}_{t} }}{{\sqrt {\hat{v}_{t} } + \smallint }}$$

(5)

where α is the learning rate,, $\:{\widehat{m}}_{t}$ and $\:{\widehat{v}}_{t}$ are the bias-corrected moving averages of the first and second moments of the gradients, and ϵ is a small constant (usually 10e-8) as a condition against zero for the denominator.

Hyper-parameter tuning is used to improve the CyberDetect-MLP performance. Hyperparameters tuned in this model include number of hidden layers, number of neurons per layer, drop out rates, batch size, initial learning rate and learning rate decay schedule. It is actually a randomized search within pre-defined ranges of the parameters so it is faster and less computationally expensive than an exhaustive search. In the case, each hidden layer dropout number of neurons and each dropout rates were tested as in the range providing in the [128, 256, 512] as well [0.2, 0.3, 0.5].

Moreover, a learning rate scheduler is added during training. Cosine annealing schedule (Loshchilov & Hutter 2016) gradually decreases the learning rate according to a cosine function, which enables the model to quickly learn during the initial stages of training while also fine-tuning during the later stages. Cosine annealing is one of the popular scheduling techniques, and the learning rate $\:{\alpha\:}_{t}$ at epoch t under cosine annealing can be expressed as in Eq. 6.

$$\:{\alpha\:}_{t}={\alpha\:}_{min}+\frac{1}{2}\left({\alpha\:}_{max}-{\alpha\:}_{min}\right)\left(1+\text{cos}\left(\frac{t\pi\:}{T}\right)\right)$$

(6)

With $\:{\alpha\:}_{min}$ and $\:{\alpha\:}_{max}$ as the minimum and maximum learning rates respectively and T as total number of training epochs. We use standard classification metrics to evaluate model performance, including accuracy, precision, recall, F1 score, and area under the ROC curve (ROC-AUC). These metrics offer a holistic evaluation of the CyberDetect-MLP detection ability under both balanced and imbalanced attacks. The hyperparameter configurations that result in the highest validation F1-score are used as the final parameters of the model, which is then evaluated on test data and/or used for deployment.

The TON_IoT dataset is heavily imbalanced, with a very low number of attacks of a specific type. In an attempt to mitigate this, we used several approaches: (i) class-weighted categorical cross-entropy loss, with weights inversely proportional to class frequencies, and (ii) some mild oversampling of the minority classes during training to improve representation. Such techniques enable the model to learn discriminative features for minority classes while avoiding overfitting.

Algorithmic implementation

This section presents the algorithmic implementation of the proposed framework, detailing the core computational procedures that enable scalable and efficient cyberattack detection in IoT environments. This section outlines the step-by-step algorithms for data preprocessing, feature selection, model training, and real-time detection. By formalizing these processes, the framework ensures reproducibility, clarity, and systematic execution, facilitating seamless integration of big data analytics and optimized deep learning techniques for robust IoT security.

Algorithm 1 outlines the end-to-end big data preprocessing and ingestion pipeline essential for preparing IoT data for effective cyberattack detection. It details the processes of scalable data storage initialization using Hadoop HDFS, real-time and batch data ingestion through Apache Kafka and Flume, and the conversion of raw data into efficient storage formats. The algorithm further describes distributed data cleaning operations, including missing value imputation and duplicate removal, categorical feature encoding, and normalization of continuous features. Finally, it partitions the processed dataset into training, validation, and testing subsets, ensuring balanced class distributions for reliable model training and evaluation. This pipeline provides the large-scale, heterogeneous IoT data is transformed into a consistent and optimized format suitable for subsequent analytics.

Algorithm 2 describes the feature selection process using Mutual Information to identify the most informative features for cyberattack detection. It takes the preprocessed dataset and corresponding class labels as input and computes the Mutual Information score for each feature, quantifying its dependency with the target variable. Features are then ranked based on their scores, and the top-kkk features are selected to reduce dimensionality while preserving predictive power. This step enhances model efficiency and accuracy by focusing training on the most relevant attributes, thereby improving the performance of the CyberDetect-MLP model.

We measure the relevance of these features for IoT intrusion detection using a mutual-information (MI) based feature selection procedure to reduce their high dimensionality. During every cross-validation fold or train–test split, MI scores are only calculated on the training subset to avoid any risk of information leakage. Subset of features selected from training data are then used on the corresponding test samples. It guarantees that you are not peeking at the test set by having any of the test set information influence the feature selection process, keeping the evaluation clean.

Algorithm 3 details the training and optimization process of the CyberDetect-MLP model. Starting with the initialization of a multilayer perceptron architecture incorporating dense layers, ReLU activations, batch normalization, and dropout regularization, the algorithm employs the categorical cross-entropy loss function optimized via the Adam optimizer. A learning rate scheduler based on cosine annealing dynamically adjusts the learning rate during training to facilitate efficient convergence. Training proceeds over multiple epochs with mini-batch gradient descent, where forward propagation computes layer outputs and backpropagation updates model weights. Validation after each epoch guides early stopping to prevent overfitting, and the best-performing model parameters are saved for deployment. This algorithm ensures scalable and accurate cyberattack classification within the big data framework.

Algorithm 4 describes the real-time cyberattack detection and alert generation process within the deployed framework. It begins by receiving preprocessed IoT data streams, which are normalized and feature-selected based on prior computations. The data is then passed through the trained CyberDetect-MLP model for inference, where forward propagation produces class probabilities for attack types. The algorithm classifies inputs by selecting the highest probability class and logs detection results, including timestamps and confidence scores, into a scalable NoSQL database. Critical detections trigger immediate alerts to system administrators, enabling timely responses. This algorithm ensures continuous, low-latency threat monitoring and effective incident management in dynamic IoT environments.

The proposed CyberDetect-MLP framework enables end-to-end intrusion detection via a structured pipeline: distributed data ingestion, pre-processing, and feature engineering based on Spark, mutual information–based feature selection, and optimized MLP training with batch normalization, dropout, and cosine annealing scheduling, as well as real-time inference with XAI-based interpretation (Grad-CAM and SHAP). To facilitate the understanding of the overall workflow, each algorithmic stage is explained — along with the associated steps.

The computational complexity of the framework is linear with respect to the dataset size n and feature dimension m. Spark preprocessing and feature selection require O(n⋅m) time, distributed across worker nodes. For the MLP, forward and backward propagation have a cost of $\:O\left({\sum\:}_{i=1}^{L}{h}_{i-1}\:.\:{h}_{i}\right)$, where L is the number of layers and h_i are neurons per layer; space complexity is of the same order. Inference per sample is lightweight, enabling near real-time detection. This analysis confirms that CyberDetect-MLP scales efficiently for large-scale IoT deployments.

Evaluation methodology

The systematic evaluation of cyberattack detection performance on the TON_IoT dataset with the proposed CyberDetect-MLP model. For a robust and fair assessment, we perform stratified sampling such that the distribution of attack classes within train, validation, and test sets is proportional to that in the dataset. To avoid any data leakage and to simulate the deployment condition, the testing set is never seen during the training and hyperparameter tuning phases of any model.

The performance metrics chosen for evaluation are accuracy, precision, recall, F1-score, and ROC curve (the area under the Receiver Operating Characteristic curve). Accuracy provides a better understanding of the model’s tendency, indicating how accurate the model is overall. This is achieved through the equation given in Eq. 7.

$$\:Accuracy\:=\frac{TP+TN}{TP+TN+FP+FN}$$

(7)

where TP is true positives, TN true negatives, FP is false positives, FN is false negatives. Precision measures the number of true positive predictions out of all positive predictions made, which can be defined as in Eq. 8.

$$\:Precision\:=\:\frac{TP}{TP+FP}$$

(8)

Recall, which is also known as sensitivity, or true positive rate, is the ratio of correctly predicted positive observations to actual positives and is defined as in Eq. 9.

$$\:Recall\:=\:\frac{TP}{TP+FN}$$

(9)

The F1-score, which strikes a balance between precision and recall, considers false positives and false negatives, and is defined as in Eq. 10.

$$\:F1-score=2\times\:\frac{\:Precision\times\:\:Recall\:}{\:Precision+\:Recall\:}$$

(10)

The other metric, ROC-AUC, measures the model’s capacity to separate classes with different thresholds. The greater the AUC, the more reducible the attack and normal instances are.

Evaluation of the testing set for all metrics has been taken into account to verify the generalization ability of CyberDetect-MLP. To better evaluate the performance of the proposed model, we also compare it with baseline models, where their Random Forest and XGBoost models are trained on the same dataset and preprocessing pipeline. We present a comparative report of performance across all chosen metrics, proposing an optimal deep-learning framework. Based on this, we justify the advantages of our approach in terms of detection accuracy, robust handling of class imbalance, and scalability for large-scale IoT-based cybersecurity applications.

Deployment strategy and alert generation

After training and validating the CyberDetect-MLP model, the framework proceeds to deployment for real-time cyberattack detection and alert generation within IoT environments. The deployment pipeline integrates the trained model into the distributed significant data infrastructure to ensure scalability, low-latency inference, and seamless integration with existing IoT systems.

The incoming IoT data streams, either simulated through Apache Kafka or sourced from real-time telemetry devices, are first preprocessed in a manner consistent with the training pipeline. This ensures that feature distributions remain uniform, avoiding inference drift. The preprocessed feature vectors are passed to the deployed CyberDetect-MLP model for real-time classification. Given the lightweight architecture of the model and optimization through feature selection, inference is achieved within millisecond latency, making it suitable for rapid attack identification in dynamic IoT environments.

Detection results, consisting of predicted attack classes and corresponding confidence scores, are stored in a distributed NoSQL database such as MongoDB or Apache Cassandra. This enables scalable logging of cybersecurity events, allowing retrospective analysis and auditing. Each detected attack instance triggers an automated alert mechanism that updates an administrator dashboard with incident details, including device ID, attack type, timestamp, and severity score. For critical attacks such as ransomware or DDoS, the system is configured to generate high-priority alerts, optionally integrating with existing network management systems for automated containment actions.

Furthermore, an optional explainable AI (XAI) module can be integrated to provide human-interpretable reasons behind each classification decision. Using SHAP (Shapley Additive exPlanations) values, the system can highlight the most influential features contributing to a specific attack prediction. This enhances administrator trust and supports informed incident response planning. The combined deployment of CyberDetect-MLP within a distributed big data framework ensures that the system can maintain high detection accuracy, rapid response times, and scalable performance even as the volume and velocity of IoT data continue to increase.

Experimental results

This section reports the experimental evaluation of the proposed CyberDetect-MLP framework using the full TON_IoT dataset⁹³. It includes the setup configuration, performance comparison with baseline models, visualization-based analysis, ablation study, and explainability validation. The results demonstrate the framework’s high accuracy, scalability, and interpretability, highlighting its effectiveness for real-world IoT cyberattack detection scenarios.

Experimental setup

The experimental setup is designed to evaluate the effectiveness, scalability, and reproducibility of the proposed CyberDetect-MLP framework for detecting IoT cyberattacks. All experiments are conducted on a distributed computing environment comprising a five-node Hadoop-Spark cluster. Each node is configured with Intel Xeon processors, 64 GB of RAM, and Ubuntu 20.04 LTS, and is connected over a high-speed Gigabit Ethernet network. Apache Hadoop 3.3.2 and Apache Spark 3.4.0 are deployed for storage and distributed processing. The CyberDetect-MLP model is implemented using TensorFlow 2.13 integrated with PySpark for seamless data loading and training across the Spark environment. NoSQL storage and alert modules are deployed using MongoDB 6.0, while Apache Kafka 3.5 is used to simulate real-time data streams.

The prototype application workflow starts by ingesting data from the TON_IoT dataset, including telemetry, network, and OS log data, into HDFS. Ingestion is handled via Kafka topics (for simulated real-time input) and Flume agents (for batch logs). All preprocessing steps, including null value imputation, categorical encoding, and Min-Max normalization, are executed using Spark DataFrames and written back to HDFS for training. Feature selection is performed using Spark’s Mutual Information feature selector, and the top 30 features are retained based on empirical validation. Table 2 summarizes the experimental setup, including the dataset, tools, cluster configuration, and hyperparameter settings to ensure reproducibility and scalability.

Table 2 Experimental setup summary for CyberDetect-MLP framework.

Full size table

To ensure replicability, hyperparameter tuning is carried out using randomized search over the following ranges: number of neurons per dense layer in {128, 256, 512}, dropout rates in {0.2, 0.3, 0.5}, batch sizes in {64, 128}, and learning rates in {1e-3, 5e-4, 1e-4}. The learning rate follows a cosine annealing schedule for adaptive decay. Training is performed for a maximum of 100 epochs, with early stopping enabled (patience = 10 epochs), based on the validation F1-score. Each experiment is repeated three times to account for stochastic variations, and the final model parameters are selected based on the best validation performance. The complete codebase, cluster configuration, and tuning scripts are documented for reproducibility and can be made available upon request to support open research.

For training the model, a learning rate of 0.001 was used with the Adam optimizer and a batch size of 64 with ReLU activations. Regularization, dropout (0.3), and batch normalization. We fixed the random seeds for all experiments in order to enable reproducibility. An 80/20 stratified train–test split was used to retain class distribution as it is in the sets.

Exploratory data analysis

This section presents the exploratory data analysis (EDA) conducted on the TON_IoT dataset to understand its structure, feature distribution, and class imbalance. Various visualizations were generated to reveal patterns, correlations, and anomalies in the data, helping inform feature engineering and model design decisions. The EDA ensures data quality and highlights critical characteristics relevant to cyberattack detection and prevention.

Figure 4 presents a comprehensive exploratory data analysis of the TON_IoT dataset using realistic feature names to evaluate its structure and suitability for cyberattack detection. Subfigure (a) shows the class distribution across five attack types, highlighting class imbalance commonly found in real-world IoT security data. Subfigure (b) ranks the top 10 features by Mutual Information scores, with logged_in, dst_host_srv_count, and srv_count emerging as the most relevant for classification. Subfigure (c) provides a PCA-based 2D projection of the feature space, illustrating how different attack classes form distinct clusters with some overlap, suggesting potential for separation through nonlinear learning. Subfigure (d) visualizes the distribution of the top-ranked feature logged_in across attack classes using a boxplot, indicating feature-wise variability and class-dependent behavior. Collectively, these visualizations support the model’s design choices and justify the preprocessing and feature selection strategies adopted in the proposed framework.

Performance analysis

This section provides a comprehensive performance analysis of the proposed CyberDetect-MLP framework. It includes quantitative comparisons with baseline models using key metrics such as accuracy, precision, recall, and F1-score. Visualizations, including metric-wise plots and confusion matrices, illustrate the model’s robustness and detection capabilities. The analysis validates the framework’s effectiveness in identifying diverse cyberattacks in IoT environments.

Figure 5 illustrates the training and validation accuracy dynamics of CyberDetect-MLP over 30 epochs. Both curves show a consistent upward trend, with minimal divergence, converging near 98.87% accuracy. This indicates effective learning, minimal overfitting, and strong generalization. The smooth convergence validates the model’s robustness and the suitability of its hyperparameters and regularization strategies.

Figure 6 illustrates the training and validation loss progression for CyberDetect-MLP across 30 epochs. Both curves demonstrate a steady decline, with training loss reducing from 1.5 to below 0.2 and validation loss closely following. The absence of divergence between the two curves confirms effective optimization, controlled generalization error, and stability of the learning process through regularization and early stopping.

Figure 7 presents confusion matrices for four models on the TON_IoT dataset. CyberDetect-MLP (d) demonstrates superior classification performance with minimal misclassifications across all attack classes. In contrast, Random Forest (a) and XGBoost (b) show higher confusion between certain classes. Vanilla MLP (c) performs well but is outperformed by the proposed model in overall accuracy and consistency.

Table 3 Performance comparison of proposed model with baseline classifiers on the TON_IoT dataset.

Full size table

Table 3 presents a comparative analysis of the proposed CyberDetect-MLP model against baseline classifiers on the TON_IoT dataset. The proposed model outperforms Random Forest, XGBoost, and a vanilla MLP across all evaluation metrics, achieving 98.87% accuracy and 99.10% ROC-AUC, which demonstrates its effectiveness, robustness, and suitability for scalable IoT cyberattack detection in large-scale data environments.

Figure 8 provides a comparative performance analysis of the proposed CyberDetect-MLP model against baseline classifiers across five key evaluation metrics: accuracy, precision, recall, F1-score, and ROC-AUC. In subfigure (a), CyberDetect-MLP achieves the highest accuracy of 98.87%, outperforming XGBoost (96.35%), vanilla MLP (97.12%), and Random Forest (94.21%), indicating its strong overall predictive capability. Subfigure (b) shows the precision values, where CyberDetect-MLP reaches 98.74%, reducing false positives more effectively than other models. Subfigure (c) displays recall, with CyberDetect-MLP achieving 98.62%, demonstrating its ability to detect the majority of actual attack instances and minimizing false negatives, critical in cybersecurity scenarios. In subfigure (d), the F1-score, which balances precision and recall, is highest for CyberDetect-MLP at 98.68%, indicating superior classification consistency. Subfigure (e) presents the ROC-AUC values, where the proposed model again leads with 99.10%, confirming its excellent discriminatory power between benign and malicious activity. These results collectively highlight the effectiveness and robustness of the CyberDetect-MLP framework, particularly its ability to generalize well across imbalanced and complex IoT attack data. The consistent margin of improvement across all metrics validates the benefit of incorporating feature selection, architectural optimization, and learning rate scheduling in the proposed model.

Statistical significance analysis

To ensure the robustness and reliability of the proposed CyberDetect-MLP model, we performed statistical significance testing to compare the CyberDetect-MLP with state-of-the-art baseline models for multiple experimental runs. For each key evaluation metric (Accuracy, Precision, Recall, F1 score, ROC-AUC), a paired t-test (or Wilcoxon signed-rank test may have been used) was performed (with 10 independent runs). This analysis makes certain that the changes we see are not simply random.

Table 4 Statistical significance testing of CyberDetect-MLP vs. baselines.

Full size table

Statistical significance analysis is shown in Table 4, comparing CyberDetect-MLP with the best-performing baseline among each of the multiple runs. Summary statistics are reported as mean ± standard deviation (M ± SD) and the associated p-values. P-value < 0.05 (highlighted entry colour), indicating that CyberDetect-MLP outperforms the baselines, is significant and reproducible across experiments.

The F1-score distribution of the CyberDetect-MLP and the best baseline over 10 independent runs is shown in Fig. 9. As the boxplot shows, CyberDetect-MLP has higher median values with less variance than the rest, which suggests that our method is more performant and stable. The visual analysis corroborates the statistical results reported in Table 4, validating the robustness and reliability of the proposed model.

Ablation study

To assess the individual contribution of each architectural and optimization component in CyberDetect-MLP, we conducted an ablation study by incrementally disabling or replacing critical modules. The experiments were performed on the TON_IoT dataset using identical training setups to ensure fairness. Table 5 summarizes the impact of each modification on classification performance.

Table 5 Ablation study on CyberDetect-MLP components.

Full size table

The complete CyberDetect-MLP model achieves the highest performance, validating the collective benefit of its optimized components and notably, removing feature selection results in the most significant drop in performance (− 2.65%), underscoring its importance in eliminating noisy or redundant input variables. Similarly, the absence of dropout and batch normalization results in noticeable degradation, affirming their roles in regularization and training stability. Without the learning rate scheduler, performance dips by ~ 2.4%, indicating the importance of adaptive learning in accelerating convergence and avoiding suboptimal minima. The raw feature variant performs the worst, reaffirming the necessity of a comprehensive preprocessing pipeline.

Figure 10 presents a comprehensive evaluation of the ablation study for CyberDetect-MLP by comparing performance across four metrics: accuracy (a), precision (b), recall (c), and F1-score (d). The complete model consistently achieves the highest values, with 98.87% accuracy, 98.74% precision, 98.62% recall, and 98.68% F1-score, confirming the synergistic effect of its optimized components.

The variant without feature selection exhibits the sharpest performance drop across all metrics, particularly in recall (95.34%) and F1-score (95.61%), highlighting the importance of eliminating irrelevant or noisy attributes. Removing batch normalization and dropout also leads to noticeable declines, suggesting their critical role in ensuring regularization and stable gradient flow. The learning rate scheduler proves equally vital; its absence leads to suboptimal convergence, resulting in a reduction in accuracy to 96.43%.

The variant trained on raw, unnormalized features performs the worst, reaffirming the necessity of preprocessing in big data environments. The consistent gap between the complete model and ablated versions across all metrics validates the importance of integrating feature engineering, normalization, dropout, batch normalization, and learning rate scheduling for scalable and accurate cyberattack detection in IoT environments. This analysis confirms that each enhancement is non-trivial and makes a meaningful contribution to the final system’s performance.

Explainability analysis

To ensure that the CyberDetect-MLP model offers not only high predictive performance but also interpretability, we incorporated explainability techniques to enhance our understanding of its decision-making process. In particular, we employed Gradient-weighted Class Activation Mapping (Grad-CAM) adapted for multi-layer perceptrons using feature-space gradients. This enables us to visualize which input features most significantly influenced the model’s decisions for specific cyberattack classifications. Figure 10 illustrates model interpretability using Grad-CAM and t-SNE. It shows focused attention for correct predictions, dispersed attention in errors, and well-separated class clusters in the projected feature space.

We generated Grad-CAM heatmaps for both correctly and incorrectly classified samples across multiple attack categories. In the correctly classified instances, the model’s attention was tightly focused on semantically relevant and high-impact features, such as dst_host_srv_count, src_bytes, and is_login_successful. These regions of high activation strongly aligned with known indicators of malicious behavior, validating the model’s capability to learn meaningful patterns.

Figure 11 contrast, for the misclassified cases, the heatmaps showed dispersed and diluted attention across lower-weighted or less informative features, such as num_compromised and urgent. This suggests that such errors could be attributed to data imbalance or inherent feature ambiguity.

We also visualized decision boundaries by projecting high-dimensional feature embeddings into a 2D space using t-SNE. The clusters of benign and various attack types (e.g., DoS, Reconnaissance, Backdoor) revealed that CyberDetect-MLP maintains clear separability between classes in most cases. At the same time, overlaps in a few borderline regions explain occasional misclassifications.

These visual interpretations support that the model does not behave as a black box. Instead, its attention maps qualitatively align with the region of interest (ROI) areas that are consistent with expert cybersecurity understanding. This contributes to the trustworthiness and potential real-world deployability of CyberDetect-MLP in critical IoT infrastructures.

For having faithful interpretability, we mainly use Integrated Gradients⁹⁴ and SHAP⁹⁵, which are valid for most of the tabular and fully connected neural models. Although Grad-CAM was initially proposed for convolutional architectures⁹⁶, some attribution studies^97,98 adapt a similar gradient-based relevance propagation for dense layers.

Performance comparison with existing methods

This section presents a comparative evaluation of CyberDetect-MLP against existing machine learning and deep learning-based intrusion detection methods. Key performance metrics are analyzed to highlight improvements in accuracy, scalability, and interpretability. A tabular and graphical comparison demonstrates that the proposed framework outperforms state-of-the-art models, validating its effectiveness and novelty for cyberattack detection in large-scale IoT environments.

Table 6 Performance comparison of CyberDetect-MLP with baseline models in IoT cyberattack detection.

Full size table

Table 6 presents a comparative evaluation of CyberDetect-MLP against five benchmark models. The proposed model achieves the highest accuracy (98.87%) on the full TON_IoT dataset, with notable advantages in scalability and explainability. Unlike prior models with limited interpretability or dataset coverage, CyberDetect-MLP integrates Grad-CAM and an optimized MLP architecture, making it more suitable for real-world IoT deployments.

Figure 12 illustrates the comparative accuracy performance of CyberDetect-MLP alongside five benchmark models widely cited in IoT cybersecurity research. Among the existing approaches, Vinayakumar et al.’s deep CNN-RNN hybrid performs notably well with 97.01% accuracy, while Shone et al.’s autoencoder-DNN model and Lopez-Martin et al.’s CVAE approach achieve 96.21% and 95.03%, respectively. Alrashdi et al., using classical machine learning models such as SVM and RF, achieve 94.67% accuracy on a subset of the TON_IoT dataset. Ferrag et al.’s review encompasses multiple models, with a reported accuracy range of 94–97%, which is approximated here at 95.5%. In contrast, the proposed CyberDetect-MLP, trained on the full TON_IoT dataset and optimized for both depth and regularization, achieves the highest recorded accuracy of 98.87%. This superior result is attributed to the model’s adaptive learning dynamics, advanced preprocessing pipeline, and explainable architecture. The figure illustrates the empirical benefits of integrating big data frameworks with deep learning, highlighting CyberDetect-MLP’s robustness for real-time, scalable intrusion detection in IoT environments.

Table 7 Qualitative comparison of proposed CyberDetect-MLP with existing IoT IDS works (including TON_IoT).

Full size table

CyberDetect-MLP is qualitatively compared with the recent works on IoT IDS over the TON_IoT or similar datasets in terms of the dataset used, performance improvements, and machine learning detection approach in Table 7. It emphasizes each paper’s approach, contributions, and areas of focus rather than numeric metrics. The comparison highlights the contribution of CyberDetect-MLP in achieving big data scalability with explainable AI through the integrated optimized MLP architecture that overcomes the gaps identified in terms of interpretability, distributed processing, and real-world IoT deployment.

Real-time inference latency analysis

To verify the real-time detection capability of CyberDetect-MLP, we assessed CyberDetect-MLP inference latency and system response under different data loads based on the TON_IoT streaming setup. Experiments: On a hardware and Spark-enabled cluster, measuring inference time per sample, average system response delay end-to-end, and what throughput you can achieve. Different streaming rates of events per second (10,000–100,000) were considered during evaluations to account for real IoT deployments.

This is summarized in Table 8, which shows the metrics we observe from time to time during our experiment at the latency and throughput level. These results suggest that CyberDetect-MLP, even when facing a heavy streaming load, provides consistently low-latency predictions with high throughput.

Table 8 Real-time latency and throughput metrics of CyberDetect-MLP.

Full size table

Results show that CyberDetect-MLP incurs sub-millisecond inference delay per sample and less than 12 ms of end-to-end response time of the system, even at high streaming rates of up to 100,000 events/s. Incremental Data Load and Continuous Integration for Operationalization of Big Data Using Spark-Based Distributed Architecture, Horizontal Scalability to Make the System Suitable for Real-time IoT Deployments and Handling High-Variable Traffic in Smart City & Factory Environments.

The inference time per sample and the total response delay experienced by the system, as a function of the streaming rate, are shown in Fig. 13. These trends indicate that both metrics increase steadily with increasing traffic load and remain within ranges acceptable for real-time IoT workloads. This visualisation demonstrates that, even under a high-volume data stream, CyberDetect-MLP is both scalable and responsive.

Scalability and resource utilisation benchmarks

Finally, to further verify the scalability of CyberDetect-MLP for large-scale IoT deployments, we conducted empirical benchmarking on a Spark-based distributed setup, gradually increasing the workload using the TON_IoT data stream. Its purpose was to measure processing throughput and the use of system resources across different nodes of the cluster, ensuring real-time detection capabilities. Throughput, CPU, GPU, and memory utilisation were observed and summarised during three workload scenarios in Table 9. Each line shows the number of events processed per second and the associated average hardware usage across the cluster for several scenarios.

Table 9 Scalability and resource utilization metrics for CyberDetect-MLP.

Full size table

As shown in Sect. 4.8, the results are also low latency and exhibit nearly linear scalability of CyberDetect-MLP with workload increases. CPU and GPU usage scale with each other but are still well below saturation, even at the highest streaming rates, and memory sits comfortably below 80%. The benchmarks validate that the proposed architecture processes high-rate IoT traffic with optimal resource utilization. The Spark-based pipeline is horizontally scaled-out by adding machines, getting even more throughput with little marginal performance loss.

Throughput and hardware utilisation trends with increasing data load are shown in Fig. 14 for CyberDetect-MLP scalability. Results with clearly observed linear throughput growth are confirmed by concurrent usage measurements, which indicate near-linear CPU and GPU scaling that remains sub-saturated. Our benchmarks show that the architecture is tunable based on the resources required for real-time, large-scale IoT platforms, while keeping resource use balanced.

Cross-dataset validation and robustness evaluation

To assess CyberDetect-MLP’s ability to handle datasets beyond TONIOT, we conducted cross-dataset validation. This involved combining the two additional IoT IDS benchmarks—UNSW-NB15 and BoT-IoT—with the current work. The UME2 and UME3 datasets were unseen by the model during training (though one of them was used for pretraining), and were sampled from different domains. In contrast, the model/full dataset were sampled differently from their respective distributions. Direct evaluation on these datasets without any retraining (and/or feature realignment) simulates a realistic domain-shift condition for the model only being trained on TON_IoT.

Table 10 Cross-dataset validation results for CyberDetect-MLP (trained on TON_IoT).

Full size table

Table 10. The results show that the CyberDetect-MLP baseline suffers a performance drop when tested out of domain (due to distribution and protocol differences); however, it can still offer competitive detection performance compared to its peers against CyberAttack in heterogeneous IoT environments. It shows the strength and flexibility of the proposed framework. Our future work will focus on more sophisticated domain adaptation and transfer learning approaches to reduce the impact of domain shifts and enhance generalisation across environments.

Cross-dataset evaluation of CyberDetect-MLP, trained on TON IoT and tested on UNSW NB15 and BoT IoT (Results in Fig. 15). Compared to in-domain performance, we observe a slight decline in accuracy and F1-score, indicating the effects of domain-shift. Nevertheless, our model continues to deliver a high detection rate, confirming its robustness and versatility across varying IoT network conditions.

Discussion

The field of cyberattack detection in IoT environments has gained substantial attention due to the increasing volume, velocity, and variety of data generated by connected devices. Traditional machine learning approaches, while helpful, often fail to scale with real-time traffic, exhibit reduced accuracy in high-dimensional datasets, and lack interpretability, mainly when applied to heterogeneous IoT settings. Deep learning-based models have shown promise in overcoming some of these challenges; however, existing frameworks often fall short in addressing the significant issues of data context, explainability, and deployment scalability.

This study bridges these gaps by proposing CyberDetect-MLP, a big-data-enabled, optimised deep learning framework tailored for scalable, explainable cyberattack detection in IoT networks. Unlike prior works that either rely on shallow models or narrowly scoped datasets, the proposed system leverages the full TON_IoT dataset. It integrates Apache Spark for distributed preprocessing and model training, ensuring responsiveness in large-scale settings. The model incorporates architectural optimizations in the MLP structure to improve generalization, and its integration with Grad-CAM enables interpretability—a key novelty compared to black-box deep models.

We report per-class performance metrics to instill confidence that class imbalance handling is effective, and, as shown in the table, we see consistent improvement in minority classes. Although minority attack types pose challenges in rare scenarios, the application of class weighting combined with controlled resampling provided more balanced predictions amongst classes and less bias in predicting majority classes.

Experimental results demonstrate that CyberDetect-MLP outperforms existing baseline models in terms of accuracy, scalability, and explainability. The model achieved an accuracy of 98.87%, indicating robust detection capability even in noisy and imbalanced conditions. Performance comparisons, ablation studies, and visualization-supported analysis further confirm the effectiveness of the proposed methodology. By addressing limitations of scalability, explainability, and real-world applicability found in state-of-the-art solutions, this research contributes a practically viable and theoretically sound framework for IoT-centric security analytics. The implications of this work extend to secure smart city deployments, industrial IoT monitoring, and adaptive threat response systems, where both high performance and interpretability are crucial. The discussion of study-specific limitations is presented separately in Sect. 5.1 to maintain focus on the strengths and outcomes of the current research.

Limitations of the study

Despite the strong performance of CyberDetect-MLP, the study has certain limitations. First, although the model is trained on the full TON_IoT dataset, it has not been validated on cross-domain datasets, such as CIC-IDS2017 or UNSW-NB15, which may limit its generalizability. Second, the current implementation uses offline evaluation; real-time performance under continuous streaming conditions is yet to be assessed. Third, although Grad-CAM provides interpretability, it is primarily designed for convolutional architectures and may offer limited insights for MLP-based decisions. These limitations open opportunities for further refinement and extension in future work.

Conclusion and future work

We then proposed CyberDetect-MLP, a fast, scalable, and interpretable IDS framework fit for the resource-constrained nature of IoT. Courtesy of big data pipelines (Kafka/Flume-HDFS-Spark) and an advanced MLP design, it overcomes the main downsides of current IDS approaches, including poor scalability and transparency, and ineffective mitigation of high-dimensional IoT traffic. With respect to accuracy, recall, and F1-score, results on the TON_IoT dataset show that our approach outperforms several standard baselines. Model interpretability, as demonstrated by Grad-CAM and SHAP, improved trust in the model among administrators and in their response to incidents. Although this research was based on one specific dataset, our work can be further extended to include a cross-dataset cross-validation suite, as well as mitigation strategies for class imbalance and concept drift benchmarking. Quantitative evaluations of real-time inference latency and scalability over edge–fog deployments are also slated. This framework, demonstrated to be practical and reproducible, provides an IDS solution in a tool framework for securing IoT ecosystems, paving the way for future work that integrates explainable and scalable AI into next-generation cyber defense systems.

Looking ahead, several extensions can further improve the proposed system. First, validating the model across diverse datasets such as CIC-IDS2017 and UNSW-NB15 will test its cross-domain generalizability. Second, deploying the system in a real-time streaming environment using tools like Apache Kafka will evaluate its operational feasibility under live traffic. Third, integrating additional XAI methods suited for non-convolutional models may improve interpretability and decision transparency. Furthermore, adapting the framework to edge-computing scenarios or federated learning environments could ensure privacy-preserving and resource-efficient deployments. These directions offer rich potential for making CyberDetect-MLP a practical tool in next-generation IoT cybersecurity solutions.

Data availability

Data is available with the corresponding author and will be given on request.

Materials availability

Materials used in this research are available with corresponding author and given on request.

Code availability

The complete implementation of the proposed CyberDetect-MLP framework, including data preprocessing, model training, and evaluation scripts, is publicly available at: https://github.com/upender0123/CyberDetect-MLP.

References

KrishnashreeAchuthan, S. S., Roy, S. & Raman, R. Integrating sustainability into cybersecurity: insights from machine learning based topic modeling. Discover Sustainability. 6(44), 1–20. https://doi.org/10.1007/s43621-024-00754-w (2025).
Article Google Scholar
Nadhir, A. M., BeggasMounira, LaouidAbdelkadera & Hammoudeh, M. Enhancing Cybersecurity in Healthcare IoT Systems Using Reinforcement Learning. Transportation Research Procedia. 84, 113–120. https://doi.org/10.1016/j.trpro.2025.03.053 (2025).
Article Google Scholar
Lai, T., Farid, F., Bello, A. & Sabrina, F. Ensemble learning based anomaly detection for IoT cybersecurity via Bayesian hyperparameters sensitivity analysis. Cybersecurity. 7(44), 1–18. https://doi.org/10.1186/s42400-024-00238-4 (2024).
Article Google Scholar
SankaramoorthyMuthubalaji, N. K. et al. An intelligent big data security framework based on aefs-kenn algorithms for the detection of cyber-attacks from smart grid systems. Big Data Mining and Analytics. 7(2), 399–418. https://doi.org/10.26599/BDMA.2023.9020022 (2024).
Article Google Scholar
Hussen, N., Elghamrawy, S. M., Salem, M. & El-Desouky, A. I. A fully streaming big data framework for cyber security based on optimized deep learning algorithm. IEEE Access. 11, 65675–65688. https://doi.org/10.1109/ACCESS.2023.3281893 (2023).
Article Google Scholar
Faheem Ahmad, Shafiqul Abidin, Imran Qureshi, and Mohammed Ishrat. Big Data and Its Role in Cybersecurity. International Conference on Innovations in Data Analytics. 1–14 (2022).
VindenWylde, N. R. et al. Cybersecurity, data privacy and blockchain: A review. SN Computer Science. 3(127), 1–12. https://doi.org/10.1007/s42979-022-01020-4 (2022).
Article Google Scholar
Bravos, George et al. Cybersecurity for industrial internet of things: architecture, models and lessons learned. IEEE Access 10, 124747–124765. https://doi.org/10.1109/ACCESS.2022.3225074 (2022).
Article Google Scholar
Sarker, I. H. Deep cybersecurity: A comprehensive overview from neural network and deep learning perspective. SN Comput. Sci. 10.1007/s42979-021-00535-6 (2021).
Ferrag, M. A., OthmaneFriha, L. M., Janicke, H. & Shu, L. Federated deep learning for cyber security in the internet of things: Concepts, applications, and experimental analysis. IEEE Access. 9, 138509–138542. https://doi.org/10.1109/ACCESS.2021.3118642 (2021).
Article Google Scholar
Elsisi, M. et al. Towards Secured Online Monitoring for Digitalized GIS Against Cyber-Attacks Based on IoT and Machine Learning. IEEE Access https://doi.org/10.1109/access.2021.3083499 (2021).
Article Google Scholar
Panda, M., Mousa, A. A. A. & Hassanien, A. E. Developing an Efficient Feature Engineering and Machine Learning Model for Detecting IoT-Botnet Cyber Attacks. IEEE Access 9, 91038–91052. https://doi.org/10.1109/access.2021.3092054 (2021).
Article Google Scholar
Usman, N. et al. Intelligent Dynamic Malware Detection using Machine Learning in IP Reputation for Forensics Data Analytics. Futur. Gener. Comput. Syst. 118, 124–141. https://doi.org/10.1016/j.future.2021.01.004 (2021).
Article Google Scholar
Yahyaoui, A., Lakhdhar, H., Abdellatif, T., & Attia, R. Machine learning based network intrusion detection for data streaming IoT applications. 2021 21st ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter) https://doi.org/10.1109/snpdwinter52325.2021.00019 (2021).
Latif, Shahid et al. Intrusion detection framework for the internet of things using a dense random neural network. IEEE Transact. Industrial Informat. 18(9), 6435–6444. https://doi.org/10.1109/TII.2021.3130248 (2022).
Article Google Scholar
Liu, J., Yang, D., Lian, M. & Li, M. Research on Intrusion Detection Based on Particle Swarm Optimization in IoT. IEEE Access 9, 38254–38268. https://doi.org/10.1109/access.2021.3063671 (2021).
Article Google Scholar
Mishra, N. & Pandya, S. Internet of Things Applications, Security Challenges, Attacks, Intrusion Detection, and Future Visions: A Systematic Review. IEEE Access 9, 59353–59377. https://doi.org/10.1109/access.2021.3073408 (2021).
Article Google Scholar
Vitorino, João., Andrade, Rui, Praça, Isabel, Sousa, Orlando & Maia, Eva. A comparative analysis of machine learning techniques for IoT intrusion detection. Internat. Sympos. Foundat. Pract. Secur. https://doi.org/10.1007/978-3-031-08147-7_13 (2022).
Article Google Scholar
Sarhan, M., Layeghy, S. & Portmann, M. Evaluating standard feature sets towards increased generalisability and explainability of ML-based network intrusion detection. Big Data Research. 30, 1–12. https://doi.org/10.1016/j.bdr.2022.100359 (2021).
Article Google Scholar
Joseph Bamidele AWOTUNDE, and Sanjay MISRA. Feature extraction and artificial intelligence-based intrusion detection model for a secure internet of things networks. Illuminat. Artific. Intelligence cybersecur. forensics. 1–24 (2022).
Roy, S., Li, J., Choi, B.-J. & Bai, Y. A lightweight supervised intrusion detection mechanism for IoT networks. Futur. Gener. Comput. Syst. 127, 276–285. https://doi.org/10.1016/j.future.2021.09.027 (2022).
Article Google Scholar
Abbas, A. et al. A new ensemble-based intrusion detection system for internet of things. Arab. J. Sci. Eng. 47, 1805–1819. https://doi.org/10.1007/s13369-021-06086-5 (2022).
Article Google Scholar
Ravi, Vinayakumar & Chaganti, Rajasekhar. Deep learning feature fusion approach for an intrusion detection system in SDN-based IoT networks. IEEE Internet Things Magazine 5(2), 24–29. https://doi.org/10.1109/IOTM.003.2200001 (2022).
Article Google Scholar
Ajay Kumar, K., Abhishek, M. R., Ghalib, A. S. & Cheng, X. Intrusion detection and prevention system for an IoT environment. Digital Communications and Networks. 8(4), 540–551. https://doi.org/10.1016/j.dcan.2022.05.027 (2022).
Article Google Scholar
YakubKayodeSaheed, A. A., Misra, S., Holone, M. K. & Colomo-Palaciosc, R. A machine learning-based intrusion detection for detecting internet of things network attacks. Alex. Eng. J. 61(12), 9395–9409. https://doi.org/10.1016/j.aej.2022.02.063 (2022).
Article Google Scholar
Pampapathi, B. M., Nageswara Guptha, M. & Hema, M. S. Towards an effective deep learning-based intrusion detection system in the internet of things. Telematics and Informatics Reports. 7, 1–11. https://doi.org/10.1016/j.teler.2022.100009 (2022).
Article Google Scholar
Jayalaxmi, P. L. S., Saha, R., Kumar, G., Conti, M. & Kim, T.-H. Machine and deep learning solutions for intrusion detection and prevention in IoTs: A survey. IEEE Access. 10, 121173–121192. https://doi.org/10.1109/ACCESS.2022.3220622 (2022).
Article Google Scholar
Muhammad Asif, Sagheer Abbas, M.A. Khan, Areej Fatima, Muhammad Adnan Khan, and Sang-Woong Lee. MapReduce based intelligent model for intrusion detection using machine learning technique. J. King Saud University-Comput. Informat. Sci. 34(10B), 9723-9731. https://doi.org/10.1016/j.jksuci.2021.12.008 (2022).
Article Google Scholar
Mouaad Mohy-Eddine, Azidine Guezzaz, Said Benkirane, MouradeAzrour, and Yousef Farhaoui An ensemble learning based intrusion detection model for industrial IoT security. Big Data Mining and Analytics. 6(3) 273-287 10.26599/BDMA.2022.9020032 (2023).
Wang, S., Wenxiang, Xu. & Liu, Y. Res-TranBiLSTM: An intelligent approach for intrusion detection in the Internet of Things. Comput. Netw. 235, 1–16. https://doi.org/10.1016/j.comnet.2023.109982 (2023).
Article Google Scholar
Fuhua, Huo. Computer network big data detection based on internet of things technology. Measurement: Sensors 33, 1–5. https://doi.org/10.1016/j.measen.2024.101222 (2024).
Article Google Scholar
Zohourian, A., Dadkhah, S., Molyneaux, H., Neto, E. C. P. & Ghorbania, A. A. IoT-PRIDS: Leveraging packet representations for intrusion detection in IoT networks. Comput. Secur. 146, 1–14. https://doi.org/10.1016/j.cose.2024.104034 (2024).
Article Google Scholar
Sarhan, M., Layeghy, S., Moustafa, N., Gallagher, M. & Portmann, M. Feature extraction for machine learning-based intrusion detection in IoT networks. Digital Communications and Networks. 10(1), 205–216. https://doi.org/10.1016/j.dcan.2022.08.012 (2024).
Article Google Scholar
Kaushik, A. & Al-Raweshidy, H. A novel intrusion detection system for internet of things devices and data. Wireless Netw. 30, 285–294. https://doi.org/10.1007/s11276-023-03435-0 (2024).
Article Google Scholar
Singh, A., Chouhan, P. K. & Aujla, G. S. SecureFlow: Knowledge and data-driven ensemble for intrusion detection and dynamic rule configuration in software-defined IoT environment. Ad Hoc Netw. 156, 1–12. https://doi.org/10.1016/j.adhoc.2024.103404 (2024).
Article Google Scholar
Fares, I. A., Elaziz, M. A., Aseeri, A. O., Zied, H. S. & Abdellatif, A. G. TFKAN: Transformer based on Kolmogorov-Arnold Networks for Intrusion Detection in IoT environment. Egyptian Informatics Journal. 30, 1–13. https://doi.org/10.1016/j.eij.2025.100666 (2025).
Article Google Scholar
Ragab, Mahmoud et al. Artificial intelligence driven cyberattack detection system using integration of deep belief network with convolution network on industrial IoT. Alexandria Eng. J. 110, 438–450. https://doi.org/10.1016/j.aej.2024.10.009 (2025).
Article Google Scholar
Aboalela, R. et al. Harnessing feature pruning with optimal deep learning-based distributed denial of service cyberattack detection on IoT environment. Alex. Eng. J. 120, 584–597. https://doi.org/10.1016/j.aej.2025.02.070 (2025).
Article Google Scholar
Olanrewaju-George, Babatunde & Pranggono, Bernardi. Federated learning-based intrusion detection system for the internet of things using unsupervised and supervised deep learning models. Cyber Secur. Applicat. 3, 1–10. https://doi.org/10.1016/j.csa.2024.100068 (2025).
Article Google Scholar
Chen, H. et al. Intrusion detection using synaptic intelligent convolutional neural networks for dynamic Internet of Things environments. Alex. Eng. J. 111, 78–91. https://doi.org/10.1016/j.aej.2024.10.014 (2025).
Article Google Scholar
Hnamte, Vanlalruata & Hussain, Jamal. DCNNBiLSTM: An efficient hybrid deep learning-based intrusion detection system. Telemat. Informat. Rep. 10, 1–13. https://doi.org/10.1016/j.teler.2023.100053 (2023).
Article Google Scholar
Alkhonaini, Mimouna Abdullah et al. Sandpiper optimization with hybrid deep learning model for blockchain-assisted intrusion detection in iot environment. Alexandria Eng. J. 112, 49–62. https://doi.org/10.1016/j.aej.2024.10.032 (2025).
Article Google Scholar
Rajathi, C. & Rukmani, P. Hybrid learning model for intrusion detection system: A combination of parametric and non-parametric classifiers. Alexandria Engineering Journal. 112, 384–396. https://doi.org/10.1016/j.aej.2024.10.101 (2025).
Article Google Scholar
Tsimenidis, S., Lagkas, T. & Rantos, K. Deep Learning in IoT Intrusion Detection. J. Netw. Syst. Manage. 30(8), 1–40. https://doi.org/10.1007/s10922-021-09621-9 (2022).
Article Google Scholar
Tran, Minh-Quang. et al. Reliable deep learning and IoT-based monitoring system for secure computer numerical control machines against cyber-attacks with experimental verification. IEEE Access 10, 23186–23197. https://doi.org/10.1109/ACCESS.2022.3153471 (2022).
Article Google Scholar
Abdalzaher, Mohamed S., Fouda, Mostafa M., Elsayed, Hussein A. & Salim, Mahmoud M. Toward Secured IoT-Based Smart Systems Using Machine Learning. IEEE Access 11, 20827–20841. https://doi.org/10.1109/ACCESS.2023.3250235 (2023).
Article Google Scholar
Lopez, M. M., Shao, S., Hariri, S. & Salehi, S. Machine Learning for Intrusion Detection: Stream Classification Guided by Clustering for Sustainable Security in IoT. Proceedings of the Great Lakes Symposium on VLSI 2023, 691–696. https://doi.org/10.1145/3583781.3590271 (2023).
Article Google Scholar
VanlalruataHnamte, Hong Nhung-Nguyen., Hussain, Jamal & Hwa-Kim, Yong. A novel two-stage deep learning model for network intrusion detection: LSTM-AE. IEEE Access 11, 37131–37148. https://doi.org/10.1109/ACCESS.2023.3266979 (2023).
Article Google Scholar
Jiawei, Du., Yang, Kai, Yanjing, Hu. & Jiang, Lingjie. NIDS-CNNLSTM: Network intrusion detection classification model based on deep learning. IEEE Access 11, 24808–24821. https://doi.org/10.1109/ACCESS.2023.3254915 (2023).
Article Google Scholar
Yi, T., Chen, X., Zhu, Yi., Ge, W. & Han, Z. Review on the application of deep learning in network attack detection. J. Netw. Comput. Appl. 212, 1–15. https://doi.org/10.1016/j.jnca.2022.103580 (2023).
Article Google Scholar
Sydney Mambwe Kasongo. A deep learning technique for intrusion detection system using a Recurrent Neural Networks based framework. Comput. Commun. 199, 113–125. https://doi.org/10.1016/j.comcom.2022.12.010 (2023).
Article Google Scholar
Abdelkhalek, A. & Mashaly, M. Addressing the class imbalance problem in network intrusion detection systems using data resampling and deep learning. Journal of Supercomputing. 79, 10611–10644. https://doi.org/10.1007/s11227-023-05073-x (2023).
Article Google Scholar
SoumyadeepHore, J. G., Shah, A. & Bastian, N. D. A sequential deep learning framework for a robust and resilient network intrusion detection system. Comput. Secur. 144, 1–15. https://doi.org/10.1016/j.cose.2024.103928 (2024).
Article Google Scholar
Maddu, Mamatha & Rao, Yamarthi Narasimha. Network intrusion detection and mitigation in SDN using deep learning models. Internat. J. Informat. Secur. https://doi.org/10.1007/s10207-023-00771-2 (2024).
Article Google Scholar
Aljehane, N. O. et al. Golden jackal optimization algorithm with deep learning assisted intrusion detection system for network security. Alex. Eng. J. 86, 415–424. https://doi.org/10.1016/j.aej.2023.11.078 (2024).
Article Google Scholar
Roshan, K., Zafar, A. & Haque, S. B. U. Untargeted white-box adversarial attack with heuristic defence methods in real-time deep learning based network intrusion Detection System. Comput. Commun. 218, 97–113. https://doi.org/10.1016/j.comcom.2023.09.030 (2024).
Article Google Scholar
Sedjelmaci, H. Cooperative attacks detection based on artificial intelligence system for 5G networks. Comput. Electr. Eng. 91, 107045. https://doi.org/10.1016/j.compeleceng.2021.107045 (2021).
Article Google Scholar
AlZubi, A. A., Al-Maitah, M. & Alarifi, A. Cyber-attack detection in healthcare using cyber-physical system and machine learning techniques. Soft. Comput. 25(18), 12319–12332. https://doi.org/10.1007/s00500-021-05926-8 (2021).
Article Google Scholar
Zaman, S. et al. Security Threats and Artificial Intelligence Based Countermeasures for Internet of Things Networks: A Comprehensive Survey. IEEE Access 9, 94668–94690. https://doi.org/10.1109/access.2021.3089681 (2021).
Article Google Scholar
Iwendi, C., Rehman, S. U., Javed, A. R., Khan, S. & Srivastava, G. Sustainable Security for the Internet of Things Using Artificial Intelligence Architectures. ACM Trans. Internet Technol. 21(3), 1–22. https://doi.org/10.1145/3448614 (2021).
Article Google Scholar
Amaizu, G. C., Nwakanma, C. I., Bhardwaj, S., Lee, J. M. & Kim, D. S. Composite and efficient DDoS attack detection framework for B5G networks. Comput. Netw. 188, 107871. https://doi.org/10.1016/j.comnet.2021.107871 (2021).
Article Google Scholar
Kuzlu, M., Fair, C. & Guler, O. Role of Artificial Intelligence in the Internet of Things (IoT) cybersecurity. Discover Internet of Things https://doi.org/10.1007/s43926-020-00001-4 (2021).
Article Google Scholar
Dhanushkodi, Kavitha & Thejas, S. Ai enabled threat detection: Leveraging artificial intelligence for advanced security and cyber threat mitigation. IEEE Access 12, 173127–173136. https://doi.org/10.1109/ACCESS.2024.3493957 (2024).
Article Google Scholar
Sowmya, T. & Mary Anita, E. A. A comprehensive review of AI based intrusion detection system. Measurement: Sensors 28, 1–13. https://doi.org/10.1016/j.measen.2023.100827 (2023).
Article Google Scholar
Alem, S., Espes, D., Nana, L., Martin, E. & De Lamotte, F. A novel bi-anomaly-based intrusion detection system approach for industry 4.0. Futur. Gener. Comput. Syst. 145, 267–283. https://doi.org/10.1016/j.future.2023.03.024 (2023).
Article Google Scholar
Zeng, H., Yunis, M., Khalil, A. & Mirza, N. Towards a conceptual framework for AI-driven anomaly detection in smart city IoT networks for enhanced cybersecurity. J. Innov. Knowl. 9(4), 1–12. https://doi.org/10.1016/j.jik.2024.100601 (2024).
Article Google Scholar
Baker, Matthew, Fard, Amin Y., Althuwaini, Hassan & Shadmand, Mohammad B. Real-time AI-based anomaly detection and classification in power electronics dominated grids. IEEE J. Emerg. Select. Topics Industr. Electron. 4(2), 549–559. https://doi.org/10.1109/JESTIE.2022.3227005 (2023).
Article Google Scholar
Vishwakarma, M. & Kesswani, N. DIDS: A Deep Neural Network based real-time Intrusion detection system for IoT. Decision Analytics Journal. 5, 1–9. https://doi.org/10.1016/j.dajour.2022.100142 (2022).
Article Google Scholar
Md. Asaduzzaman, and Md. Mahbubur Rahman. An adversarial approach for intrusion detection using hybrid deep learning model. In 2022 International Conference on Information Technology Research and Innovation (ICITRI). 1–7 https://doi.org/10.1109/ICITRI56423.2022.9970221 (2022).
Zhang, Zhibo, Hamadi, Hussam Al, Damiani, Ernesto, Yeun, Chan Yeob & Taher, Fatma. Explainable artificial intelligence applications in cyber security: State-of-the-art in research. IEEE Access. 10, 93104–93139. https://doi.org/10.1109/ACCESS.2022.3204051 (2022).
Article Google Scholar
Attkan, A. & Ranga, V. Cyber-physical security for IoT networks: a comprehensive review on traditional, blockchain and artificial intelligence. Complex & Intelligent Systems. 8, 3559–3591. https://doi.org/10.1007/s40747-022-00667-z (2022).
Article Google Scholar
Markkandeyan, S. et al. Novel hybrid deep learning based cyber security threat detection model with optimization algorithm. Cyber Security and Applications. 3, 1–10. https://doi.org/10.1016/j.csa.2024.100075 (2025).
Article Google Scholar
Gulzar, Qawsar & Mustafa, Khuram. Enhancing network security in industrial IoT environments: a DeepCLG hybrid learning model for cyberattack detection. Internat. J. Mach. Learn. Cybernet. https://doi.org/10.1007/s13042-025-02544-w (2025).
Article Google Scholar
EnerstEdozie, Aliyu Nuhu et al. Artificial intelligence advances in anomaly detection for telecom networks. Artificial Intelligence Review. 58(100), 1–40. https://doi.org/10.1007/s10462-025-11108-x (2025).
Article Google Scholar
Gaggero, Giovanni Battista, Girdinio, Paola & Marchese, Mario. Artificial intelligence and physics-based anomaly detection in the smart grid: A survey. IEEE Access 13, 23597–23606. https://doi.org/10.1109/ACCESS.2025.3537410 (2025).
Article Google Scholar
Vivek Menon, U. et al. AI-powered IoT: A survey on integrating artificial intelligence with IoT for enhanced security, efficiency, and smart applications. IEEE Access. 13, 50296–50339. https://doi.org/10.1109/ACCESS.2025.3551750 (2025).
Article Google Scholar
Halgamuge, M. N. & Niyato, D. Adaptive edge security framework for dynamic IoT security policies in diverse environments. Comput. Secur. 148, 1–28. https://doi.org/10.1016/j.cose.2024.104128 (2025).
Article Google Scholar
FiratKilincer, Ilhan. Explainable AI supported hybrid deep learnig method for layer 2 intrusion detection. Egyptian Informat. J. 30, 1–13. https://doi.org/10.1016/j.eij.2025.100669 (2025).
Article Google Scholar
Dong, Huiyao & Kotenko, Igor. Cybersecurity in the AI era: analyzing the impact of machine learning on intrusion detection. Knowl. Informat. Syst. https://doi.org/10.1007/s10115-025-02366-w (2025).
Article Google Scholar
Alotaibi, M. et al. Hybrid GWQBBA model for optimized classification of attacks in Intrusion Detection System. Alex. Eng. J. 116, 9–19. https://doi.org/10.1016/j.aej.2024.12.057 (2025).
Article Google Scholar
Abdelaziz Amara Korba. Alaeddine Diaf, Mouhamed Amine Bouchiha, Yacine Ghamri-Doudane, Mitigating IoT botnet attacks: An early-stage explainable network-based anomaly detection approach. Comput. Commun. 241, 108270. https://doi.org/10.1016/j.comcom.2025.108270 (2025).
Article Google Scholar
Korba, A. A., Karabadji, N. E., &Ghamri-Doudane, Y. Zero-day botnet attack detection in IoV: A modular approach using isolation forests and particle swarm optimization. arXiv preprint https://doi.org/10.48550/arXiv.2504.18814 (2025).
A. A. Korba, A. Diaf and Y. Ghamri-Doudane, AI-Driven Fast and Early Detection of IoT Botnet Threats: A Comprehensive Network Traffic Analysis Approach, In 2024 International Wireless Communications and Mobile Computing (IWCMC), 1779–1784 https://doi.org/10.1109/IWCMC61514.2024.10592525 (2024).
A. Diaf, A. A. Korba, N. ElislemKarabadji and Y. Ghamri-Doudane, BARTPredict: Empowering IoT Security with LLM-Driven Cyber Threat Prediction. GLOBECOM 2024 – 2024 IEEE Global Communications Conference. 1239–1244 https://doi.org/10.1109/GLOBECOM52923.2024.10901770 (2024).
A. Diaf, A. A. Korba, N. ElislemKarabadji and Y. Ghamri-Doudane, Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks, In 2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), 117–123 https://doi.org/10.1109/DCOSS-IoT61029.2024.00026 (2024).
Labiod, Y., Amara Korba, A. & Ghoualmi, N. Fog Computing-Based Intrusion Detection Architecture to Protect IoT Networks. Wireless Pers. Commun. 125, 231–259. https://doi.org/10.1007/s11277-022-09548-7 (2022).
Article Google Scholar
Khan, I. A., Pi, D., Kamal, S., Alsuhaibani, M. & Alshammari, B. M. Federated-Boosting: A Distributed and Dynamic Boosting-Powered Cyber-Attack Detection Scheme for Security and Privacy of Consumer IoT. IEEE Trans. Consum. Electron. https://doi.org/10.1109/TCE.2024.3499942 (2024).
Article Google Scholar
Khan, I. A. et al. A Novel Collaborative SRU Network With Dynamic Behaviour Aggregation, Reduced Communication Overhead and Explainable Features. IEEE J. Biomed. Health Inform. 28(6), 3228–3235. https://doi.org/10.1109/JBHI.2024.3352013 (2024).
Article PubMed Google Scholar
Nandanwar, Himanshu & Katarya, Rahul. Deep learning enabled intrusion detection system for Industrial IOT environment. Expert Systems with Applications. 249, 123808. https://doi.org/10.1016/j.eswa.2024.123808 (2024).
Article Google Scholar
Nandanwar, H. & Katarya, R. TL-BILSTM IoT: transfer learning model for prediction of intrusion detection system in IoT environment. Int. J. Inf. Secur. 23, 1251–1277. https://doi.org/10.1007/s10207-023-00787-8 (2024).
Article Google Scholar
Himanshu Nandanwar , Rahul Katarya,. Securing Industry An explainable deep learning model for intrusion detection in cyber-physical systems. Computers and Electrical Engineering. 123 110161 https://doi.org/10.1016/j.compeleceng.2025.110161 (2025).
Bhanu Kauhsik, Himanshu Nandanwar, Rahul Katarya. IoT Security: A Deep Learning-Based Approach for Intrusion Detection and Prevention. 2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques (EASCT). https://doi.org/10.1109/EASCT59475.2023.10392490 (2023).
Mohammad, A., Ahmad, R., & Habib, I. TON_IoT Datasets: The Next Generation of IoT Datasets for AI-based Intrusion Detection Systems. University of New South Wales (UNSW), Cyber Range Lab. https://research.unsw.edu.au/projects/toniot-datasets (2020).
Sundararajan, M., Taly, A., & Yan, Q. Axiomatic attribution for deep networks. International Conference on Machine Learning (ICML).3319–3328 (2017).
Lundberg, S. M., & Lee, S. I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS). 30 (2017).
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. IEEE International Conference on Computer Vision (ICCV). 618–626 (2017).
Shrikumar, A., Greenside, P., & Kundaje, A. Learning important features through propagating activation differences. International Conference on Machine Learning (ICML). 3145–3153 (2017).
Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015).
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, CMR College of Engineering & Technology, Hyderabad, Telangana, India
Talluri Upender
Department of IT, Vasavi College of Engineering, Hyderabad, Telangana, India
M. Neelakantappa
Department of CSE, Bapatla Engineering College, Bapatla, Andhra Pradesh, India
C. Prakasa Rao
Department of Computer Science & Business Systems (CSBS), RVR & JC College of Engineering, Guntur, Andhra Pradesh, India
Jaideep Gera
Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Green Fileds, Vaddeswaram, Guntur, 522302, AP, India
Vuyyuru Lakshma Reddy
School of Computer Science and Artificial Intelligence, SR University, Warangal, India
Nagendar Yamsani

Authors

Talluri Upender
View author publications
Search author on:PubMed Google Scholar
M. Neelakantappa
View author publications
Search author on:PubMed Google Scholar
C. Prakasa Rao
View author publications
Search author on:PubMed Google Scholar
Jaideep Gera
View author publications
Search author on:PubMed Google Scholar
Vuyyuru Lakshma Reddy
View author publications
Search author on:PubMed Google Scholar
Nagendar Yamsani
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by **Talluri Upender, Dr. M. Neelakantappa, Dr. C. Prakasa Rao, Dr. Jaideep Gera, Vuyyuru Lakshma Reddy, Nagendar Yamsani**. The first draft of the manuscript was written by **Talluri Upender** all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Talluri Upender.

Ethics declarations

Competing interests

The authors declare no competing interests.

Consent for publication

The authors give consent for their publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Upender, T., Neelakantappa, M., Rao, C.P. et al. CyberDetect MLP a big data enabled optimized deep learning framework for scalable cyberattack detection in IoT environments. Sci Rep 15, 40865 (2025). https://doi.org/10.1038/s41598-025-24459-w

Download citation

Received: 02 June 2025
Accepted: 14 October 2025
Published: 19 November 2025
Version of record: 19 November 2025
DOI: https://doi.org/10.1038/s41598-025-24459-w

Subjects

Abstract

Similar content being viewed by others

Explainable artificial intelligence-based cyber resilience in internet of things networks using hybrid deep learning with improved chimp optimization algorithm

Orchestrating machine learning models in a swarm architecture for IoT inline malware detection

Enhanced intrusion detection in cybersecurity through dimensionality reduction and explainable artificial intelligence

Introduction

Related work

Machine learning and deep learning approaches for cyberattack detection

IoT-specific intrusion detection techniques

Big data frameworks for security analytics

Model interpretability and explainability in IDS

Benchmarking and evaluation using public datasets

Optimization techniques and hybrid model designs

Proposed framework

System overview

Dataset description and preparation

Big data infrastructure and processing pipeline

Proposed CyberDetect-MLP model architecture

Model training and hyperparameter optimization

Algorithmic implementation

Evaluation methodology

Deployment strategy and alert generation

Experimental results

Experimental setup

Exploratory data analysis

Performance analysis

Statistical significance analysis

Ablation study

Explainability analysis

Performance comparison with existing methods

Real-time inference latency analysis

Scalability and resource utilisation benchmarks

Cross-dataset validation and robustness evaluation

Discussion

Limitations of the study

Conclusion and future work

Data availability

Materials availability

Code availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Consent for publication

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links