Introduction

In information technology, cybersecurity is the most demanding research topic at present. It is mainly challenging to attain when evolving technology, like the IoT, is concerned1. This development poses a vast threat to data confidentiality, availability, and integrity that malicious actions might utilize. There is more attention on IoT security as several novel applications that depend on connected devices have advanced2. With the high popularity of IoT, assaults against connected devices are one of the vital issues. The IoT devices are exposed to attacks through numerous methods, such as privilege escalation, denial of service, and eavesdropping. As an outcome, the necessity to defend IoT devices from these assaults is becoming gradually significant3. Besides, numerous devices in an incorporated network trust a wireless network for realistic communication, which is vulnerable to eavesdropping; so, the network is subjected to cyberattacks. Therefore, enhanced and extremely robust intrusion detection systems (IDS) are required for IoT devices. IDS aids in observing and examining the services, data, and network and analyzing within its effectual network management and classification of exposures in the least period4. Detecting intruders is a significant stage in certifying IoT networks’ safety. Intrusion detection is the protection mechanism for handling safety intrusions5. IDS is the primary tool employed to secure conventional information systems and networks. It observes the processes of a network or a host, informing the system administrator when it identifies a security intrusion. IDS recognizes unauthorized intrusions, attacks, and malicious actions in the network and creates one of the foremost security actions in the current network6.

The development of advanced technologies like the IoT with storage resources has resulted in the invention of big data. This has resulted in huge data generation by human beings over IoT-based sensors and devices, thus altering the production of various features7. Providing privacy and protection for big data is the biggest challenge facing developers of security management systems, particularly with the prevalent usage of the internet networks and the fast evolution of data produced from multiple resources; this generates more space for intruders to commence malicious attacks. Figure 1 represents the structure of big data. New attacks are being developed frequently, while insiders take advantage of their authorization to access the system to attack, leaving restricted suspicious tracks behind8. The conventional knowledge-based IDS should give mode to intelligent and data-driven networks. Unlike other models, intrusion detection based on deep learning (DL) and ML achieves superiority over other models. The DL has robust capabilities, such as good generalization, self-adaptation, self-learning, and recognition against unknown attack behaviour9. Meanwhile, the ML models are also flexible and scalable and, in numerous ways, can meet the exclusive demands of IoT security better than any other technique presently utilized10. On the other hand, the DL techniques extract features independently, can handle vast amounts of data, and beat classical ML in accuracy and performance.

Fig. 1
figure 1

Big data architecture.

This manuscript presents a Privacy-Preserving Statistical Learning with an Optimization Algorithm for a High-Dimensional Big Data Environment (PPSLOA-HDBDE) approach. The primary purpose of the PPSLOA-HDBDE approach is to utilize advanced optimization and ensemble techniques to ensure data confidentiality while maintaining analytical efficacy. In the primary stage, the linear scaling normalization (LSN) method scales the input data. Besides, the sand cat swarm optimizer (SCSO)-based feature selection (FS) process is employed to decrease the high dimensionality problem. Moreover, the recognition of intrusion detection takes place by using an ensemble of temporal convolutional network (TCN), multi-layer auto-encoder (MAE), and extreme gradient boosting (XGBoost) models. Lastly, the hyperparameter tuning of the three models is accomplished by utilizing an improved marine predator algorithm (IMPA) method. An extensive range of experimentations is performed to improve the performance of the PPSLOA-HDBDE technique, and the outcomes are examined using distinct measures. The major contribution of the PPSLOA-HDBDE technique is listed below.

  • The LSN is utilized to scale input data effectually, confirming consistent feature ranges. This preprocessing step substantially improved the performance of subsequent models by enhancing their convergence and accuracy. Furthermore, it facilitated enhanced feature interpretability and mitigated the impact of noise in the data.

  • A SCSO-based FS process is implemented to handle the dataset’s high dimensionality threats. This methodology enhanced the model’s efficiency by mitigating computational complexity and improved accuracy by retaining the most relevant features. Ultimately, it resulted in more robust and reliable model performance in intrusion detection tasks.

  • An ensemble approach was utilized, integrating a TCN, MAE, and XGBoost for robust intrusion detection. This incorporation employed the merits of every model, improving overall detection accuracy and resilience against various attack patterns. The ensemble approach portrayed superior performance related to individual models, crucially enhancing the intrusion detection capabilities.

  • Optimized performance of the models is attained via hyperparameter tuning using an IMPA technique. This methodology systematically explored the parameter space, confirming the optimal settings for every method were detected. As a result, the overall efficiency and reliability of the IDS were substantially improved.

  • Integrating several advanced methods, including LSN, SCSO, ensemble models, and IMPA, creates a unique framework for improving IDSs. This methodology emphasizes enhanced accuracy and effectiveness and cohesively addresses the threats of high dimensionality and model optimization. The novelty is in the synergistic combination of these techniques, which collectively outperform conventional models in real-world applications.

Review of literature

Haseeb et al.11 present an AI-aided route method for mobile wireless sensor networks (MWSN) to enhance energy and identify transmission link errors. Furthermore, the presented smart security technique upsurges the reliability of the restraint devices on random paths. Initially, it discovers a metaheuristic optimizer, a genetic algorithm (GA) system to identify the possible solutions, and depending upon independent metrics, it creates an optimum set of routings. Next, novel routes were recognized utilizing dynamic decisions to fulfil energy concerns. In12, a new Artificial Intelligence (AI)-based Energy-aware IDS and Safe routing method is presented to improve a secured IWSN. Primarily, the presented method performs the IDS to classify several attacks. Later, a game strategy-based decision device is incorporated with the presented ID method to decide whether security is required. In the latter stage, an energy-aware ad-hoc on-demand distance vector method is recognized to deliver a safe routing between the several nodes. Ntizikira et al.13 present the honeypot and blockchain (BC)-based ID and prevention (HB-IDP) method, in which edge computing is proposed to decrease the latency in transmission. Primarily, three-fold verification was executed utilizing the camellia encryption algorithm (CEA), which offers confidential keys. The method executes preprocessing by utilizing the min-max normalization. Signature-based ID is executed on the data preprocessed, with well-known assaults categorized into three modules utilizing the improved isolation forest (IIF) method. Kipongo et al.14 propose an improved honeycomb structure-based IDS for SDWSN that contains safe verification utilizing the 3D cube method, improved honeycomb-based network reinforcement learning (RL), clustering-based smart routing with a transfer learning (TL)-based deep Q networks (TLDQNs), and a hybrid IDS. In15, an advanced network IDS was proposed for an IoT-based intelligent home atmosphere. Separate from present methods, the overall approach offers a method and integrates IoT devices as possible vectors in the cyber attack environments, a concern frequently ignored in the preceding study. Using the harmony search algorithm (HSA), the method developed the extra trees classifier (ETC) by enhancing a widespread array of hyperparameters. Shitharth et al.16 proposed a multi-attack IDS for edge-aided IoT, which unites the backpropagation NN (BPNN) with the Radial basis function (RBF) NN. A backpropagation NN is mainly used to identify outliers and zero down on the most significant features for every attacking approach. An NN depending on RBF is utilized to identify multi-attack intrusions. Sajid et al.17 presented a hybrid method for ID with DL and ML methods to address these restrictions. The presented technique uses CNN and XGBoost methods for extracting the feature and then unites them all with the LSTM technique for identification.

Salama and Ragab18 present a new BC with an Explainable AI-driven ID for the IoT-driven Ubiquitous Computing System (BXAI-IDCUCS) method. The BXAI-IDCUCS method primarily groups the IoT nodes using an energy-aware duck swarm optimizer (EADSO). DNN is also used to classify and detect data. Finally, the BC technique is used for safe inter-cluster data communication procedures. Vakili et al.19 propose a service composition methodology by utilizing Grey Wolf Optimization (GWO) and MapReduce framework to compose services with optimized QoS. Ntizikira et al.20 present the Secure and Privacy-Preserving Intrusion Detection and Prevention for UAVS (SP-IoUAV) model by using federated learning (FL), differential privacy, and CNN-LSTM for real-time anomaly detection. Heidari et al.21 present an algorithm for constructing an optimal spanning tree by incorporating an artificial bee colony (ABC), genetic operators, and density correlation. Heidari, Navimipour, and Unal22 propose a BC-based RBF neural networks (RBFNNs) model. Heidari et al.23 present a model using fault trees and Markov chain analysis. Wang et al.24 introduce the DL-BiLSTM lightweight IoT intrusion detection model, which integrates deep neural networks (DNNs) and bidirectional LSTMs (BiLSTMs) for effectual feature extraction. The method employs an incremental principal component analysis (PCA) for dimensionality reduction. Zanbouri et al.25 propose a Glowworm Swarm Optimization (GSO) model to optimize performance in BC-based Industrial IoT (IIoT) systems. Zhang et al.26 introduce a dispersed privacy-preserving energy scheduling methodology using multi-agent deep RL (DRL) for energy harvesting clusters. Amiri, Heidari, and Navimipour27 introduce a novel taxonomy of the DL method. Devi and Arunachalam28 present a method that utilizes DL to detect attack nodes with a deep LSTM classifier. The Improved Elliptic Curve Cryptography (IECC) approach is employed for prevention, with hybrid MA-BW optimization for optimal key selection. Wang et al.29 propose a high-dimensional temporal data publishing methodology by utilizing dynamic Bayesian networks and differential privacy. The approach constructs a network based on mutual data, assesses edge sensitivity with Coherent Neighborhood Propinquity, and adds noise to attributes to meet ε-differential privacy standards. Younis et al.30 introduce FLAMES2Graph, a horizontal FL framework. Zhao et al.31 present the Variational AutoEncoder Gaussian Mixture Model Clustering Vertical FL Model (VAEGMMC-VFL) model. Chougule et al.32 introduce a Privacy-Preserving Asynchronous FL model. El-Adawi et al.33 propose a model using Gramian angular field (GAF) and DenseNet. The approach comprises preprocessing signals through artefact removal and median filtering (MF), then converting time series data into 2D images utilizing the GAF approach. Zainudin et al.34 present an FL framework employing Chi-square and Pearson correlation coefficient methods. Bushra et al.35 introduce the attention-based random forest (ABRF) with stacked bidirectional gated recurrent unit (stacked Bi-GRU) methodology.

Hou et al.36 propose the adaptive training and aggregation FL (ATAFL) framework through a joint optimization problem. The model also incorporates the digital twin and DRL model for optimal node selection and resource allocation. Jiang et al.37 propose a FL framework utilizing a conditional generative adversarial network (CGAN) method. Additionally, the model suggests an FL scheme, namely FedDWM, which effectually combines local model parameters from terrestrial clients to satellite servers. Feng et al.38 present a model sparsification strategy by employing contrastive distillation; the framework enhances local-global model alignment and maintains performance. Abdallah et al.39 explore existing ML and DL methods for detecting anomalies. Shan et al.40 introduce the CFL-IDS framework by employing local models’ evaluation metrics. An intelligent, cooperative model aggregation mechanism (ICMAM) method optimizes local model weights and reduces interference from subpar models. Babu, Barthwal, and Kaluri41 propose a trusted BC system for edge-based 5G networks. Begum et al.42 improve data safety by utilizing a secret key to scramble input, applying the Burrows-Wheeler Transform (BWT), and compressing the result with Move-To-Front and Run-Length Encoding, integrating cryptographic principles to improve performance. Babu et al.43 introduce a cooperative flow in fog-enabled IoT networks utilizing a permissioned BC system. Devarajan et al.44 present the adapted particle swarm optimization integrated FL-based sentiment analysis integrated deep learning (aPSO-FLSADL) for sentiment analysis. The method also utilizes SentiWordNet for sentiment scoring and BERT for word embedding, with a CNN-BiLSTM model for training. Yenduri et al.45 explore how BC technology enhances protection and transparency. Hao et al.46 propose a protocol for wireless applications in the multi-server environment. Saheed et al.47 present a hybrid model by integrating Autoencoder and Modified Particle Swarm Optimization (HAEMPSO) for feature selection, with a DNN for classification by optimizing DNN parameters by utilizing a modified inertia weight in PSO. Saheed, Abdulganiyu, and Tchakoucht48 introduce the IoT-Defender framework integrating a Modified Genetic Algorithm (MGA) with a deep Long Short-Term Memory (LSTM) network for cyberattack detection in IoT networks. Saheed, Omole, and Sabit49 present the GA-mADAM-IIoT model by integrating a GA, attention mechanism, and modified Adam-optimized LSTM across six modules.

Heidari et al.50 provide insights into the generation and detection of deepfakes, explore recent improvements, detect weaknesses in current security methods, and emphasize areas needing additional research. Heidari et al.51 introduce a BC-based FL solution incorporating SegCaps, CNN, and capsule networks for enhanced image feature extraction, data normalization, and confidentiality in global model training. Boopathi et al.52 propose a strategy to improve edge computing data privacy by utilizing secure transfer, DL optimization, and trust-based encryption with hybrid federated networks. Heidari, Navimipour, and Otsuki53 review the challenges and merits of Cloud Non-destructive Characterization Testing (CNDCT), comparing cloud-based testing environments with conventional system testing techniques. Asadi et al.54 provide a comprehensive overview of botnets, their evolution, detection methods, and evasion techniques while underscoring future research directions for combating these security threats. Ramkumar et al.55 propose the GACO-MLF framework, using ML and an enhanced ant colony optimization (ACO) model for balancing loads efficiently across IoT-PCN data centres. Heidari, Jamali, and Navimipour56 propose fuzzy multicriteria decision-making (MCDM)--based re-broadcasting scheme (FMRBS) for VANETs to reduce broadcast storms and improve data distribution. Saini et al.57 propose a trust-based, hybrid privacy-preserving strategy for cloud computing. Dansana, Kabat, and Pattnaik58 introduce an ML-based perturbation approach utilizing clustering, IGDP-3DR, and SVD-PCA for dimensionality reduction, followed by classification with KSVM-HHO for improved accuracy. Zhang and Tang59 propose the VPPLR framework for secure logistic regression training, utilizing secret sharing and a vectorization approach for effectual global parameter updates, bypassing homomorphic encryption. Jadhav and Borkar60 developed a data sanitization technique using the Marine Predator Whale Optimization (MPWO) approach to generate optimal keys, preserving the privacy of sensitive data. Vasa and Thakkar61 explore methods for protecting privacy in DL models for big data, discussing potential attacks and privacy protection approaches and proposing an effectual solution for enhancing privacy in DL models. Jahin et al.62 highlight the requirement for varied forecasting based on SC objectives, optimizing models using KPIs and error measurements. Bajpai, Verma, and Yadav63 propose a novel methodology integrating extended PCA and reinforcement learning to enhance clustering, data reduction, network service life, energy efficiency, and data aggregation. Song et al.64 present a GCNN-IDS approach that uses gene expression programming (GEP) to optimize CNN parameters, preventing local optima through global search capabilities.

Chen and Huang65 developed a privacy-preserving FL methodology to predict airline passengers’ willingness to pay for upgrades by securely incorporating multi-source data without compromising customer privacy. Dodda et al.66 examine three privacy-preserving algorithms—regularized logistic regression with DP, SGD with private updates, and distributed Lasso—underscoring the impact of training data volume on error rates and privacy. Kamatchi and Uma67 present an FL approach for detecting and reducing insider threats in IoT devices. It utilizes hybrid RSA and elliptic curve digital signatures for user registration, node clustering for privacy, and federated optimization with secure hashing for improved protection. Dhavamani et al.68 propose an enhanced Particle Swarm Optimization (PSO) approach for achieving differential privacy in IoT data sharing, optimizing privacy-preserving mechanisms while maintaining data utility through improved fitness functions, dynamic inertia weight, and adaptive coefficients. Xia69 propose Split Federated Mutual Learning (SFML), an FL approach for traffic classification that uses split and mutual learning, where clients maintain privacy and public models, sharing knowledge through distillation while offloading computation to the server. Liu70 present Hilbert-ConvLSTM, a novel DL method for location data prediction that enhances data availability and user privacy. It also utilizes Hilbert curve partitioning, spatio-temporal feature extraction, and differential privacy with Laplace noise to protect user information. Chowa et al.71 present a privacy-preserving self-supervised framework for COVID-19 classification from lung CT scans utilizing FL and Paillier encryption, ensuring secure decentralized training with unlabeled datasets from multiple hospitals. Huang et al.72 present GeniBatch, a batch compiler that optimizes PPML programs with PHE for efficiency, ensuring result consistency and preventing bit-overflow. Integrated into FATE, it utilizes SIMD APIs for hardware acceleration. Hossain et al.73 propose a privacy-preserving SSL-based Intrusion Detection System for 5G-V2X networks, using unlabeled data for pre-training and minimal labelled data for post-training, improving cyber-attack protection without compromising privacy.

Bezanjani et al.74 propose a three-phase methodology: blockchain-based encryption for secure transactions, request pattern recognition to detect unauthorized access, and BiLSTM-based feature selection for enhanced intrusion detection accuracy. Deebak and Hwang75 introduce a privacy-preserving learning mechanism for failure detection, using lightweight model aggregation at the edge and a 2D-CNN for improved privacy protection and accuracy without extra verifiability. Zhou et al.76 propose PPML-Omics, a secure ML method using decentralized differential private FL, ensuring privacy protection while analyzing omic data across diverse sequencing technologies and DL techniques. Babu et al.77 explore the challenges and solutions in managing IoT data, focusing on scalable modelling, real-time processing, and security. It highlights the significance of feature engineering and model selection for effectual IoT data analysis, contributing to improved decision-making and operational efficiency in IoT applications. Li et al.78 present a lightweight privacy-preserving predictive maintenance technique using binary neural networks (BNNs) and homomorphic encryption for privacy protection in 6G-IIoT scenarios. Li et al.79 propose a new training metric, Intra-modal Consistent Contrast Loss, to improve image-text retrieval accuracy. A quadtree index structure with hybrid representation vectors mitigates retrieval overhead, while encrypted feature vectors enable secure image-text matching in a large-scale ciphertext environment. Yang et al.80 propose a privacy-preserving ML model using a cloud-edge-end architecture, optimizing IoT systems by offloading tasks to edge servers and using homomorphic encryption, secret sharing, and differential privacy for improved privacy and reduced computational burden. Mumtaz et al.81 explore privacy-preserving data analysis methods utilizing AI-based approaches like differential privacy, FL, GANs, and VAEs, emphasizing their efficiency in protecting privacy while analyzing data. Keerthana82 examine the tactics and methods utilized in FL to preserve privacy during cooperative model training across dispersed devices, ensuring data security in ML workflows. Zhang et al.83 introduce SensFL, a privacy-enhancing methodology for protecting against privacy inference attacks in VFL by regularizing embedding sensitivity, preventing data reconstruction. Iam-On et al.84 propose a bi-level ensemble clustering framework that ensures data privacy and mitigates complexity by choosing multiple clusterings from each segment.

The existing studies present diverse enhancements in intrusion detection and prevention systems across various contexts. One approach improves mobile WSNs by employing a GA to optimize routing and enhance energy efficiency. Another technique concentrates on AI-based IDSs that classify attacks and integrate game theory for secure routing. A honeypot and BC-based strategy is presented to mitigate latency in data transmission, while an en enhanced honeycomb structure integrates RL and clustering for robust security. Methods utilizing DL classifiers and hybrid methods aim to address threats safeguarding IoT devices, including using optimized algorithms for feature extraction and improving model performance. Additionally, FL frameworks are presented to ease secure data sharing and decision-making, incorporating real-time anomaly detection mechanisms. Other proposals accentuate energy-aware scheduling and the usage of dynamic models to manage data privacy efficiently. At the same time, some frameworks explore the application of advanced DL methodologies to detect and classify threats in diverse environments. These innovations enhance security, efficiency, and reliability in increasingly intrinsic networked systems. Despite crucial enhancements in intrusion detection and prevention systems, a notable research gap remains in incorporating diverse models to improve adaptability and scalability in real-time environments. Many existing models need help with resource constraints and need to maintain performance across varying data dispersions. This underscores the requirement for more holistic solutions that effectively address modern cyber threats’ complexities in dynamic settings.

Materials and methods

This manuscript presents a novel PPSLOA-HDBDE methodology. The primary purpose of the PPSLOA-HDBDE model is to leverage advanced optimization and ensemble techniques to ensure data confidentiality while maintaining analytical efficacy. It encompasses four processes involving data preprocessing, SCSO-based FS, ensemble classification models, and an IMPA-based parameter optimizer. Figure 2 portrays the entire flow of the PPSLOA-HDBDE methodology.

Fig. 2
figure 2

Overall flow of the PPSLOA-HDBDE model.

Data preprocessing: LSN

In the first stage, the PPSLOA-HDBDE model utilizes LSN to measure the input data. This normalization technique is specifically beneficial for algorithms sensitive to feature ranges, namely gradient-based approaches. By employing LSN, the model enhances convergence rates during training and improves overall performance, as it reduces the influence of outliers and scales the data to a uniform range. LSN is an ideal choice to implement and interpret compared to other normalization methodologies, making it an accessible option for practitioners. Its capacity to retain the associations between features while standardizing their scales additionally contributes to the accuracy and reliability of the model. Overall, LSN is a valuable preprocessing step that improves the effectiveness of the PPSLOA-HDBDE model in handling diverse datasets. LSN, also called min-max normalization, rescales features to a definite range, usually [0, 1]. This model certifies that every feature donates similarly to the study by removing the effects of opposing units and magnitudes. Converting the data linearly improves the performance of ML techniques, chiefly those delicate to feature measures, like gradient descent-based approaches. Linear scaling enhances the rate of convergence and complete accuracy of the model. This method is mainly valuable in high-dimensional databases, certifying that no feature excessively affects the outcomes.

FS using SCSO model

Besides, the SCSO-based FS process is applied to decrease the high dimensionality problem85. This model is chosen for its efficiency in handling high dimensionality, a common threat in ML techniques. SCSO replicates the hunting behaviour of sand cats, allowing it to effectually explore the feature space and detect the most relevant features while averting overfitting. Unlike conventional models, SCSO balances exploration and exploitation, producing a more robust FS. This adaptability makes it appropriate for intrinsic datasets where irrelevant or redundant features can affect the model’s performance. Furthermore, the SCSO approach underscores superior convergence speed related to other optimization methods, confirming quicker processing times (PTs) without sacrificing accuracy. Its innovative mechanism and proven efficacy make SCSO ideal for FS in IDSs. Figure 3 portrays the working flow of the SCSO model.

Fig. 3
figure 3

Workflow of SCSO model.

In the problem of dimension optimize\(\:d\), a \(\:SandCat\)is a \(\:1\times\:d\) array demonstrating the problem solving, described as \(\:Sand\)\(\:Ca{t}_{i}=\)\(\:\dots\:,{x}_{d};i\in\:population\left(1,n\right)\). All \(\:x\) should be among the lower and upper limits: \(\:\forall\:{x}_{i}\in\:\) [upper, lower]. If the SCSO model has been applied, it primarily makes a matrix of initialization \(\:\left({N}_{pop}\times\:{N}_{d}\right)\) based on the size of the difficulty. The fitness cost of every \(\:SandCat\) has been gained by calculating the clear fitness function (FF). If an iteration is accomplished, the \(\:SandCat\) is designated using the optimal cost still in that iteration. When no improved solution is initiated in the following iteration, keeping it in the memories is redundant.

Search for prey

The SCSO model uses SandCats’ auditory ability to detect lower frequencies. \(\:SandCats\) can observe lower frequency under 2 kHz. During this model, this search range has been described as \(\:{r}_{G}\). Based on the algorithmic working principle, these values will linearly reduce from 2 -\(\:0\) in the iteration process, slowly moving toward the prey without skipping or missing. The \(\:{S}_{m}\) pretends the sand cat’s auditory features are two using the statement. The mathematical reproduction is described in the following:

$$\:{r}_{G}={S}_{m}-\left(\frac{{S}_{m}\times\:ite{r}_{c}}{ite{r}_{Max}}\right)$$
(1)

On the other hand, \(\:ite{r}_{c}\) represents the present iteration amount, and \(\:ite{r}_{Max}\) denotes maximal iteration.

The leading parameter that controls the conversion in the middle of the development and the exploration stage is \(\:R\). Owing to these adaptable strategies, the possibility and transition between the dual stages should be more stable. \(\:R\) is described as demonstrated:

$$\:R=2\times\:{r}_{G}\times\:rand\left(\text{0,1}\right)-{r}_{G}$$
(2)

To prevent dropping into a local best, every sand cat’s range of sensitivity has been changed, well-defined as:

$$\:r=R\times\:rand\left(\text{0,1}\right)$$
(3)

During the SCSO model, the \(\:SandCat\) upgrades its position according to the best solution, its present position, and the range of sensitivity, looking for another possible optimal position of prey. This searching behaviour is delineated as follows:

$$\:Pos\left(t+1\right)=r\times\:\left(Po{s}_{bc}\left(t\right)-rand\left(\text{0,1}\right)\times\:Po{S}_{c}\left(t\right)\right)$$
(4)

\(\:Po{s}_{bc}\) represents the best candidate position, and \(\:Po{s}_{c}\) denotes the present position.

Attacking prey

If a \(\:SandCat\) attacks its prey, it initially utilizes the\(\:Po{s}_{b}\) ideal and \(\:Po{s}_{c}\)present positions to create a \(\:Po{s}_{rnd}\) randomly formed position. Assume that the \(\:SandCats\) range of circle sensitivity, to prevent dropping into the local optimal, the Roulette model has been applied to pick at random an angle for every \(\:SandCat\), such that the \(\:SandCat\) can tactic the searching position:

$$\:Po{s}_{rnd}=\left|rand\left(\text{0,1}\right)\times\:Po{s}_{b}\left(t\right)-Po{s}_{c}\left(t\right)\right|$$
(5)
$$\:Pos\left(t+1\right)=Po{s}_{b}\left(t\right)-r\times\:Po{s}_{rnd}\times\:cos\left(\theta\:\right)$$
(6)

The attack and search stages from the SCSO Model are guaranteed by adaptability. \(\:{r}_{G}\) and \(\:R.\) This parameter permits SCSO to change effortlessly among the dual stages. If \(\:R\vee\:1,\) the \(\:SandCats\) hunt for prey; If \(\:R\vee\:\le\:1\), the \(\:SandCats\) attack prey. The pseudocode of the SCSO model is demonstrated in Algorithm 1.

Algorithm 1
figure a

Pseudocode of SCSO.

The FF used in the SCSO technique is projected to balance the number of chosen features in every solution and the classifier accuracy gained by utilizing these preferred features. Equation (7) denotes the FF for estimating solutions.

$$\:Fitness=\alpha\:{\gamma\:}_{R}\left(D\right)+\beta\:\frac{\left|R\right|}{\left|C\right|}$$
(7)

Here, \(\:{\gamma\:}_{R}\left(D\right)\) signifies the classifier rate of error. \(\:\left|R\right|\) means the cardinality of the nominated sub-set, \(\:C\vee\:\) represents the total no. of features in the database, \(\:\alpha\:\) and \(\:\beta\:\) signify the dual parameters equivalent to the significance of classifier quality and sub-set length. [1,0] and \(\:\beta\:=1-\alpha\:.\)

Ensemble classification models

Moreover, the recognition of intrusion detection is accomplished by using an ensemble of TCN, MAE, and XGBoost models. These methods are chosen for their complementary merits. TCN outperforms in capturing temporal patterns in data, making it ideal for detecting time-dependent attacks. MAE efficiently learns intrinsic feature representations, improving the technique’s capability to detect anomalies. XGBoost, known for its high performance in structured data, provides robust predictions with robust generalization capabilities. Integrating these models employs their advantages, enhancing accuracy and resilience against various attack types. Moreover, the ensemble approach mitigates the risk of overfitting and improves overall reliability, making it more significant than using any single model alone. This integrated strategy confirms a more comprehensive and effectual IDS.

TCN classifier

TCN is one of the new neural networks that depend upon the structure of CNN86. TCN utilizes structures like dilated causal convolution (DCC) and residual blocks. When equated to conventional CNN, DCC concentrates only on historical and present data without seeing prospect data. This indicates that the output value \(\:{y}_{t}\) at \(\:t\)-\(\:th\) time was formed only by the value of the input at a \(\:t\)-\(\:th\) time and former, up to input \(\:\left\{{x}_{0},{x}_{1},\ldots\:{x}_{t-1},{x}_{t}\right\}\). Conventional CNN upsurges the receptive area by including a pooling layer foremost in data loss. Unlike, TCN presents DCC to enlarge the receptive area, permitting it to take dependences through a higher range without misplacing data. The mathematical formulation is computed below:

$$\:F\left(t\right)=\sum\limits_{i=0}^{k=1}f\left(i\right)\times\:{x}_{t-p\times\:i}$$
(8)

Here, \(\:t\) signifies the sequence part index, \(\:p\) represents the dilation factor, \(\:{x}_{i}\in\:{R}^{n}\) denotes the sequence element, \(\:F\left(t\right)\) refers to a dilated convolution of element \(\:{x}_{t}\), and \(\:k\) denotes the dimension of the convolutional kernel. The residual connection inserts an input \(\:x\) to the output \(\:f\left(x\right)\). \(\:f\left(x\right)\) is stated below:

$$\:f\left(x\right)=h\left(x\right)-x$$
(9)

The extension of causal convolutional in TCN certifies that data flows precisely from the earlier to the future, thereby averting data leakage from the upcoming to the previous. Even with fewer layers, the DCC application permits TCN to hold a greater receptive area, permitting it to procedure lengthier time-series data. Besides, DCC includes methods like dropout regularization, weight normalization, and\(\:ReLU\) activation function. These models improve the model’s non-linear representation abilities and recover its constancy and generalization performance, eventually increasing its efficiency.

MAE classifier

MAE is a neural network that removes the unseen feature from input data87. The MAE architecture comprises two sections: a decoder and an encoder. The encoding and decoding part contains input and output layers with some fully connected (FC) layers. An FC layer has accompanied the ReLU or sigmoid. The FC layer handling from \(\:x\in\:{R}^{\varOmega\:}\) to \(\:y\in\:{R}^{\varPsi\:}\) is calculated as:

$$\:y=\delta\:\left(Wx+b\right),$$
(10)

whereas \(\:\delta\:\left(\bullet\:\right)\) refers to the activation function, \(\:W\in\:{R}^{\varPsi\:\times\:\varOmega\:}\) represents the weighting coefficients matrix, and \(\:b\in\:{R}^{\varPsi\:}\) denotes the biased term.

In training, the auto-encoder reduces the loss function to discover the biases and optimum weighting coefficients. The function of loss is considered as follows:

$$\:Loss=\frac{1}{H}\sum\limits_{i=0}^{H-1}$$
(11)

Here, \(\:{x}_{i}\) represents the \(\:i‐th\) input neuron, and \(\:{z}_{i}\) signifies the equivalent output neuron of the auto-encoder; correspondingly, \(\:H\) stands for neuron counts in the input layer.

The network encoding part usually converts the data input into a low-dimensional space, and these lower-dimensional representations are applied to minimize the data input. Hence, the auto-encoder is a non-linear Karhunen-Loeve transform (KLT) form.

XGBoost classifier

XGBoost combines linear scale determination with a definite regression tree learning method88. The technique integrates architectures with decreased accuracy by employing specific approaches. The aim is to make a combined architecture that must be highly accurate. In the model training procedure, XGBoost improves the boosting method. All iterations produce an updated DT to fit the residuals created in the prior rounds. XGBoost could constantly increase its accuracy and generalization capacity with the iterative optimizer. But, conventional gradient boosting-DT (GBDT) techniques employ only 1st order derivative; XGBoost is a 2nd order Taylor extension of the loss function, tackles the complexity of the method by presenting regularization relationships for avoiding overfit issues as well and implements a highly refined estimation method while dividing nodes for significantly capturing the non-linear correlations among features. Figure 4 depicts the infrastructure of the XGBoost technique. This method is dependent upon the resulting mathematical values:

Fig. 4
figure 4

Structure of XGBoost model.

An incorporation model for the DT is represented as given below:

$$\grave{y}_{i}=\sum\limits_{m=1}^{M}{f}_{m}\left({x}_{i}\right),{f}_{m}\in\:F$$
(12)

Here, \(\:{x}_{i}\) denotes the first \(\:i\) input feature; \(\:M\) represents the DT counts; F describes the tree collection space; \(\grave{y}_{i}\) denotes the predictable value.

XGBoost’s loss function is given as:

$$\:Q={\sum\:}_{i=1}^{n}l\left({y}_{i},\grave{y}_{i}\right)+\sum\limits_{m=1}^{M}\theta\:\left({f}_{m}\right)$$
(13)

The primary portion of the operation is the predictable error among the evaluated values as well as actual training values of the XGBoost framework, and the secondary part signifies the intricacy of the tree that is mainly employed for controlling the regularization of the model complexity:

$$\:\theta\:\left({f}_{m}\right)=\gamma\:T+\frac{1}{2}\tau\:\Vert\:\omega\:{\Vert\:}^{2}$$
(14)

Where\(\:\tau\:\) and \(\:\gamma\:\) describe penalty factors.

In addition to an increment function \(\:{f}_{t}\left(X\right)\) to Eq. (14), the loss function value will be decreased. The \(\:t-th\) time is mentioned below:

$$Q_{{\left( t \right)}} = \mathop \sum \limits_{{i = 1}}^{n} l\left( {y_{i} ,\acute{y}_{i} } \right) + \mathop \sum \limits_{{m = 1}}^{M} \theta \left( {f_{m} } \right) = \mathop \sum \limits_{{i = 1}}^{n} l\left( {y_{i},\acute{y}_{i}^{{t + 1}} + f_{t} \left( {x_{i} } \right)} \right) + \theta \left( {f_{t} } \right)$$
(15)

The 2nd -order Taylor expansion of Eq. (16) has been utilized to estimate the primary function, and the set of instances in every branch of the \(\:j\) tree could be described as \(\:{I}_{j}=\left\{i|q\left({x}_{i}=j\right)\right)\). the \(\:{Q}_{\left(t\right)}\) could be represented,

$$\:{Q}_{\left(t\right)}\cong\:{\sum\:}_{j=1}^{T}\left[\left({\sum\:}_{iϵ{I}_{j}}^{}{g}_{i}\right){\omega\:}_{j}+\left(\frac{1}{2}\right)\left({\sum\:}_{iϵ{I}_{j}}^{}{h}_{i}+\tau\:\right){\omega\:}_{{j}^{2}}\right]+\gamma\:T$$
(16)

Now, \(\:{g}_{i}={\partial\:}_{{\stackrel{\prime }{y}}_{i}^{t+1}}l\left({y}_{i},{\stackrel{\prime }{y}}_{i}^{t+1}\right)\), and \(\:{h}_{i}={\partial\:}_{{\stackrel{\prime }{y}}_{i}^{t+1}}^{2}l\left({y}_{i},{\stackrel{\prime }{y}}_{i}^{t+1}\right)\) denotes the loss function’s 1st and 2nd order derivative, respectively. Define \(\:{G}_{i}={\sum\:}_{}^{}i\in\:{I}_{j}{g}_{i},{H}_{i}={\sum\:}_{i\in\:{I}_{j}}^{}{h}_{i}\):

$$\:{Q}_{\left(t\right)}\cong\:\sum\limits_{j=1}^{T}\left[{G}_{j}{\omega\:}_{j}+\left(\frac{1}{2}\right)\left({H}_{j}+\tau\:\right){\omega\:}_{j}^{2}\right]+\gamma\:T$$
(17)

The partial derivative of \(\:\omega\:\) yields

$$\:{\omega\:}_{j}=\frac{-{G}_{j}}{{H}_{j}+\tau\:}$$
(18)

By integrating weights to the main function as given below,

$$\:{Q}_{\left(t\right)}\cong\:-\left(\frac{1}{2}\right){\sum\limits_{j=1}^{T}{G}_{j}}^{2}/\left({H}_{j}+\tau\:\right)+\gamma\:T$$
(19)

Parameter optimizer: IMPA

Finally, the hyperparameter tuning of three models is performed using an IMPA technique89. This method is chosen for its efficiency in navigating intrinsic optimization landscapes. IMPA replicates the foraging behaviour of marine predators, allowing it to efficiently balance exploration and exploitation during the tuning process. This results in a more thorough search for optimum hyperparameter settings related to conventional techniques, which may converge prematurely. The model’s adaptability makes it appropriate for high-dimensional parameter spaces commonly found in ML approaches. Furthermore, IMPA has achieved excellent performance using convergence speed and solution quality, outperforming other optimization models such as grid or random search. Overall, the usage of IMPA improves the performance of the model by confirming that hyperparameters are finely tuned, resulting in improved predictive accuracy and robustness in intrusion detection tasks. Figure 5 depicts the structure of the IMPA model.

Fig. 5
figure 5

Structure of the IMPA approach.

The main feature of MPA is using the social behaviour of marine animals to efficiently balance exploitation and exploration. Both prey and predators are considered searching individuals for prey and hunting for food. The MPA first generates a randomly generated population by describing the lower and upper bounds, and the initialization process is exposed in Eq. (20).

$$\:{X}_{ij}={X}_{min}+rand\left(\text{0,1}\right)\left({X}_{max}-{X}_{min}\right),$$
(20)

The \(\:Prey\) matrix presents the location vector.

$$Prey = \left[ {\begin{array}{*{20}l} {X_{{1,1}} } & {X_{{1,2}} } & \cdots & {X_{{1,n}} } \\ {X_{{2,1}} } & {X_{{2,2}} } & \cdots & {X_{{2,n}} } \\ \vdots & \vdots & \ddots & \vdots \\ {X_{{m,1}} } & {X_{{m,2}} } & \cdots & {X_{{m,n}} } \\ \end{array} } \right]_{{m \times n}},$$
(21)

whereas \(\:m\) denotes a population dimension; \(\:n\) refers to the location of every dimension. The \(\:Elite\) matrix depends upon the victim’s fitness assessment, choosing the finest individual as the hunter vector and repeating it \(\:n\) times to generate the \(\:Elite\).

$$Elite = \left[ {\begin{array}{*{20}l} {X_{{1,1}}^{l} } & {X_{{1,2}}^{l} } & \cdots & {X_{{1,n}}^{l} } \\ {X_{{2,1}}^{l} } & {X_{{2,2}}^{l} } & \cdots & {X_{{2,n}}^{l} } \\ \vdots & \vdots & \ddots & \vdots \\ {X_{{m,1}}^{l} } & {X_{{m,2}}^{l} } & \cdots & {X_{{m,n}}^{l} } \\ \end{array} } \right]_{{m \times n}} ,$$
(22)

The MPA is separated into 3 stages dependent upon the victim and predator speed ratio, each equivalent to dissimilar iterative procedures.

Stage 1: This stage is called an exploration phase. It happens in the 1st 1/3 of the iteration procedure. The prey is uniformly spread all over the exploration area in the early iteration, and the distance between the prey and predator will be moderately great; Brownian motion simplifies fast survey of the target’s location. The mathematic formulation is mentioned below:

$$\:{\overrightarrow{S}}_{i}={\overrightarrow{R}}_{B}\otimes\:\left(\overrightarrow{Elit{e}_{i}}-{\overrightarrow{R}}_{B}\otimes\:\overrightarrow{Pre{y}_{i}}\right),i=1,\dots\:,n$$
(23)
$$\:\overrightarrow{Pre{y}_{i}}=\overrightarrow{Pre{y}_{i}}+P.\overrightarrow{R}\otimes\:{\overrightarrow{S}}_{i},$$
(24)

Here, \(\:{\overrightarrow{R}}_{B}\) means a vector of randomly produced values by Brownian motion that follows a Gaussian distribution. \(\:{\overrightarrow{S}}_{i}\) denotes a step size. The constant value \(\:P\) is set as \(\:5\), and \(\:\overrightarrow{R}\) refers to a randomly generated vector uniformly distributed among \(\:0\) and 1.

Stage 2: This stage aims to attain a competitive changeover from the exploration to exploitation stages, which arise between \(\:1/3\) and \(\:2/3\) of the iterative method. The individuals are divided into dual equivalent parts, and one portion is upgraded depending upon Eqs. (25) and (26).

$$\:{\overrightarrow{S}}_{i}={\overrightarrow{R}}_{L}\otimes\:\left(\overrightarrow{E1it{e}_{i}}-{\overrightarrow{R}}_{L}\otimes\:\overrightarrow{Pre{y}_{i}}\right),i=1,\frac{n}{2}$$
(25)
$$\:\overrightarrow{Pre{y}_{i}}=\overrightarrow{Pre{y}_{i}}+P.\overrightarrow{R}\otimes\:{\overrightarrow{S}}_{i},$$
(26)

Here, \(\:{\overrightarrow{R}}_{L}\) denotes a vector of randomly produced values by Levy motion. The remaining part of the population is upgraded depending upon Eqs. (27) and (28).

$$\:{\overrightarrow{S}}_{i}={\overrightarrow{R}}_{B}\otimes\:\left({\overrightarrow{R}}_{B}\otimes\:\overrightarrow{Elit{e}_{i}}-\overrightarrow{Pre{y}_{i}}\right),i=\frac{n}{2},\dots\:,n$$
(27)
$$\:\overrightarrow{Pre{y}_{i}}=\overrightarrow{Elit{e}_{i}}+P.CF\otimes\:{\overrightarrow{S}}_{i},$$
(28)

Whereas \(\:CF\) denotes an adaptive control parameter that is definite as:

$$\:CF=$$
(29)

The variable \(\:Iter\) specifies the present count of iterations, whereas \(\:{Max}_{Iter}\) signifies the maximum iteration count. In the second stage, exploitation and exploration operations happen simultaneously as the hunter and victim technique each other, with a decreased step size compared to the preceding stage.

Stage 3 happens in the previous \(\:1/3\)th of the iterative procedure. The hunters start changing from Brownian to Levy motions using the upgrading formulation mentioned below.

$$\:{\overrightarrow{S}}_{i}={\overrightarrow{R}}_{L}\otimes\:\left({\overrightarrow{R}}_{L}\otimes\:\overrightarrow{Elit{e}_{i}}-\overrightarrow{Pre{y}_{i}}\right),i=1,\dots\:,n.$$
(30)
$$\:\overrightarrow{Pre{y}_{i}}=\overrightarrow{Elit{e}_{i}}+P.CF\otimes\:{\overrightarrow{S}}_{i}.$$
(31)

Then, there are Eddy Formation and Fish Aggregating Device (FAD) effects in the real predator method, and FAD is considered a local goal. So, these cases want to pretend in the iterative procedure to avoid dropping into local goals. The depiction of FADs is given below:

$$\:\overrightarrow{Pre{y}_{i}}=\left\{\begin{array}{c}\overrightarrow{Pre{y}_{i}}+CF\left[{\overrightarrow{X}}_{min}+\overrightarrow{R}\otimes\:\left({\overrightarrow{X}}_{max}-{\overrightarrow{X}}_{min}\right)\right]\otimes\:\overrightarrow{U},r\le\:FADs\\\:\overrightarrow{Pre{y}_{i}}+\left[FADs\left(1-r\right)+r\right]\left(\overrightarrow{Pre{y}_{r1}}-\overrightarrow{Pre{y}_{r2}}\right),r>FADs\end{array}\right.$$
(32)
$$\:\overrightarrow{{U}_{i}}=\left\{\begin{array}{c}0,rand\le\:FADs\\\:1,rand>FADs\end{array}\right.$$
(33)

The FAD constant value is fixed as 2. \(\:r\in\:\left(\text{0,1}\right).\overrightarrow{U}\), which means a dual vector, with every element described in Eq. (33). The norm of marine memory storage is parallel to the greedy tactic, which equates the outcomes beforehand and after the iteration and only recollects the solution with superior fitness. Its calculation is mentioned below:

$$\:{X}_{i}^{\left(t+1\right)}=\left\{\begin{array}{c}{X}_{i}^{\left(t+1\right)},f\left({X}_{i}^{\left(t+1\right)}\right)\le\:f\left({X}_{i}^{t}\right)\\\:{X}_{i}^{t},f\left({X}_{i}^{\left(t+1\right)}\right)>f\left({X}_{i}^{t}\right)\end{array}\right.$$
(34)

Meanwhile, \(\:{X}_{i}^{t+1}\) signifies the location of the optimum candidate solution taken by an \(\:ith\) individual after the \(\:\left(t+1\right)th\) iteration. \(\:f\) indicates the FF.

A new MPA separates optimizer iterations into 3 distinct stages. The 1st stage concentrates on exploration, the 2nd changeovers from exploration to exploitation, and the 3rd is only for exploitation. This tactic permits every stage to focus on dissimilar tasks, improving the exploration ability. If the early matrix quality is lower, then it will gradually converge. Also, transferring among dissimilar stages might present transition costs. For instance, certifying the effectual transfer of outcomes without data loss is vital when transitioning from the global to local search phases.

Opposition-based learning (OBL) is an enhanced optimizer tactic in searching. Its main thought is to produce an opposite solution and use it for the optimizer procedure. The IMPA is demonstrated in Eq. (35) to incorporate the OBL technique into the initialize and FF computation phases.

$$\:{X}_{OBL}={r}_{1}\times\:\left(ub+lb\right)-{r}_{2}\times\:X$$
(35)
  • Initialize stage: Employ Eq. (35) to produce the opposite initialize matrix of Prey, intensifying individuals’ distribution range within matrix form. The enhanced tactic upsurges the likelihood of an initial result covering the optimum solution, increases the range and excellence of the early matrix, and quickens convergence speed.

  • FF computation stage: Throughout the optimizer procedure, the FF is computed at the end of every iteration to discover and maintain the optimum solution. Conversely, owing to a stochastic method, the calculated fitness value might vary considerably from the real optimum solution. Then, this enhanced plan uses Eq. (20) to produce an opposite result of equivalent extent to the present individual result after every iteration.

Fitness choices significantly influence the efficiency of IMPA. The hyperparameter range method encompasses the encoded method for considering the effectiveness of candidate results. The IMPA reflects accuracy as the primary standard for projecting the FF in this work.

$$\:Fitness=max\left(P\right)$$
(36)
$$\:P=\frac{TP}{TP+FP}$$
(37)

Here, \(\:TP\) and \(\:FP\) denote true and false positive values, respectively.

Experimental validation

The performance evaluation of the PPSLOA-HDBDE methodology is studied under the BoT-IoT dataset90. The database comprises 2056 samples under five classes, as represented in Table 1. The BoT-IoT dataset was developed in the Cyber Range Lab of UNSW Canberra, simulating a realistic network environment that integrates normal and botnet traffic. It encompasses diverse source files, comprising original pcap files, generated argus files, and CSV files, all organized by attack category and subcategory to ease labelling. The captured pcap files total 69.3 GB, containing over 72 million records, while the extracted flow traffic in CSV format is 16.7 GB. The dataset features a range of attacks, including DDoS, DoS, OS and Service Scan, Keylogging, and Data Exfiltration, with DDoS and DoS attacks also classified by protocol. To streamline dataset handling, a 5% sample was extracted using MySQL queries, resulting in four files totalling approximately 1.07 GB and containing about 3 million records. The suggested technique is simulated using the Python 3.6.5 tool on PC i5-8600k, 250GB SSD, GeForce 1050Ti 4GB, 16GB RAM, and 1 TB HDD. The parameter settings are provided: learning rate: 0.01, activation: ReLU, epoch count: 50, dropout: 0.5, and batch size: 5.

Table 1 Details of dataset.

Figure 6 determines the confusion matrices produced by the PPSLOA-HDBDE model over different epochs. The results state that the PPSLOA-HDBDE technique precisely has effectual identification and recognition of all 5 class labels.

Fig. 6
figure 6

Confusion matrices of PPSLOA-HDBDE technique (a-f) Epochs 500–3000.

The intrusion detection result of the PPSLOA-HDBDE method is identified under dissimilar epochs in Table 2; Fig. 7. The table values indicate that the PPSLOA-HDBDE method appropriately identified all the samples. On 500 epoch counts, the PPSLOA-HDBDE method offers an average \(\:acc{u}_{y}\) of 98.89%, \(\:pre{c}_{n}\) of 96.43%,\(\:rec{a}_{l}\) of 96.64%, \(\:{F1}_{score}\) of 96.53%, and MCC of 95.82%. Also, on 1000 epoch counts, the PPSLOA-HDBDE methodology offers an average \(\:acc{u}_{y}\) of 99.11%, \(\:pre{c}_{n}\) of 96.76%,\(\:rec{a}_{l}\) of 97.72%, \(\:{F1}_{score}\) of 97.22%, and MCC of 96.66%. Furthermore, on 2000 epochs, the PPSLOA-HDBDE methodology provides an average \(\:acc{u}_{y}\) of 97.47%, \(\:pre{c}_{n}\) of 93.37%,\(\:\mathfrak{R}c{a}_{l}\) of 89.85%, \(\:{F1}_{score}\) of 91.33%, and MCC of 89.85%. Eventually, on 3000 epochs, the PPSLOA-HDBDE methodology delivers an average \(\:acc{u}_{y}\) of 96.96%, \(\:pre{c}_{n}\) of 92.90%,\(\:rec{a}_{l}\) of 84.95%, \(\:{F1}_{score}\) of 87.38%, and MCC of 86.25%.

Table 2 Detection outcome of the PPSLOA-HDBDE technique under different epochs.
Fig. 7
figure 7

Average outcome of PPSLOA-HDBDE technique (a-f) Epochs 500–3000.

Figure 8 shows the training \(\:acc{u}_{y}\)(TRAAC) and validation \(\:acc{u}_{y}\)(VLAAC) outcomes of the PPSLOA-HDBDE method under different epochs. The \(\:acc{u}_{y}\) values are estimated for 0-3000 epoch counts. The figure underlined that the TRAAC and VLAAC values display an increasing trend, which reported the capability of the PPSLOA-HDBDE method to have enhanced performance over various iterations. Furthermore, the TRAAC and VLAAC remain adjacent over epochs, which specifies lower minimum overfitting and shows superior performances of the PPSLOA-HDBDE technique, promising constant prediction on unnoticed samples.

Fig. 8
figure 8

\(\:Acc{u}_{y}\) curve of PPSLOA-HDBDE method (a-f) Epochs 500–3000.

Figure 9 shows the TRA loss (TRALS) and VLA loss (VLALS) graph of the PPSLOA-HDBDE technique under different epochs. The loss values are estimated for 0-3000 epoch counts. The TRALS and VLALS curves indicate a decreasing trend, reporting the proficiency of the PPSLOA-HDBDE methodology in balancing a trade-off between generalized and data fitting. The constant reduction in loss values furthermore possibilities the more outstanding performances of the PPSLOA-HDBDE methodology and tuning of the predictive outcomes over time.

Fig. 9
figure 9

Loss curve of PPSLOA-HDBDE method (a-f) Epochs 500–3000.

In Fig. 10, the precision-recall (PR) curve analysis of the PPSLOA-HDBDE technique under different epochs interprets its performances by plotting Precision against Recall for all 5 class labels. The figure demonstrates that the PPSLOA-HDBDE technique consistently achieves improved PR values across various classes, indicating its ability to maintain high precision and recall by balancing true positive predictions with actual positives. The constant increase in PR results between all 5 class labels represents the effectiveness of the PPSLOA-HDBDE methodology in the classification procedure.

In Fig. 11, the ROC curve of the PPSLOA-HDBDE technique under different epochs is examined. The outcomes denote that the PPSLOA-HDBDE technique attains superior ROC results over every class, representing the vital ability to discriminate the class labels. This consistent tendency of better ROC values over several class labels indicates the efficient performances of the PPSLOA-HDBDE methodology in predicting class labels, emphasizing the robust nature of the classifier process.

Fig. 10
figure 10

PR curve of PPSLOA-HDBDE methodology (a-f) Epochs 500–3000.

Fig. 11
figure 11

ROC curve of PPSLOA-HDBDE method (a-f) Epochs 500–3000.

The comparison of the PPSLOA-HDBDE approach with current methods is demonstrated in Table 3; Fig. 1291,92,93. Regarding \(\:acc{u}_{y}\), the PPSLOA-HDBDE methodology has a greater \(\:acc{u}_{y}\) of 99.49%. In contrast, the AROMA, Dynamic Bandwidth Allocation(DBA), SOM-SVM, CNN-BiLSTM, CANET, RFS-1, CNN-Focal, Dense Convolutional Network with Discrete Wavelet Transform (DenseNet-DWT), Linear Discriminant Analysis with Discrete Cosine Transform (LDA-DCT), and Robust Linked List with Steganography Without Embedding(RLL-SWE) models have lower \(\:acc{u}_{y}\) of 95.02%, 96.99%, 96.12%, 99.15%, 98.85%, 98.51%, 98.32%, 96.11%, 97.68%, and 95.51%, correspondingly. Also, for \(\:pre{c}_{n}\), the PPSLOA-HDBDE technique has a greater \(\:pre{c}_{n}\) of 98.74%, whereas the AROMA, DBA, SOM-SVM, CNN-BiLSTM, CANET, RFS-1, CNN-Focal, DenseNet-DWT, LDA-DCT, and RLL-SWE techniques have lower \(\:pre{c}_{n}\) of 95.69%, 95.00%, 91.18%, 96.89%, 97.12%, 97.65%, 96.81%, 96.04%, 96.89%, and 95.86%, correspondingly. Finally, for the \(\:{F1}_{score}\), the PPSLOA-HDBDE technique has a better \(\:{F1}_{score}\) of 98.52%, while the AROMA, DBA, SOM-SVM, CNN-BiLSTM, CANET, RFS-1, CNN-Focal, DenseNet-DWT, LDA-DCT, and RLL-SWE models have minimum \(\:{F1}_{score}\) of 93.97%, 95.12%, 94.27%, 96.68%, 98.00%, 97.89%, 96.99%, 96.39%, 94.13%, and 95.56%, respectively. The table values indicated that the PPSLOA-HDBDE technique outperformed existing models.

Table 3 Comparative outcome of the PPSLOA-HDBDE model with existing methods91,93.
Fig. 12
figure 12

Comparative analysis of the PPSLOA-HDBDE model with existing methods.

Table 4; Fig. 13 state the comparative outcomes of the PPSLOA-HDBDE methodology based on PT. The result indicates that the PPSLOA-HDBDE model achieved superior performance. In terms of PT, the PPSLOA-HDBDE model offers a lower PT of 8.48s, while the AROMA, DBA, SOM-SVM, CNN-BiLSTM, CANET, RFS-1, CNN-Focal, DenseNet-DWT, LDA-DCT, and RLL-SWE models attain improved PT values of 17.84s, 16.09s, 13.41s, 15.10s, 14.20s, 15.27s, 16.63s, 14.99s, 16.23s, and 15.02s, correspondingly.

Table 4 PT outcome of BEADL-EDCHD technique with recent models.
Fig. 13
figure 13

PT outcome of BEADL-EDCHD technique with recent models.

Conclusion

In this manuscript, a novel PPSLOA-HDBDE methodology is presented. The primary purpose of the PPSLOA-HDBDE methodology is to utilize advanced optimization and ensemble techniques to ensure data confidentiality while maintaining analytical efficacy. It encompasses four processes involving data preprocessing, SCSO-based FS, ensemble classification models, and IMPA-based parameter optimizer. At the primary stage, LSN is utilized to scale the input data. Besides, the SCSO-based FS process is employed to diminish the high dimensionality problem. Moreover, intrusion detection recognition is performed using an ensemble of TCN, MAE, and XGBoost classifiers. Lastly, the three models’ hyperparameter tuning is accomplished using an IMPA model. An extensive range of experimentations is performed to improve the performance of the PPSLOA-HDBDE technique, and the outcomes are examined using distinct measures. The performance validation of the PPSLOA-HDBDE technique illustrated a superior accuracy value of 99.49% over existing models. The limitations of the PPSLOA-HDBDE technique comprise potential threats in data heterogeneity across diverse devices, which may affect the accuracy of the FL results. Furthermore, the computational overhead associated with real-time sentiment analysis can strain resource-constrained edge devices. Privacy concerns regarding data sharing, even in a federated setting, remain a critical issue. Future studies may explore advanced privacy-preserving models to improve data safety further. Moreover, investigating adaptive mechanisms to optimize model training depending on varying device capabilities and network conditions would be beneficial. Expanding the model’s applicability to diverse domains beyond consumer electronics could also give broader insights into its efficiency. Lastly, incorporating user feedback into the recommendation process could enhance personalization and user satisfaction.