Introduction

The Internet of Things (IoT) enables the interconnection of everyday physical objects, or ‘smart devices,’ through embedded sensors, microchips, and innovative technologies1. This interconnectedness is facilitated by Radio Frequency Identification (RFID) tags, which assign a unique global identity to each device, allowing for remote monitoring and control, as well as communication between connected nodes2. With widespread connectivity across various sectors, including smart cities, education, commerce, industry, and healthcare3,4,5,6, IoT is projected to reach 50 billion online devices, significantly increasing big data shared via cloud infrastructure7,8,9. However, this constant connectivity exposes IoT devices to cyber threats, including malware and software piracy, which can compromise security and lead to financial losses6,10,11. Advanced plagiarism detection methods, such as clone detection and source code similarity analysis, are essential to combat these threats and protect intellectual property.

Early identification of cyber threats is crucial for effective security prevention and attack detection12. Despite various deep learning techniques proposed for this purpose, current methods often lack accuracy and require significant computation time, resulting in longer processing times and inadequate precision13. These limitations highlight the need for enhanced strategies that can deliver faster and more accurate threat detection14,15. Recognizing the inefficiencies in current techniques has inspired the development of new solutions to better address the challenges in cyber threat detection, aiming to improve both performance and reliability in securing systems against potential attacks. By focusing on these advancements, we can work towards a more effective defense against cyber threats in increasingly complex digital environments.

Major contribution of this research work summarized as below,

  • Novel IoT Threat Detection Framework: We propose a new threat detection approach using Alternating Graph-Regularized Neural Networks (AGRNN), specifically designed to classify IoT-related threats into benign and malicious categories. This is a novel application of AGRNN in the field of IoT cyber security.

  • Enhanced Data Preprocessing via EASSF: We introduce Edge-Aware Smoothing Sharpening Filtering (EASSF) to pre-process the dataset. This method effectively reduces noise and enhances data quality for improved feature extraction—an original contribution in the context of source code threat detection.

  • Advanced Feature Extraction Using GSCT: We apply the General Synchro extracting Chirplet Transform (GSCT) to extract key Haralick texture features such as entropy and correlation. This combination of GSCT and source-code-based threat detection is both innovative and effective.

  • Optimization with Sea-lion Optimization Algorithm: The use of the Sea-lion Optimization Algorithm (SLOA) to tune the AGRNN improves model performance and convergence, which has not been previously explored in this context.

  • Real-World Dataset Application: We utilize the Google Code Jam Dataset, a real-world and practical dataset that enhances the relevance and applicability of our research to real-time cyber threat detection scenarios.

Literature review

Recent studies on cyber security threat detection in IoT using deep learning (DL) have garnered significant attention. Notably, Farhan Ullah et al.15 proposed a combined DL approach in 2019 to classify files infected with malware and pirated software in IoT. Their method leveraged TensorFlow, a deep neural network (DNN), to detect software piracy through source code plagiarism, achieving higher precision but lower f-measure. This study highlights the potential of DL in enhancing threat detection in IoT environments.

In 2021, Keping Yu et al.16 presented a study on securing critical infrastructures through a deep learning (DL) approach for threat detection in Industrial Internet of Things (IIoT). Their method utilized Bidirectional Encoder Representations from Transformers (BERT) to identify Advanced Persistent Threat (APT) attack sequences, taking into account the unique characteristics of prolonged and continuous APT attacks. By optimizing APT attack sequences, the proposed approach achieved improved F-measure performance, although it maintained relatively low accuracy.

In 2022, Mohamed Amine Ferrag et al.17 introduced Edge-IIoTset, a comprehensive and realistic cybersecurity dataset for IoT and IIoT applications, designed to facilitate effective machine learning-based intrusion detection in both centralized and federated learning paradigms. The dataset was created on a specifically designed IoT/IIoT testbed equipped with a range of devices, sensors, protocols, and cloud or edge setups. When utilized in two distinct scenarios, the dataset demonstrated its versatility and effectiveness in evaluating machine learning-based intrusion detection systems, showcasing higher precision albeit with relatively low f-measure.

In 2021, Yakub Kayode Saheed and Micheal Olaolu Arowolo18 proposed an efficient cyber attack detection framework for the Internet of Medical Things (IoMT) in smart environments, leveraging Deep Recurrent Neural Networks (DRNN) and machine learning processes. The study highlighted various security concerns arising from potential attacks, including password guessing, impersonation, Denial of Service (DoS) attacks, remote hijacking, and man-in-the-middle attacks. Their approach demonstrated improved performance with a higher Receiver Operating Characteristic (ROC) curve, although it yielded relatively low Recall.

In 2021, Eirini Anthi et al.19 presented a study on adversarial attacks on machine learning (ML) cybersecurity defenses in Industrial Control Systems (ICS). The researchers utilized a Jacobian-based Saliency Map attack to generate adversarial samples, enabling adversarial learning to be applied to supervised methods. Their investigation examined the impact of adversarial training with these samples on the robustness of supervised methods. The results showed improved sensitivity, although the approach yielded relatively low accuracy.

Recent studies have explored innovative approaches to cybersecurity in industrial and smart grid contexts. In 2021, Mahmoud Elsisi et al.20 proposed a novel Internet of Things (IoT) architecture for online monitoring of gas-insulated switchgear state, leveraging the cyber-physical system concept from Industry 4.0. This approach demonstrated higher accuracy, although it yielded relatively low Receiver Operating Characteristic (ROC) performance. In a separate study, Priti Prabhakar et al.21 presented a cybersecurity framework for smart metering infrastructure in 2022, utilizing the Median Absolute Deviation method. Their intrusion detection system, designed to counteract cyber-attacks, showed improved f-measure performance but relatively low recall. By incorporating anomaly-based intrusion detection, the system can identify even slight alterations in parameters, enhancing its ability to detect unknown threats.

Recent studies have proposed innovative approaches to anomaly detection and threat identification in IoT networks. For instance, Yakub Kayode Saheed et al.22 developed a novel, privacy-preserving Deep Neural Network (DNN) framework that leverages deep SHapley Additive exPlanations (SHAP) technique for anomaly detection in CPS-enabled IoT networks. This approach enhances system resilience by providing insights into the DNN’s decision-making process.

Furthermore, Musa Odunayo Sabit et al.23 proposed a new threat intrusion detection model in the IIoT, comprising six modules, including activity receiver, communication module, attention module, intrusion detection module, mitigation module, and alert module. This model utilizes a genetic algorithm with an attention mechanism and modified Adam optimized LSTM (GA-mADAM-IIoT) to optimize LSTM networks.

Additionally, Joshua Ebere Chukwuere et al.24 introduced an explainable Artificial Intelligence (XAI) Ensemble Transfer Learning (TL) model for detecting zero-day attacks in the Internet of Vehicles (IoV). This model incorporates deep SHAP, offering transparency and making decisions intelligible to cybersecurity experts. Table 1 gives the features and challenges of the various recent literature on the research topic.

Table 1 Features and challenges.

Proposed methodology

This cyber security threat detection model identifies two types of threats: benign and malicious. The process begins with an initial phase of data collection, where cyber security threats are detected and gathered data is transmitted for further processing. The processing pipeline then consists of three primary stages: pre-processing, feature extraction, and classification, which are executed sequentially to facilitate accurate threat detection and analysis, as detailed in Fig. 1.

Fig. 1
figure 1

Proposed sea lion AGRNN-CS-TD-IoT for cyber security threats detection.

Data procurement

This section leverages the GCJ dataset25, comprising 400 distinct source code documents from 100 programmers, to investigate software piracy. Sourced from Google Code Jam, the dataset facilitates analysis of coding patterns, piracy detection, and classification of benign and malicious threats, enabling robust evaluation of cyber security threat detection models.

Pre-Processing using Edge-Aware smoothing sharpening filtering (EASSF)

In this section, the Edge-Aware Smoothing Sharpening Filtering (EASSF)26 method is employed for pre-processing source codes. The codes are segmented into manageable fragments and refined through techniques like stemming, root word extraction, frequency analysis, and word elimination. This process eliminates noisy data, transforming codes into actionable insights, as formulated in Eq. (1)

$$H\left( {{\alpha _l}} \right)=\frac{{\sigma _{l}^{2}}}{2}{\left( {{\alpha _l} - 1} \right)^2} - \log \,S\left( {{\alpha _l}} \right)$$
(1)

where represents usable tokens from cleansed data, denotes various filters, and signifies source code identification of the log-posterior. The EASSF filter formulation, differing from guided filter functions, is given by Eq. (2).

$$H\left( {{\alpha _l}} \right)=\frac{{\sigma _{l}^{2}}}{2}{\left( {{\alpha _l} - 1} \right)^2}+\frac{1}{{2{\theta ^2}}}\,\alpha _{l}^{2} - \eta \log {\alpha _l}$$
(2)

Where, \(- \eta \log {\alpha _l}\) indicates the comparison of the guided filter’s and it contains an additional term; \(\theta\) denoted as a scale parameter; \(\sigma _{l}^{2}\) indicates the usable tokens from the cleansed data and \(H\left( {{\alpha _l}} \right)\) represents the source code identification of the log-posterior. By pre-processing this method, Stemming, root word extraction, frequency extraction, and word elimination are removed from the collected input data. Then, pre-processed data given to feature extraction phase.

Feature extraction is performed using the general synchro extracting chirplet transform (GSCT)

This section discusses the General Synchro extracting Chirplet Transform (GSCT)27, a feature extraction method applied to pre-processed data. GSCT extracts Haralick texture features, including energy and entropy. The primary goal of GSCT is to develop a new extraction operator based on ridge curve identification, as expressed in Eq. (3).

$${P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)=\int\limits_{S} {g\left( T \right)\,\psi _{{\alpha ,\sigma }}^{ * }} \left( {T - \tau } \right)\exp \,\left( { - J\omega \left( {T - \tau } \right)} \right)\,h\left( T \right)$$
(3)

Where, \(\alpha\) denoted as a chirplet rate; \(g\left( T \right)\) denoted as analytical signal; \(\psi _{{\alpha ,\sigma }}^{ * }\) denoted as a is the window function’s complex conjugate; \({P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)\) denoted as extracted data and it is retained by the Synchro extracting transform, which efficiently extracts the features; \(h\left( T \right)\) expansion around the time point \(T\), ignoring the order terms and \(- J\omega \left( {T - \tau } \right)\) denoted as second-order GSCT Synchro extracting. Haralick Texture features such as Correlation and Entropy were removed using Synchro extracting Chirplet Transform. And the following methods are:

Correlation

This scale’s purpose is to calculate the likelihood of the designated data pairs. Then the correlation is expressed as Eq. (4).

$${P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)=\int\limits_{S} {\bar {g}} \left( T \right)\,{V_\sigma }\left( {T - \tau } \right)\,\exp \,\left( { - J\omega \left( {T - \tau } \right)} \right)\,h\,\left( T \right)$$
(4)

Where, \({P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)\) denoted as extracted data and it is retained by the Synchroextracting transform, which efficiently extracts the features; \(h\left( T \right)\)expansion around the time point \(T\), ignoring the order terms; \(\alpha\)denoted as a chirplet rate; \(g\left( T \right)\)denoted as analytical signal; \(- J\omega \left( {T - \tau } \right)\)denoted as second-order GSCT Synchro extracting and \(\,{V_\sigma }\)denoted as the length of GSCT window.

Entropy

The texture of input data is identified by applying a statistical measure of randomness, enabling analysis and characterization of its underlying patterns and structures. Then the Entropy is expressed as Eq. (5).

$$\left| {{P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)} \right| \leqslant \left| {g\left( \tau \right)\int\limits_{S} {{V_\sigma }\left( {T - \tau } \right)\,h\left( T \right)} } \right|$$
(5)

Where, \({P_{\alpha ,\sigma }}\left( {\tau ,\omega } \right)\) The term denotes the extracted data, which is retained by the Synchro extracting transform. This transform efficiently extracts features, allowing for effective data analysis and utilization. \(\,{V_\sigma }\) denoted as the length of GSCT window; \(h\left( T \right)\) expansion around the time point \(T\) , ignoring the order terms; \(\alpha\) denoted as a chirplet rate;\(g\left( \tau \right)\) indicates the maximum amplitude and \({V_\sigma }\left( {T - \tau } \right)\) denoted as demodulation operator matches the extracted modulation element, CT achieves the optimal GSCT window, enabling effective feature extraction. Subsequently, Haralick texture features are extracted and fed into the AGRNN or classification, distinguishing between benign and malicious threats.

Classification using alternating Graph-Regularized neural network (AGRNN)

In this section, AGRNN28 method is used for classifying cyber security threats. By incorporating conditional information into the generator’s label, AGRNN effectively categorizes output types.

The total adding conditional of the AGRNN is divided into two parts and it is given as Eq. (6).

$$z\left( {{G^{\left( k \right)}}} \right)=Soft\hbox{max} \left( {\sigma \left( {{G^{\left( k \right)}}{V_z}+y} \right)} \right)$$
(6)

Where, \(z\left( {{G^{\left( k \right)}}} \right)\) denoted as the previous layer’s prediction; \({G^{\left( k \right)}}V\) denotes the number of classes; \(\sigma\) boosting weights on identifying threats in cyber-attackand \(Y\) indicates the ground truth. The soft-max function is the proposed AGNN’s goal to classify the cyber threats and it is given as Eq. (7).

$$P=\sum\limits_{{k=1}}^{T} {\left( {{\alpha ^k}z\left( {{G^{\left( k \right)}}} \right)+{\beta ^{\left( k \right)}}z\left( {{C^{\left( k \right)}}} \right)} \right)}$$
(7)

Where, \({\alpha ^{\left( k \right)}}\) denotes the weight of the classifier; \(z\left( {{C^{\left( k \right)}}} \right)\) denotes the each weak classifier’s performance on labelled layers; \(z\left( {{G^{\left( k \right)}}} \right)\) denoted as the previous layer’s prediction; \(\rho\) linked to the rationale behind the layer classification and \({\beta ^{\left( k \right)}}\) denoted as a very small number that prevents the divide-by-zero error. By using an AGRNN variation to calculate the model’s final predictions in the data’s in the normalized layer and it is given as Eq. (8).

$${\alpha ^{\left( k \right)}}=\frac{1}{2}\log \frac{{1 - f_{G}^{{\left( k \right)}}}}{{f_{H}^{{\left( k \right)}}}}+\log \,\left( {S - 1} \right)$$
(8)

Where; \({\alpha ^{\left( k \right)}}\) denotes the weight of the classifier; \(\log \frac{{1 - f_{G}^{{\left( k \right)}}}}{{f_{H}^{{\left( k \right)}}}}\) denoted as an updating rate that automatically modifies the sample weight based on the weak classifier’s predictions and \(\log \,\left( {S - 1} \right)\) denotes the number of classes in the normalized layer.

The dataset employed in our research is the Google Code Jam (GCJ) dataset, which comprises source code documents from 100 programmers, totaling 400 distinct files. This dataset is utilized to investigate software piracy and detect cyber security threats in IoT systems. The hyper parameters for the Graph-Regularized Neural Network were included Number of hidden layers are 3, Number of neurons in each layer: 128, 64, and 32, Learning rate is 0.001, Regularization strength (L2) is 0.01 and finally Graph regularization coefficient is 0.1.

Dataset Characteristics

The GCJ dataset contains source code files with varying characteristics, including different programming styles, coding patterns, and potential vulnerabilities. This diversity enables robust evaluation of our proposed approach for detecting cyber security threats.

Pre-processing Steps

To prepare the dataset for model training, we applied the following preprocessing steps:

  1. 1.

    Edge-Aware Smoothing Sharpening Filtering (EASSF): We used EASSF to remove noise and enhance the quality of the source code data.

  2. 2.

    Tokenization: The source code files were tokenized to extract meaningful features and patterns.

  3. 3.

    Feature extraction: We employed the General Synchro extracting Chirplet Transform (GSCT) to extract Haralick texture features, including correlation and entropy.

Dataset Source

The GCJ dataset is sourced from the Google Code Jam repository, a publicly available collection of source code files. This dataset provides a suitable foundation for evaluating our proposed approach for detecting cyber security threats in IoT systems. By utilizing the GCJ dataset and applying these pre-processing steps, proposed model is able to train and evaluate effectively, achieving promising results in detecting cyber security threats.

The AGRNN ultimately classifies cybersecurity threats into two categories: benign and malicious. To optimize its performance, the Sea-Lion algorithm is utilized for fine-tuning the AGRNN’s weight and bias parameters.

Optimization for AGRNN using Sea-Lion optimization algorithm (SLnO)

The Sea-Lion Optimization Algorithm29 is used the weights of the AGRNN. The Fig. 2 depicts the flow chart of Sea-Lion for Optimizing AGRNN. The weight parameter \({\alpha ^{\left( k \right)}}\)of AGRNN is optimized using the sea-lion. The formulation of the Sea-lion depends critically on the movement, detecting, attacking, which characterised as behaviours that enhance life qualities of Sea-Lion which are described below,

Step 1: Identifying and tracking phase

Sea lions use their whiskers to notice the position, shape, and size of prey by sensing alterations in water waves. If the whiskers are oriented opposite to the water flow, they vibrate less, allowing the lion to pinpoint a prey’s location. Upon detecting prey, the sea lion, acting as a leader, alerts other members of its subgroup, which then update their positions to race and hunt the prey. The SLnO models this behaviour, Given that the target prey is presumed to be the most promising candidate, which is often the current best or near-optimal solution, mathematically represented by Eq. (9).

$$D=\left| {2B.P(t) - SL(t)} \right|$$
(9)
$$SL(t+1)=P(t) - D.C$$
(10)

The hunting process is mathematically represented as follows: The distance between the target prey and the sea lion is calculated as D, where P(t) and SL(t) denote the position vectors of the prey and the sea lion at iteration t, respectively. A random vector B, ranging from 0 to 1, is scaled by a factor of 2 to enhance exploration and facilitate the discovery of optimal or near-optimal solutions. As the sea lion moves closer to the target prey at the next iteration, this behaviour is captured by the mathematical model presented in Eq. (10).

The next iteration is represented by (t + 1). The parameter C decreases linearly from 2 to 0 over the course of iterations, which enables the sea lion’s leader to move towards the current prey and surround it effectively. This linear decrease in C facilitates the convergence of the search process.

Step 2: Vocalization phase

In this phase, sea lions employ a distinctive hunting strategy, leveraging their advanced sensory capabilities and coordinated behaviour. By chasing and herding prey towards the ocean’s surface, they create an advantageous hunting environment. Their ability to detect sounds in both aquatic and aerial environments allows them to effectively communicate and coordinate with other sea lions. Upon detecting prey, a sea lion alerts its group members, triggering a collaborative effort to encircle and capture the prey. This synchronized behavior is a hallmark of sea lion hunting tactics.

$$SLHs=\left| {{V_1}(1+{V_2})/{V_2}} \right|$$
(11)
$${V_1}=\sin \theta$$
(12)
$${V_2}=\sin \varphi$$
(13)

The Sea Lion Optimization Algorithm incorporates the leader’s speed of sound, denoted as SLHs, which accounts for the distinct speeds of sound in water and air. Upon vocalizing, the sound wave undergoes reflection in the air to notify shore-based members and refraction in water to alert submerged members. The angles of reflection (θ) and refraction (Ø) are governed by separate equations, reflecting the unique acoustic characteristics of each medium.

Step 3: Exploitation phase

Sea lions employ a sophisticated hunting approach, characterized by a leader-guided encirclement of prey. The leader, representing the most effective search agent, identifies the prey’s location and alerts the group, facilitating a coordinated convergence on the target. This target typically represents the current optimal solution; however, the discovery of a superior solution by a new search agent can trigger a shift in focus, enabling the group to adapt and potentially refine their solution.

$$SL(t+1)=\left| {P{{(t)}_2} - SL(t)} \right|\cos (2\Pi m)+P(t)$$
(14)

Step 4: Exploration phase

In nature, sea lions use their whiskers and zigzag swimming patterns to randomly search for prey. Similarly, When exceeds 1 or is less than − 1, sea lions are forced to diverge from the target prey and leader, prompting a search for alternative prey. During exploitation, sea lions update their positions based on the best search agent, while in exploration, they update positions based on a randomly selected sea lion.

Notably, when > 1, the SLnO algorithm performs a global search, seeking the optimal solution, as formulated in Eqs. 15 and 16

$$D=\left| {2B.S{L_{rnd}}(t) - SL(t)} \right|$$
(15)
$$SL(t+1)=S{L_{rnd}}(t) - D.C$$
(16)

Step 5: Termination

The weight parameter value of generator \({\alpha ^{\left( k \right)}}\) from Alternating Graph-Regularized Neural Network is optimized by utilizing Sea-lion; and it repeat until it obtains its halting criteria \(A={A_1}+1\). Then AGRNN-CS-TD-IoT method effectively classifies the Cyber security threats as Benign and Malicious.

Fig. 2
figure 2

Flow Chart of Sea-Lion for Optimizing AGRNN.

Result and discussion

The experimental outcomes of AGRNN-CS-TD-IoT are discussed in this section. The simulation is executed in Python using PC through Intel core i5, 2.50 GHz CPU, 8GB RAM, windows 7 using Google Code Jam (GCJ) dataset. Obtained outcome of the proposed AGRNN-CS-TD-IoT approach is analysed with existing CSD-IoT-DLA15, SCI-DLTD-IoT16, and CSD-IoT-FL17 systems.

Performance measures

The performance of proposed approach examined utilizing accuracy, precision, and Mean Square Error performance metrics.

Accuracy

Accuracy states detection rate are correctly categorized. Then the formula derived in Eq. (17).

$$Accuracy=\frac{{\left( {TP+TN} \right)}}{{\left( {TP+FP+TN+FN} \right)}}$$
(17)

F-Measure

The performance equation is provided in and the evaluation parameter of F-Measure is analysed. Then the formula is derived in Eq. (18)

$$F - Measure=2 \times \frac{{recall \times precision}}{{recall+precision}}$$
(18)

ROC

ROC formulated in Eqs. (19),

$$ROC=0.5 \times \left( {\frac{{TP}}{{TP+FN}}+\frac{{TN}}{{TN+TP}}} \right)$$
(19)

Performance analysis

Figure 3, 4 and 5 illustrates the simulation results of the AGRNN-CS-TD-IoT method. The performance metrics are compared alongside those of existing methods, including CSD-IoT-DLA, SCI-DLTD-IoT, and CSD-IoT-FL.

Fig. 3
figure 3

Accuracy analysis.

Figure 3 presents the accuracy analysis. The proposed AGRNN-CS-TD-IoT achieves 29.60%, 18%, and 14.7% higher accuracy for detecting Benign threats, and 20.1%, 27.6%, and 13.2% higher accuracy for Malicious threats compared to existing methods, including CSD-IoT-DLA, SCI-DLTD-IoT, and CSD-IoT-FL, respectively.

Fig. 4
figure 4

F-Measure analysis.

Figure 4 illustrates the F-measure analysis. The proposed AGRNN-CS-TD-IoT achieves 17.9%, 26.11%, and 13% higher F-measure for detecting Benign threats, and 16.7%, 35.6%, and 17% higher F-measure for Malicious threats compared to existing methods, including CSD-IoT-DLA, SCI-DLTD-IoT, and CSD-IoT-FL, respectively.

Fig. 5
figure 5

ROC analysis.

Figure 5 presents the ROC analysis. The proposed AGRNN-CS-TD-IoT achieves 29.02%, 18.0%, and 14.7% higher ROC values in detecting cyber security threats compared to existing methods, including CSD-IoT-DLA, SCI-DLTD-IoT, and CSD-IoT-FL, respectively. The convergence plot for prosed algorithm for MSE is illustrated in Fig. 6 which exhibits that within 18 iterations the MSE is obtained.

Fig. 6
figure 6

Convergence Characteristics of sea lion.

Conclusion and future work

In conclusion, this paper presents a novel IoT threat detection framework utilizing Graph-Regularized Neural Networks (AGRNN), which effectively detects cyber security threats in IoT systems. By leveraging the Google Code Jam Dataset, pre-processed with Edge-Aware Smoothing Sharpening Filtering (EASSF) and feature extraction via General Synchro extracting Chirplet Transform (GSCT), our approach achieves remarkable performance improvements of up to 29.60% in accuracy compared to existing methods. The AGRNN classifier, optimized using the Sea-Lion Optimization Algorithm, demonstrates robustness and adaptability in distinguishing between benign and malicious threats, providing a promising solution for organizations to safeguard sensitive data and protect their reputation. Future work will be Future research will focus on deploying the proposed AGRNN-based threat detection system in real-time IoT environments to evaluate its effectiveness under live, dynamic network conditions. To enhance classification capabilities, the model will be extended from binary to multi-class threat detection, enabling identification of various cyber-attacks such as ransomware, spyware, and denial-of-service (DoS). Additionally, the framework will be adapted for edge and fog computing architectures to enable low-latency, distributed detection directly on IoT devices, reducing reliance on centralized servers. To ensure adaptability across diverse IoT domains such as healthcare, industrial systems, and smart grids, transfer learning and domain adaptation techniques will be explored for improved cross-domain generalization. Further optimization performance may be achieved by hybridizing the Sea-lion Optimization Algorithm with other metaheuristic techniques to enhance convergence speed and solution quality. Finally, future efforts will focus on incorporating Explainable AI (XAI) techniques to improve the transparency and interpretability of AGRNN predictions, making the system more usable and trustworthy for cybersecurity professionals in real-world applications.