A deep learning based intrusion detection system for CAN vehicle based on combination of triple attention mechanism and GGO algorithm

Yang, Hongwei; Effatparvar, Mehdi

doi:10.1038/s41598-025-04720-y

Download PDF

Article
Open access
Published: 03 June 2025

A deep learning based intrusion detection system for CAN vehicle based on combination of triple attention mechanism and GGO algorithm

Hongwei Yang¹ &
Mehdi Effatparvar^2,3

Scientific Reports volume 15, Article number: 19462 (2025) Cite this article

2277 Accesses
2 Citations
Metrics details

Subjects

Abstract

Recently, with the growth of electronic cars and the advancement of modern vehicles using portable equipment and embedded systems, several in-vehicle networks like the CAN (Controller Area Network) encountered novel risks of security. Because the portal of CAN does not have systems of security, like encryption and authentication in order to contend with cyber-attacks, the necessity for a system of intrusion detection for identifying attacks on the portal of CAN is really essential. In this study, Triple-attention Mechanism (TAN) has been used to recognize different kinds of security intrusions in portals of CAN. The purpose of TAN here is to identify intrusion within 3 steps. Within the initial phase, the major features have been extracted, and TAN functions as a descriptor of feature. Then, the discriminating categorizer classifies the current features. Eventually, with the help of adversarial learning, intrusion has been recognized. The current work utilizes a novel Greylag Goose Optimization algorithm for optimal selection of the network hyperparameters. For checking the effectiveness of the suggested method, an open-source dataset was applied, which recorded the traffic of CAN using a real vehicle throughout injection attacks of message. The results show that this method outperforms certain machine learning algorithms in error rate and false negative for DoS and drive gear and RPM spoofing attack with accuracy of 96.3%, recall of 96.1%, F1-Score of 96.2%, specificity of 97.2%, accuracy of 96.3%, AUC-ROC of 0.97, and MCC of 0.92 for DoS attacks. Therefore, the phase attack is minimized.

Intrusion detection using metaheuristic optimization within IoT/IIoT systems and software of autonomous vehicles

Article Open access 02 October 2024

Securing the CAN bus using deep learning for intrusion detection in vehicles

Article Open access 22 April 2025

A lightweight intrusion detection approach for CAN bus using depthwise separable convolutional Kolmogorov Arnold network

Article Open access 20 May 2025

Introduction

Background

Built cars have become more complex in the past years, and in-vehicle network technology has been the main pillar of these electronic changes that are progressing at a high speed¹. Nowadays, vehicles contain many ECUs (Engine Control Units)² that connect to one another in an in-vehicle network³. Actually, CAN (Controller Area Network) was created for providing a reliable connection channel among these control units to broadcast messages⁴. Different serial protocols within in-vehicle networks like CAN⁵, FlexRay, and LIN (Local Area Network) are efficient in immediate message exchange, but they will be quite weak in remoting attacks⁶. Since the CAN protocol’s design lacks security mechanisms, such as authentication and encryption for protecting its relations against cyber-attacks, it is vulnerable to various security attacks⁷. Also, the lack of security solutions in the CAN protocol allows attackers to control car systems by injecting fake messages^8,9.

Fake messages can be sent directly through the OBD-II port or wireless communication systems, such as Bluetooth and WIFI¹⁰. There is a noticeable connection between road safety and cyber security violations, because hateful attacks can lead to unexpected behaviors, which cause accidents and injuries¹¹. As a result, research on the improvement of car security has become significant¹². There are 4 different ways to improve the security of in-vehicle systems, however the best solution that can identify intrusion in the shortest possible time is the IDS (intrusion detection system)¹³.

Previously, anomaly-based IDS has been considered as an effective approach for detecting wicked attacks in CAN¹⁴. While, in recent years with the help of artificial intelligence and deep learning methods, the use of these methods to provide better IDS has also increased^15,16,17,18. For example, deep convolutional networks or CNN, GAN or Generative Adversarial Networks, SOM or self-organizing map can be used now¹⁹. But, these methods use unrealistic data, therefore, their results are not reliable or they need a lot of data that may not be readily available²⁰. Consequently, it is expected that the results obtained from previous researches in this field are not reliable or depend on limited data, and the error rate is too high.

In this study, an anomaly-based IDS is presented to enhance CAN portal security in in-vehicle networks²¹. To do this, Adversarial Learning has been introduced as an authoritarian in CNN, and a DACNN is offered in order to recognize security intrusions in the CAN portal and vehicles that is well-matched with a very small amount of data. Through taking advantage of the Adversarial Learning approach, this method is capable of generating more efficient data for better IDS training by using a limited set of real attacks. In this way, the recommended IDS can recognize different cyber-attacks like DoS (Denial of Service), fuzzy, and spoofing with a false negative and a lower error rate. The obtained results indicate that this technique, achieves an acceptable level of accuracy, false negative, and lower error rates through using the limited amount of data.

Related works

Gao et al.²² examined an analysis of the security measures implemented within vehicles, with a particular focus on the existing state of in-vehicle intrusion detection systems. These systems were primarily designed for specific vehicle models and insufficient to adequately address the overall requirements for vehicle security. A novel in-vehicle intrusion detection mechanism was introduced based on deep learning alongside the structured representation of experiential knowledge known as the set of experience knowledge structure (SOEKS). This approach employed SOEKS and information entropy to enhance the adaptability of intrusion detection across various vehicles. In actuality, deep learning could be used to train a significant quantity of vehicle-specified data to create a more accurate model for that particular vehicle. It was illustrated with experimental outcomes that the suggested approach was capable of having 98% accuracy and detecting a wide range of in-vehicle attacks.

Ullah et al.²³, proposed a hybrid deep learning (DL) model for detection of cyber-attacks in IoV. The suggested model was on the basis of gated recurrent units (GRU) and long short-term memory (LSTM). The evaluation of the proposed model was conducted employing two datasets, including an integrated DDoS dataset that includes CI-CIDS 2017, CSE-CIC-IDS 2018, and CIC DoS, as well as a car-hacking dataset. The experimental findings indicated that the suggested algorithm attains higher attack detection accuracy of 99.5% and 99.9% for DDoS and car hacks, in turn. The additional performance scores, including F1-score, recall, and precision, also verify the superior performance of the suggested framework.

Lin et al.²⁴ suggested a model of intrusion detection for IVNs that utilized the VGG16 deep learning classifier to analyze attack behavior patterns and categorized potential risks. The dataset has been supplied by the Countermeasure Research Lab (HCRL) and Hacking to assess the categorization efficacy for DoS (Denial of Service), spoofing gear, RPM, and fuzzy attacks within communication vehicles. The performance of the suggested categorizer has been evaluated against the XBoost ensemble learning method for the purpose of identifying threats within in-vehicle networks. Specifically, the cases of test were designed to recognize irregularities based on precision, F1-score, recall, and accuracy for ensuring accuracy and recognizing false alarm dangers. The outcomes demonstrated that the categorization accuracy of the HCRL Car-Hacking via the XBoost and VGG16 classifiers were, in turn, 99.9995% and 97.8241% for the 5-subcategory categorization outcomes.

Lo et al.²⁵ introduced a intrusion detection system on the basis of hybrid deep learning (HyDL-IDS) that utilized spatial–temporal illustration to effectively characterize in-vehicle network traffic in an accurate manner. The current system employed a CNN (Convolutional Neural Network) and Long Short-term Memory (LSTM) in succession to automatically extract temporal and spatial attributes from in-vehicle network traffic. The suggested HyDL-IDS are evaluated utilizing a benchmark dataset of car-hacking. The findings indicated almost 100% accuracy of detection with a low rate of false alarm for diverse cyber-attacks, comprising attacks of DoS (Denial-of-Service), spoofing (revolutions and gear per minute (RPM)) attacks, and fuzzy attacks on the basis of the recognized dataset. The suggested model had importantly enhanced accuracy of detection and rate of false alarm for recognizing intrusions in-vehicle networks in comparison with other techniques, specifically Multi-layer perceptions CNN, LSTM, Naive Bayes, and Decision tree on the basis of temporal-spatial illustration of traffic of in-vehicle network.

Wang et al.²⁶ analyzed ten representative advanced deep-learning-based intrusion detection techniques and provided examples of their features and benefits. Additionally, fair and quantitative experiments were planned to do assessments of horizontal comparisons. Additionally, this work offered some important recommendations for choosing baseline methods as well as helpful direction for future research on lightweight models and the capacity to identify unknown threats. Below is a tabular representation highlighting the limitations of related and available work in line with the contents of the manuscript (Table 1).

Table 1 Related and existing work limitations.

Full size table

The tabular overview describes clearly the limitation that is aligned with some of the available strategies and highlights the need for more efficient, adaptive, and stronger intrusion detection systems like the introduced system in the manuscript.

Motivation

Most available intrusion detection systems (IDS) for CAN networks apply traditional machine learning-based methods that may have difficulty in detecting advanced and changing attack patterns. The current lack of optimized hyperparameter tuning processes in numerous IDS frameworks also restricts their performance. These gaps accentuate the pressing necessity for adaptive and optimal intrusion detection schemes suited to the specific requirements of automotive networks. The growing interconnectivity of our environment makes this research even more relevant as it seeks to improve CAN network security to safeguard the integrity and functionality of our future-connected vehicles.

Contribution

In this paper, we contribute to several areas within the automotive cybersecurity landscape, most notably in effective prevention against cyber-attacks targeting CAN networks. A novel detection architecture of intrusion has been suggested on the basis of the so-called Triple-Attention Mechanism (TAN), which operates in three stages: feature extraction, feature classification, and adversarial learning. The new system is capable of accurately identifying diverse kinds of spoofing attacks (e.g., drive gear and RPM spoofing) and cyber-attacks (e.g., and Denial-of-Service (DoS)). By adopting adversarial learning the resulting framework proves resilient to increasingly sophisticated attack modes, with demonstrated ability to adapt and maintain sustained detection performance in the presence of evolved strategies. Moreover, the paper introduced a sole implementation of a novel design of Greylag Goose Optimization algorithm for optimal hyperparameter selection, enhancing an improved balance in between accuracy and computational efficiency for the specific model.

An open-source dataset is designed to validate the effectiveness of the framework, which reflects real-world CAN network traffic in the process of message injection attack and proves the practicality of the proposed method under realistic conditions. Proposed TAN based system significantly offers four times less false negative rate and two times less error rate in comparison to common machine learning based approaches which help drastically improving the capability of mitigating DoS and spoofed attacks, improving the overall mitigation effectiveness against phase attacks.

Therefore, the proposed mechanism not only addresses the critical gaps in the existing frameworks but also caters to a robust, adaptive and efficient IDS to cater the unique challenges faced within automotive cybersecurity as per CAN network. The paper concludes by emphasizing the scalability of the TAN framework to additional in-vehicle networks, as well as the applicability of other security mechanisms, including encryption and authentication, which can provide better protection.

Materials and methods

In the present part, the approaches utilized in the suggested intrusion recognition and Controller Area Network are presented. Initially, the convolutional neural network used in this work has been introduced, then the adversarial generator network, and the controller area network have been evaluated.

Triple-attention mechanism

The mechanism of attention is famous for probability weight distribution. It calculates the attributes at diverse times, hence the features that have more data are able to learn better weighting coefficients. Therefore, it can enhance the high-dimensional hidden layer attributes’ quality. The mechanism of triple attention has been utilized as a weighted sum of top-p local attributes in a dynamic manner to obtain a local attribute $d_{j}$ (Eq. 1). In other words, $H_{j}$ equals $H_{j1} ,H_{j2} ,H_{j3} , \ldots ,H_{np}$.

$$d_{j} = \mathop \sum \limits_{k = 1}^{p} b_{jk} H_{jk}$$

(1)

where the features’ weight coefficient has been represented by ${b}_{jk}$ within the sequence of input. By taking into account the functions of attention, the score of relevance $f_{jk}$ are mathematically represented in Eq. (2):

$$f_{jk} = x_{m}^{U} {\text{tanh}}\left( {X_{f} H_{u - 1} + V_{f} H_{jk} + a_{f} } \right)$$

(2)

where the attribute points of $k$ has been demonstrated via $m$, the pixel $k$’s hidden state data has been illustrated via $H_{k}$ by vectors of feature. Ultimately, the summary function of hidden state $b_{jk}$ and $h_{k}$ has been utilized for creating vector of context $d_{j}$. Also, the shared variables have been depicted through $X_{f}$, $x_{m}^{U}$, $a_{f}$, and $V_{f}$. After that, the attention weight $b_{jk}$ has been calculated in Eq. (3):

$$b_{jk} = \frac{{{\text{exp}}\left( {f_{jk} } \right)}}{{\mathop \sum \nolimits_{l = 1}^{p} {\text{exp}}\left( {f_{jl} } \right)}}$$

(3)

In the suggested study, a triple-attentional mechanism includes 3 diverse modules of attention, including max-pooling attention, average pooling attention, and traditional attention mechanisms. The initial attention mechanism preserves half of features with more weight. In the following, the max-pooling attention mechanism emphasizes the local attributes with more weight. In the end, the average pooling function has been used to maintain local attributes. These modules of attention have been computed subsequently:

$$O = \mathop \sum \limits_{k = 1}^{p} \frac{{{\text{exp}}\left( {f_{jk} } \right)}}{{\mathop \sum \nolimits_{l = 1}^{p} {\text{exp}}\left( {f_{jl} } \right)}}$$

(4)

$$h = AVG.POOLING\left[ {\mathop \sum \limits_{k = 1}^{p} \frac{{{\text{exp}}\left( {f_{jk} } \right)}}{{\mathop \sum \nolimits_{l = 1}^{p} {\text{exp}}\left( {f_{jl} } \right)}}H_{jk} } \right]$$

(5)

$$n = MAX.POOLING\left[ {\mathop \sum \limits_{k = 1}^{p} \frac{{{\text{exp}}\left( {f_{jk} } \right)}}{{\mathop \sum \nolimits_{l = 1}^{p} {\text{exp}}\left( {f_{jl} } \right)}}H_{jk} } \right]$$

(6)

where the outcome of the mechanism attribute of traditional attention has been represented via $O$, the outcome of the mechanism feature of the average pooled attention has been displayed via $n$, and the outcomes of the mechanism feature of the maximum pooled attention has been demonstrated via $h$.

Additionally, several weights have been allocated through triple-attention mechanism to the real intrusion detection, which can distinguish all the features. The integrated output formula of the current mechanism has been computed in Eq. (7):

$$G_{{\left( {O,h,n} \right)}} = Concatenate\left( {O \oplus h \oplus m} \right)$$

(7)

When the distribution and computation processes have been conducted by the mechanism of attention, three diverse features with novel weights are generated via the weights optimizer. The integrated result of triple-attention attributes has been resembled by $G_{{\left( {O,h,n} \right)}}$. the procedure of the feature fusion optimizer has been displayed via $\oplus$. The fusion features’ output provides a denser residual network that have more depth attributes by integrating multiple-attention structures.

Dense residual network

The dense residual network has been created for combining deep attributes. The distinction between the residual network and the conventional convolutional network is that the former one can offer a skip residual procedure. the current procedure can minimize the characteristic variables; moreover, it omits the vanishing gradient and degradation initiated through deep features. The residual network has been calculated in Eq. (8):

$$Z = X_{j\gamma } \left( {X_{j - 1} Y_{j - 1} } \right) + Y$$

(8)

where the layer’s output has been represented through $Z$, and the activation function of ReLU has been resembled via $\gamma$. The weight matrix and the present layer input have been demonstrated through $X$ and $Y$. Considering the feature fusions, the previous data offered by the dense network presents a huge amount of previous feature to mine deep attributes, hence maintaining massive deep attributes. The double dense fusion policy has been resembled in the current study because it performs a feature fusion procedure using initial features and deep residual. To accomplish additional deep features, the initial and deep residual features have been thoroughly utilized by the policy of double dense fusion. The deep residual and initial features have been computed in Eqs. (9) and (10)

$$Y_{0} = I_{\partial } \left( {Y_{\partial - 1} } \right) + Y_{\partial - 1}$$

(9)

$$Y_{Z} = I_{\partial } \left( {Y_{\partial - 1} } \right) + Z_{\partial - 1}$$

(10)

where the intensive operation of initial attributes has been demonstrated through $Y_{0}$, the policy of feature fusion has been depicted through $I_{\partial }$, the output features’ intensive operation in the residual network has been demonstrated by $Y_{Z}$. Rapid Tri-Net can mine the feature deeper and the fusion of the high-dimensional data. Moreover, it can solve the problems related to the gradient disappearance and explosion in deep network.

Capsule network

The capsule network has been introduced to maintain the situation of objects and their characteristics in data. It generates a vector output of akin size with different routings. These vector routings represent the parameters of the data. Conventional neural networks (CNNs) employ scalar input activation functions, such as tangent, sigmoid, and ReLU. In contrast, the capsule network makes use of a vector-based activation function known as squashing, which is illustrated in the subsequent equation.

$$w_{k} = \frac{{\left\| {T_{k} } \right\|^{2} }}{{1 + \left\| {T_{k} } \right\|^{2} }}\frac{{T_{k} }}{{\left\| {T_{k} } \right\|}}$$

(11)

where the input and output vectors of the capsule $k$ have been, in turn, represented through $T_{k}$ and $w_{k}$. Once there exists an entity within the data, $w_{k}$ decreases a long vector near 1. Whereas, it there exists no entity within the data, $w_{k}$ decreases a short vector near 0. The capsule $T_{k}$’s entire input value has been calculated by the weighted total amount of the vector ($V_{k|j}$) within the capsule, which has been located in the lower layers, and it excludes the capsule network’s initial layer (Eqs. 12 and 13). The vector has been computed through multiplication of capsule layer within the lower layers by the weighted matrix ($V_{k|j}$) and its findings $P_{j}$.

$$T_{k} = \mathop \sum \limits_{j} c_{jk} v_{k|j}$$

(12)

$$v_{k|j} = X_{jk} P_{j}$$

(13)

where the coefficient calculated by the dynamic routing procedure has been represented through $c_{jk}$, which has been calculated in the following way:

$$c_{jk} = \frac{{{\text{exp}}\left( {b_{jk} } \right)}}{{\mathop \sum \nolimits_{l} {\text{exp}}\left( {b_{jl} } \right)}}$$

(14)

where the log probability has been displayed through $b_{jk}$. The softmax is capable of determining the total amount of correlation coefficients between the log probability and the top layer’s capsule, as well as $j{\text{th}}$ capsule. The loss of margin has been demonstrated within the capsule network for realizing if a specific category’s entities are accessible. This has been computed subsequently:

$$M_{l} = U_{l} Max\left( {0, n^{ + } - \left\| {w_{l} } \right\|} \right)^{2} + \kappa \left( {1 - U_{l} } \right)Max\left( {0, \left\| {w_{l} } \right\| - n^{ - } } \right)^{2}$$

(15)

The value of ${U}_{l}$ has been regarded as 1 once the category $l$ exists. ${n}^{+} = 0.9$ and ${n}^{-} = 0.1$ have, in turn, represented hyper-parameters and down-weighting of the loss. The length of vector calculated within the capsule network demonstrates the probability that it might be in the data portion, while the direction vector includes the data of variables, such as texture, color, size, position, and so on.

In addition, the capsule network includes 3 fully connected layers, 1 digit layer, 1 primary layer, and 1 convolutional layer. The convolutional layer consists of 256 kernels with sizes of $9 \times 9$, and it can transform pixel’s densities to local attributes with size of $20 \times 20$ to be utilized like inputs within the initial capsule. The primary capsules’ subsequent layer consisted of 36 capsules, which utilized convolutional kernels with the size of $9 \times 9 \times 9 \times 256$. These two-layer utilized ReLU activation function. Digital caps’ ultimate layers within 16D vectors consisted of instantiation variable required for reform.

In this study, we used a metaheuristic-based methodology for optimizing this network for providing optimal results in the system.

Novel Greylag Goose Optimization algorithm

The gooseneck’s most significant feature is the breeding manner during sperm forming and the way the wind and wave action influence the breeding. Thus, an algorithm was designed in order to imitate such features in a mathematical way. Subsequently, the mentioned model’s design has been displayed.

$$\left( {X + l} \right)_{i + 1} = { }\left( {X + l} \right)_{i} + WD_{i} + T_{dim} + S\left( {\left( {X + l} \right)_{i} ,(SP_{water} )_{j} } \right) + HS.{ }\left( {X + l} \right)_{i}$$

(16)

where the goose barnacle’s situation in $i_{th}$ iteration has been demonstrated by $\left( {X + l} \right)$, the wind’s orientation within $i_{th}$ iteration has been displayed via $WD$, $T_{dim}$ has been considered the goal dimension for going through the best solution or the aim, $S\left( {\left( {X + l} \right)_{i} ,(Sp_{water} )_{j} } \right)$, the logarithmic spiral amid the $j_{th}$ sperm region inside water has been represented via $S$, $i_{th} \left( {X + l} \right)_{i}$ and $Sp_{water}$, and the significant height of wave has been illustrated by $Hs$. The comprehensive clarification of the model’s all mathematical stages is explained below.

Initialization

Within GBO algorithm, the gooseneck barnacles are the individual solution, and the goosenecks’ location are the problem space variables. The gooseneck is capable of traveling any dimensional space through spreading their penis, which may be seven or eight times larger than their body²⁷. Actually, GBO is a population-based algorithm. So, for initialization, the individual solution $X$ has been considered the gooseneck barnacles’ population, which is a kind of stalked barnacle and is demonstrated in the following:

$$X = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\left( {X + l} \right)_{1,1} } & {\left( {X + l} \right)_{1,2} \ldots } & {\left( {X + l} \right)_{1,d} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots & \vdots & \vdots \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \vdots & \vdots & \vdots \\ \end{array} } \\ {\begin{array}{*{20}c} {\left( {X + l} \right)_{n,1} } & {\left( {X + l} \right)_{n,2} \ldots } & {\left( {X + l} \right)_{n,d} } \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]$$

(17)

here, the entire populations have been displayed by $n$, and the dimension as $X$ or variables’ amount has been specified by $d$. As the goosenecks have an eatable construction, all of them has different sizes, which have been displayed by $l.$ that would be chosen in a random way. There would be a consistent area inside the water for all of the animals that has sperm in order to breed. Actually, this is an important factor in the suggested algorithm called as ‘sperm_region’ that its matrix is similar to $X$:

$${\text{Sperm}}\_{\text{Region}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\left( {sr} \right)_{1,1} } & {\left( {sr} \right)_{1,2} \ldots } & {\left( {sr} \right)_{1,d} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots & \vdots & \vdots \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \vdots & \vdots & \vdots \\ \end{array} } \\ {\begin{array}{*{20}c} {\left( {sr} \right)_{n,1} } & {\left( {sr} \right)_{n,2} \ldots } & {\left( {sr} \right)_{n,d} } \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]$$

(18)

here, $n$ has been the gooseneck’s amount, and $d$ has been the dimension as $X$ or variables’ quantity. While the two of gooseneck barnacles’ situation and regions of sperm are probable responses, the manner that the issue’s domain controls them in every iteration would differ. Actual search agents, such as the ones existing in the gooseneck barnacles’ penis, direct the solution space, although the area of sperm inside the water signifies the optimum reproducing place.

Because of breeding, the area of sperm might be supposed as the innovative gooseneck within the solution space. Therefore, the goosenecks discover the innovative location and improve it in the case that they discover a finer substitute. The present method guarantees that the optimum choice for a gooseneck would be obtainable all the time. Research show that Gooseneck barnacles are originated in the higher and middle intertidal areas.

The height of significant wave has been noticeably under 0.8 to 1.5–3 m beyond mean low water has been the supremacy for barnacles. Commonly, height of significant wave has been computed through the present equation $Hs = 4\sqrt {H \wedge 2T}$, but, in the current form of GBO, the $Hs$ would be computed via the subsequent formula. Actually, the barnacles’ tolerance in their cycle of life have been considered.

$$Hs = 1.5 - \left( {\frac{{Iteration \left( {1.5 - 0.2} \right)}}{maximum\;Iteration}} \right)$$

(19)

here, it is concluded that barnacle is capable of tolerating the intensity of wave ranging from 1.5 or 3.0 to 0.2 m. The Hs would be reducing within each iteration for exploring the wave intensity range for discovering the optimum reproducing area. Next, an adjoining logarithmic spiral area has been demonstrated for sperm casting.

$$S\left( {\left( {X + l} \right)_{i} (Sp_{water} )_{j} } \right) = D_{i} .e^{bt} .\cos \left( {2\pi t} \right) + (Sp_{water} )_{j}$$

(20)

here, the $i_{th}$ barnacle’s distance for the area of $j_{th}$ sperm for breeding has been represented via $D_{i}$, the constant to define the logarithmic spiral’s shape has been represented via $b$, and the stochastic amount in [1, -1] has been illustrated by $t$. Distance has been computed this way: $D_{i} = \left( {X + l} \right)_{i} , - (Sp_{water} )_{j}$. here, $\left( {X + l} \right)_{i}$ shows the $i_{th}$ barnacles, $(Sp_{water} )_{j}$ specifies the $j_{th}$ sperm casting area, and $D_{i}$ displays the $i_{th}$ barnacles’ distance for the $j_{th}$ sperm casting region.

Off-spring generation

The sperm-cast breeding manner has been considered to be the integration of some forms argued in the current study. For updating the position of innovative gooseneck barnacles within a solution space and mimicking the motion through novel progeny generation, a movement vector have been assumed, $\Delta \left( {X + l} \right)_{i}$, which is displayed below:

$${\Delta }\left( {X + l} \right)_{i + 1} = WD_{i} + T_{dim} + Hs. {\Delta }\left( {X + l} \right)_{i}$$

(21)

$$\left( {X + l} \right)_{i + 1} = \left( {X + l} \right)_{i} + {\Delta }\left( {X + 1} \right)_{i + 1}$$

(22)

$$\left( {X + l} \right)_{i + 1} = \left( {X + l} \right)_{i} + Levy^{*} \left( {X + 1} \right)_{i}$$

(23)

The wind orientation has been displayed by $WD_{i}$ that is resulted by the degree range [0 359] on the solution space radios, and lastly added with the target dimension $T_{dim}$ to the finest solution. $T_{dim}$ or dimension through target is considered the dimension’s value within the target, through considering wind orientation to the target all the time. here, Levy flight has been displayed:

$$Levy \left( N \right) = 0.01 \times \frac{{r_{1} \times \sigma }}{{|r_{2} |^{{\frac{1}{\beta }}} }}$$

(24)

$\beta$ has been a constant that is set to 1.5, $r1$ and $r2$ have been stochastic amounts [0–1], and the formula is below:

$$\sigma = \left( {\frac{{\tau \left( {1 + \beta } \right) \times \sin (\frac{\pi \beta }{2})}}{{\tau \left( {\frac{1 + \beta }{2}} \right) \times \beta \times 2^{{\left( {\frac{\beta - 1}{2}} \right)}} }}} \right)$$

(25)

where $\tau \left( y \right) = \left( {y - 1} \right)$.

The prior clarification displays that the present calculated model needs the gooseneck barnacle for moving through a goal breeding area within the logarithmic spiral area of sperm for sperm casting over some iterations. But, there are not any orientations in the real solution space since the situation of global optimal needs to be found. Therefore, within every optimization iteration, a reproducing aim should be selected. Within GBA, the gooseneck with the maximum objective value during optimization has been considered to be the objective that makes other goosenecks become nearer to the breeding zone, but, it aids GBA keep the most talented goal within the solution space throughout every replication.

In fact, the GBA starts the optimization procedure via producing a stochastic lake of individual solutions. After that, according to the Eq. (22), the search agents modify their locations through the objective as the wind orientation and significant wave movement. In order to mimic the actual situation there might be just single sperm area remained, based on the wave intensity of the wave and wind. If their location would be enhanced by means of Eq. (23). Lastly, iteratively improving the location has been accomplished till the last necessity has been met. Finally, the finest global approximation has been assumed, besides the quality and place of the finest objective. Although the earlier debates demonstrated the way the GBA algorithm has been effective at finding the greatest result in a solution space.

Novel version

Greylag goose optimization algorithm (GGO) is novel optimization algorithm that recently introduced. But it has some limitations in some problems. Between two data samples, this algorithm struggles to update the variable damages to the factors iteratively till the modal data equals the final achieved outcomes in the field structure. This reduces the exploration term of the algorithm and leads it to prematurely converge on it. The new exploitation term of the algorithm in the current work is framed based on the Cuckoo Search Algorithm (CS)²⁸ and Gray Wolf Optimization (GWO)²⁹. This improvement leads to the following general equations for updating:

$$\left( {X + l} \right)_{1} = \left( {X + l} \right)_{i} - A_{1} \times \left( {C_{1} \times \left( {X + l} \right)_{i}^{New} - \left( {X + l} \right)_{i} } \right)$$

(26)

$$\left( {X + l} \right)_{2} = \left( {X + l} \right)_{i} - A_{2} \times \left( {C_{2} \times \left( {X + l} \right)_{i}^{New} - \left( {X + l} \right)_{i} } \right)$$

(27)

$$\left( {X + l} \right)_{3} = \left( {X + l} \right)_{i} - A_{3} \times \left( {C_{3} \times \left( {X + l} \right)_{i}^{New} - \left( {X + l} \right)_{i} } \right)$$

(28)

$$\left( {X + l} \right)_{i + 1}^{New} = \frac{1}{3} \times \left[ {\left( {X + l} \right)_{1} + \left( {X + l} \right)_{2} + \left( {X + l} \right)_{3} } \right]$$

(29)

where $\left( {X + l} \right)_{i}^{New}$ represents the new solution and $A_{1}$, $A_{2}$, $A_{3}$ and $C_{1}$, $C_{2}$, $C_{3}$ specify the generated from $A_{i}$ and $C_{i}$, where,

$$A = 2 \times a \times rnd_{1} - a$$

(30)

$$C = 2 \times rnd_{2}$$

(31)

where $a$ determines the linearly decreasing, and $rnd_{1}$ and $rnd_{2}$ represent two random values in the range [0, 1].

This change is based on ideas from GWO and includes updating 50% of the algorithm. The other 50% of the updates during the process are inspired by the CS algorithm, as shown below:

$$\left( {X + l} \right)_{i + 1} = \left( {X + l} \right)_{i} + \alpha \otimes L\left( \lambda \right) \times \left( {\left( {X + l} \right)_{Best} - \left( {X + l} \right)_{i}^{New} } \right)$$

(32)

where $\alpha$ and $L\left( \lambda \right)$ specify the Lévy distributed random values.

Optimizing TAN based on NGGO algorithm

Metaheuristic algorithms are widely used to optimize the network in various architectures, including those based on the triple-attention module, dense residual network, and capsule network. What follows is a recommendation of how the objective function would be set, for the embedding space and a loss function that allows the model to minimize the preserving rules. Below is a more in depth look at how you could structure the objective function:

(A)
Loss Function of Attention Mechanism

The attention mechanism, such as the triple-attention mechanism, tends to dynamic-weight different features, which helps improve the quality of feature extraction. The attention mechanism optimization objective function can be represented as:

$$Loss_{attention} = - \mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{P} \left[ {y_{ij} \log \left( {d_{j} } \right) + \left( {1 - y_{ij} } \right)\log \left( {1 - d_{j} } \right)} \right]$$

(33)

where $y_{ij}$ is the ground truth label of the $i - th$ sample and $j - th$ feature, $d_{j} = output$ of the attention mechanism for the $j - th$ feature, $N$ is the number of samples, $p$ signifies the number of features.

(B)
Loss Function for Dense Residual Network

The dense residual network is designed to reduce the vanishing gradient issue and enhance the ability to extract features. The goal of optimizing the dense residual network can be expressed as follows:

$$Loss_{denseResNet} = - \mathop \sum \limits_{i = 1}^{N} \left| {Z_{i} - Y_{i} } \right|^{2}$$

(34)

where $Z_{i}$ is the dense residual network output for the sample $i$, $i$-th sample’s ground truth output is $Y_{i}$, and $N$ specifies the number of samples.

(C)
Loss Function for Capsule Network

The capsule network is designed to keep the spatial connections between objects and their features intact. The goal function used to improve the capsule network can be written as:

$$Loss_{CapsNet} = - \mathop \sum \limits_{l = 1}^{L} \left[ {U_{l} \max \left( {0,n^{ + } - \left| {w_{l} } \right|} \right)^{2} + \kappa \left( {1 - U_{l} } \right)\max \left( {0,\left| {w_{l} } \right| - n^{ - } } \right)^{2} } \right]$$

(35)

where $U_{l}$ is 1 if the category $l$ exists, and 0 otherwise, $n^{ + }$ and $n^{ - }$ are hyper-parameters (e.g., $n^{ + }$ = 0.9 and $n^{ - }$ = 0.1), $\kappa$ represents a down-weighting factor for the loss, $w_{l}$ describes the output vector of the capsule for category $l,{ }L$ represents the number of categories (Fig. 1).

Therefore, the entire network architecture, including the triple-attention mechanism, dense residual network, and capsule network can be considered as an integrated loss function as follows:

$$Loss_{Total} = \frac{1}{3}\left[ {Loss_{attention} + Loss_{denseResNet} + Loss_{CapsNet} } \right]$$

(36)

Controller area network

The controller area has been found to be fast serial portal developed for providing a reliable, efficacious, and highly cost-effective association between actuators and sensors. CAN aims to connect electronic equipment of vehicles³⁰. These links facilitate the information and resources sharing among distributed applications³¹. Each node can send messages all the time³². Once 2 nodes access the vehicle with each other, and the referee aims to decide who should carry on. The extensive CAN and CAN FD cover all CAN functions and power modes with excellent performance of EMC, high quality, and a multi-source industrial foundation.

Disruptive innovation in this field makes the researchers to study more about larger and more flexible automotive networks in the future³³. Protocol of CAN has been considered several rules for receiving and transmitting messages within a system of electronic devices, where messages are transferred from one device to another. There exist 2 kinds of protocols, namely address and message. Within the protocol on the basis of address, the data packets include the device’s address that has an intended message. The signaling logic of a CAN bus is illustrated in Fig. 2.

In the protocol on the basis of message, each message has been recognized via a predetermined identifier instead of an address. A transmitted CAN architecture has been usually a protocol on the basis of message, so a message has been found to be a packet of data that aims to carry data. A message of CAN consists of 0 to 8 bytes of data arranged into a special network named an architecture. The data conceded per byte has been expressed within the protocol of CAN. Each node utilizing the protocol of CAN achieve an architecture, and CAN decides to accept it or not based on the node ID. If multiple nodes send the message simultaneously, the node with the highest priority gets access to the vehicle. On the other hand, low-priority nodes must wait until the vehicle becomes available.

Problem statement: proposed intrusion detection system

Triple-Attention Mechanism (TAN) employs three attention modules of max-pooling, average pooling, and classic attention, dynamically fused to weight and extract features and enhance the quality of high-dimensional hidden layer attributes. It computes the relevance scores for input features, generating optimized feature vectors through the combination of multiple attention structures. These enhanced features are then passed to a Dense Residual Network, which solves the vanishing gradient issue using skip connections and a double dense fusion method to combine early and deep residual features in order to allow deeper feature extraction. The Capsule Network ultimately maintains spatial information between objects by employing vector-based activation functions (e.g., squashing) and dynamic routing to output vectors that capture object attributes like size, position, and texture. Together, these components form a unified architecture that enhances the accuracy and robustness of intrusion detection in CAN networks.

The current section presents the suggested method based on the Adversarial deep learning for the intrusion recognition system of the CAN vehicle within in-vehicle networks. The overview of this technique that consists of three main parts, has been displayed in Fig. 3. This work is done in three phases, first is the feature descriptor, then the discriminating classifier, and the third phase is adversarial learning. These three different functions are presented in the form of a CNN integrated with adversarial learning.

In this intrusion recognition system, optimized triple-attention mechanism integrated with adversarial training is employed. The intrusion identification method here is a supervised learning that is performed throughout three stages. In the first stage, the optimized triple-attention mechanism acts as a descriptor, in the second phase, there is a discriminating classifier, and finally, in the third stage, adversarial learning is established.

First stage: the feature descriptor using the CNN model

In the first step, using the optimized TAN model, the feature descriptor is performed in such a way that the Convolutional Blocks of the CNN model is responsible for the feature descriptor. This part maps the data in the D-dimensional space and is displayed below:

$$G_{f} \left( {x; \theta_{f} } \right):x \to R^{D} = G_{f}$$

(37)

here, the input data is signified by $\theta_{f}$, and the generating function of the input is displayed via $G_{f}$. Actually, the feature descriptors input the data and output the feature descriptors or feature vectors³⁴. In fact, softmax function is used here that is demonstrated subsequently.

$$O_{j} = \left[ {\begin{array}{*{20}c} {P\left( {y = 1} \right)|x;\theta } \\ {\begin{array}{*{20}c} {P\left( {y = 2} \right)|x;\theta } \\ {\begin{array}{*{20}c} \ldots \\ {P\left( {y = k} \right)|x;\theta } \\ \end{array} } \\ \end{array} } \\ \end{array} } \right] = \frac{1}{{\mathop \sum \nolimits_{j = 1}^{k} {\text{exp}}\left( {\theta^{j} x} \right)}}\left[ {\begin{array}{*{20}c} {{\text{exp}}\left( {\theta^{1} x} \right)} \\ {\begin{array}{*{20}c} {{\text{exp}}\left( {\theta^{1} x} \right)} \\ {\begin{array}{*{20}c} \ldots \\ {{\text{exp}}\left( {\theta^{k} x} \right)} \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]$$

(38)

where $k$ is the number of categories, and ${\theta }_{x}^{j}$ is considered the parameters of the classification layer.

Second stage: discriminating classification

In the second stage, the discriminant classifier is performed on the features obtained from the previous step. The discriminant classifier learns that which features in the input will be useful for distinguishing between different possible categories. Mathematically, it directly computes the posterior probability $P(y|x)$ or learns a direct map from the input x to the label y.

In most classification tasks, the classifier is often more accurate. The discriminator distinguishes between real and fake data due to the GAN model. In this step, a discriminating classifier is designed as an opponent, that it has ${\theta }_{d}$ parameters, and both data subsets would be fed from the feature descriptor. Actually, the output features have been previously distinguished by the discriminating classifier.

$$G_{d} = G_{d} \left( {G_{f} \left( x \right):\theta_{d} } \right)$$

(39)

where $x$ explains the input and, $G$ represents the generating function.

Third stage: adversarial learning

In the final phase, adversarial learning is performed on the outputs of the discriminator classifier, which is the second stage and is considered as the input of this phase^35,36,37. To do this, the GAN model has been used in such a way that this data is employed as a dataset for the model, and according to that dataset the Generative and Discriminator model will be done. Subsequently, the classifier that is made by the next fully-connected layers in CNN is defined.

$$G_{y} = G_{y} \left( {G_{f} \left( x \right);\theta_{y} } \right):R^{D} \to R^{L}$$

(40)

where $L$ represents the number of categories. The softmax function is used to perform this method. In this technique, the amount of BCE (binary cross entropy) for the obtained outputs is displayed below.

$$\begin{gathered} {\mathcal{L}}\left( {G_{d} \left( {G_{f} \left( {x_{i} } \right)} \right),d_{i} } \right) = d_{i} \log \frac{1}{{G_{d} \left( {G_{f} \left( {x_{i} } \right)} \right)}} + \left( {1 - d_{i} } \right) \hfill \\ \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;\; \times \log \frac{1}{{1 - G_{d} \left( {G_{f} \left( {x_{i} } \right)} \right)}} \hfill \\ \end{gathered}$$

(41)

here, represents the generating function is represented via $G$, the input value is demonstrated via ${X}_{i}$, and the binary variable for ${X}_{i}$ is displayed by ${d}_{i}$. Figure 4 shows the method of GAN for classification.

In the third phase, the adversarial learning is discussed. To do this, the GAN algorithm has been applied. The current task advances through being supervised and using labels. During this stage, data is first given to the generating model in a random way for generating fake data. After that, the present data is given to the differentiated model along with the real labeled data, and after passing the productive and differentiated steps, intrusion recognition will be done.

Results and discussion

In the present part, the outcomes of the experimental tests of the proposed intrusion recognition are presented. First, the dataset used in this study has been introduced for evaluating the suggested method, then the evaluation criteria has been checked. Then an analysis was performed on the designed Novel Greylag Goose Optimization algorithm to show its superiority against some other algorithms that can be used for intrusion detection. Afterward, the performance of the suggested method with the design of various experiments will be assessed. After that, the findings gained from this approach have been compared to the previous methods.

Dataset

Although the dataset utilized in this research work was originally developed by³⁸, it was adapted and suited to our experiment by recording CAN traffic using two bespoke Raspberry Pi devices, one for capturing normal network traffic and the other for injecting malicious messages. This adaptation ensured that the dataset was representative of a real-world attack for the setup of our experiment.

The records used to perform this analysis have 6,683,777 total records, consisting of normal CAN traffic and packets of maliciously injected messages. It consists of four types of attacks: DoS (3,078,250 records), Fuzzy (3,347,013 records), Drive Gear Spoofing (2,766,522 records), and RPM Spoofing (2,290,185 records), and normal traffic (5,086,498 records). Each dataset was constructed by recording CAN traffic for 300 message injection intrusions, each between 3 to 5 s, for 30 to 40 min per attack type. Flagged injected messages (“T”) and normal messages (“R”) are included in the dataset, enabling complete testing of the suggested intrusion detection system.

Also, two custom Raspberry Pi devices have been used to create this dataset. One is used to record network traffic and the other is used to inject fake messages. They are connected to the in-vehicle network through the OBD-II port located under the steering wheel of the vehicle, and the message is transmitted through the actual ECU nodes in the CAN vehicle.

Through the OBD-II port, custom nodes were able to send and receive from real ECU nodes on the CAN vehicle. This dataset has four parts, such as DoS attack, Fuzzy attack, drive gear spoofing attack, and RPM spoofing attack. Table 2 indicates the comparison of the different types of attacks and some analysis based on various models.

Table 2 The comparison of the different types of attacks and some analysis based on various models.

Full size table

Each dataset is constructed by recording CAN traffic during message injection. Each dataset contains 300 message injection intrusions and each intrusion is done for 3 to 5 s and each dataset has a total of 30 to 40 min of CAN traffic. During the construction of the data set, the vehicle was parked with the engine running. In normal mode, there are 26 distinct CAN identifiers in the CAN vehicle.

The characteristics of the attacks are as follows:

1.
DoS attack: injection of CAN ID “0000” messages every 3.0 ms “0000” is the most common CAN ID.
2.
Fuzzy attack: injection of messages with completely random value CAN ID and DATA every 5.0 ms.
3.
Drive gear forgery attack and RPM forgery attack: injection Specific CAN messages related to wheel/RPM information gear every 1 ms.

Timestamp, CAN ID, DLC, DATA, DATA, DATA, DATA, DATA, DATA, DATA, DATA, Flag

1.
Timestamp: The recorded times
2.
CAN ID: ID of CAN message in HEX, for example 043
3.
DLC: number of data bytes from 0 to 8 4
4.
DATA [0–7]: data value (bytes)
5.
Flag: T or R, T indicates the injected message, while R stands for normal message

Figure 5 shows a demonstration of the normal and injected messages’ number in the collection.

Efficiency verification of the NGGO algorithm

A set of experiments was conducted on some conventional benchmark functions to validate the proposed algorithm. The synthetic test function consists of several specific functions derived from the “CEC-BC-2017 test suite”. Because of their specific properties such as continuity, differentiability, and convexity, many optimization algorithms are tested on these functions. These comprise unimodal, multimodal, and a mixed function, implementing a full span of tests to judge the performance. The simulations were run on a desktop with an i7 processor, 16GB RAM with Windows 11 OS.

The algorithm is realized in MATLAB R2019b. The algorithm was compared with some different advanced methods of state the art, consisting of some various renowned meta-heuristic algorithms such as Biogeography-Based Optimizer (BBO)³⁹, Emperor penguin optimizer (EPO)⁴⁰, Spotted hyena optimize (SHO)⁴¹, Wildebeest herd optimization (WHO)⁴², and World Cup Optimization (WCO)⁴³. Table 3 indicates the setting parameter value for the comparative algorithms.

Table 3 Setting parameter value for the comparative algorithms.

Full size table

Each algorithm was run 15 times on all test functions to obtain reliable results. These include the Mean, Best and StD (standard Deviation) of the solution quality and the computation time. Then, statistical methods are used to analyze the results and to ask whether the proposed algorithm significantly outperforms the other methods; If so, this would be strong evidence of both the efficiency of the modifications and the dominance of the suggested optimizer. In Table 4, the results have been compared by employing the NGGO and other investigated optimizers in the analysis were provided.

Table 4 A comparative analysis of the findings accomplished via employing the NGGO and other investigated optimizers in the present work.

Full size table

The outcome of the experiments clearly showed the advantage of the NGGO algorithm in all experiments in comparison to other algorithms. NGGO effectively produced the best solution or very close the best solution across all test functions indicating its efficiency to converging to optimal or near-optimal solutions. We also observed the stability and robustness of the algorithm, shown in the lower standard deviations of solution quality than other algorithms, which indicates less dispersion and is more reliable than their performance⁴⁴. Similar behavior was observed in the multimodal and mixed functions, which are the most complex problems since they are characterized by multiple local optima; here, NGGO also stood out, efficiently traversing the search space finds high quality solutions. The versatility is very nice, since it shows that NGGO is not only good for certain types of problem, but can handle a range of optimization problems.

Overall, the NGGO algorithm outperforms the existing optimization techniques because of its ability to find the optimal solution repeatedly to stabilize and get the best results for any application. The algorithm generates superior quality results evidenced by its better performance compared with the well-known meta-heuristic algorithms with the diversity of unimodal, multimodal and mixed functions. Such consistent and reliable performance, even in computationally expensive situations, lends compelling support to the efficacy of the changes made to the NGGO algorithm.

Attack detection performance

In order to assess the effectiveness of the proposed Triple-attention Mechanism-based Intrusion Detection System (TAN-IDS) optimized with the Novel Greylag Goose Optimization (NGGO) algorithm, experiments were carried out in detecting attacks of different nature; the DoS attack, Fuzzy attack, drive gear spoofing attack, and RPM spoofing attack. The results were then compared with other cutting-edge methods, including support vector machine (SVM)⁴⁵, Random forest (RF)⁴⁶, k-nearest neighborhood (KNN)⁴⁷, long-short-term memory (LSTM)⁴⁸, and convolutional neural network (CNN)⁴⁹. Table 5 indicates the comparison results of the attack detection performance.

Table 5 Attack detection performance comparison.

Full size table

As shown by the results in Table 5, the TAN-IDS proposed in this article outperformed other classification methods in identifying different types of cyber-attacks toward CAN communication. Compared to conventional machine learning approaches (Random Forest, KNN, and SVM) and deep learning models (LSTM, CNN), the proposed method achieved the best detection rates for Fuzzy attacks, DoS attacks, RPM spoofing attacks, and Drive Gear spoofing attacks under all attack types.

The TAN-IDS showed strong capabilities in detecting these types of attacks thanks to the introduction of the Triple-attention Mechanism for dynamically weighting the effective features and the NGGO algorithm for determining the optimal hyperparameters of the network. Additionally, the dense residual network and capsule network elements play a crucial role in preventing the problem of vanishing gradients and maintaining exponential relationships between input variables. Thus, the positive detection rates demonstrate that the proposed TAN-IDS can accurately detect and successfully mitigate cyber-attacks on the CAN network interior to the vehicle, proving to be a tremendous step forward in automotive security.

Evaluation criteria

In order to check the effectiveness of the proposed system in intrusion identification, the common evaluation criteria of deep learning models like area under the curve (AUC), accuracy (ACC), precision, specificity, Matthew’s correlation coefficient (MCC), F1-score, sensitivity. Following provides the mathematical equations utilized as measurement indicators in this study:

$$Precision = \frac{TP}{{TP + FP}} \times 100$$

(42)

$$Sensitivity = \frac{TP}{{TP + FN}} \times 100$$

(43)

$$Specificity = \frac{TN}{{TN + FP}} \times 100$$

(44)

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}} \times 100$$

(45)

$$F1 - score = 2 \times \frac{Precision \times Sensitivity}{{Precision + Sensitivity}} \times 100$$

(46)

$$MCC = \frac{TP \times TN - TP \times FN}{{\sqrt {\left( {TP + FP} \right) \times \left( {TP + FN} \right) \times \left( {TN + FP} \right) \times \left( {TN + FN} \right)} }} \times 100$$

(47)

where TP, TN, FP, and FN describe the true positive, true negative, false positive, and false negative, respectively.

Performance analysis of different attack types

We have also increased the performance metrics for each attack type separately in order to make a thorough analysis of the performance of each attack (doS attack, Fuzzy attack, drive gear spoofing attack, and rpm spoofing attack). These metrics are Precision (PR), Recall (RE), F1-Score (F1), Specificity (SP), Accuracy (AC), AUC-ROC and Matthews Correlation Coefficient (MCC) for the proposed TAN-IDS method and other state-of-the-art methods are compared. Table 6 illustrates the performance analysis for different attack types.

Table 6 Performance analysis for different attack types.

Full size table

As can be observed, the results give a detailed perspective about the performance of different TAN-IDS for attacks in the proposed model over the selected metrics. The proposed approach consistently outperforms traditional machine learning models (SVM, Random Forest, KNN) and deep learning models (LSTM, CNN) for detection of all forms of attacks.

TAN-IDS achieves DoS attacks Precision 96.3%, Recall 96.1%, F1-Score 96.2%, Specificity 97.2%, Accuracy 96.3%, AUC-ROC 0.97 and MCC 0.92. The above metrics will tell us how good our model is able to detect DoS attack and act on them with minimum number of false positives and false negatives. The TAN-IDS outperforms other methods by a considerable margin over all the metrics employed, emphasizing its robustness against this type of attack.

The TAN-IDS results in a Precision score of 95.8%, Recall of 95.4%, F1-Score of 95.6%, Specificity of 96.5%, Accuracy of 95.8%, AUC-ROC of 0.96 and MCC = 0.91 for Fuzzy attacks. The outcome proves the efficiency of the model with which we are able to recognize Fuzzy attacks and is considered a typical injection method that introduces random values for CAN ID and data. It illustrates the effectiveness of TAN-IDS against other methodologies, making it a viable approach for economically detecting these kinds of attacks.

For drive gear spoofing attack TAN-IDS Precision = 96.5%, Recall = 96.2%, F1-Score = 96.3%, Specificity = 97.3%, Accuracy = 96.5%, AUC-ROC = 0.97 and MCC = 0.92. These high metrics demonstrate the ability of the model to detect and filter out fake messages how such messages are transmitted and used (driving gear). This model beats several inferential methods and can be exploited to include security measures in devices based on vehicular networks.

For RPM spoofing attack, the TAN-IDS achieves Precision of 96.2%, Recall of 96.0%, F1-Score of 96.1%, Specificity of 97.1%, Accuracy of 96.2%, AUC-ROC of 0.97 and MCC of 0.92. The results reveal the model’s skill at identifying counterfeit messages regarding RPM details. Our experiments also show that the TAN-IDS outperforms alternative approaches, and is an effective tool for preserving vehicular network integrity.

Overall, using more metrics including Precision, Recall, F1-Score, Specificity, Accuracy, AUC-ROC and MCC, in addition to all experiments prove the ability of the tan-ids in detecting any forms of attacks on the CAN network. All metrics attained high values, validating the model’s robustness, accuracy, and reliability, thus marking an advancement in automotive cybersecurity. The results also shown as profiles for more clarification (Fig. 6).

The proposed TAN-IDS method consistently outperforms all other methods for all attack types, as shown in those tables. TAN-IDS is the top of the line, excelling others methods in precision, recall, F1-Score, and AUC-ROC. As a result, not only do TAN-IDS detect attacks with precision, but they also sustain a low false positive and negative rate, as highlighted by the high values of these metrics. As comparison:

Precision: For all types of attack, The TAN-IDS has the highest precision which shows its ability to correctly classify positive vector and to make less false positive.
In all attack types, the recall values for TAN-IDS are higher than that of all attack types and confirming the effectiveness of TAN-IDS to identify most of the actual positive instances.
The F1-Scores for TAN-IDS are relatively high, suggesting a balanced tradeoff between precision and recall.
AUC-ROC: The AUC-ROC score of TAN-IDS is significantly higher than the other methods, confirming that the proposed method has a very good separability between the attack and non-attack instances.

These comparative results show an overall summary of the superiority of the proposed TAN-IDS method. The above results may show that TAN-IDS is capable of consistently achieving high values across multiple metrics, proving its effectiveness and reliability against various forms of cyber-attacks on the CAN network. The findings indicate that the TAN-IDS has made considerable progress in the area of automotive cybersecurity.

Confusion matrix and ROC analysis

In order to better exhibit the performance measures of the proposed Triple-Attention Mechanism-based Intrusion Detection System (TAN-IDS), we can make use of the confusion matrix and ROC curves. The confusion matrix visually shows the classification of the model by representing True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). Figure 7 shows confusion matrix of the model.

By these steps over different categories of attacks, such as Fuzzy, drive gear spoofing, and RPM spoofing, the confusion matrix points towards the robustness of TAN-IDS framework, particularly with respect to other conventional machine learning models like SVM, Random Forest, and KNN.

The Receiver Operating Characteristic (ROC) plot illustrates the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) for various threshold levels. The Area Under the Curve (AUC) provides a measure to evaluate the model’s performance. Figure 8 shows the ROC curve.

The ROC curve complements the confusion matrix by indicating the True Positive Rate (TPR) vs. False Positive Rate (FPR) trade-off at various levels of thresholds. In the case of the TAN-IDS, the ROC curve indicates that it is an improved classifier of attack vs. non-attack instances as revealed by the very high Area Under the Curve (AUC) values of above 0.97 for all the categories of attacks. This indicates that the model possesses very good separability even in complex cases, such as distinguishing genuine RPM signals and spoofed messages. The ROC curve also shows a graphical representation of the robustness of the model, where consistent performance against diverse cyber-attacks. Compared to other advanced methods like LSTM and CNN, the TAN-IDS consistently has a greater AUC value, thus proving its efficiency in preventing phase hacking attacks and protecting in-vehicle networks more securely. Both the confusion matrix and ROC curve together provide an overall analysis of the detection system mechanism.

Discussions

Performance of the proposed Triple-attention Mechanism-based Intrusion Detection System (TAN-IDS) optimized by the Novel Greylag Goose Optimization (NGGO) algorithm exhibits better detection of cyber-attacks on CAN networks, as indicated from its better performance on several parameters compared to conventional machine learning and deep learning-based approaches.

The TAN-IDS achieved very high precision, recall, F1-score, specificity, accuracy, AUC-ROC, and MCC values for all types of attacks, i.e., DoS, Fuzzy, drive gear spoofing, and RPM spoofing attacks, and thus proves its capability to effectively detect malicious activity with negligible false positives and false negatives. This excellent performance can be attributed to the triple-attention dynamic weighting of effective features, optimal hyperparameter tuning by NGGO, and dense residual and capsule networks which handle problems like vanishing gradients and maintain spatial information between input variables. Notably, the precision and recall values of the system highlight its excellent performance in correct classification of positive samples and identification of most genuine threats, whereas the excellent AUC-ROC values validate its excellent separability among attack and non-attack samples.

These findings not only prove the computational feasibility and reliability of the TAN-IDS solution but also identify its capacity to enhance automotive cybersecurity through prevention of attacks through phase hacking and other sophisticated attack modes. Further, the consistent outperformance of state-of-the-art algorithms such as SVM, Random Forest, KNN, LSTM, and CNN proves the effectiveness of the proposed solution, thus making it a pivotal step towards protecting in-vehicle networks from the spread of cyber-attacks.

Conclusion

With the fast development of automotive electronics and the growing demand for in-vehicle networks and embedded model like Controller area networks (CANs) are paradigm shift in modern automobiles. However, this technology development brings key security weaknesses, as well. CAN networks being the most important ones for in-vehicle communication, do not possess inherent mechanisms of security like encryption and authentication that makes them defenseless to cyber-attacks. These vulnerabilities could result in critical implications like control of vehicle operations, data breaching or even safety risks to passengers. Given the increasing connectivity and automation of vehicles, there is a growing motivation to develop effective intrusion detection systems (IDS) to secure CAN networks. In this work, CAN network security was investigated as a significant challenge and a novel intrusion detection framework was proposed rooted in the Triple-Attention Mechanism (TAN). The TAN framework consists of three stages: the feature extraction stage, the feature classification stage, and the adversarial learning stage. By breaking down the detection process into multiple levels, the system can robustly detect different types of cyber-attacks like DoS attacks and spoofing attacks on the drive gear and RPM signals. A novel version of Greylag Goose Optimization algorithm was reconstructed to reach optimal accuracy in the least amount of time possible and thus trained the model while tuning hyperparameters to get the best version of the proposed framework. The proposed method was validated on an open-source dataset where real-world CAN network traffic was recorded during message injection attacks. The findings showed that the TAN model for intrusion detection system outperformed the traditional machine learning techniques with regards to false negative rate and error rate. In particular, the system demonstrated powerful performance in detecting and addressing DoS and spoofing threats that diminished phase hacking risk and fortifying the quality of security of the CAN infrastructure. This research adds to the literature on automotive cybersecurity by presenting a strong yet computationally feasible way to perform intrusion detection for CAN networks. This approach combining the Triple-Attention Mechanism with adversarial learning presents an evolution of the existing limitations of the current IDS frameworks. In addition, a novel variant of Greylag Goose Optimization algorithm emphasizes the need for hyperparameter optimization and how it can boost the performance of machine learning models. Future work would work on effect of the versatility of this framework and its potential applicability to broader in-vehicle networks and potentially integrate further growing trends in security like encryption and authentication to really enhance the resiliency of automotive systems from malicious attack.

Data availability

The data can be available from the following website: https://ocslab.hksecurity.net/Datasets/CAN-intrusion-dataset.

References

Huang, J., Zhao, M., Zhou, Y. & Xing, C.-C. In-vehicle networking: Protocols, challenges, and solutions. IEEE Netw. 33(1), 92–98 (2018).
Article Google Scholar
Park, C. & Park, S. Performance evaluation of zone-based in-vehicle network architecture for autonomous vehicles. Sensors 23(2), 669 (2023).
Article ADS PubMed PubMed Central Google Scholar
Kumar, V. S. A big data analytical framework for intrusion detection based on novel elephant herding optimized finite Dirichlet mixture models. Int. J. Data Inform. Intell. Comput. 2(2), 11–20 (2023).
Google Scholar
Beniel Dennyson, W. & Jothikumar, C. A review on controller area network and electronic control unit in automotive environment. J. Posit. Sch. Psychol. 269–277 (2022).
Sun, G., Sheng, L., Luo, L. & Yu, H. Game theoretic approach for multipriority data transmission in 5G vehicular networks. IEEE Trans. Intell. Transp. Syst. 23(12), 24672–24685 (2022).
Article Google Scholar
Tanksale, V. Intrusion detection system for controller area network. Cybersecurity 7(1), 4 (2024).
Article MathSciNet Google Scholar
Dehghani, M. et al. Blockchain-based securing of data exchange in a power transmission system considering congestion management and social welfare. Sustainability 13(1), 90 (2021).
Article Google Scholar
Byun, J. Y., Park, J. W., Jo, C. Y. & Jeon, J. W. Effective in-vehicle network training strategy for automotive engineers. IEEE Access 10, 29252–29266 (2022).
Article Google Scholar
Ding, C., Zhu, L., Shen, L., Li, Z., Li, Y. & Liang, Q. The intelligent traffic flow control system based on 6G and optimized genetic algorithm. IEEE Trans. Intell. Transp. Syst. (2024).
Buscemi, A. et al. A survey on controller area network reverse engineering. IEEE Commun. Surv. Tutor. 25(3), 1445–1481 (2023).
Article Google Scholar
Krishna, S. Advancing cyber resilience for autonomous systems with novel ai-based intrusion prevention model. Int. J. Data Inform. Intell. Comput. 3(3), 1–7 (2024).
Google Scholar
Duan, F., Song, F., Chen, S., Khayatnezhad, M. & Ghadimi, N. Model parameters identification of the PEMFCs using an improved design of Crow Search Algorithm. Int. J. Hydrog. Energy 47(79), 33839–33849 (2022).
Article ADS CAS Google Scholar
Gong, Z., Li, Lu. & Ghadimi, N. SOFC stack modeling: A hybrid RBF-ANN and flexible Al-Biruni Earth radius optimization approach. Int. J. Low Carbon Technol. 19, 1337–1350 (2024).
Article CAS Google Scholar
Xiao, J., Yang, L., Zhong, F., Chen, H. & Li, X. Robust anomaly-based intrusion detection system for in-vehicle network by graph neural network framework. Appl. Intell. 53(3), 3183–3206 (2023).
Article Google Scholar
Sun, H., Wang, J., Weng, J. & Tan, W. KG-ID: Knowledge graph-based intrusion detection on in-vehicle network. IEEE Trans. Intell. Transp. Syst. 26, 4988 (2025).
Article Google Scholar
Kumar, C. & Ansari, M. S. A. An explainable nature-inspired cyber attack detection system in Software-Defined IoT applications. Expert Syst. Appl. 250, 123853 (2024).
Article Google Scholar
Ye, P. et al. GDT-IDS: Graph-based decision tree intrusion detection system for controller area network. J. Supercomput. 81(4), 591 (2025).
Article Google Scholar
Khanal, B., Kumar, C. & Ansari, M. S. A. Real-time anomaly detection framework to mitigate emerging threats in software defined networks. J. Netw. Syst. Manag. 33(2), 26 (2025).
Article Google Scholar
Li, Z., Hu, J., Leng, B., Xiong, L. & Fu, Z. An integrated of decision making and motion planning framework for enhanced oscillation-free capability. IEEE Trans. Intell. Transp. Syst. 25(6), 5718–5732 (2023).
Article Google Scholar
Cheng, Q., Chen, W., Sun, R., Wang, J. & Weng, D. RANSAC-based instantaneous real-time kinematic positioning with GNSS triple-frequency signals in urban areas. J. Geod. 98(4), 24 (2024).
Article ADS Google Scholar
Xiao, J. et al. CALRA: practical conditional anonymous and leakage-resilient authentication scheme for vehicular crowdsensing communication. IEEE Trans. Intell. Transp. Syst. 26, 1273 (2024).
Article Google Scholar
Gao, L., Li, F., Xu, X. & Liu, Y. Intrusion detection system using SOEKS and deep learning for in-vehicle security. Clust. Comput. 22, 14721–14729 (2019).
Article Google Scholar
Ullah, S. et al. HDL-IDS: A hybrid deep learning architecture for intrusion detection in the Internet of Vehicles. Sensors 22(4), 1340 (2022).
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Lin, H.-C., Wang, P., Chao, K.-M., Lin, W.-H. & Chen, J.-H. Using deep learning networks to identify cyber attacks on intrusion detection for in-vehicle networks. Electronics 11(14), 2180 (2022).
Article Google Scholar
Lo, W. et al. A hybrid deep learning based intrusion detection system using spatial-temporal representation of in-vehicle network traffic. Veh. Commun. 35, 100471 (2022).
Google Scholar
Wang, K., Zhang, A., Sun, H. & Wang, B. Analysis of recent deep-learning-based intrusion detection methods for in-vehicle network. IEEE Trans. Intell. Transp. Syst. 24(2), 1843–1854 (2022).
Google Scholar
Karthic, S., Manoj Kumar, S. & Senthil Prakash, P. Grey wolf based feature reduction for intrusion detection in WSN using LSTM. Int. J. Inf. Technol. 14(7), 3719–3724 (2022).
Google Scholar
Li, J. et al. Multi-swarm cuckoo search algorithm with q-learning model. Comput. J. 64, 108–131 (2020).
Article MathSciNet Google Scholar
Emary, E., Zawbaa, H. M. & Grosan, C. Experienced gray wolf optimization through reinforcement learning and neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(3), 681–694 (2017).
Article MathSciNet PubMed Google Scholar
Suhana, S., Karthic, S. & Yuvaraj, N. Ensemble based dimensionality reduction for intrusion detection using random forest in wireless networks. In 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT) 704–708 (IEEE, 2023).
Sundaram, K., Subramanian, S., Natarajan, Y. & Thirumalaisamy, S. Improving performance of intrusion detection using ALO selected features and GRU Network. SN Comput. Sci. 4(6), 809 (2023).
Article Google Scholar
Sundaram, K., Natarajan, Y., Perumalsamy, A. & Yusuf Ali, A. A. A Novel hybrid feature selection with cascaded LSTM: Enhancing security in IoT networks. Wirel. Commun. Mob. Comput. 2024, 5522431 (2024).
Article Google Scholar
Rammohan, S. R. & Jayanthiladevi, A. AI enabled crypto mining for electric vehicle systems. Int. J. Data Inform. Intell. Comput. 2(4), 33–39 (2023).
Google Scholar
Kumuthini, C., Thangarasu, N., Kavitha, K. & Gomathy, K. Ant with artificial bee colony techniques in Vehicular ad-hoc networks. Int. J. Data Inform. Intell. Comput. 2(3), 21–28 (2023).
Google Scholar
Aljabri, W., Hamid, M. A. & Mosli, R. Enhancing real-time intrusion detection system for in-vehicle networks by employing novel feature engineering techniques and lightweight modeling. Ad Hoc Netw. 169, 103737 (2025).
Article Google Scholar
Kumar, C., Biswas, S., Ansari, M. S. A. & Govil, M. C. Nature-inspired intrusion detection system for protecting software-defined networks controller. Comput. Secur. 134, 103438 (2023).
Article Google Scholar
Zehao, W. et al. Optimal economic model of a combined renewable energy system utilizing Modified. Sustain. Energy Technol. Assess. 74, 104186 (2025).
Google Scholar
Song, H. M., Woo, J. & Kim, H. K. In-vehicle network intrusion detection using deep convolutional neural network. Veh. Commun. 21, 100198 (2020).
Google Scholar
Simon, D. Biogeography-based optimization. IEEE Trans. Evol. Comput. 12(6), 702–713 (2008).
Article Google Scholar
Dhiman, G. & Kumar, V. Emperor penguin optimizer: A bio-inspired algorithm for engineering problems. Knowl. Based Syst. 159, 20–50 (2018).
Article Google Scholar
Dhiman, G. & Kumar, V. Spotted hyena optimizer: A novel bio-inspired based metaheuristic technique for engineering applications. Adv. Eng. Softw. 114, 48–70 (2017).
Article Google Scholar
Amali, D. & Dinakaran, M. Wildebeest herd optimization: A new global optimization algorithm inspired by wildebeest herding behaviour. J. Intell. Fuzzy Syst. 37, 8063 (2019).
Google Scholar
Ghiasi, M et al. Enhancing power grid stability: Design and integration of a fast bus tripping system in Protection Relays. IEEE Trans. Consum. Electron. (2024).
Karthic, S. & Kumar, S. M. Hybrid optimized deep neural network with enhanced conditional random field based intrusion detection on wireless sensor network. Neural Process. Lett. 55(1), 459–479 (2023).
Article Google Scholar
Alsarhan, A., Alauthman, M., Alshdaifat, E. A., Al-Ghuwairi, A.-R. & Al-Dubai, A. Machine Learning-driven optimization for SVM-based intrusion detection system in vehicular ad hoc networks. J. Ambient Intell. Humaniz. Comput. 14(5), 6113–6122 (2023).
Article Google Scholar
Li, Y., Li, F. & Song, J. The research of random forest intrusion detection model based on optimization in internet of vehicles. In Journal of Physics: Conference Series Vol. 1757, 012149 (IOP Publishing, 2021).
Shapoorifard, H. & Shamsinejad, P. Intrusion detection using a novel hybrid method incorporating an improved KNN. Int. J. Comput. Appl 173(1), 5–9 (2017).
Google Scholar
Yu, Y., Zeng, X., Xue, X. & Ma, J. LSTM-based intrusion detection system for VANETs: A time series classification approach to false message detection. IEEE Trans. Intell. Transp. Syst. 23(12), 23906–23918 (2022).
Article Google Scholar
Hossain, M. D., Inoue, H., Ochiai, H., Fall, D. & Kadobayashi, Y. An effective in-vehicle CAN bus intrusion detection system using CNN deep learning approach. In GLOBECOM 2020–2020 IEEE Global Communications Conference 1–6 (IEEE, 2020).

Download references

Author information

Authors and Affiliations

Xijing University, Xi’an, 710123, Shaanxi, China
Hongwei Yang
Department of Computer, Ardabil Branch, Islamic Azad University, Ardabil, Iran
Mehdi Effatparvar
College of Technical Engineering, The Islamic University, Najaf, Iran
Mehdi Effatparvar

Authors

Hongwei Yang
View author publications
Search author on:PubMed Google Scholar
Mehdi Effatparvar
View author publications
Search author on:PubMed Google Scholar

Contributions

H.Y. and M.E. wrote the main manuscript text and M.E. prepared figures. All authors reviewed the manuscript.

Corresponding author

Correspondence to Mehdi Effatparvar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Yang, H., Effatparvar, M. A deep learning based intrusion detection system for CAN vehicle based on combination of triple attention mechanism and GGO algorithm. Sci Rep 15, 19462 (2025). https://doi.org/10.1038/s41598-025-04720-y

Download citation

Received: 07 March 2025
Accepted: 28 May 2025
Published: 03 June 2025
DOI: https://doi.org/10.1038/s41598-025-04720-y

Subjects

Abstract

Similar content being viewed by others

Intrusion detection using metaheuristic optimization within IoT/IIoT systems and software of autonomous vehicles

Securing the CAN bus using deep learning for intrusion detection in vehicles

A lightweight intrusion detection approach for CAN bus using depthwise separable convolutional Kolmogorov Arnold network

Introduction

Background

Related works

Motivation

Contribution

Materials and methods

Triple-attention mechanism

Dense residual network

Capsule network

Novel Greylag Goose Optimization algorithm

Initialization

Off-spring generation

Novel version

Optimizing TAN based on NGGO algorithm

Controller area network

Problem statement: proposed intrusion detection system

First stage: the feature descriptor using the CNN model

Second stage: discriminating classification

Third stage: adversarial learning

Results and discussion

Dataset

Efficiency verification of the NGGO algorithm

Attack detection performance

Evaluation criteria

Performance analysis of different attack types

Confusion matrix and ROC analysis

Discussions

Conclusion

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links