A novel detection mechanism against malicious attacks by using spatio and temporal topology information

Wu, Qingfeng; Zhuang, Shufa; Wang, Xinyu

doi:10.1038/s41598-025-93957-8

Download PDF

Article
Open access
Published: 22 March 2025

A novel detection mechanism against malicious attacks by using spatio and temporal topology information

Qingfeng Wu¹,
Shufa Zhuang¹ &
Xinyu Wang^2,3

Scientific Reports volume 15, Article number: 9978 (2025) Cite this article

1738 Accesses
1 Citations
Metrics details

Subjects

Abstract

This paper aims at addressing the detection of false data attacks (FDAs) in power system. While improving the operation of the power system, the integration of multi-layered cyber-physical networks poses huge security risks. In particular, the FDAs can fool the Chi-square detector-based detection mechanism by manipulating communication layer data. For this reason, this paper focuses on proposing a novel spatial–temporal features-based detection framework against false data attacks (FDAs). The proposed detection framework consists of two steps as follows: Kepler Optimization Algorithm (KOA)-convolutional neural networks (CNN)-based spatial features extraction; bidirectional gate recurrent unit (BiGRU)-based temporal features extraction. To enhance the performance of extracting spatial features in CNN, KOA is introduced to optimize the related parameters of CNN, such as learning rate and convolution kernel size etc. Different traditional GRU, a BiGRU model is developed to extract the forward and backward temporal features. In addition, an Attention mechanism is introduced to focus on important information of feature data. Through the bilevel extraction of spatio-temporal features, the proposed detection framework can identify the normal or abnormal data in power system. Finally, simulation cases on IEEE 14-bus and 118-bus grid system are provided to verify the effectiveness of the proposed KOA-CNN-BiGRU-Attention framework. Compared with existing detection models, such as GCN and GGNN-GAT, accuracy, precise, F1-score, and recall under proposed detection model can be improved.

Lightweight mechanical equipment fault diagnosis framework based on GCGAN-MDSCNN-ICA model

Article Open access 10 February 2025

A multi-branch network for cooperative spectrum sensing via attention-based and CNN feature fusion

Article Open access 13 January 2026

Bearing fault diagnosis method based on WSST and ISSA-MCNN-BIGRU

Article Open access 24 November 2025

Introduction

As the key to the new energy transformation, the security of power grid directly affects people's livelihood and social stability. Due to the deep cross-fertilisation of cyber-physical power systems, the security of new power systems stems from both physical and cyber systems^1,2. However, the traditional security control of the cyber or physical power system is difficult to cope with the high-level persistent threat from cyber-physical space³. Research works have shown that different types of malicious attacks faced power system are increasing, especially the threat of false data attacks^4,5. At present, malicious attacks in power system can be classified into two categories: denial-of-service attacks (Dos) and false data attacks (FDAs)⁶. As shown in Fig. 1, due to the openness of power system, hackers can inject malicious attacks, such as Dos and FDA. Dos attacks aim to disrupt the communication layer of power system causing it to go down⁷. It's worth noting that Dos attacks can be detected by using KF state estimation. In contrast, FDAs can bypass the above detection mechanism by injecting a bank of false data^8,9. With incomplete network information, Jin et al., constructed an FDI attack aiming at AC state estimation¹⁰. For instance, power system in Taiwan suffered a hacking attack that caused very large losses in 2023. Without a timely response to the above cyber-physical attacks, an attacker can wreak havoc on the power system. In this regard, rapid and accurate detecting false data attacks is a critical issue for cyber-physical power system.

To detect injected FDAs, lots of research works have been devoted. The existing detection methods can be classified into two categories: model-based techniques and learning-based techniques. Model-based detection techniques are based on static power data. By constructing accurate mathematical models, model-based techniques can detect and identify the injected attacks^{11,12,13,14,15,16}. In¹¹, a secure observer-based attack detection approach was developed. The proposed method considered the influence of FDAs on system and designed a security technique for detecting and preventing FDAs by using observer residuals. In¹², an attack detection and reconstruction approach was proposed. The state residual generated by a robust sliding mode observer was used to detect and reconstruct the dynamic load altering attacks (DLAAs). In addition, to cut down the influence of precomputed threshold, an adaptive observer-based detection approach using adaptive threshold was developed¹³. Motivated by the super-twisting sliding mode algorithm, an effective attack detection and reconstruction framework was constructed¹⁴. In contrast to above centralized detection methods, Chowdhury rt al., developed a decentralized estimator to detect and identify distributed FDAs in power system¹⁵. In¹⁶, a novel detection method using unknown interval observer was proposed. The proposed detection approach can use interval residuals to replace the traditional precomputed threshold. As a sum, model-based detection techniques can enhance detection performance against FDAs to a certain extent. Meanwhile, the accuracy of power model and design of detection thresholds are key factors affecting model-based detection techniques.

With the rapid development of AI technology, learning-based detection techniques are gradually being applied to detect anomalous data in power grids. Different from model-based detection techniques, learning-based detection techniques are not limited by the model accuracy and design of detection thresholds. In¹⁷, an improved random forest algorithm-based detection algorithm against FDAs and random attacks was proposed in power grids. Based on the established linear power model, a data-driven using low-rank detection approach was developed⁸. Taking the unknown system parameters into account, a detection approach using the observable Markov decision process was proposed¹⁸. In¹⁹, a support vector machine based detection model was developed to identify the injected FDAs. However, the detection accuracy of above learning-based methods can be affected by the selection of detection threshold. In²⁰, a machine learning-based detection technique using semi-supervised generative adversarial network was proposed to detect and locate the injected FDAs. In addition, the deep convolutional neural network (CNN) has gradually been an efective method for effective feature extraction. In²¹, a group-fusion one-dimensional CNN was proposed to improve the model recognition accuracy. And a multi‐scale group‐fusion one‐dimensional CNN for high‐resolution range profile (HRRP) target recognition was presented²². To reduce the computational complexity of vanilla CNNs, a lightweight depth-wise separable fusion CNN for ballistic target HRRP recognition is developed²³. A graph convolutional network framework was constructed to detect FDAs, which considered the impact of the attack on the spatial properties of the grid system structure²⁴. However, above learning-based detection techniques lack consideration of attack detection from a spatio-temporal perspective. By ignoring the topological relationship among various instruments within a grid deployment, these techniques frequently lead to an incomplete extraction of spatial information. Furthermore, recent study in²⁵ indicates that load variations in smart grids have distinct spatial-temporal characteristics.

Motivated by above problems, this paper explores a spatial-temporal detection framework against FDAs in power grids. As shown in Fig. 2, the proposed detection framework consists of Kepler Optimization Algorithm (KOA)-convolutional neural networks (CNN), bidirectional gate recurrent unit (BiGRU) and attention mechanism. Specifically, CNN is mainly used to extract local and spatial features from data. Through convolutional and pooling layers, CNN can extract local features of power network data (such as voltage, current, power, etc.) from time series data; BiGRU is an enhanced recurrent neural network (RNN) that can simultaneously capture the forward and backward dependencies of time series data. Through gate control mechanisms (reset and update gates), BiGRU can capture long-term dependencies in network data, such as the correlation between abnormal events before and after; Attention mechanisms are used to dynamically assign weights, highlight important features and suppress irrelevant features. There can be a large amount of noise or redundant information in power system data, and attention mechanisms can help models focus on key features of anomalous data, improving detection accuracy. In addition, KOA is an optimisation algorithm used to adjust the hyperparameters of a model, such as learning rate, convolution kernel size, number of GRU units, etc. Grid data typically has high dimensionality and complexity, and KOA can help find the most appropriate model parameters for data features, thereby improving detection accuracy. Main contributions can be summarized as follows.

1.
A spatial–temporal detection framework against FDAs is proposed, which consists of spatial features extraction model using KOA-CNN; temporal features extraction using BiGRU. KOA is introduced to enhance the feature extraction performance of MCNN; Attention mechanism is further proposed to improve the detection performance of constructed detection model.
2.
Simulation tests on IEEE 14-bus and 118-bus grid system are provided. Compared with the existing detection methods, the detection indicators, such as , such as accuracy, missed alarm, precise, $F_{1}$-score, and recall, by at least at least 1.49%, 1.15%, 11.24%, and 4.15%, respectively. In addition, simulation results verify that the proposed detection model has a good robustness performance against attack intensity.

The framework of this work is organized as follows. Section "Background" presents the covert feature of FDAs. The proposed spatial-temporal detection framework against FDAs is provided in Section "The spatial-temporal detection framework against FDAs". Simulation cases are tested in Section "Case studies". Section "Conclusions and discussion" shows the conclusion and future works.

Background

In general, Phasor Measurement Unit (PMU) can acquire the operating status of the power system in real time, such as power voltage, phase angles, etc. Based on these collected data, Supervisory Control And Data Acquisition (SCADA) can implement data anomaly analysis and anomaly alerts by using state estimator. Therefore, accurate state estimation is critical to power system security. According to the work in²⁶, the designed FDA can fool the detection mechanism using chi-square detector. Based on this, power grid and FDA models are presented in this section.

Power system model

In this section, a AC power model is presented. Considering power voltage,phase angles, etc, AC power model can be described as follows²⁷.

$$ z = H\left( x \right) + v, $$

(1)

where ${\mathbf{z}} = \left[ {z_{1} , \ldots ,z_{m} } \right]$ is measurement data, $x = \left[ {x_{1} , \ldots ,x_{m} } \right]$ is system state, v is measurement noise, $H = \left[ {\begin{array}{*{20}c} {H_{1} ,} & { \cdots ,} & {H_{m} } \\ \end{array} } \right]$ denotes the Jacobi matrix for power system estimation, which describes the relationship between measurement data and system state as follows.

$$ P_{n} = V_{n} \sum\limits_{k = 1}^{m} {V_{k} \left( {G_{nk} \cos \theta_{nk} + B_{nk} \sin \theta_{nk} } \right)} $$

(2)

$$ Q_{n} = V_{n} \sum\limits_{k = 1}^{m} {V_{j} \left( {G_{nk} \sin \theta_{nk} - B_{nk} \cos \theta_{nk} } \right)} $$

(3)

$$ P_{nk} = - V_{n}^{2} G_{nk} + V_{n} V_{k} \left( {G_{nk} \cos \theta_{nk} + B_{nk} \sin \theta_{nk} } \right) $$

(4)

$$ Q_{nk} = - V_{n}^{2} G_{nk} - V_{n} V_{m} \left( {G_{nk} \sin \theta_{nk} - B_{nk} \cos \theta_{nk} } \right) $$

(5)

where $\theta_{nk} = \theta_{n} - \theta_{k}$ denotes the phase difference between bus n and k, $V_{k}$ denotes voltage amplitude, $G_{nk}$ and $B_{nk}$ are conductance and susceptance, respectively, $P_{n}$ and $Q_{n}$ denote the active and reactive power, respectively, $P_{nk}$ and $Q_{nk}$ denote the real and reactive power flow between bus n and k, respectively.

Based on the collected measurement data from PMU, SCADA can determine the operational status of power system. Commonly, bad data detection mechanism (BDDM) is used as a criterion for judging anomalous data, which can be described as follows²⁶.

$$ \gamma = z - \tilde{z} $$

(6)

$$ \tilde{z} = H\left( {\tilde{x}} \right) $$

(7)

$$ \tilde{x} = \mathop {\min }\limits_{x} [z - h(\tilde{x})]^{T} R^{ - 1} [z - h(\tilde{x})] $$

(8)

where $\gamma$, $\tilde{z}$ and $\tilde{x}$ denote the measurement residual, measurement estimated value,

respectively.

Based on precomputed threshold and measurement residuals in Eq. (6), the corresponding detection criterion can be expressed as

$$ \left\{ \begin{gathered} ||\gamma || \le \upsilon ,Normal \, \hfill \\ ||\gamma || > \upsilon ,{\text{Abnormal}} \hfill \\ \end{gathered} \right. $$

(9)

where $\upsilon$ is the precomputed threshold. It is noted that the selection of precomputed threshold is given in²⁶.

False data attack model

To fool the BDDM, hackers can design a bank of false data, which satisfies the following constraint as⁹

$$ \xi^{f} = H\left( \delta \right) $$

(10)

where $\xi^{f}$ is false attack sequence, $\delta$ is the attack-induced state change.

Taking false attack sequence $\xi^{f}$ into Eq. (6), one can obtain

$$ \begin{gathered} \gamma^{f} = \left\| {z^{f} - H\tilde{x}^{f} } \right\| \\ = \left\| {(z + \xi^{f} ) - H({\tilde{\mathbf{x}}} + \delta )} \right\| \\ = \left\| {(z - H\left( {{\tilde{\mathbf{x}}}} \right)) + (\xi^{f} - H\left( \delta \right))} \right\| \\ = \left\| {z - H\left( {{\tilde{\mathbf{x}}}} \right)} \right\| + \left\| {\xi^{f} - H\left( \delta \right)} \right\| \\ = \gamma \\ \end{gathered} $$

(11)

Equation (11) indicates that $\gamma^{f} = \gamma$ if hacker can inject false data to satisfy $\xi^{f} = H\left( \delta \right)$. Since measurement residual $\gamma^{f}$ under FDAs cannot exceed the precomputed threshold $\upsilon$, the BDDM can be cheated without triggering an alarm. In the following, Example 1 is given to shown the covert feature of FDA.

Example 1

As shown Fig. 3, it is assumed that hackers can inject false data into generator 1 at $t = 80s$ on IEEE 6-bus grid system. Of noted, the related simulation parameters are given in²⁶. Figures 4 and 5 show the state and residual change under FDAs.

Simulation results in Example 1 demonstrate the covert feature of FDAs. By exploring the vulnerability of BDDM, hackers can tamper with grid operation without triggering alarms. For this reason, detection of FDAs is a pressing issue for current grid security mechanisms. Based on this, a spatial-temporal detection framework using KOA-CNN-BiGRU-Attention detection framework is developed.

The spatial–temporal detection framework against FDAs

In this section, a spatial-temporal detection framework against FDAs is proposed. The proposed detection framework includes the CNN-based spatial feature extraction and BiGRU-based temporal feature extraction. In addition, KOA is introduced to enhance the feature extraction performance of CNN; Attention mechanism is further proposed to improve the detection performance of constructed detection mode. The detailed framework is given as follows

CNN-based spatial feature extraction

To extract spatial features from the grid dataset, CNN model is constructed. As shown in Fig. 6, the constructed CNN model consists of input layer, a convolutional layer, a pooling layer and a fully connected layer. The input layer obtains the input data (including normal and abnormal) of power system; convolution and pooling layers extract the spatial features of input data; finally the fully connected layer outputs the features. KOA is introduced to enhance the feature extraction performance of CNN.

In the convolutional layer, the convolutional kernel is summed by multiplying with the corresponding positions of input data matrix $\gamma = \left[ {x_{1} , \cdots \cdots ,x_{m} } \right]$, respectively. Then, the spatial features of the data are extracted by means of a sliding window, the corresponding mathematical formulas is given as follows:

$$ F_{i} = \rho \left( {W_{i} * \gamma + b_{i} } \right) $$

(12)

where $F_{i}$ is the ith convolutional layer output;$\rho$ is the nonlinear activation function ReLU; $W_{i}$ is the ith convolutional kernel;$b_{i}$ denotes the bias term in convolutional layer.

The role of pooling layer is mainly used to reduce the amount of data processing. After the pooling layer, it is possible to reduce the amount of data to be processed and to preserve the characteristic information of the input signals. The corresponding mathematical formulas is given as follows:

$$ S_{i/\max } = \max pool\left[ {F_{i} } \right] $$

(13)

where $\max pool[]$ indicates a maximum pooling operation, $S_{i/\max }$ is the ith pooling layer output.

The fully-connected layer is a fully-connected layer of high-level features obtained from multiple convolutional layers and multiple pooling layers. The corresponding mathematical formulas is given as follows:

$$ \zeta_{i} = {\text{Re}} LU\left( {\xi_{i} *S_{i/\max } + \eta_{i} } \right) $$

(14)

where $\eta_{i}$ denotes the bias term in fully-connected layer, $\xi_{i}$ is the weight value, $\zeta_{i}$ is the ith fully-connected layer output.

BiGRU-based temporal feature extraction

As shown in Fig. 6, BiGRU-based temporal feature extraction framework is constructed. The BiGRU model including update gate, reset gate and new candidate state, is a recurrent neural network, which consists of two independent GRU units. Of noted, one GRU aims at processing the data in the forward direction of the time series and the other GRU is to process the data in the reverse direction of the time series. With above bi-directional structure, the constructed BiGRU model is able to capture both forward and backward information of sequence data to extract the temporal feature in the sequence. The calculation procedure of temporal feature extraction for BiGRU model is given as follows.

$$ h_{{\text{t/1}}} = \chi \left( {\Psi_{o} \cdot \left[ {\begin{array}{*{20}c} {h_{t - 1/1} } & {x_{t} } \\ \end{array} } \right] + \Phi_{o} } \right) $$

(15)

$$ h_{t/2} = \chi \left( {\Psi_{\mu } \cdot \left[ {\begin{array}{*{20}c} {h_{t - 1/2} } & {x_{t} } \\ \end{array} } \right] + \Phi_{\mu } } \right)t - 1 $$

(16)

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{h}_{t} = \left[ {\begin{array}{*{20}c} {h_{t/1} } & {h_{t/2} } \\ \end{array} } \right] $$

(17)

where $h_{{\text{t/1}}}$ is hidden status in the forward at $t$,$\chi$ is activation function, $\Psi_{o}$ is weight matrix in the forward,$\Psi_{\mu }$ is weight matrix in backward,$x_{t}$ is data input, $\Phi_{o}$ is the bias parameter in the forward, $h_{t - 1/1}$ is hidden status in the forward at $t - 1$_,$h_{{\text{t/2}}}$ is hidden status in the backward at $t$,$\Phi_{\mu }$ is the bias parameter in the backward, $h_{t - 1/2}$ is hidden status in the backward at $t - 1$_,$\Phi_{o}$ is the output of BiGRU model.

In order to enhance the feature extraction performance of CNN-BiGRU, KOA is applied to optimize three important parameters in the CNN-BiGRU model: learning rate, number of iterations and number of neurons. The settings of these parameters will directly affect the performance and accuracy of the model.The detailed process of KOA is given as follows.

The algorithm is initialized as²⁸:

$$ \left\{ \begin{gathered} X_{i} = X_{low} + rand(X_{up} - X_{low} ) \hfill \\ \kappa_{i} = rand_{{\left[ {0,1} \right]}} \hfill \\ \lambda_{i} = \left| j \right|,i = 1, \cdots ,m \hfill \\ \end{gathered} \right. $$

(18)

where $X_{i}$ denotes the ith planet,$X_{low}$ and $X_{up}$ denote the lower and upper bounds, $rand_{{\left[ {0,1} \right]}}$ is a random number between 0 and 1, $\kappa_{i}$ is the eccentricity, $\lambda_{i}$ is the orbital period,$j$ a normally distributed random number.

The velocity $V_{i}$ calculation of KOA is given as:

$$ V_{{\text{i}}} = \left\{ \begin{gathered} \begin{array}{*{20}c} {\smallint_{1} \times \smallint_{4} + \smallint_{2} \times \smallint_{5} + \smallint_{3} \times \smallint_{9} \times \smallint_{10} \times {\vec{\text{j}}}_{5} \times \smallint_{7} } & {\begin{array}{*{20}c} {if} & {1 - \varphi_{in} } \\ \end{array} } \\ \end{array} \hfill \\ \begin{array}{*{20}c} {j_{4} \times \left[ {\eta \times \left( {\Re_{1} + \Re_{2} } \right)\left| {\frac{2}{{\Re_{3} + \varsigma }} - \frac{1}{{\Re_{4} + \varsigma }}} \right|} \right]^{\frac{1}{2}} \times \smallint_{6} + \smallint_{3} \times \smallint_{9} \times \smallint_{11} \times {\vec{\text{j}}}_{5} \times \smallint_{8} } & {\begin{array}{*{20}c} {else} & {} \\ \end{array} } \\ \end{array} \hfill \\ \end{gathered} \right. $$

(19)

with

$$ \smallint_{1} = \vec{L} \times \left( {{\text{j}}_{3} \times \left( {1 - {\text{j}}_{4} } \right) + {\text{j}}_{4} } \right) \times \left[ {\eta \times \left( {\Re_{1} + \Re_{2} } \right)\left| {\frac{2}{{\Re_{3} + \varsigma }} - \frac{1}{{\Re_{4} + \varsigma }}} \right|} \right]^{\frac{1}{2}} $$

$$ \smallint_{2} = \left( {1 - \vec{L}} \right) \times \left( {j_{3} \times \left( {1 - \vec{j}_{5} } \right) + \vec{j}_{5} } \right) \times \left[ {\eta \times \left( {\Re_{1} + \Re_{2} } \right)\left| {\frac{2}{{\Re_{3} + \varsigma }} - \frac{1}{{\Re_{4} + \varsigma }}} \right|} \right] $$

$$ \smallint_{3} = \left( {1 - \varphi_{{{\text{in}}}} } \right),\smallint_{4} = 2j_{4} \vec{X}_{i} - \vec{X}_{a} ,\smallint_{5} = \vec{X}_{a} - \vec{X}_{b} $$

$$ \smallint_{6} = \vec{X}_{{\text{a}}} - \vec{X}_{{\text{i}}} ,\smallint_{7} = \vec{X}_{{{\text{up}}}} - \vec{X}_{{{\text{low}}}} ,\smallint_{8} = j_{3} \vec{X}_{{{\text{up}}}} - \vec{X}_{{{\text{low}}}} $$

$$ \vec{L} = \left\{ \begin{gathered} \begin{array}{*{20}c} 0 & {\vec{j}_{5} \le \vec{j}_{6} } \\ \end{array} \hfill \\ \begin{array}{*{20}c} 1 & {Else} \\ \end{array} \hfill \\ \end{gathered} \right.,\smallint_{9} = \left\{ \begin{gathered} \begin{array}{*{20}c} 1 & {j_{4} \le 0.5} \\ \end{array} \hfill \\ - \begin{array}{*{20}c} 1 & {Else} \\ \end{array} \hfill \\ \end{gathered} \right.,\smallint_{10} = \left\{ \begin{gathered} \begin{array}{*{20}c} 0 & {\vec{j}_{5} \le j_{4} } \\ \end{array} \hfill \\ \begin{array}{*{20}c} 1 & {Else} \\ \end{array} \hfill \\ \end{gathered} \right.,\smallint_{11} = \left\{ \begin{gathered} \begin{array}{*{20}c} 0 & {j_{3} \le j_{4} } \\ \end{array} \hfill \\ \begin{array}{*{20}c} 1 & {Else} \\ \end{array} \hfill \\ \end{gathered} \right. $$

where and are random numbers between 0 and 1, and are random vectors between 0 and 1, and are solutions, which are selected at random from the population, is the universal gravitational constant, anddenote the mass, denotes the distance,is the the semimajor axis, denotes a small value, is the normalizing the Euclidian distance.

Updating positions and distances is presented as

$$ \vec{X}_{{{\text{i}}/t + 1}} = \vec{X}_{{{\text{i}}/t}} + \smallint_{9} \times \vec{V}_{{{\text{i}}/t}} + \left( {\lambda + \left| j \right|} \right) \times \vec{L} \times \left( {\vec{X}_{s} - \vec{X}_{{{\text{i}}/t}} } \right) $$

(20)

(21)

where $\vec{X}_{{{\text{i/}}t + 1}}$ is the next momentary position of $\vec{X}_{{{\text{i/}}t}} ,\,\vec{X}_{s}$ is the sun, $\lambda$ is the universal law of gravitation.

The elitist strategy of KOA is given as

$$ \vec{X}_{{{\text{i,new}}/t + 1}} = \left\{ \begin{gathered} \begin{array}{*{20}c} {\vec{X}_{{{\text{i}}/t + 1}} } & {{\text{if }}f\left( {\vec{X}_{{{\text{i}}/t + 1}} } \right) \le f\left( {\vec{X}_{{{\text{i}}/t}} } \right)} \\ \end{array} \hfill \\ \begin{array}{*{20}c} {\vec{X}_{{{\text{i}}/t}} } & {{\text{else}}} \\ \end{array} \hfill \\ \end{gathered} \right. $$

(22)

By using the above KOA, the related parameters of CNN-BiGRU model, such as learning rate, number of iterations and number of neurons can be optimized. Of noted, detailed algorithmic calculations and benefits are given in²⁸.

Attention-based detection module

Based on the output results in BiGRU model, the attention score is calculated as

$$ \Delta_{{e_{i} }} = a\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{h}_{t} } \right) $$

(23)

where $a$ denotes the Query.

The attention weight is calculate as

$$ \Delta_{{\alpha_{i} }} = \frac{{\exp \left( {\Delta_{{e_{i} }} } \right)}}{{\sum\limits_{j = 1}^{n} {\left( {\Delta_{{e_{j} }} } \right)} }} $$

(24)

Then, we can get the weighted sum as

$$ \Delta_{c} = \sum\limits_{i = 1}^{n} {\Delta_{{\alpha_{i} }} } \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{h}_{t} $$

(25)

In a sum, the model prediction output can be obtained as

$$ y_{pre} = \sum\limits_{i = 1}^{T} {SoftMax\left( {\Delta_{c} } \right)} $$

(26)

Since the FDA detection task is a binary classification task, the cross-entropy loss function is given as:

$$ Loss = - \frac{1}{T}\sum\limits_{i = 1}^{T} {\left[ {y_{i} \log \left( {\hat{y}_{pre} } \right) + \left( {1 - y_{i} } \right)\log \left( {1 - \hat{y}_{pre} } \right)} \right]} $$

(27)

Remark 1

Attention mechanisms are used to dynamically assign weights, highlight important features and suppress irrelevant features. There can be a large amount of noise or redundant information in power system data, and attention mechanisms can help models focus on key features of anomalous data, improving detection accuracy.

KOA-CNN-BiGRU-attention-based detection framework

By constructing KOA-CNN and BiGRU-Attention model, a detection framework using spatial–temporal features is developed, as shown in Fig. 7. The detailed process are shown as follows.

Step 1: Establish the CNN-based spatial features extraction model;

Step 2: Optimize the related parameters of CNN model by using KOA,

Step 3: Construct the BiGRU-based temporal features extraction model;.

Step 4: Input power measurement to train the above model offline based on KOA-CNN-BiGRU-Attention .

Step 5: Apply KOA-CNN-BiGRU-Attention-based detection model to identify normal and abnormal data online.

Based on the above detection step, the detection algorithm against FDAs is summarized in the Algorithm 1.

Case studies

In this section, simulation tests are provided to demonstrate the performance of the constructed spatial-temporal detection framework against FDAs. Compared with the existing detection models, such as GCN and GGNN-GAT , simulation studies on IEEE 14-bus and IEEE 118-bus power system verify the superiority of the proposed detection model.

Simulation and data setup

The simulation environment in this paper is realized on a desktop computer based on Matlab 2023a with the following configuration: i9-13900HX, 2.20 GHz, 16G RAM, and NVIDIA TITAN RTX 4060 graphics card. KOA: the population size: 20, maximum iteration times: 100; the optimized parameters are 0.00408136, 4, 86. CNN: Convolution layers: 2; Convolutional kernel size: 3x1; Activation function: ReLU. BiGRU: Number of hidden units: 128; Number of layers: 2; Dropout rate: 0.2. Attention dimension: 64. Optimizer: Adam; Learning rate: 0.001;gamma=2.

Consistent with work in²⁹, we employed the publicly accessible data released by NYISO to create the dataset. Leveraging the 2022 user load fluctuation pattern of New York, we utilized linear interpolation to expand the dataset with a time interval of five minutes. Following this, we standardized the obtained dataset in relation to the capacities of the grid systems under consideration. Subsequently, we linked the expanded dataset to each of the aforementioned two grid systems and utilized the Matpower 7.0 toolbox to conduct power flow analyses. The specific division of the training and test datasets is shown in Table 1.

Table 1 Training and Test Datasets Division.

Full size table

Evaluation indicators

To evaluate the performance of the detection model, evaluation indicators (Accuracy, Missed Alarm, Precision, $F_{1}$-Score, and Recall Alarm) are selected³⁰. Then, the corresponding mathematical expressions are given as follows.

$$ Acc = \frac{{\pi_{TN} + \pi_{TP} }}{{\pi_{TN} + \pi_{TP} + \pi_{FN} + \pi_{FP} }} $$

(28)

$$ Ma = \frac{{\pi_{FN} }}{{\pi_{TP} + \pi_{FN} }} $$

(29)

$$ \Pr = \frac{{\pi_{TP} }}{{\pi_{TP} + \pi_{FP} }} $$

(30)

$$ Ra = \frac{{\pi_{TP} }}{{\pi_{TP} + \pi_{FN} }} $$

(31)

$$ F_{1} /S = \frac{2\Pr \times Ra}{{\Pr + Ra}} $$

(32)

where $\pi_{FN}$ denotes the number of normal data in the data judged to be abnormal, $\pi_{FP}$ denotes the number of abnormal data in the data judged to be normal, $\pi_{TP}$ denotes the number of normal data in the data judged to be normal, $\pi_{TN}$ denotes the number of abnormal data in the data judged to be abnormal, $Acc$_, $Ma$_, $\Pr$_,$Ra$ and $F_{1} /S$ denote the accuracy, missed alarm, precision, recall and $F_{1}$-Score, respectively.

Ablation analysis

To evaluate the contributions of KOA, CNN, BiGRU, and Attention modules to KOA-CNN-BiGRU-Attention model performance, we designed the following ablation experiments: In this section, the full KOA-CNN-BiGRU-Attention model is established as the benchmark. The performance of the following ablation models is considered:Model 1: Removing KOA and using default hyperparameters. Model 2: Removing CNN and utilizing only BiGRU Attention. Model 3: Removing BiGRU and utilizing only CNN Attention. Model 4: Removing Attention and using only CNN BiGRU. Model 5: Removing both CNN and BiGRU, utilizing only Attention. The comparative results are presented, as shown in Fig. 8.

As shown in Fig. 8, the contributions of each module to the overall performance can be summarized as follows. After removing KOA, the model performance slightly dropped (F1 score decreased from 96.45% to 94.28%), suggesting that KOA contributes to some degree in hyperparameter optimization, albeit with a relatively minor impact. Upon removing CNN, the model performance significantly declined (F1 score dropped from 96.45% to 90.65%), revealing CNN's vital role in extracting local features. The removal of BiGRU led to a decrease in model performance (F1 score fell from 96.45% to 89.53%), indicating that BiGRU is important for capturing time series dependencies. With Attention removed, the model performance suffered a slight drop (F1 score decreased from 96.45% to 92.19%), hinting that the Attention mechanism aids in feature weighting and noise suppression to a certain extent. When both CNN and BiGRU were removed, the model performance drastically declined (F1 score fell from 96.45% to 83.15%), underlining the absolute cruciality of the combination of CNN and BiGRU for model performance.

Detection performance under different detection models

In this section, the accuracy of proposed detection model under KOA is analyzed, as shown in Fig. 9. In addition, compared with other detection models, such as particle swarm optimization (PSO)-CNN-BiGRU-Attention, group-fusion CNN (GFCNN) and CNN, accuracy and loss comparison during training are selected as evaluation indicators under the same conditions. The comparison of the accuracy and loss of each model are shown in Table 2.

Table 2 Comparison of the accuracy and loss of each model.

Full size table

Obviously, the selection of optimal hyperparameters in detection model can affect the detection performance against malicious attacks. As the error of the fitness function gets smaller, the KOA can obtain the optimal hyperparameters of the detection model, as shown in Fig. 9. Based on this, the corresponding accuracy of proposed detection model can be enhanced. In other words, the introduction of KOA for the proposed detection model can improve the detection performance against attacks.

Simulation results in Table 2 demonstrate that the proposed detection model has the highest accuracy and the lowest value of the loss function, indicating that it has the best overall performance. Compared with PSO-CNN-BiGRU-Attention, the KOA can further improve the ability of the model to generalize by optimizing the hyper-parameters. In addition, compared with works, such as GFCNN and CNN, the proposed detection model have significantly improved the model's ability to capture time series features. However, in comparison with standard CNN, it can be seen that GFCNN can further improve the detection performance of anomalous data.

Detection performance with evaluation indicators under different detection models

To evaluate the detection performance against FDAs under different detection models on IEEE 14-bus and 118-bus grid system, evaluation indicators (accuracy, missed alarm, precise, $F_{1}$-score, and recall) are conducted, as shown in Table 3-Table 4. In addition, the corresponding confusion matrices are presented, as shown in Fig. 10 and Fig. 11.

Table 3 Comparison results of evaluation indicators on IEEE 14-bus grid system.

Full size table

Table 4 Comparison results of evaluation indicators on IEEE 118-bus grid system.

Full size table

The simulation experiments presented in Table 3 reveal that the accuracy rates of the proposed detection model, GCN, and GGNN-GAT are 98.73%, 95.99%, and 96.35%, respectively. It is evident that the accuracy rate of the proposed detection model surpasses those of the other models. Additionally, the precision rates for the proposed detection model, GCN, and GGNN-GAT are 98.85%, 95.23%, and 97.03%, respectively. Furthermore, the recall rate and F1 score of the proposed detection model exhibit improvements of 11.34% and 4.15%, respectively. Consequently, when compared to the works in³⁰ and²⁹, the detection performances of the proposed detection model surpass those of other detection models on the IEEE 14-bus power system. Moreover, the comparison results on the IEEE 118-bus grid system in Table 3 further underscore the enhanced detection performance of the proposed model. Specifically, the evaluation indicators such as accuracy rate, precision rate, recall rate, and F1 score have improved by at least 1.49%, 1.15%, 11.24%, and 8.4%, respectively. In summary, the utilization of the KOA-CNN-BiGRU-Attention framework for spatio-temporal detection enhances detection performance compared to GCN and GGNN-GAT detection models.

Figures 10 and 11 show the comparison results of the confusion matrices for different detection models. As seen in Figs. 10 and 11, the proposed detection model can identify the abnormal data better than other detection models on the IEEE 14-bus and 118-bus grid system. Compared with GCN and GGNN-GAT detection models, the proposed detection model accurately identifies positive and negative samples and has fewer false positives and false negatives in the prediction, which indicates the effectiveness of the proposed KOA-CNN-BiGRU-Attention loss for improving the binary classification task distinguishability.

From the above experiments, we can conclude that the proposed detection model using KOA-CNN-BiGRU-Attention has a powerful detection performance, which makes the proposed detection model outperform the compared methods in both test accuracy for detecting FDAs.

Detection performance with variable attack intensity and ROC under different detection models

To evaluate the robustness of the proposed detection model, we further select the variable attack intensity and ROC as indicators on IEEE 14-bus system. As shown in Fig. 12, the detection rate under different attack strengths for proposed detection model, GCN and GGNN-GAT are tested. And ROC under all detection models are shown in Table 5.

Table 5 Comparison results of ROC under different detection models.

Full size table

Figure 11 demonstrates the detection rate under different attack strengths under different detection models. Obviously, detection rates become more accurate as the intensity of the attack increases. However, the detection rate under proposed detection model is higher than that of other detection models under the same attack strength. Compared with other detection models, simulation experiments in Table 5 indicate that the ROC of proposed detection model for FDAs detection is best. It is concluded that our proposed detection framework can achieve high detection rate and robustness for FDAs in power grids.

Conclusions and discussion

In this paper, we construct a spatio-temporal detection framework for FDAs in power networks. Firstly, a CNN-BiGRU-based network model is established, which can effectively extract the spatio-temporal features of power data. Secondly, the KOA and attention mechanisms incorporated in the constructed model can effectively improve the detection performance against FDAs. Finally, compared with GCN and GGNN-GAT detection models, the accuracy, precision, score and recall of the proposed detection model can be improved on IEEE 14-bus and 118-bus grid system. What's more, simulation experiments have verified that our detection models for FDAs detection show robustness to attack strength.

However, there remain certain deficiencies that serve as incentives for further research in the future.

1.
The KOA-CNN-BiGRU-Attention model combines several algorithms and techniques, resulting in a highly complex model. Complex models based on standard model combinations will influence the detection performance against FDAs. For example, Table 2 indicates that introducing GFCNN²³ can further optimize the detection model.
2.
Despite the excellent performance of the KOA-CNN BiGRU Attention model in several applications, its generalisation ability remains a concern. In some cases, the model may over-rely on specific patterns or features in the training data, resulting in poor performance on unseen data. To improve the generalisation ability of the model, it may be necessary to employ more regularisation techniques, data augmentation strategies or ensemble learning methods.

Data availability

The data analysed during the current study are included in this article.

References

Oyewole, P. A. & Jayaweera, D. Power system security with cyber-physical power system operation. IEEE Access 8, 179970–179982 (2020).
Article MATH Google Scholar
Zhang, X., Ma, H. & Tse, C. K. Assessing the robustness of cyber-physical power systems by considering wide-area protection functions. IEEE J. Emerg. Select. Top. Circuits Syst. 12(1), 107–114 (2022).
Article ADS MATH Google Scholar
Alvarez-Alvarado, M. S. et al. Cyber-physical power systems: A comprehensive review about technologies drivers, standards, and future perspectives. Comput. Electr. Eng. 116, 109149 (2024).
Article Google Scholar
Lu, K.-D. & Wu, Z.-G. Multi-objective false data injection attacks of cyber-physical power systems. IEEE Trans. Circuits Syst. II Express Briefs 69(9), 3924–3928 (2022).
MATH Google Scholar
Chen, L. & Wang, B. Robustness assessment of weakly coupled cyber-physical power systems under multi-stage attacks. Electr. Power Syst. Res. 231, 110325 (2024).
Article MATH Google Scholar
Cheng, Z., Hu, S., Yue, D., Dou, C. & Shen, S. Resilient distributed coordination control of multiarea power systems under hybrid attacks. IEEE Trans. Syst. Man Cybern. Syst. 52(1), 7–18 (2022).
Article MATH Google Scholar
Chawla, A. et al. Denial-of-service attacks pre-emptive and detection framework for synchrophasor based wide area protection applications. IEEE Syst. J. 16(1), 1570–1581 (2022).
Article ADS Google Scholar
Mukherjee, D. Data-driven false data injection attack: A low-rank approach. IEEE Trans. Smart Grid 13(3), 2479–2482 (2022).
Article MATH Google Scholar
Liu, Y., Ning, P. & Reiter, M. K. False data injection attacks against state estimation in electric power grids. ACM Trans. Inf. Syst. Secur. 14(1), 1–33 (2011).
Article MATH Google Scholar
Jin, S. False data injection attack against smart power grid based on incomplete network information. Electr. Power Syst. Res. 230, 110294 (2024).
Article MATH Google Scholar
Sadeghikhorami, L., Varadharajan, V. & Safavi, A. A. A novel secure observer-based controller and attack detection scheme for Networked Control Systems. Inf. Sci. 575, 185–205 (2021).
Article MathSciNet MATH Google Scholar
G. Rinaldi, M. Cucuzzella, P. P. Menon, A. Ferrara and C. Edwards, Load Altering Attacks Detection, Reconstruction and Mitigation for Cyber-Security in Smart Grids with Battery Energy Storage Systems. 2022 European Control Conference (ECC), London, United Kingdom, pp. 1541–1547, (2022).
Wang, X., Luo, X. & Pan, X. Guan, detection and location of bias load injection attack in smart grid via robust adaptive observer. IEEE Syst. J. 14(3), 4454–4465 (2020).
Article ADS MATH Google Scholar
Li, J., Sun, Y. & Qingyu, Su. Security control of integral sliding modes for multi-agent systems under false data injection attacks. J. Franklin Inst. 361(4), 106613 (2024).
Article MathSciNet MATH Google Scholar
Roy Chowdhury, N., Belikov, J., Baimel, D. & Levron, Y. Observer-based detection and identification of sensor attacks in networked CPSs. Automatica 121, 109166 (2020).
Article MathSciNet MATH Google Scholar
Wang, X., Luo, X., Zhang, M., Jiang, Z. & Guan, X. Detection and isolation of false data injection attacks in smart grid via unknown input interval observer. IEEE Internet Things J. 7(4), 3214–3229 (2020).
Article MATH Google Scholar
M. Li and T. Qian, False Data Injection Information Detection Method for Power Systems Based on Improved Random Forest. 2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 1885-1888, (2024).
Liu, K., Zhang, H., Zhang, Y. & Sun, C. False data-injection attack detection in cyber-physical systems with unknown parameters: A deep reinforcement learning approach. IEEE Trans. Cybern. 53(11), 7115–7125 (2023).
Article PubMed MATH Google Scholar
Esmalifalak, M., Liu, L., Nguyen, N., Zheng, R. & Han, Z. Detecting stealthy false data injection using machine learning in smart grid. IEEE Syst. J. 11(3), 1644–1652 (2017).
Article ADS Google Scholar
Feng, H., Han, Y., Li, K., Si, F. & Zhao, Q. Locational detection of the false data injection attacks via semi-supervised multi-label adversarial network. Int. J. Electr. Power Energy Syst. 155, 109682 (2024).
Article Google Scholar
Xiang, Q. et al. Group-fusion one-dimensional convolutional neural network for ballistic target high-resolution range profile recognition with layer-wise auxiliary classifiers. Int. J. Comput. Intell. Syst. 16, 190 (2023).
Article MATH Google Scholar
Xiang, Q. et al. Multi-scale group-fusion convolutional neural network for high-resolution range profile target recognition. IET Radar Sonar Navig. 16(12), 1997–2016 (2022).
Article MATH Google Scholar
Xiang, Q. et al. Quadruplet depth-wise separable fusion convolution neural network for ballistic target recognition with limited samples. Expert Syst. Appl. 235, 121182 (2024).
Article Google Scholar
Vincent, E., Korki, M., Seyedmahmoudian, M., Stojcevski, A. & Mekhilef, S. Detection of false data injection attacks in cyber–physical systems using graph convolutional network. Electr. Power Syst. Res. 217, 109118 (2023).
Article Google Scholar
Yin, L. & Xie, J. Multi-temporal-spatial-scale temporal convolution network for short-term load forecasting of power systems. Appl. Energy 283, 116328 (2021).
Article MATH Google Scholar
Lin, Y. & Abur, A. A highly efficient bad data identification approach for very large scale power systems. IEEE Trans. Power Syst. 33(6), 5979–5989 (2018).
Article ADS MATH Google Scholar
Abur, A. & Exposito, A. G. Power System State Estimation: Theory and Implementation (CRC Press, 2004).
Book MATH Google Scholar
Abdel-Basset, M., Mohamed, R., Abdel Azeem, S. A., Jameel, M. & Abouhawwash, M. Kepler optimization algorithm: A new metaheuristic algorithm inspired by Kepler’s laws of planetary motion. Knowl.-Based Syst. 268, 110454 (2023).
Article MATH Google Scholar
Li, X., Wang, Y. & Lu, Z. Graph-based detection for false data injection attacks in power grid. Energy 263, 125865 (2023).
Article MATH Google Scholar
O. Boyaci, M. R. Narimani, K. Davis, and E. Serpedin, Cyberattack detection in large-scale smart grids using Chebyshev graph convolutional networks. In: Proc. IEEE 9th Int. Conf. Elect. Electron. Eng. 217–221 (2022).

Download references

Funding

This work is Supported by the Doctoral Research Initiation Fund of Taiyuan University of Science and Technology (20202005, 20202007); Shanxi Province Basic Research Program General Project (202203021221153), and the Open Research Fund of Jiangsu Collaborative Innovation Center for Smart Distribution Network at Nanjing Institute of Technology under grant number XTCX202203.

Author information

Authors and Affiliations

College of Electronic Information Engineering, Taiyuan University of Science and Technology, Taiyuan, 030024, China
Qingfeng Wu & Shufa Zhuang
School of Electrical Engineering, Yanshan University, Qinhuangdao, 066004, China
Xinyu Wang
Jiangsu Collaborative Innovation Center for Smart Distribution Network, Nanjing, 210000, China
Xinyu Wang

Authors

Qingfeng Wu
View author publications
Search author on:PubMed Google Scholar
Shufa Zhuang
View author publications
Search author on:PubMed Google Scholar
Xinyu Wang
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, Qingfeng Wu; methodology, B.G; writing—original draft preparation, Qingfeng Wu and Shufa Zhuang; writing—review and editing, Xinyu Wang; visualization, Xinyu Wang;

Corresponding author

Correspondence to Xinyu Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, Q., Zhuang, S. & Wang, X. A novel detection mechanism against malicious attacks by using spatio and temporal topology information. Sci Rep 15, 9978 (2025). https://doi.org/10.1038/s41598-025-93957-8

Download citation

Received: 29 May 2024
Accepted: 11 March 2025
Published: 22 March 2025
Version of record: 22 March 2025
DOI: https://doi.org/10.1038/s41598-025-93957-8

Subjects

Abstract

Similar content being viewed by others

Lightweight mechanical equipment fault diagnosis framework based on GCGAN-MDSCNN-ICA model

A multi-branch network for cooperative spectrum sensing via attention-based and CNN feature fusion

Bearing fault diagnosis method based on WSST and ISSA-MCNN-BIGRU

Introduction

Background

Power system model

False data attack model

Example 1

The spatial–temporal detection framework against FDAs

CNN-based spatial feature extraction

BiGRU-based temporal feature extraction

Attention-based detection module

Remark 1

KOA-CNN-BiGRU-attention-based detection framework

Case studies

Simulation and data setup

Evaluation indicators

Ablation analysis

Detection performance under different detection models

Detection performance with evaluation indicators under different detection models

Detection performance with variable attack intensity and ROC under different detection models

Conclusions and discussion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links