Cross-attention mechanism-based spectrum sensing in generalized Gaussian noise

Xi, Haolei; Guo, Wei; Yang, Yanqing; Yuan, Rong; Ma, Hui

doi:10.1038/s41598-024-74341-4

Download PDF

Article
Open access
Published: 06 October 2024

Cross-attention mechanism-based spectrum sensing in generalized Gaussian noise

Haolei Xi¹,
Wei Guo^2,3,
Yanqing Yang¹,
Rong Yuan² &
…
Hui Ma²

Scientific Reports volume 14, Article number: 23261 (2024) Cite this article

3941 Accesses
7 Citations
Metrics details

Subjects

Abstract

Spectrum sensing (SS) technology is essential for cognitive radio (CR) networks to effectively identify and utilize idle spectrum resources. Due to the influence of noise characteristics in the channel, providing accurate sensing results is challenging. In order to improve the performance of SS under non-Gaussian noise and overcome the limitations of existing methods that are mostly based on a single feature, we propose a novel time-frequency cross fusion network (TFCFN). Specifically, we utilize gated recurrent units (GRU) to capture long-term dependencies in the time domain on the original signals, meanwhile, we perform a fast Fourier transform (FFT) on the original signals to obtain the frequency domain information, and subsequently use convolutional neural networks (CNN) to extract the local spatial features in the frequency domain. Ultimately, these time-domain and frequency-domain features are dynamically fused through a cross-attention mechanism to construct more comprehensive and robust features for signal classification. We use generalized Gaussian distribution (GGD) as the noise model and reconstruct the RadioML2016.10a dataset to explore the performance under various noise conditions. The experimental results show that compared with the baseline methods, TFCFN exhibits better detection ability and maintains lower complexity in both Gaussian and non-Gaussian noise environments. Notably, when the shape parameter of GGD is set to 0.5 and the signal-to-noise ratio (SNR) of the received signal is -16dB, it can maintain the probability of false alarm ($P_f$) of 10% while still ensuring the probability of detection ($P_d$) of over 90%.

Enhanced sensing performance through the integration of denoising autoencoder and ensembling techniques

Article Open access 17 December 2025

Research on multi-branch residual connection spectrum image classification based on attention mechanism

Article Open access 15 July 2025

Nonlinear time domain and multi-scale frequency domain feature fusion for time series forecasting

Article Open access 16 August 2025

Introduction

Background

With the explosive growth of wireless communication applications and the continuous innovation and development of wireless communication technology¹, the scarcity of spectrum resources and the increasingly complex electromagnetic environment have become great challenges for communication systems. The traditional spectrum management adopts a static allocation strategy, which is the main reason for low spectrum utilization². The emergence of CR provides a new technological solution to significantly improve the efficiency of spectrum resource utilization³. It can monitor and understand the usage of spectrum in the current environment, allowing secondary users (SU) to access idle spectrum without causing interference to PUs. SS, as the primary component of CR systems, endows the system with the ability to discover spectrum holes. The detection performance of SS directly determines whether it can accurately and real-time discover idle spectrum, and then allocate suitable communication frequency bands for SUs. This key link profoundly affects the operational efficiency and quality of service of the entire wireless communication system⁴.

The characteristics of channel fading, environmental noise, and dynamic changes in wireless channels can seriously affect the performance of SS⁵. Among them, noise, as an important interference source, its complexity and diversity lead to a decline in the performance of traditional SS methods. In an ideal situation, many theoretical analyses and algorithm designs tend to assume that noise follows a Gaussian distribution. However, in real-world environments, transient electromagnetic interference caused by thunderstorms and lightning, short voltage spikes or current pulses caused by arcing and switching operations on power lines, do not follow a Gaussian distribution. These non-Gaussian noises have the characteristics of strong suddenness, short duration, and concentrated energy, which pose a serious challenge to SS. Therefore, in order to improve the accuracy and robustness of SS, it is necessary to develop sensing techniques and algorithms that can effectively deal with such noise. When modeling noise, several common empirical models are worth paying attention to. Gaussian mixture model (GMM) and GGD are widely used to fit the pulse phenomena of man-made impulsive noise and ultra wideband interference^6,7,8,9,10. The $\alpha$-stable distribution has good fitting properties for natural electromagnetic noise environments¹¹.

Research status

Traditional SS algorithms include energy detection (ED)¹², matched filter detection (MFD)¹³, cyclostationary feature detection¹⁴ and so on. Specifically, the ED determines whether a particular frequency band is occupied by calculating the energy in that band compared to a threshold value. It is widely used due to the simplicity of its implementation, but it requires a high estimate of the noise power and performs poorly in low SNR conditions. In addition, the calculation of the threshold must be adjusted based on specific noise distributions, as the threshold calculation formulas for different noise distributions are incompatible with each other. Of particular importance is that traditional ED will lose its effectiveness due to the possible absence of second-order moments in the $\alpha$-stable distribution¹⁵. MDF utilizes known features of the PU signal to design a specific filter to match the received signal. It achieves maximum detection performance under ideal conditions, but requires a priori information about the PU, which is not always applicable in real environment. Cyclostationary feature detection determines the presence of a signal by detecting the periodic changes in the statistical characteristics of the signal. This method improves the performance of SS in low SNR environments, however, it has high computational complexity.

Machine learning and deep learning, as key technologies in the field of artificial intelligence, have been widely researched and applied in areas such as computer vision, natural language processing, and network security^{16,17,18,19,20}. With the continuous maturity of deep learning technology, its application in the field of communication has become an increasingly hot research topic. Deep learning is completely data-driven and can automatically learn and extract the features of complex signals without any priori information, thereby improving the performance of SS. Gao et al.²¹ proposed DetectNet, a combined model of CNN, Long Short-Term Memory (LSTM), and Deep Neural Network (DNN) for SS. DetectNet capitalizes on the underlying structural information of modulated signals, showcasing state-of-the-art detection performance. Su et al.²² applied a stacked convolutional autoencoder to preprocess signals for noise reduction. They introduced a self-attention mechanism in a combined model named H-CSG, further enhancing detection performance. In references^23,24, the authors used short-time Fourier transform and wavelet transform to process signal data into time-frequency matrix data as inputs to CNN, transforming the SS into an image classification problem. Wang et al.²⁵ utilized ConvLSTM to simultaneously extract temporal and spatial features of IQ signals, and then implemented SS at extremely low SNR based on the extracted features. These deep learning based methods are effective in terms of results, which proves that deep learning has enormous potential for development in the field of SS. The above studies were conducted under the assumption of Gaussian white noise background and have not yet explored robustness in non-Gaussian noise environments. Mehrabian et al. proposed a CNN detector for symmetric $\alpha$-stable ($S\alpha S$) noise²⁶. Compared with the baselines, this detector exhibits stronger robustness in dealing with impulse noise. Subsequently, they further proposed a CNN detector suitable for various noise models in multi antenna systems, including Middle Class A, $S\alpha S$ distribution, and GGD²⁷.

Motivation and contribution

The above research methods only use information from one domain in the signal as input to the neural network, ignoring the rich information contained in other domains. Therefore, these methods may not achieve optimal performance in non-Gaussian noise environments²⁸. Due to the regularity exhibited by PU signals in the time domain and the transient nature of non-Gaussian noise pulses, using GRU to extract global temporal dependencies is very helpful. Furthermore, it is worth noting that frequency domain information also plays an important role in signal processing. Since frequency domain signals do not contain time series information, CNN is used to focus on local features in the frequency domain. Taking inspiration from the cross attention mechanism’s ability to effectively fuse multimodal features^29,30, we have adopted this mechanism as a module for fusing time-domain and frequency-domain features. Through this integration, the TFCFN adapts to various noises and improves performance. The main contributions of this work are as follows:

1.
In our experiments, we employ the open-source RadioML2016.10a dataset³¹ to represent PU signals. We further utilize a GGD noise model to simulate the non-Gaussian noise encountered in real communication environments, thereby training our model.
2.
We propose a deep learning-based model that effectively fuses time and frequency features using a cross-attention mechanism to improve the accuracy of SS.
3.
We conduct comparative experiments under varying degrees of noise tailing conditions to evaluate the performance differences between our proposed model and other methods such as ED³², DetectNet²¹, WT-ResNet²⁴, ConvLSTM²⁵, 1D-CNN²⁶, 2D-CNN²⁷, and MASSnet-B³³. The experimental results indicate that our model exhibits superior detection performance and robustness, regardless of whether the noise tailing is mild or severe.

Organization

The rest of this paper is organized as follows. The “Related work” section introduces related work. The “System model” section introduces the system model and problem statement of SS. The proposed TFCFN and its training process are presented in the “The proposed TFCFN” section. The “Performance evaluation” section provides simulation results and discussion. Finally, the “Conclusion” section provides a summary of this paper. Table 1 provides the abbreviations and their descriptions used in this paper.

Table 1 Abbreviations and descriptions

Full size table

Related work

Traditional spectrum sensing methods

The ED¹² algorithm has received widespread attention due to its simple implementation. Chen et al. proposed replacing the amplitude squared operation with an arbitrary positive power operation to improve the energy detector in Gaussian noise³⁴. Fading and noise are key factors affecting the ED algorithm. Digham et al. proposed a closed form expression for the $P_d$ on multipath channels³⁵. Chatziantiou et al. derived an analytical expression for the average $P_d$ under two wave with diffusion power fading, and extended it to collaborative SS and square law selection diversity reception to mitigate fading effects³⁶. Gao et al. fully utilized the stochastic characteristics of the GGD and the central limit theorem to derive the $P_d$ and $P_f$, and analyzed the impact of noise uncertainty on the system³². The eigenvalue based methods are also popular in the field of SS. Chaurasiya et al. proposed a maximum-minimum-eigenvalue algorithm and a spectrum sensor architecture based on this algorithm³⁷. In order to improve the detection performance under noise uncertainty, Hashim et al. obtained an adaptive threshold based on the absolute covariance value³⁸. The method based on high-order moments is used for SS in satellite communication³⁹. The hybrid SS technology proposed by Ramya et al. automatically selects sensing methods based on energy or eigenvalues according to the range of SNR⁴⁰. MFD is a better detection algorithm when there is prior information of PU signal. Brito et al. proposed a hybrid method based on existing MFD, which flexibly adjusts the number of MFD used to optimize detection performance under different $P_f$⁴¹. Obtaining prior information limits the application in certain scenarios. The act of obtaining prior information limits its application in certain scenarios. To address this issue, Zhang et al. developed a new test statistic. This statistic is composed of the correlation between the received signal and its delayed version, as well as the independence of noise at different times. Due to the accumulation of correlation, prior information from PU is no longer required⁴². Bala et al. proposed an iterative algorithm applied to CR Internet of Things (IoT) devices to optimize sensing threshold and time, greatly improving CR-IoT throughput in low SNR regions⁴³.

Deep learning-based spectrum sensing methods

Data driven deep learning methods have become a hot topic in the field of SS in recent years. An et al. proposed a CNN for digital television terrestrial multimedia broadcasting systems, which can achieve satisfactory $P_d$ at low SNRs⁴⁴. Duan et al. used kernel principal component analysis to map the sampled signal to a high-dimensional space, created a covariance matrix, and then obtained the eigenvectors through matrix decomposition. Finally, CNN was used for classification⁴⁵. Uvaydov et al. implemented real-time wideband SS using CNN⁴⁶. Based on this work, Mei et al. designed a parallel CNN that reduced latency⁴⁷. Wang et al. used residual dense network to solve the problem of gradient vanishing in deep network structure, while using convolutional block attention module to improve network performance⁴⁸. The LSTM model developed by Balwani et al. extracted temporal correlations between spectrum data and achieved high classification accuracy, but at the cost of longer training and execution time⁴⁹. Subsequently, Soni et al. improved the detection performance by using PU activity statistics as training data for LSTM, at the cost of high time consumption⁵⁰. Combining various models together can extract multiple features. Xing et al. used CNN and BiLSTM in series, simultaneously extracting local features and global correlations of time-domain data, and then emphasized the importance of features using self-attention⁵¹. Denoising the signal and then classifying it is a novel innovation. Due to the two-stage nature of the H-CSG method²², where the denoising and detection stages are trained separately, the results are affected by the denoising performance. Therefore, Su et al. implemented joint learning for denoising and detection.⁵². Ni et al. did not limit themselves to CNN and LSTM, but used a temporal convolutional network with a special structure that allows it to extract temporal related features from sequence data⁵³. MASSnet adopts a residual network structure, which is specifically optimized for the flexible configuration problem of multi antenna technology³³. The most popular transformer model in the natural language processing field is gradually being used to handle SS tasks^54,55.

System model

In this work, we consider a single-input single-output (SISO) system influenced by Rayleigh fading channels and non-Gaussian noise. SS is used to detect the presence or absence of PU signals, so it is usually represented as a binary hypothesis problem:

$$\begin{aligned} r(n) = {\left\{ \begin{array}{ll} w(n), & H_0 \\ h(n)s(n) + w(n), & H_1 \end{array}\right. } \end{aligned}$$

(1)

where r(n) denotes the n-th received signal sample in a detection period; s(n) is the signal from the PU; h(n) represents the channel gain in the current detection period, and w(n) represents a random noise sequence that follows the GGD. ${H_0}$ and ${H_1}$ signify the hypotheses that the PU is absent and present, respectively.

The GGD can represent noise distributions with different heavy-tailed characteristics by adjusting its shape parameter, which is an extension of the Gaussian distribution and can adapt to a wider range of noise environments. Its probability density function (PDF) can be expressed as⁵⁶:

$$\begin{aligned} f(x) = \frac{\beta }{2\alpha \Gamma (\frac{1}{\beta })}exp(-(\frac{|x|}{\alpha })^\beta ) \end{aligned}$$

(2)

where ${\beta }$ is the shape parameter, ${\Gamma (\cdot )}$ denotes gamma function, and ${\alpha }$ is the scale parameter and the relationship between it and the variance of the random variable can be expressed as: ${\sigma ^2} = \alpha ^2\frac{\Gamma (3/\beta )}{\Gamma (1/\beta )}$. In particular, when ${\beta =2}$, the GGD degenerates to a Gaussian distribution; when ${0<\beta <2}$, the GGD exhibits heavy-tailed properties.

Typically, the performance metrics for evaluating SS are $P_d$ and $P_d$ defined as:

$$\begin{aligned} {\left\{ \begin{array}{ll} P_d = P(H_1|H_1) \\ P_f = P(H_1|H_0) \end{array}\right. } \end{aligned}$$

(3)

where ${P_d}$ is the probability of correctly detecting PU signals in the presence of the PU, and ${P_f}$ is the probability of mistakenly identifying noise as a signal in the absence of the PU.

The proposed TFCFN

To effectively perform SS in a non-Gaussian noise environment, we have utilized a cross-attention mechanism to fuse time-domain and frequency-domain features, enabling the model to adaptively learn the correlations between time-frequency features and dynamically assign corresponding frequency-domain feature weights to each time-domain feature. In the following, we will introduce the generation and preprocessing method of the dataset, followed by the detailed parts of the TFCFN architecture, and finally, the training and testing process will be described.

Dataset generating and preprocessing

The RadioML2016.10a dataset is a publicly available dataset widely used for wireless signal modulation identification. We reconstructed this dataset to meet the needs of the SS task, which is represented as follows:

$$\begin{aligned} r_k = {\left\{ \begin{array}{ll} w_k, & H_0 \\ h_k \cdot s_k + w_k, & H_1\\ \end{array}\right. } \end{aligned}$$

(4)

where ${r_k}$ is the reconstructed signal, ${s_k}$ is the clean modulated signal, ${h_k}$ is the complex Rayleigh decay coefficient, and ${w_k}$ is the GGD noise. Specifically, we use clean QPSK signals with a signal length (L) of 128 and add GGD noise to simulate PU signals at specified SNRs. The SNR range we set spans from -20dB to 0dB, with an increment of 2dB. For each SNR level, we generate 2000 PU signal samples and an equivalent number of GGD noise samples. For the classification task, PU signal samples are labeled as 1, and GGD noise samples are labeled as 0. Therefore, the total number of signal samples in the entire dataset is 44000, with 22000 samples for $H_0$ and $H_1$ respectively. In addition, the dataset is divided into training set, validation set and test set in the ratio of 3:1:1. In the GGD, the variance of the noise is fixed to 1, and the shape parameter ${\beta }$ is selected from $\{0.5, 1, 1.5,2\}$ to examine the effects of different noise trailing phenomena on the SS. Note that each $\beta$ value builds a separate dataset.

The received signal is an IQ signal consisting of in-phase and quadrature components. As neural networks cannot process complex signals directly, the received IQ signal is decomposed into I-Q components, which can be represented by the ${L \times 2}$-dimension matrix ${R_T}$ as follows:

$$\begin{aligned} R_T = [r_I(n); r_Q(n)] \end{aligned}$$

(5)

where L refers to the number of sampling points for a sample and ${r_I(n)}$, ${r_Q(n)}$ denote the I-Q components of the n-th received signal, respectively. And the dimensions of $r_I(n)$ and $r_Q(n)$ are both $L\times 1$. The TFCFN needs to integrate information from both time and frequency domains to better understand and analyze the data. To obtain the frequency domain features, the TFCFN utilizes the Discrete Fourier Transform (DFT). The DFT converts a signal from the time domain to the frequency domain by decomposing it into a composite of different frequency components. Its formula is expressed as:

$$\begin{aligned} X[f] = \sum _{n=0}^{L-1}r[n]e^{-j\frac{2\pi fn}{N}} \end{aligned}$$

(6)

where X[f] denotes the complex amplitude in the frequency domain, and f denotes the different frequency components. Note that here r is equivalent to r(n) in Eq. (1) and r[n] represents the n-th sampling point. L refers to the number of sampling points for sample r. Specifically, the FFT is used to accomplish this transformation task. The resulting X[f] is complex vector, which needs to be processed as a matrix with ${L \times 2}$-dimension:

$$\begin{aligned} R_F = [real(X[f]); imag(X[f])] \end{aligned}$$

(7)

where L represents the number of X[f] points, and its value is the same as the number of sampling points of r. The $real(\cdot )$ and $imag(\cdot )$ functions respectively refer to taking the real and imaginary parts.

In order to improve the performance and stability of the model,we perform max-min normalization on matrices $R_T$ and $R_F$ along the column-wise direction. By scaling the original data and mapping the features to the [0,1] interval, max-min normalization can effectively unify the magnitudes of different features, and it also helps to accelerate the learning process and improve the convergence speed of the model. The specific implementation is:

$$\begin{aligned} {\overline{R}} = \frac{R_{n, j} - min_{j}}{max_{j} - min_{j}} \end{aligned}$$

(8)

where ${\overline{R}}$ represents the matrix normalized by $R_T$ or $R_F$. $R_{n, j}$ is the n-th element of the original matrix $R_T$ or $R_F$ in j-th column, $min_{j}$ and $max_{j}$ are the minimum and maximum values in j-th column, respectively. Finally, the normalized matrices of $R_T$ and $R_F$ are represented as ${\overline{R}}_T$ and ${\overline{R}}_F$, respectively.

TFCFN architecture

The TFCFN we designed is a multi-feature input model, and its overall architecture is shown in Fig. 1. As can be seen from the figure, the TFCFN is a deep neural network architecture that incorporate one-dimensional Convolutional (Conv1D) layers, GRUs, cross-attention block and Dense layers. The input time-domain and frequency-domain data are subjected to deep feature extraction using network modules appropriate to their characteristics, respectively; specifically, two GRUs are used to extract the time-domain features, and two Conv1D layers are used to extract the frequency features. Subsequently, the extracted features are fused using the cross-attention block to obtain a comprehensive feature representation being used for signal and noise classification. We denote the input time-domain and frequency-domain data as ${{\overline{R}}_T \in R^{L \times 2}}$ and ${{\overline{R}}_F \in R^{L \times 2}}$, respectively, where L denotes the signal length. In addition, Fig. 1 includes the settings of module parameters and changes in feature dimensions.

Recurrent neural networks (RNN), particularly LSTM network, have been widely applied in SS tasks due to their exceptional ability to handle time-series data^57,58,59. However, the structure of LSTM is relatively complex with a large number of parameters, leading to high computational costs. The GRU, as a simplified version of LSTM, merges the hidden state and cell state and contains only update and reset gates, thus simplifying the network structure. This simplification not only reduces the number of parameters in the model but also improves the efficiency of training, while maintaining performance comparable to LSTM⁶⁰. Therefore, in our work, we have chosen to use two GRUs with 32 units to extract time features from the data, achieving a balance between performance and efficiency. The features extracted from the time-domain data are represented as $X_t$. Frequency-domain data reveal information about the intensity and phase of the signal at different frequency components, which, although they do not contain underlying temporal information, are equally crucial for understanding the spectral properties of the signal. Conv1D excels at extracting local features of the data, which can help the model to identify the frequency-specific components as well as the interactions between these components. In order to deepen the model’s sensitivity to signal variations and to enhance the level of feature abstraction, we employ two Conv1D layers as the extraction module for frequency-domain features. The parameter configuration for these two Conv1D is “32@3”. This means that each layer contains 32 convolution kernels, with each kernel having a size of 3. After each convolutional layer, the data will undergo ReLU activation function processing to introduce nonlinear characteristics and enhance the feature extraction ability between layers. The features extracted from the frequency-domain data in this way are denoted as $X_f$.

In recent years, the cross-attention mechanism has received extensive attention in the field of multimodal research. This mechanism is capable of computing the similarity between different modal input information and adaptingively associating and fusing the information based on the similarity scores. In this paper, we regard the frequency-domain and time-domain features as two modalities of the signal, each proving a different perspective on the signal. The application of the cross-attention mechanism in TFCFN is illustrated in the Fig. 2. Specifically, the features $X_t$ and $X_f$ are mapped into Query (Q), Key (K) and Value (V) matrices through Dense layer, and the Q and K matrices are dot products to generate an attention weight matrix whose elements represent the similarity between time-domain and frequency-domain features. In order to stabilize the gradient and prevent numerical overestimation, the dot product result is divided by $\sqrt{d_k}$, where $d_k$ is the diemnsion of K. Subsequently, the softmax function is applied to normalize these weights, resulting in an attention distribution assigned to the frequency-domain features on each time-domain features. The attention weight matrix is multiplied by the V matrix to obtain a feature representation X that combines time-domain and frequency-domain information. The calculation process is shown below:

$$\begin{aligned} X = Softmax(\frac{QK^T}{\sqrt{d_k}})V \end{aligned}$$

(9)

where T denotes transpose, and $d_k$ is the dimension of K. In TFCFN, we combine early time-domain features with enhanced features to improve the model’s feature expression ability, and the final classification feature is represented as:

$$\begin{aligned} X' = X + X_d \end{aligned}$$

(10)

Subsequently, we further processed $X'$ through a series of Conv1D and Dense layers to obtain the classification result. In the processing, in order to reduce the number of model parameters, we choose to use a Global Average Pooling layer instead of Flatten layer.

Training and testing phase

To train the TFCFN, we label the processed data according to the states of ${H_0}$ and ${H_1}$ to construct the training dataset:

$$\begin{aligned} {\mathcal {D}} = \{({\overline{R}}_1, y_1), ({\overline{R}}_2, y_2), \ldots ,({\overline{R}}_u, y_u)\} \end{aligned}$$

(11)

where each tuple $({\overline{R}}_u, y_u)$ represents the u-th sample of the labeled training dataset. Here, ${\overline{R}}_u \in \{({\overline{R}}_T, {\overline{R}}_F)_1, \ldots ({\overline{R}}_T, {\overline{R}}_F)_u\}$. The label ${y_u \in \{0,1\}}$ indicates the binary state: $y_u = 0$ for the noise-only state ${H_0}$, and $y_u = 1$ for the signal-present state $H_1$. SS can thus be framed as a binary classification problem, and $y_u$ is encoded as a one-hot vector to reflect this:

$$y_u = \left\{ \begin{aligned} & [1, 0],\quad H_0\\ & [0, 1], \quad H_1 \end{aligned}\right.$$

(12)

The labeled data ${\mathcal {D}}$ are then fed into the TFCFN for training. The output of the network is a probability vector normalized by the softmax function:

$$\begin{aligned} f_\theta ({\overline{R}}_u) = [f_{\theta |H_0}({\overline{R}}_u), f_{\theta |H_1}({\overline{R}}_u)] \end{aligned}$$

(13)

with

$$\begin{aligned} \sum _{i=0}^{1} f_{\theta |H_i}({\overline{R}}_u) = 1 \end{aligned}$$

(14)

here, ${\theta }$ represents the set of parameters of the TFCFN, and ${f_\theta ({\overline{R}}_u)}$ is the vector of probabilities, with ${f_{\theta |H_i}({\overline{R}}_u)}$ denoting the probability that the given sample ${\overline{R}}_u$ is classified as state $H_i$.

To obtain the optimal parameter ${\theta }$, we accomplished this by minimizing the cross-entropy loss function using an Adam optimizer with a learning rate of 0.0005. The cross-entropy function is defined as follows:

$$\begin{aligned} {\mathcal {L}}(\theta ; {\overline{R}}_u, y_u) = y_u \log (f_{\theta |H_1}({\overline{R}}_u)) + (1 - y_u) \log (1 - f_{\theta |H_0}({\overline{R}}_u)) \end{aligned}$$

(15)

After obtaining a model with optimal parameters by training on the dataset ${{\mathcal {D}}}$, the determination of the PU state can be performed using the new test data. Typically, the result is taken from the category corresponding to the maximum probability in ${f_{\theta }({\overline{R}}_u)}$, which is denoted as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} H_0, & f_{\theta |H_0}(R_u) > f_{\theta |H_1}(R_u) \\ H_1, & f_{\theta |H_0}(R_u) < f_{\theta |H_1}(R_u) \end{array}\right. } \end{aligned}$$

(16)

In order to compare the experimental results more accurately, we are keeping a constant $P_f$ for comparison. The method is to replace the judgment condition of Eq. (16) by a decision threshold ${\gamma }$⁶¹. Concretely, a new dataset ${{\mathcal {D}}_\text {noise}}$ is created by randomly selecting purely noisy samples from dataset ${{\mathcal {D}}}$, where ${{\mathcal {D}}_\text {noise}} = \{({\overline{R}}'_1, 0), ({\overline{R}}'_2, 0),...,({\overline{R}}'_n, 0)\}$. Subsequently, this dataset is subjected to predictions using the trained model, and the probabilities of $H_1$ states are subsequently ranked:

$$\begin{aligned} f_{\theta |H_1}({\overline{R}}'_a) \le f_{\theta |H_1}({\overline{R}}'_b),\, {\forall }\,1 \le a \le b \le n \end{aligned}$$

(17)

Finally, the decision threshold ${\gamma }$ can be represented as:

$$\begin{aligned} \gamma = f_{\theta |H_1}({\overline{R}}'_{round((1-P_f)n)}) \end{aligned}$$

(18)

where ${round(\cdot )}$ is rounding function. After obtaining ${\gamma }$, Eq. (16) transforms to the following judgment condition:

$$\begin{aligned} {\left\{ \begin{array}{ll} H_0, & f_{\theta |H_1}({\overline{R}}_u) < \gamma \\ H_1, & f_{\theta |H_1}({\overline{R}}_u) > \gamma \end{array}\right. } \end{aligned}$$

(19)

Performance evaluation

In this section, we first demonstrate the effectiveness of the cross-attention mechanism in TFCFN through ablation experiments. Next, we will evaluate the performance of TFCFN under constant $P_f$ and different noise conditions, and compare it with existing SS techniques. At the same time, we will demonstrate the receiver operating characteristic (ROC) curves of TFCFN under different noise tail levels. Subsequently, the results of a series of generalization and robustness experiments will be analyzed. Finally, we will analyze and discuss the complexity of the model.

Ablation experiment

In the ablation experiment, we replace the cross-attention module, denoted as “CA”, with several common feature fusion methods to validate the effectiveness of the module. These methods include “Concat”, “Add”, and “Multiply”. The results are presented in Table 2, where accuracy, a commonly used metric in classification tasks, is employed as the performance indicator to intuitively compare the classification performance under different shape parameters of the GGD. In the experiments, the aforementioned modules and the “CA” were validated using only the X feature, while “CA + skip connection” incorporated the $X_d$ feature for validation. All results are averaged over multiple runs.

Table 2 The ablation experiment results of TFCFN.

Full size table

Through the “Concat” operation, features $X_t$ and $X_f$ are concatenated in the channel dimension, which increases the number of features. However, experimental results indicate that this fusion strategy is not ideal in overall performance, despite maintaining a moderate level of parameter count and floating-point operations (FLOPs). When using the “Add” operation, the corresponding elements of features $X_t$ and $X_f$ are directly added to enhance the information content of the features. When the $\beta =0.5$, there is a significant improvement in the performance of the model. The “Multiply” operation emphasizes the interaction between $X_t$ and $X_f$ features through element level multiplication, and shows good performance at $\beta =1.5$. Similar to the “Add” operation, the “Multiply” operation also maintains the original dimensions of the features, so their parameter count and FLOPs are relatively low. The “CA” dynamically adjusts the importance of features by calculating the attention score between $X_t$ and $X_f$. This mechanism not only considers the interactions between features, but also establishes more complex dependency relationships between different features. The experimental results show that the “CA” performs the best in handling different noise shape parameters. Although its parameter count and FLOPs have increased, this is due to the additional computation required within the attention mechanism. In addition, when we introduce skip connection and add the most primitive feature information into the model, the performance is further improved.

Comparative experiment

In our experiments, we compare the proposed TFCFN model with existing deep learning-based methods and traditional method, which include ED³², DetectNet²¹, WT-ResNet²⁴, ConvLSTM²⁵, 1D-CNN²⁶, 2D-CNN²⁷, and MASSnet-B³³. For a fair comparison, all schemes use the same original dataset and are fine-tuned to obtain optimal hyperparameters. The properties of these methods are presented in the Table 3.

Table 3 Properties of comparison methods.

Full size table

Figure 3 illustrates the $P_d$ of all schemes at different SNRs when the $P_f$ is constant at 0.1. Although the threshold calculation of the ED is derived from the GGD, its performance is lower than that of deep learning based methods. As shown in Fig. 3a, the characteristics of GGD noise approximate Gaussian noise when $\beta =2$. ConvLSTM, WT-ResNet and DetectNet are schemes designed as SS algorithm in a Gaussian white noise environment. Observations show that the performance of TFCFN is similar to that of ConvLSTM when the SNR is below -14 dB. However, when the SNR exceeds -14 dB, TFCFN significantly outperforms the other schemes, and its $P_d$ improves by about 7.5% compared to ConvLSTM. As the $\beta$ decreases, the tail of the noise distribution thickens, indicating an increase in the probability of extreme values in the noise. This change has the most significant impact on traditional ED algorithm, whose performance is extremely dependent on accurate estimates of noise energy levels. The CNN and ConvLSTM show similar detection performance when the $\beta$ is reduced to 1.5. At SNR below -16 dB, the performance of the schemes does not differ much. However, when the SNR is higher than -16dB, the performance of TFCFN is significantly improved compared to CNN and ConvLSTM, with an increase in $P_d$ ranging from about 5% to 10%. In Fig. 3c, TFCFN outperforms the other schemes in terms of $P_d$ at all SNRs, and only slightly underperforms in the -20 dB SNR condition. Compared with ConvLSTM, TFCFN improves the $P_d$ by about 4% to 9%. In particular, TFCFN shows optimal performance at $\beta =0.5$. Under the condition of -20 dB SNR, the $P_d$ of TFCFN reaches 57%, which is significantly higher than other schemes by more than 40%. In addition, when the SNR is increased to above -16 dB, the $P_d$ of TFCFN is more than 90%. From the analysis of the results, it can be seen that the method with GRU or LSTM structures performs better, indicating that long-term dependencies in the time domain play a significant role. On this basis, introducing frequency domain features further improves the $P_d$. The performance differences between various methods can be more clearly understood through accuracy indicators, as shown in the Table 4.

Table 4 Accuracy results of comparative experiments.

Full size table

Figure 4 illustrates the curves of the $P_d$ as a function of the $P_f$, where different curves correspond to different GGD noise $\beta$ values. In this figure, the curve shifts to the left overall as the $\beta$ value decreases. This trend indicates that when the tail of the noise distribution becomes thicker, i.e., the $\beta$ value decreases, the TFCFN is able to achieve a higher $P_d$ while maintaining a low $P_f$. In the TFCFN, the cross-attention mechanism allows the model to dynamically learn the importance of time-frequency features, which means that the model can automatically assign higher weights to features that are more informative in thick-tailed noise environments. At the same time, the stacking of multiple layers of convolution and recurrent structures increases the expressive power of the model, allowing it to capture more complex signal features. Taken together, the results show that TFCFN has higher $P_d$ and better adaptability when dealing with non-Gaussian noise, especially noise distributions with thicker tails.

Generalization experiments

The impact of modulation methods

In this experiment, we study the classification of untrained signals with new modulation types by various methods. All models are trained using QPSK signals under the conditions of $\beta=2$ and $\beta =1$, respectively. Subsequently, BPSK and 16QAM signals were tested. The experimental results are shown in the Fig. 5, and all results maintain the $P_f=0.1$. The experimental results revealed that when faced with new BPSK signals, the performance of 1D-CNN, 2D-CNN, WT-ResNet, and DetectNet all showed significant degradation. Compared with the results shown in Fig. 3, the $P_d$ of TFCFN changes less and still maintains the best performance among all comparison methods. This discovery confirms that a fully trained TFCFN model can effectively detect new signals that have not been encountered before.

The impact of noise uncertainty

In real-world SS scenarios, the uncertainty of noise manifests as fluctuations in noise power over time, which may lead to significant performance degradation of certain detectors. In this experiment, the background noise is GGD noise. According to Eq. (2), when the shape parameter $\beta$ remains fixed, $\frac{\Gamma (3/\beta )}{\Gamma (1/\beta )}$ becomes a constant. Therefore, in the GGN model, only the parameter $\alpha$ has an impact on the variance of noise, and its uncertainty can be represented by $\alpha ^2\in [\frac{1}{\rho }\alpha _0^2, \rho \alpha _0^2]$, where $\alpha _0^2$ is the nominal $\alpha ^2$ and $\rho$ is the uncertainty factor³². In Fig. 6, we use NU to represent uncertainty, and its transformation relationship with $\rho = 10^{\frac{NU}{10}}$, with NU expressed in dB⁵². In the experiment, all methods used the trained model shown in Fig. 3, and it can be considered that $NU=0dB$. We conduct tests on the cases of $\beta =2$ and $\beta =1$ respectively to evaluate the changes in model performance at $NU=0.2 \, {\rm dB}$ and $NU=0.5\, {\rm dB}$. The experimental results indicate that ED is most affected as it requires a higher estimation of noise power. On the contrary, the results of TFCFN remained stable and were not significantly affected. This indicates that TFCFN has reliability in environments with noise uncertainty.

The impact of shape parameter on GGD

In this section, we aim to estimate the generalization ability of the trained model under different shape parameter conditions of GGD. Specifically, we use the model trained in a GGD noise environment with $\beta =1.5$ to test its performance on unknown datasets generated under different shape parameters $\beta =2$ and $\beta =1$. As shown in Fig. 7a, after training under the condition of $\beta =1.5$, the performance of the TFCFN model on the test data of $\beta =2$ is quite similar to that of the model trained and tested directly under the condition of $\beta =2$ (see Fig. 3a). This indicates that the TFCFN has good robustness in handling the transition from slightly sharp shapes to smoother Gaussian distribution, and can effectively generalize to distributions similar to the training data. However, as shown in Fig. 7b, when the TFCFN is evaluated on the test dataset with $\beta =1$, its performance decreased compared to the results shown in Fig. 3c. This is because the noise at $\beta =1$ has sharper peaks and heavier tails, and the model failed to fully capture the specific characteristics of these noises during training. However, the TFCFN still exhibits certain functionality, and its $P_d$ is still prominent among the compared models.

Comparison experiment of other noise distributions

To verify the robustness of TFCFN to other noise distributions, we conduct experiments on GMM noise and $S\alpha S$ noise. The performance comparison and analysis under each noise distribution will be introduced below.

The PDF of a GMM is composed of a weighted sum of PDF of a set of Gaussian distributions. The PDF of a binary GMM is as follows:

$$\begin{aligned} f(x) = \frac{1 - \epsilon }{\sqrt{2\pi \sigma _1^2}}exp(-\frac{x^2}{2\sigma _1^2}) + \frac{\epsilon }{\sqrt{2\pi \sigma _2^2}}exp(-\frac{x^2}{2\sigma _2^2}) \end{aligned}$$

(20)

where $\epsilon$ is a mixed parameter, and $0< \epsilon < 1$. In general, when the variance of the noise satisfies $\sigma _2^2>> \sigma _1^2$ and $\epsilon<< 1$, the variance of Gaussian noise $\sigma _2^2$ is used to describe sudden pulses or interferences with short duration and large amplitude changes. At the same time, the Gaussian distribution with variance of $\sigma _1^2$ dominates in the background noise. The total noise variance $\sigma ^2$ is $(1 - \epsilon )\sigma _1^2 + \epsilon \sigma _2^2$. In the experiment, set $\sigma _1^2 = 1$, $\sigma _2^2 = 4$, $\epsilon = 0.5$, and $P_f=0.1$ to observe the perceptual performance of all models on the GMM⁵. As shown in Fig. 8, under multiple SNR conditions, the TFCFN model exhibits significantly better detection performance than other models. Specifically, under SNR = -8 dB, the $P_d$ of TFCFN exceeds 90%. In the SNR range of -16 dB to -8 dB, TFCFN has an $P_d$ increase of about 5% compared to the other models. The experimental results show that TFCFN can still achieve ideal detection performance under GMM.

The PDF of $\alpha$-stable distribution does not have a closed expression, so it is defined through the characteristic function:

$$\begin{aligned} \phi (z) = exp\{j\mu z - \gamma |z|^\alpha [1+j\beta sign(z)\omega (z, \alpha )]\} \end{aligned}$$

(21)

where $\alpha (0 < \alpha \le 2)$ denotes the characteristic exponent, $\beta (-1 \le \beta \le 1)$ represents symmetrical parameter, $\gamma$ is the proportional parameter, $\mu$ is the positional parameter, sign(z) is a sign function, and $\omega (z, \alpha )$ is define as:

$$\begin{aligned} \omega (z, \alpha ) = {\left\{ \begin{array}{ll} tan(\frac{\pi \alpha }{2}), & \alpha \ne 1 \\ \frac{2}{\pi }log|z|, & \alpha = 1 \end{array}\right. } \end{aligned}$$

(22)

When $\beta =0$, the $\alpha$-stable distribution is symmetric about $\mu$ and degenerates into a $S\alpha S$ distribution. Specifically, when $\alpha =2$, the $S\alpha S$ distribution transforms into a Gaussian distribution; When $\alpha =1$, it degenerates into a Cauchy distribution⁶². In the experiment, we mainly focus on the performance of TFCFN under the conditions of $\alpha =1.5$ and $\alpha =1.2$²⁷. In addition, other related parameters were set as follows: $\beta =0$, $\mu =0$, $\gamma =1$, $P_f=0.1$. Since the $S\alpha S$ distribution does not possess a finite second-order moments, its variance becomes undefined. Therefore, in scenarios with additive $S\alpha S$ noise, the generalized signal-to-noise ratio (GSNR) is commonly used as a measurement metric. The definition of GSNR is as follows:

$$\begin{aligned} GSNR = 10log_{10}(\frac{\sigma _s^2}{\gamma }) \end{aligned}$$

(23)

where $\sigma _s^2$ represents the variance of the signal. Figure 9a shows the detection performance of each model under different GSNR condition when the $\alpha =1.5$. Overall, the TFCFN performs better than other models. In the range of GSNR from -20 dB to -14dB, the $P_d$ of the TFCFN is quite similar to other models. However, within the range of GSNR from -12 dB to -6 dB, the $P_d$ of the TFCFN is about 10% higher than other models. As shown in Figure 9b, when the $\alpha$ value is adjusted to 1.2, the pulse characteristics of the noise become more prominent. Within the GSNR range of -16 dB to -8 dB, the results are higher than those obtained when $\alpha =1.5$, while still maintaining a higher level than other models.

Complexity analysis

In deep learning, the number of parameters and the FLOPs are common metrics used to measure the complexity of a model, which can be regarded as the space complexity and time complexity, respectively. In Table 5, we present in detail the computational complexity analysis based on deep learning methods, where l and $D_l$ represent the index and number of layers of the network layer, respectively. In a Conv1D, $u_l$, $k_l$, and $m_l$ represent the number of channels in the l-th layer, the size of the convolution kernel, and the length of the output sequence, respectively. In LSTM and GRU, $e_l$ and $h_l$ represent the size of the embedding layer and the number of hidden units in the l-th layer, respectively. In addition, $d_l$ represents the number of neurons in the l-th dense layer⁵². Compared to Conv1D, the convolutional kernels and feature maps in Conv2D are two-dimensional. Therefore, we use $m_l^h \times m_l^w$ to represent the size of the output feature map, and $k_l^h \times k_l^w$ to denote the size of the convolutional kernel. ConvLSTM replaces fully connected layers with convolutional layers internally, thereby changing the computational cost of convolution. Compared to LSTM, GRU reduces the number of parameters and FLOPs of the model by reducing the computation of on gating structure. In addition, during the classification stage, we adopted a global pooling layer to reduce data dimensionality, thereby reducing the number of input neurons in the Dense layer, further reducing the overall parameter and computational complexity. Please refer to the Table 5 for specific comparative data. It can be seen that the TFCFN has much fewer parameters than other comparative models, while its FLOPs remain at a slightly lower to moderate level. The actual FLOPs are influenced by the size of features and the number of channels. In both 2D-CNN and MASSnet, pooling operations are used to reduce the size of features, resulting in a decrease in computational complexity. However, their characteristic information may be lost.

Table 5 The complexity of all models.

Full size table

Conclusion

In this paper, we studied a multi feature fusion network called TFCFN. This network effectively integrates time-domain and frequency-domain features through cross-attention mechanism, which can better adapt to Gaussian noise and strong impulse noise environments simulated by GGD. The experimental results show that TFCFN performs better than the comparative methods under various shape parameters of GGD, while maintaining lower complexity. In addition, the experiment verified that the network has robustness under new modulation methods, noise models (GMM and $S \alpha S$), and noise uncertainty. However, we are still dissatisfied with the results at extremely low SNR, and TFCFN has certain limitations in multi antenna scenarios. Future work will further optimize TFCFN to improve its performance and practicality in practical communication systems.

Data availability

All data generated or analyzed during this study are included in this article.

References

Abdulsalam, A., Al-shami, S., Al-aghbary, A. & Hamam, H. Performance study of an improved version of li-fi and wi-fi networks. CRJ (2023).
Haykin, S. Cognitive radio: Brain-empowered wireless communications. IEEE J. Sel. Areas Commun. 23, 201–220. https://doi.org/10.1109/JSAC.2004.839380 (2005).
Article Google Scholar
Mitola, J. & Maguire, G. Cognitive radio: Making software radios more personal. IEEE Pers. Commun. 6, 13–18. https://doi.org/10.1109/98.788210 (1999).
Article Google Scholar
Mazhar, T. et al. Quality of service (qos) performance analysis in a traffic engineering model for next-generation wireless sensor networks. Symmetry 15. https://doi.org/10.3390/sym15020513 (2023).
Li, J. et al. Spectrum sensing with non-Gaussian noise over multi-path fading channels towards smart cities with iot. IEEE Access 9, 11194–11202. https://doi.org/10.1109/ACCESS.2021.3051719 (2021).
Article Google Scholar
Middleton, D. Statistical-physical models of man-made radio noise, part I. First-order probability models of the instantaneous amplitude (1974).
Zhao, Y., Zhuang, X. & Ting, S.-J. Gaussian mixture density modeling of non-gaussian source for autoregressive process. IEEE Trans. Signal Process. 43, 894–903. https://doi.org/10.1109/78.376842 (1995).
Article ADS Google Scholar
Corral, C., Emami, S. & Rasor, G. Model of multi-band ofdm interference on broadband qpsk receivers. In Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.. Vol. 3. iii/629–iii/632. https://doi.org/10.1109/ICASSP.2005.1415788 (2005).
Moghimi, F., Nasri, A. & Schober, R. Adaptive lp norm spectrum sensing for cognitive radio networks. IEEE Trans. Commun. 59, 1934–1945. https://doi.org/10.1109/TCOMM.2011.051311.090588 (2011).
Article Google Scholar
Zhou, Q. & Ma, X. Receiver designs for differential uwb systems with multiple access interference. IEEE Trans. Commun. 62, 126–134. https://doi.org/10.1109/TCOMM.2013.120413.130005 (2014).
Article Google Scholar
Bibalan, M. H. & Amindavar, H. On parameter estimation of symmetric alpha-stable distribution. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4328–4332. https://doi.org/10.1109/ICASSP.2016.7472494 (2016).
Urkowitz, H. Energy detection of unknown deterministic signals. Proc. IEEE 55, 523–531. https://doi.org/10.1109/PROC.1967.5573 (1967).
Article Google Scholar
Salahdine, F., Ghazi, H. E., Kaabouch, N. & Fihri, W. F. Matched filter detection with dynamic threshold for cognitive radio networks. In 2015 International Conference on Wireless Networks and Mobile Communications (WINCOM). 1–6. https://doi.org/10.1109/WINCOM.2015.7381345 (2015).
Sherbin M., K. & Sindhu, V. Cyclostationary feature detection for spectrum sensing in cognitive radio network. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS). 1250–1254. https://doi.org/10.1109/ICCS45141.2019.9065769 (2019).
Liu, M., Zhao, N., Li, J. & Leung, V. C. M. Spectrum sensing based on maximum generalized correntropy under symmetric alpha stable noise. IEEE Trans. Vehic. Technol. 68, 10262–10266. https://doi.org/10.1109/TVT.2019.2931949 (2019).
Article Google Scholar
Torun, O., Yuksel, S. E., Erdem, E., Imamoglu, N. & Erdem, A. Hyperspectral image denoising via self-modulating convolutional neural networks. Signal Process. 214, 109248. https://doi.org/10.1016/j.sigpro.2023.109248 (2024).
Article Google Scholar
Himeur, Y. et al. Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization. Eng. Appl. Artif. Intell. 119, 105698. https://doi.org/10.1016/j.engappai.2022.105698 (2023).
Article Google Scholar
Kheddar, H., Himeur, Y., Al-Maadeed, S., Amira, A. & Bensaali, F. Deep transfer learning for automatic speech recognition: Towards better generalization. Knowl.-Based Syst. 277, 110851. https://doi.org/10.1016/j.knosys.2023.110851 (2023).
Article Google Scholar
Kheddar, H., Himeur, Y. & Awad, A. I. Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review. J. Netw. Comput. Appl. 220, 103760. https://doi.org/10.1016/j.jnca.2023.103760 (2023).
Article Google Scholar
Mazhar, T. et al. Electric vehicle charging system in the smart grid using different machine learning methods. Sustainability 15, 2603 (2023).
Article Google Scholar
Gao, J., Yi, X., Zhong, C., Chen, X. & Zhang, Z. Deep learning for spectrum sensing. IEEE Wirel. Commun. Lett. 8, 1727–1730. https://doi.org/10.1109/LWC.2019.2939314 (2019).
Article Google Scholar
Su, Z., Teh, K. C., Razul, S. G. & Kot, A. C. Deep non-cooperative spectrum sensing over rayleigh fading channel. IEEE Trans. Vehic. Technol. 71, 4460–4464. https://doi.org/10.1109/TVT.2021.3138593 (2022).
Article Google Scholar
Chen, Z., Xu, Y.-Q., Wang, H. & Guo, D. Deep stft-cnn for spectrum sensing in cognitive radio. IEEE Commun. Lett. 25, 864–868. https://doi.org/10.1109/LCOMM.2020.3037273 (2021).
Article Google Scholar
Zhen, P., Zhang, B., Chen, Z., Guo, D. & Ma, W. Spectrum sensing method based on wavelet transform and residual network. IEEE Wirel. Commun. Lett. 11, 2517–2521. https://doi.org/10.1109/LWC.2022.3207296 (2022).
Article Google Scholar
Wang, Q. et al. Convlstm-based spectrum sensing at very low snr. IEEE Wirel. Commun. Lett. 12, 967–971. https://doi.org/10.1109/LWC.2023.3254048 (2023).
Article Google Scholar
Mehrabian, A., Sabbaghian, M. & Yanikomeroglu, H. Spectrum sensing for symmetric $\alpha$-stable noise model with convolutional neural networks. IEEE Trans. Commun. 69, 5121–5135. https://doi.org/10.1109/TCOMM.2021.3070892 (2021).
Article Google Scholar
Mehrabian, A., Sabbaghian, M. & Yanikomeroglu, H. Cnn-based detector for spectrum sensing with general noise models. IEEE Trans. Wirel. Commun. 22, 1235–1249. https://doi.org/10.1109/TWC.2022.3203732 (2023).
Article Google Scholar
Liu, M., Zhang, X., Chen, Y. & Tan, H. Multi-antenna intelligent spectrum sensing in the presence of non-gaussian interference. Digit. Signal Process. 140, 104135. https://doi.org/10.1016/j.dsp.2023.104135 (2023).
Article Google Scholar
Yuan, N., Li, J. & Sun, B. Global cross-attention network for single-sensor multispectral imaging. In IEEE Transactions on Emerging Topics in Computational Intelligence. 1–13. https://doi.org/10.1109/TETCI.2024.3414950 (2024).
Liu, Y. et al. Sca: Streaming cross-attention alignment for echo cancellation. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. https://doi.org/10.1109/ICASSP49357.2023.10096417 (2022).
O’shea, T. J. & West, N. Radio machine learning dataset generation with gnu radio. In Proceedings of the GNU Radio Conference. Vol. 1 (2016).
Gao, R., Qi, P. & Zhang, Z. Performance analysis of spectrum sensing schemes based on energy detector in generalized gaussian noise. Signal Process. 181, 107893. https://doi.org/10.1016/j.sigpro.2020.107893 (2021).
Article Google Scholar
Zhang, L., Zheng, S., Qiu, K., Lou, C. & Yang, X. Massnet: Deep-learning-based multiple-antenna spectrum sensing for cognitive-radio-enabled internet of things. IEEE Internet Things J. 11, 14435–14448. https://doi.org/10.1109/JIOT.2023.3343699 (2024).
Article Google Scholar
Chen, Y. Improved energy detector for random signals in gaussian noise. IEEE Trans. Wirel. Commun. 9, 558–563. https://doi.org/10.1109/TWC.2010.5403535 (2010).
Article Google Scholar
Digham, F. F., Alouini, M.-S. & Simon, M. K. On the energy detection of unknown signals over fading channels. IEEE Trans. Commun. 55, 21–24. https://doi.org/10.1109/TCOMM.2006.887483 (2007).
Article Google Scholar
Chatziantoniou, E., Allen, B., Velisavljevic, V., Karadimas, P. & Coon, J. Energy detection based spectrum sensing over two-wave with diffuse power fading channels. IEEE Trans. Vehic. Technol. 66, 868–874. https://doi.org/10.1109/TVT.2016.2556084 (2017).
Article Google Scholar
Chaurasiya, R. B. & Shrestha, R. Hardware-efficient and fast sensing-time maximum-minimum-eigenvalue-based spectrum sensor for cognitive radio network. IEEE Trans. Circuits Syst. I Regul. Pap 66, 4448–4461. https://doi.org/10.1109/TCSI.2019.2921831 (2019).
Article Google Scholar
Hashim, B. T., Ziboon, H. T. & Abdulsatar, S. M. Covariance absolute values spectrum sensing method based on two adaptive thresholds. Indonesian J. Electric. Eng. Comput. Sci. (IJEECS) 30, 1029–1037 (2023).
Article Google Scholar
Benedetto, F., Giunta, G. & Pallotta, L. Cognitive satellite communications spectrum sensing based on higher order moments. IEEE Commun. Lett. 25, 574–578. https://doi.org/10.1109/LCOMM.2020.3029091 (2021).
Article Google Scholar
Ramya, M. & Rajeswari, A. Improved hybrid spectrum sensing technique in cognitive radio communication system. Signal Image Video Process. 18, 4233–4242 (2024).
Article Google Scholar
Brito, A., Sebastião, P. & Velez, F. J. Hybrid matched filter detection spectrum sensing. IEEE Access 9, 165504–165516. https://doi.org/10.1109/ACCESS.2021.3134796 (2021).
Article Google Scholar
Zhang, C., Li, J., Li, B. & Ma, W. Blind matching filtering algorithm for spectrum sensing under multi-path channel environment. Electronics 12. https://doi.org/10.3390/electronics12112499 (2023).
Bala, I., Sharma, A., Tselykh, A. & Kim, B.-G. Throughput optimization of interference limited cognitive radio-based internet of things (cr-iot) network. J. King Saud Univ.-Comput. Inf. Sci. 34, 4233–4243. https://doi.org/10.1016/j.jksuci.2022.05.019 (2022).
Article Google Scholar
An, N. et al. Spectrum sensing for dtmb system: A cnn approach. IEEE Trans. Broadcast. 68, 271–278. https://doi.org/10.1109/TBC.2021.3108055 (2022).
Article Google Scholar
Duan, Y., Huang, F., Xu, L. & Gulliver, T. A. Intelligent spectrum sensing algorithm for cognitive internet of vehicles based on kpca and improved cnn. Peer-to-Peer Netw. Appl. 16, 2202–2217 (2023).
Article Google Scholar
Uvaydov, D., D’Oro, S., Restuccia, F. & Melodia, T. Deepsense: Fast wideband spectrum sensing through real-time in-the-loop deep learning. In IEEE INFOCOM 2021 - IEEE Conference on Computer Communications. 1–10. https://doi.org/10.1109/INFOCOM42981.2021.9488764 (2021).
Mei, R. & Wang, Z. Deep learning-based wideband spectrum sensing: A low computational complexity approach. IEEE Commun. Lett. 27, 2633–2637. https://doi.org/10.1109/LCOMM.2023.3310715 (2023).
Article Google Scholar
Wang, A., Meng, Q. & Wang, M. Spectrum sensing method based on residual dense network and attention. Sensors 23, 7791 (2023).
Article ADS PubMed PubMed Central Google Scholar
Balwani, N., Patel, D. K., Soni, B., López-Benítez, M. Long. & short-term memory based spectrum sensing scheme for cognitive radio. In IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). 1–6. https://doi.org/10.1109/PIMRC.2019.8904422 (2019).
Soni, B., Patel, D. K. & López-Benítez, M. Long short-term memory based spectrum sensing scheme for cognitive radio using primary activity statistics. IEEE Access 8, 97437–97451. https://doi.org/10.1109/ACCESS.2020.2995633 (2020).
Article Google Scholar
Xing, H. et al. Spectrum sensing in cognitive radio: A deep learning based model. Trans. Emerg. Telecommun. Technol. 33, e4388 (2022).
Article Google Scholar
Su, Z., Teh, K. C., Xie, Y., Razul, S. G. & Kot, A. C. Signal enhancement aided end-to-end deep learning approach for joint denoising and spectrum sensing. IEEE Trans. Vehic. Technol. 73, 4424–4428. https://doi.org/10.1109/TVT.2023.3324826 (2024).
Article Google Scholar
Ni, T. et al. Spectrum sensing via temporal convolutional network. China Communications 18, 37–47, https://doi.org/10.23919/JCC.2021.09.004 (2021).
Zhang, W., Wang, Y., Chen, X., Cai, Z. & Tian, Z. Spectrum transformer: An attention-based wideband spectrum detector. In IEEE Transactions on Wireless Communications. 1–1. https://doi.org/10.1109/TWC.2024.3391515 (2024).
Zhang, W., Wang, Y., Chen, X. & Tian, Z. Spectrum transformer: Wideband spectrum sensing using multi-head self-attention. In 2023 IEEE 24th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). 101–105. https://doi.org/10.1109/SPAWC53906.2023.10304551 (2023).
Chandra, S. S., Upadhye, A., Saravanan, P., Gurugopinath, S. & Muralishankar, R. Deep neural network architectures for spectrum sensing using signal processing features. In 2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER). 129–134. https://doi.org/10.1109/DISCOVER52564.2021.9663583 (2021).
Balwani, N., Patel, D. K., Soni, B., López-Benítez, M. Long. & short-term memory based spectrum sensing scheme for cognitive radio. In IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). 1–6. https://doi.org/10.1109/PIMRC.2019.8904422 (2019).
Balwani, N., Patel, D. K., Soni, B., López-Benítez, M. Long. & short-term memory based spectrum sensing scheme for cognitive radio. In IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). 1–6. https://doi.org/10.1109/PIMRC.2019.8904422 (2019).
Bkassiny, M. A deep learning-based signal classification approach for spectrum sensing using long short-term memory (lstm) networks. In 2022 6th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE). 667–672. https://doi.org/10.1109/ICITISEE57756.2022.10057728 (2022).
Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
Liu, C., Wang, J., Liu, X. & Liang, Y.-C. Deep cm-cnn for spectrum sensing in cognitive radio. IEEE J. Sel. Areas Commun. 37, 2306–2321. https://doi.org/10.1109/JSAC.2019.2933892 (2019).
Article Google Scholar
Liu, M., Zhao, N., Li, J. & Leung, V. C. M. Spectrum sensing based on maximum generalized correntropy under symmetric alpha stable noise. IEEE Trans. Vehic. Technol. 68, 10262–10266. https://doi.org/10.1109/TVT.2019.2931949 (2019).
Article Google Scholar

Download references

Acknowledgements

This research was funded by National Key R&D Program of China, grant number 2021YFC3002103, National Key R&D Program of China, grant number 2023YFC3011505, the Natural Science Project of Xinjiang University Scientific Research Program, grant number XJEDU2021Y003 and major special projects in Xinjiang Uygur Autonomous Region (2022A01007-4).

Author information

Authors and Affiliations

Xinjiang University, School of Computer Science and Technology, Urumqi, 830046, China
Haolei Xi & Yanqing Yang
Ministry of Emergency Management Big Data Center, Beijing, 100013, China
Wei Guo, Rong Yuan & Hui Ma
China Satellite Communications Co., Ltd., Beijing, 100190, China
Wei Guo

Authors

Haolei Xi
View author publications
Search author on:PubMed Google Scholar
Wei Guo
View author publications
Search author on:PubMed Google Scholar
Yanqing Yang
View author publications
Search author on:PubMed Google Scholar
Rong Yuan
View author publications
Search author on:PubMed Google Scholar
Hui Ma
View author publications
Search author on:PubMed Google Scholar

Contributions

W.G., R.Y., and H.M. provided research background and questions, and designed experiments. H.X. and Y.Y. conducted experiments and analyzed and results, and H.X. prepared the original manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yanqing Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Xi, H., Guo, W., Yang, Y. et al. Cross-attention mechanism-based spectrum sensing in generalized Gaussian noise. Sci Rep 14, 23261 (2024). https://doi.org/10.1038/s41598-024-74341-4

Download citation

Received: 10 April 2024
Accepted: 25 September 2024
Published: 06 October 2024
Version of record: 06 October 2024
DOI: https://doi.org/10.1038/s41598-024-74341-4

This article is cited by

A multi-branch network for cooperative spectrum sensing via attention-based and CNN feature fusion
- Doi Thi Lan
- Quan T. Ngo
- O-Joun Lee
Scientific Reports (2026)
Cognitive link adaptation via modulation scheme classification in narrowband networks under AWGN and SUI channel conditions
- Fatima Ismail
- Sajid Gul Khawaja
- Arslan Shaukat
Scientific Reports (2025)

Subjects

Abstract

Similar content being viewed by others

Enhanced sensing performance through the integration of denoising autoencoder and ensembling techniques

Research on multi-branch residual connection spectrum image classification based on attention mechanism

Nonlinear time domain and multi-scale frequency domain feature fusion for time series forecasting

Introduction

Background

Research status

Motivation and contribution

Organization

Related work

Traditional spectrum sensing methods

Deep learning-based spectrum sensing methods

System model

The proposed TFCFN

Dataset generating and preprocessing

TFCFN architecture

Training and testing phase

Performance evaluation

Ablation experiment

Comparative experiment

Generalization experiments

The impact of modulation methods

The impact of noise uncertainty

The impact of shape parameter on GGD

Comparison experiment of other noise distributions

Complexity analysis

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

A multi-branch network for cooperative spectrum sensing via attention-based and CNN feature fusion

Cognitive link adaptation via modulation scheme classification in narrowband networks under AWGN and SUI channel conditions

Search

Quick links