Improving wind power prediction with advanced temporal and frequency domain processing combined with error correction

Gao, Jinming; Sun, Yixin; Kim, Hankil; Kim, Changsu; Jung, Hoekyung

doi:10.1038/s41598-025-27896-9

Download PDF

Article
Open access
Published: 08 December 2025

Improving wind power prediction with advanced temporal and frequency domain processing combined with error correction

Jinming Gao¹,
Yixin Sun¹,
Hankil Kim²,
Changsu Kim¹ &
…
Hoekyung Jung¹

Scientific Reports volume 15, Article number: 44300 (2025) Cite this article

1044 Accesses
Metrics details

Subjects

Abstract

Accurate prediction of wind power is crucial for grid scheduling and the integration of renewable energy, given its significant temporal variability and nonlinear characteristics. This study proposed a multi-module integrated model for wind power forecasting based on time–frequency domain analysis, aiming to enhance prediction accuracy and reliability. The mode9l combined several advanced techniques, including Wavelet Convolutions (WTC), Long Short-Term Memory Networks (LSTM), Time Series Lightweight Adaptive Network (TSLANet), Frequency Enhanced Channel Attention Mechanism (FECAM), and Fast Kolmogorov-Arnold Networks (FastKAN). Each module was designed to capture distinct characteristics in wind power data, such as local frequency features, temporal dependencies, global contextual information, frequency-domain features, and complex nonlinear relationships. Through the integration of these modules, the model achieved high-precision predictions in multi-scale and dynamic environments. Additionally, the Least Squares Support Vector Machine (LSSVM) was employed for error correction, further reducing prediction errors. Experimental results showed that the model delivered exceptional performance across various test scenarios, significantly improving the handling of multi-scale, complex nonlinear, and global dependency issues in wind power forecasting, demonstrating considerable application potential.

Introduction

Background and motivation

With the rising global energy demand and increasingly stringent environmental regulations, wind energy, as a clean and renewable resource, has taken a prominent role in the energy strategies of many countries¹. In recent years, global initiatives, supported by policies and technological innovations, have continuously increased wind power installations to reduce dependence on fossil fuels, lower carbon emissions, and mitigate environmental pollution². However, the intermittency and volatility of wind power significantly impact the stable operation of power grids, making accurate and reliable wind power forecasting a critical factor for large-scale grid integration³.

Literature review

Wind power forecasting is inherently challenging due to its high uncertainty and nonlinear characteristics. Fluctuations in wind speed, regional environmental diversity, and climate change result in complex, multi-scale dynamics in wind power data⁴. To address these challenges, various forecasting methods have been proposed, yet improving the accuracy and reliability of wind power prediction remains a key research focus⁵. Existing forecasting approaches can be broadly classified into physical models, statistical models, and machine learning-based methods. However, each type of model has inherent limitations, making it difficult to comprehensively address the complexities of wind power forecasting.

Physical models

Physical models simulate wind turbine operations based on meteorological and geographical inputs⁶. Though grounded in well-established aerodynamic principles and suitable for short-term forecasting, they heavily rely on high-resolution input data. As a result, their accuracy deteriorates when real-time or fine-grained data is unavailable. Moreover, physical models are inherently deterministic and struggle to capture the stochastic and nonlinear nature of wind power time series^7,8.

Statistical models

Statistical models such as ARIMA and SVR⁹ are widely used due to their interpretability and simplicity. However, these models typically assume linearity and stationarity, which are seldom satisfied in wind power data¹⁰. They lack the flexibility to capture abrupt fluctuations, multi-scale variations, and temporal dependencies that extend beyond their short memory horizons¹¹.

Machine learning-based models

Early machine learning models like ANN and SVM introduced nonlinearity into forecasting frameworks, yet they still struggled to model complex temporal structures¹². More recently, deep learning models such as CNNs and LSTMs have gained popularity^13,14. CNNs effectively extract spatial or local features but lack the temporal modeling capacity required for time series. LSTMs are more adept at capturing long-term dependencies; however, they typically operate in the time domain and thus cannot effectively model frequency-domain patterns¹⁵. This leads to suboptimal forecasting in cases where spectral characteristics are significant.

Transformer-based architectures such as Informer and TFT have introduced global attention mechanisms to model long-range temporal dependencies more effectively^16,17,18. FEDformer further expands upon this by incorporating seasonal-trend decomposition and frequency-domain attention to reduce time-domain noise and highlight periodic patterns¹⁹. SEAformer similarly adopts frequency-domain decomposition to strengthen long-horizon predictions²⁰. While these approaches offer significant advancements, they tend to focus on high-level temporal patterns and often ignore low-frequency drifts or fine-grained local structures. Additionally, attention mechanisms in these models increase computational cost, limiting real-time deployment potential.

Moreover, most existing deep models are single-path and monolithic, meaning they fail to differentiate between local detail learning and global trend extraction. This often leads to a loss of interpretability and reduced adaptability in diverse operational contexts. To address these issues, ensemble and hybrid architectures have gained attention.

Recent literature shows increasing interest in hybrid and multi-module models. For instance, hybrid CNN-LSTM models attempt to combine local spatial and long-term temporal learning²¹, and wavelet decomposition-LSTM frameworks integrate wavelet decomposition to capture multi-resolution frequency features²². Attention modules such as CBAM or squeeze-excitation networks have also been applied to enhance the importance of salient temporal features²³. However, many of these hybrid models remain heuristic in their design, and few systematically explore the frequency characteristics of wind power time series or incorporate residual modeling strategies.

Additionally, studies rarely consider post-processing of prediction errors, such as modeling residuals for correction²⁴. In fact, error sequences often contain structured information (e.g., cyclic deviation, bias), which can be leveraged to improve accuracy. Yet this aspect remains under-explored in most mainstream models.

Research motivation and proposed framework

Although numerous models have achieved notable progress by improving either temporal or spectral representations, few have jointly addressed multi-scale frequency characteristics, long-term temporal dependencies, and nonlinear dynamics within a unified architecture. Moreover, residual learning and post-correction mechanisms remain largely unexplored in the field of wind power forecasting. These limitations restrict existing models from effectively capturing the complex temporal–spectral patterns and nonlinear behaviors inherent in wind power time series.

To overcome the above challenges, this study proposes a hybrid deep learning architecture that systematically models the multi-scale, nonstationary, and nonlinear characteristics of wind power data. Specifically, Wavelet Transform Convolution (WTC) is utilized to extract localized spectral components across multiple frequency bands; Long Short-Term Memory (LSTM) networks capture long-range temporal dependencies; the Time Series Lightweight Adaptive Network (TSLANet) enhances attention efficiency while maintaining global contextual awareness; the Frequency-Enhanced Channel Attention Mechanism (FECAM) emphasizes key frequency-domain features; and an attention mechanism based on FastKAN provides expressive yet compact nonlinear transformations.

Furthermore, a Least Squares Support Vector Machine (LSSVM) is incorporated for residual error correction in the post-prediction stage, which significantly improves stability and accuracy in multi-step forecasting tasks.

The proposed integrated framework addresses several key research gaps by:

(1)
capturing multi-scale temporal–frequency patterns that are often overlooked by time-domain models;
(2)
modeling both global and local dependencies in a computationally efficient manner;
(3)
introducing an interpretable frequency-domain attention mechanism;
(4)
achieving enhanced nonlinear representation with reduced model complexity; and.
(5)
integrating an LSSVM-based residual correction mechanism that improves robustness in multi-step forecasting.

By jointly leveraging time and frequency domain representations and combining deep and shallow learning paradigms, the proposed model achieves superior forecasting accuracy, robustness, and interpretability, providing a comprehensive solution for real-world wind power prediction tasks.

Wind power prediction model structure

Data characterization and exogenous feature analysis

Effective wind power forecasting requires not only advanced modeling techniques but also a deep understanding of the input data’s characteristics. In this study, we utilize a dataset collected from a wind farm that includes six key features measured by sensors mounted on a meteorological mast: wind speed, wind direction, air density, turbulence intensity, wind shear below hub height, and the corresponding power output. These features collectively reflect both the energy potential and operational conditions affecting wind turbines.

To quantitatively evaluate the relative importance and influence patterns of these exogenous variables, we conducted a model-agnostic interpretability analysis using SHAP (SHapley Additive exPlanations)²⁵, with the results shown in Fig. 1. The results reveal that wind speed is the most critical predictor, displaying a strong positive correlation with power output. This aligns with the aerodynamic principle that wind power increases cubically with wind speed. In contrast, turbulence intensity shows a negative contribution to power prediction, consistent with its physical role in inducing unsteady flow conditions that impair turbine efficiency. Moderate yet meaningful contributions were also observed from air density and below-hub-height wind shear, which indirectly reflect changes in atmospheric pressure and vertical wind gradient. Wind direction, however, had a negligible impact on the prediction, likely due to modern wind turbines’ ability to yaw and maintain optimal alignment with prevailing winds.

This data-driven insight serves as a foundation for model component selection in our proposed hybrid architecture. The observed multi-scale variability in wind speed and turbulence intensity justifies the incorporation of WTC for localized frequency-domain feature extraction. The long-term temporal dependencies intrinsic to meteorological sequences support the use of LSTM networks, which excel at capturing sequential memory over extended horizons. To further enhance the model’s efficiency and capture global contextual relationships without incurring prohibitive computational costs, we employ the TSLANet. TSLANet effectively balances attention expressiveness and complexity, allowing the model to focus adaptively on salient temporal segments while maintaining scalability for long sequences.

Given that the relative contribution of each input feature is non-uniform and evolves over time, we integrate the FECAM to emphasize important variables dynamically, especially under shifting atmospheric regimes. Furthermore, the nonlinear and nonstationary relationships observed between features and outputs motivate the use of an attention mechanism based on FastKAN, which provides a compact and flexible approximation of complex functional mappings. Lastly, to refine the final output and address systematic prediction errors, LSSVM is used in a post-processing stage to correct residuals, leveraging potential hidden patterns not captured by the primary forecasting modules.

In summary, this analysis demonstrates that each modeling technique in our architecture was chosen based on empirical evidence derived from data characteristics. By aligning feature behavior with appropriate algorithmic capabilities—spanning time–frequency transformation, temporal memory, attention adaptation, nonlinear mapping, and error correction—we construct a forecasting framework that is both theoretically grounded and practically effective for complex wind power prediction tasks.

Module design rationale

The wind power forecasting model presented in this paper comprises five core modules: WTC, LSTM, TSLANet, FECAM, and an attention mechanism based on FastKAN. The overall process design is illustrated in Fig. 2.

The proposed model is constructed to leverage the complementary strengths of time-domain, frequency-domain, and nonlinear modeling techniques for wind power forecasting. The architecture integrates a sequence of functionally distinct yet synergistic modules to address the multi-scale, nonstationary, and nonlinear characteristics of wind data.

First, the WTC module serves as a learnable wavelet-based encoder, performing multi-resolution decomposition to extract temporal features at different frequency scales. Compared to conventional CNNs, WTC preserves fine-grained time–frequency information while reducing noise through hierarchical filtering and reconstruction.

To capture temporal dependencies, the output of WTC is processed by an LSTM layer followed by TSLANet, a temporal-spectral learning module composed of the Adaptive Spectral Block (ASB) and the Interactive Convolution Block (ICB). ASB performs adaptive frequency-domain filtering based on FFT, enhancing dominant spectral components via learnable complex weighting and data-driven masking. In parallel, ICB enhances time-domain interactions through depth-wise convolutions and feature mixing, complementing ASB’s spectral emphasis. Together, TSLANet enriches the model’s ability to capture both spectral and temporal dynamics.

To further refine high-frequency representations, we incorporate FECAM, a channel attention module inspired by the Discrete Cosine Transform (DCT). FECAM selectively amplifies informative frequency-aware features across channels, helping the model focus on fluctuation-prone patterns.

Finally, the enhanced representations are passed to FastKAN, a kernel-based nonlinear mapping layer capable of approximating complex functional relationships more efficiently than traditional MLPs. FastKAN enables the model to project rich feature embeddings into the output space with high flexibility and expressiveness.

Each module contributes a distinct capability—WTConv1D for multiscale temporal decomposition, TSLANet for dynamic temporal-spectral learning, FECAM for high-frequency enhancement, and FastKAN for nonlinear regression. The sequential coupling ensures information from different domains is progressively integrated, enabling accurate and robust forecasting.

Wavelet convolutions

To effectively capture the complex multi-frequency characteristics of wind power data, the model first applies a Wavelet Transform Convolution (WTC) for preprocessing. Leveraging the multi-resolution capabilities of wavelet transform, WTC enables joint analysis in the time and frequency domains, allowing for the extraction of features across different temporal scales²⁶. This improves the quality of representations fed into subsequent modules. Moreover, by extending the receptive field without increasing parameter complexity, WTC enhances the model’s ability to capture both low- and high-frequency components, which is critical for modeling the non-stationary and long-range dependencies inherent in wind power series. The WTC architecture is illustrated in Fig. 3

The WTC decomposes the input data into distinct frequency components using convolution operations. For a given wind power series $X = [x_{1} ,x_{2} , \ldots ,x_{n} ]$, the wavelet transform convolution can be expressed as follows:

$$Y_{{{\text{WT}}}} = {\text{WTConv}}(X) = [{\text{WT}}(x_{1} ),{\text{WT}}(x_{2} ), \ldots ,{\text{WT}}(x_{n} )]$$

(1)

In this context, ${\text{WT}}(x)$ represents the one-dimensional wavelet transform of the input sequence xxx. Specifically, high-frequency (detail) and low-frequency (trend) information are extracted through high-pass and low-pass filters, respectively, using the following formula:

$$Y_{{\text{WT, hi}}} = \sum\limits_{i} {x_{i} } \cdot h(i)$$

(2)

$$Y_{{\text{WT, lo}}} = \sum\limits_{i} {x_{i} } \cdot l(i)$$

(3)

Here, $h(i)$ and $l(i)$ are the coefficients for the high-pass and low-pass filters. After obtaining and concatenating the high- and low-frequency components, this combined information is used as input for the next layer.

After processing by the WTC, the wind data is decomposed into distinct frequency components, enriching the input data for the subsequent LSTM. With temporal patterns separated into different scales, the LSTM can effectively learn temporal dependencies without needing to isolate high- and low-frequency components. Additionally, WTC compresses the data into fewer, more relevant features, enabling the LSTM to focus on capturing temporal relationships with fewer parameters. This approach helps prevent overfitting, particularly in the presence of noisy or non-stationary wind power data.

Long short-term memory networks

The LSTM manages long-term dependencies using memory cells and gating mechanisms²⁷, as shown in its structure in Fig. 4.

For the input sequence $Y_{{{\text{WT}}}}$, the LSTM update rules are as follows:

$$\begin{gathered} f_{t} = \sigma (W_{f} \cdot [h_{t - 1} ,Y_{{{\text{WT}},t}} ] + b_{f} ) \hfill \\ i_{t} = \sigma (W_{i} \cdot [h_{t - 1} ,Y_{{{\text{WT}},t}} ] + b_{i} ) \hfill \\ o_{t} = \sigma (W_{o} \cdot [h_{t - 1} ,Y_{{{\text{WT}},t}} ] + b_{o} ) \hfill \\ \tilde{C}_{t} = \tanh (W_{C} \cdot [h_{t - 1} ,Y_{{{\text{WT}},t}} ] + b_{C} ) \hfill \\ C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot \tilde{C}_{t} \hfill \\ h_{t} = o_{t} \cdot \tanh (C_{t} ) \hfill \\ \end{gathered}$$

(4)

Here, $f_{t}$,$i_{t}$ and $o_{t}$ represent the forget, input, and output gates, respectively. $C_{t}$ is the cell state, and $h_{t}$ is the output state.

By feeding the frequency information $Y_{{{\text{WT}}}}$, derived from wavelet decomposition, into the LSTM, the model can capture both short- and long-term dependencies across different frequencies. Combined with the LSTM’s ability to handle extended temporal dependencies, this approach allows the model to better manage irregular patterns and abrupt changes in wind power generation. This improves prediction accuracy and enhances model robustness, which is essential for practical wind power generation applications.

Time series lightweight adaptive network

To enhance the extraction of complex information from frequency domain signals, this paper introduces the TSLANet module²⁸. The core components of TSLANet are the Adaptive Spectral Block (ASB) and the Interactive Convolution Block (ICB). The frequency-adaptive filtering in ASB, along with the feature interaction mechanism in ICB, significantly enhances the accuracy of wind power predictions. TSLANet structure is shown in Fig. 5.

The ASB module processes the input sequence in the frequency domain using the Fourier transform and applies an adaptive frequency mask to filter out noise, retaining only the frequency components valuable for prediction. Its primary function is to adaptively reduce high-frequency noise while preserving essential frequency information, thereby improving the model’s capacity to capture complex wind speed variations. ASB effectively addresses frequency fluctuations, periodic patterns, and sudden changes in wind speed, enhancing the model’s adaptability across different time scales.

For the input time series $X \in^{B \times N \times C}$, we first apply the Fast Fourier Transform (FFT) to obtain its frequency domain representation:

$$X_{{{\text{FFT}}}} = {\text{FFT}}(X)$$

(5)

Next, an adaptive filter is applied to isolate the high-frequency components. The frequency domain energy is then calculated as follows:

$$E = \sum\limits_{i} | X_{{{\text{FFT}},i}} |^{2}$$

(6)

The energy is normalized and compared with a threshold θ to generate an adaptive mask:

$${\text{Mask}} = E> \theta$$

(7)

This mask is then applied to the frequency domain representation:

$$X_{{{\text{Filtered}}}} = X_{{{\text{FFT}}}} \times {\text{Mask}}$$

(8)

Next, the masked frequency domain data is weighted appropriately:

$$X_{{{\text{Weighted}}}} = X_{{{\text{Filtered}}}} \times W_{{{\text{high}}}} + X_{{{\text{FFT}}}} \times W$$

(9)

Finally, the data is transformed back to the time domain using the Inverse Fast Fourier Transform (IFFT):

$$X_{{{\text{ASB}}}} = {\text{IFFT}}(X_{{{\text{Weighted}}}} )$$

(10)

The ICB module leverages multi-scale convolution operations to extract features from time series data, effectively capturing both local details and global trends in wind speed variations. By integrating these multi-scale features through interaction mechanisms, the ICB enables the model to learn feature interactions across different time scales, enhancing its ability to predict wind power generation.

Within the ICB module, the input time series undergoes convolutions of varying sizes, capturing both short-term dependencies and longer-range patterns. This multi-scale approach allows the model to extract a comprehensive range of features for improved accuracy in wind power forecasting.

$$\begin{gathered} X_{1} = {\text{Conv}}1(X) \hfill \\ X_{2} = {\text{Conv}}2(X) \hfill \\ \end{gathered}$$

(11)

After applying the activation function and dropout, the feature interaction is as follows:

$$\begin{gathered} {\text{Out}}_{1} = X_{1} \times {\text{Drop}}({\text{Act}}(X_{2} )) \hfill \\ {\text{Out}}_{2} = X_{2} \times {\text{Drop}}({\text{Act}}(X_{1} )) \hfill \\ \end{gathered}$$

(12)

The final output is then:

$$X_{{{\text{ICB}}}} = {\text{Conv3}}({\text{Out}}_{1} + {\text{Out}}_{2} )$$

(13)

The TSLANet model, which combines the ICB and ASB modules, integrates both temporal and frequency domain features. This design enables it to capture wind speed variations across different scales and their effects on power generation, significantly enhancing the model’s generalization capability.

Frequency enhanced channel attention mechanism

The complex frequency information resulting from multiple transformations can significantly impact predictions. To address this, we introduce the FECAM module²⁹. FECAM improves time series predictions by combining frequency domain features with channel attention mechanisms, effectively managing time series data with intricate frequency components. FECAM structure is shown in Fig. 6.

FECAM uses the Discrete Cosine Transform (DCT) to extract frequency information from the input data, applying channel-level weighting based on this information to enhance the model’s sensitivity. This approach enables the model to accurately capture key fluctuations, as well as high- and low-frequency components that most impact power output, ultimately improving prediction precision.

DCT effectively decomposes frequency information in time series data, allowing the model to directly utilize these features. Unlike some other transforms, DCT does not introduce the Gibbs phenomenon (high-frequency noise) when processing non-periodic signals, making it well-suited for wind power data, which often lacks strict periodicity.

For each channel in the input multivariate wind power series, FECAM first applies DCT, as represented by the following formula:

$$X_{{{\text{DCT}}}} = {\text{DCT}}(X) = \sum\limits_{i = 0}^{L - 1} {x_{i} } \cos \left( {\frac{\pi }{L}(i + 0.5)k} \right)$$

(14)

Here, X represents the input sequence, L is the sequence length, and the DCT converts the time-domain signal into the frequency-domain signal. The frequency information extracted through DCT effectively represents the periodic characteristics and short-term fluctuations in wind speed variations, helping the model identify the features that most significantly contribute to wind power generation.

After the DCT transformation, the resulting frequency domain feature, X_Freq is used to construct channel-level attention weights, as shown in the formula below:

$${\text{Attn}} = \sigma (W_{2} \delta (W_{1} X_{{{\text{Freq}}}} ))$$

(15)

In this context, W₁ and W₂ represent the weights of the fully connected layers, σ is the sigmoid activation function, and δ is the ReLU activation function. The generated attention weights Attn are used to scale the original input, enhancing the emphasis on key frequency components.

This mechanism learns the weight distribution across each channel, highlighting the frequencies and channels that contribute most to the prediction. By combining these attention weights with the original features in a weighted manner, it allows the model to better capture the most relevant information for accurate forecasting.

The final output is then:

$$X_{{{\text{FECAM}}}} = X \times {\text{Attn}}$$

(16)

This mechanism scales features across different channels, increasing the weight of important channels while effectively reducing high-frequency noise, such as instantaneous wind speed fluctuations. It amplifies the most valuable frequency domain information, thereby enhancing the model’s predictive capability.

Fast Kolmogorov-Arnold network attention

To model nonlinear dependencies while balancing global and local representations, the model incorporates a multi-head attention mechanism built upon the FastKAN layer³⁰, as illustrated in Fig. 7. This design integrates FastKAN’s nonlinear transformation capability into the attention framework, enabling simultaneous learning of temporal and spectral features. FastKAN, an efficient variant of the Kolmogorov-Arnold Network (KAN)³¹, replaces traditional B-splines with radial basis functions (RBFs) to approximate complex mappings, significantly improving computational efficiency. Each layer applies learnable RBFs to perform nonlinear transformations, enhancing the model’s capacity to represent intricate dynamics in wind power data.

In this framework, FastKAN is applied to the queries, keys, and values, extracting local features through radial basis functions. A radial basis function measures the distance between the input x and a reference grid point, smoothing this distance using a Gaussian function. The specific formula is:

$${\text{RBF}}_{{{\text{query}}}} = \exp \left( { - \left( {\frac{{q - {\text{grid}}}}{{{\text{denominator}}}}} \right)^{2} } \right)$$

(17)

In this context, q represents the query feature, and similar operations are applied to the keys and values. The grid is created by discretizing a specified range, while the denominator controls the smoothness of the radial basis function.

Calculate the similarity weight between the query and the key:

$${\text{Att}}_{output} = {\text{softmax}}\left( {\frac{{W_{q} \cdot W_{k}^{T} }}{{\sqrt {d_{k} } }}} \right) \cdot W_{v}$$

(18)

Here, W_q, W_k, and W_v are the query, key, and value representations transformed by the FastKAN layer, and $d_{k}$ is the dimensionality of these vectors.

The final output is generated by combining the attention-weighted results with gated modulation:

$$O_{{{\text{gated}}}} = \sigma (W_{g} \cdot q) \cdot {\text{Att}}_{{{\text{output}}}}$$

(19)

In this context, W_g controls the gating, and O_grated represents the final gated output.

The final prediction result is:

$$\hat{P}(t) = W_{{{\text{out}}}} O_{{{\text{gated}}}} + b$$

(20)

In wind power forecasting, integrating the attention mechanism with FastKAN allows the model to focus more effectively on key features, such as abrupt changes in wind speed. The FastKAN layer enhances the accuracy of feature transformations, enabling the model to capture the relationships between input features more precisely. Additionally, the gating mechanism further improves the model’s adaptability by dynamically adjusting outputs based on input patterns, enabling it to optimize prediction performance adaptively.

LSSVM error correction

To further enhance prediction accuracy, this paper employs LSSVM for error correction. LSSVM demonstrates strong generalization ability and effectively captures the nonlinear error characteristics in wind power forecasts³², enabling accurate correction of initial prediction results. LSSVM structure is shown in Fig. 8.

By using the least squares method instead of the traditional quadratic programming approach in SVM, LSSVM significantly reduces computational complexity, making it well-suited for error correction tasks involving large-scale data. In this study, we utilize the LSSVM model by taking the residuals—differences between preliminary predicted values and actual wind power values—as inputs. By learning the relationship between these residuals and the input features, LSSVM corrects errors, thereby improving overall prediction accuracy.

Given the preliminary forecast series for wind power $\hat{P}(t)$, the error between this forecast and the actual wind power $P{(}t{)}$ is defined as:

$$e(t) = P(t) - \hat{P}(t)$$

(21)

The LSSVM model constructs an error correction model by learning the error $e(t)$ in the time series. The final corrected wind power forecast $P_{corr} (t)$ is expressed as:

$$P_{corr} (t) = \hat{P}(t) + e_{pred} (t)$$

(22)

Here, $e_{pred} (t)$ represents the LSSVM model’s prediction of the error $e(t)$.

Case analysis and verification

Data sources

The effectiveness of the proposed model in wind power output forecasting was evaluated on a real-world dataset collected from a wind turbine unit in the United States. The dataset spans from 00:00 on January 1, 2023, to 23:50 on August 1, 2023, with 10-min intervals, totaling 30,672 data points. The data were split into 80% for training and 20% for testing. The input sequence length was set to 144, and prediction horizons of 1, 4, and 8 steps were used under a rolling prediction strategy to forecast future wind power outputs.

Data preprocessing

Due to the differences in magnitudes between wind power output and other influencing factors, directly inputting the raw data could degrade the model’s performance and generalization capability. To address this, a standardization method was applied to scale all features of the raw sample data to the same range. The calculation is defined as follows:

$$x_{std} = \frac{x - \mu }{\sigma }$$

(23)

where x represents the standardized value of a feature, μ is the mean of the feature, and σ is the standard deviation of the feature.

For the wind direction feature, whose original range is [0,360] degrees, a sine function was employed for normalization based on its physical properties. This approach mapped the wind direction data to the range [–1,1], preserving its periodicity and directionality while eliminating boundary discontinuities. Such preprocessing makes the data more suitable for neural network models. The calculation is given as follows:

$$x^{*} = \sin \left( {\frac{x}{180}\pi } \right)$$

(24)

where $x^{*}$ is the normalized wind direction value.

Evaluation metrics

To comprehensively assess the predictive performance of the proposed model, four evaluation metrics were employed: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Symmetric Mean Absolute Percentage Error (sMAPE), and the Coefficient of Determination (R²). To enhance interpretability and maintain consistency with the other evaluation metrics, R² is reported in percentage form³³.

Additionally, multiple baseline models were selected for comparison to validate the model’s performance. Forecasting experiments were conducted for 1-step (10 min), 4-step (40 min), and 8-step (1 h 20 min) prediction horizons. The comparison models are summarized in Table 1.

Table 1 Comparison model abbreviation table.

Full size table

Model configuration and hyperparameter settings

The proposed hybrid model integrates wavelet-based convolutional feature extraction, sequential modeling via LSTM, spectral attention, and kernel-based nonlinear mapping. The model input dimension is set to 19, corresponding to the number of observed meteorological and power-related features. The LSTM module is configured with 2 layers and a hidden size of 128, which provides a balance between model expressiveness and training efficiency. To enhance temporal and spectral representation, a two-level wavelet decomposition is applied using Daubechies-4 basis functions within the WTC module. The convolutional kernel size is fixed at 5 to ensure adequate temporal locality while maintaining computational feasibility.

For frequency domain modeling, the ASB includes a learnable threshold parameter initialized to 0.3, which allows the model to adaptively emphasize high-energy spectral components during training. The ICB contains two convolution branches (1 × 1 and 3 × 1) and incorporates a dropout layer with a rate of 0.2 to mitigate overfitting risks. The final prediction head is constructed using a two-layer FastKAN module with structure [128, 1], which enhances nonlinear approximation ability while preserving training stability.

The model is trained using the Adam optimizer with an initial learning rate of 1 × 10⁻³ and a batch size of 32. All weight parameters are initialized using truncated normal distribution where applicable, and activation functions throughout the model include GELU (in ICB) and tanh (implicitly in FastKAN basis functions). Hyperparameters are selected based on empirical validation performance and domain-specific modeling principles, rather than grid or random search, in consideration of computational constraints.

Experimental analysis

As illustrated in Fig. 9, we compared the prediction results of different models. Figure 9 (a), (b) and (c) represent the 1-step, 4-step, and 8-step prediction curves, respectively. In the single-step prediction, all models perform well. However, as the prediction horizon increases, the baseline models gradually deviate from the actual value curve, while the proposed model continues to deliver satisfactory results.

Table 2 presents the prediction accuracy of different models for wind turbine power output, highlighting the best results for each prediction horizon. The proposed model consistently achieves the highest prediction accuracy. The RMSE and MAE values for all models increase with longer prediction steps, indicating higher errors as the forecast horizon extends. This phenomenon arises because capturing dependencies between distant time points becomes increasingly challenging. However, the proposed model consistently outperforms the baseline models, demonstrating its superior feature extraction and prediction capabilities across different time steps. For the 1-step prediction task, the proposed model reduces RMSE and MAE by up to 47.28% and 41.67%, respectively, compared to the baseline models. In the 4-step prediction task, the relative reductions in RMSE and MAE are 38.38% and 37.47%. In the 8-step prediction task, the relative reductions in RMSE and MAE are 45.17% and 42.95%.

Table 2 comparison of model evaluation results.

Full size table

The exceptional performance of the proposed model in multi-step forecasting is primarily attributed to its ability to accurately capture the complex dynamic characteristics of wind power. Specifically, the WTC extracts both high-frequency and low-frequency components to capture short-term fluctuations and long-term trends. The frequency-domain decomposition method in the ASB adaptively enhances critical frequency features. Simultaneously, the ICB extracts multi-scale convolutional features, achieving deep integration of local and global patterns. Additionally, the FECAM improves the focus on key channel features, while the FastKAN-based nonlinear mapping effectively captures the complex nonlinear relationship between wind speed and power output.

Error correction

Although the proposed model demonstrates strong prediction performance, it may still exhibit systematic bias and fail to capture certain complex nonlinear relationships or multi-scale features, which can adversely affect forecasting accuracy. To address these issues, the Least Squares Support Vector Machine (LSSVM) model was employed to correct prediction errors. This approach aims to reduce systematic bias, mitigate the impact of random errors, and supplement the features missed by the original model, ultimately improving prediction accuracy.

A comparative analysis of multi-step prediction performance between the LSSVM-corrected ensemble model and the uncorrected ensemble model is illustrated in Fig. 10, Figs. 10, (a), (b) and represent the 1-step, 4-step, and 8-step prediction curves, respectively. The evaluation metrics for the LSSVM-corrected model are presented in Table 3.

Table 3 Evaluation results of error correction.

Full size table

The results indicate that error correction using LSSVM consistently improves forecasting performance across different prediction horizons. Specifically, in the 1-step, 4-step, and 8-step predictions, the RMSE and MAE values showed significant improvement compared to the uncorrected model. In the 1-step prediction task, the relative reductions in RMSE and MAE are 18.15% and 16.95%. In the 4-step prediction task, the relative reductions in RMSE and MAE are 24.31% and 16.56%. In the 8-step prediction task, the relative reductions in RMSE and MAE are 31.83% and 33.73%. The data further demonstrate that error correction effectively captures periodic and trend-related variations, providing substantial reductions in prediction error. It also alleviates error accumulation, with the accuracy improvement becoming more pronounced for longer prediction horizons.

Impact of wind speed variations

To assess the model’s capability in handling dynamic variations of key features, the relationship between absolute prediction error and wind speed fluctuations across different models was analyzed, as illustrated in Fig. 11. The horizontal axis represents the rate of wind speed change at the current time step relative to its value two time steps earlier (20 min), termed the "near-2 wind speed change rate." This metric quantifies the intensity of wind speed fluctuations.

As shown in Fig. 11, the wind speed change rate at the wind farm was predominantly concentrated between 0.5 and 1.0, indicating significant and unstable wind fluctuations. Under these conditions, the proposed model effectively captured wind speed variation patterns, exhibiting high predictive stability. Specifically, the absolute error remained mostly below 5, and even in extreme cases where the wind speed change rate exceeded 1.0, the error was contained within the range of 5 to 10. The slow and concentrated growth of errors further demonstrated the model’s reliability under intense wind fluctuations.

By contrast, the traditional LSTM model exhibited greater instability and higher error fluctuations when handling rapid wind speed variations, with significantly larger errors at certain time steps compared to the proposed model. This suggests that the LSTM model lacked robustness in managing short-term abrupt wind speed changes, resulting in reduced prediction accuracy, particularly in cases of pronounced wind speed variability.

The analysis of error distribution confirms that the proposed model maintains strong stability and adaptability under intense wind fluctuations. While larger wind speed variations inevitably lead to some increase in error, the magnitude remained relatively small, with a concentrated distribution and no significant anomalies. These findings underscore the model’s superior predictive stability and robustness in complex, highly dynamic environments.

Conclusion

The proposed model effectively captures multi-scale features in both the frequency and time domains, significantly improving its ability to represent trend dynamics, and nonlinear patterns in wind power data. By integrating the WTC and the ASB, the model effectively addresses the challenge of multi-scale feature extraction that traditional methods often fail to capture.

Furthermore, the ICB and the FECAM enhance the robustness of feature representation by suppressing noise and amplifying informative temporal–spectral features. The FastKAN-based nonlinear mapping further strengthens the model’s ability to approximate complex dynamic relationships, ensuring both robustness and high predictive accuracy in short- and long-term forecasting tasks.

To further improve forecasting precision, a LSSVM is incorporated as a post-prediction residual correction layer. This hybrid structure effectively compensates for residual errors accumulated during multi-step prediction, thereby enhancing stability and maintaining high accuracy across extended forecasting horizons.

Experimental results confirm that the proposed model consistently outperforms baseline models in key metrics such as RMSE, MAE, sMAPE and R², achieving superior performance in both short-term fluctuation tracking and long-term trend prediction. In particular, the Impact of Wind Speed Variations experiment demonstrates the model’s robustness under highly dynamic wind conditions. By analyzing the relationship between absolute prediction error and the near-2 wind speed change rate, it is shown that the proposed model maintains significantly lower error sensitivity compared to other models, confirming its adaptability to sharp wind speed fluctuations.

Overall, the integration of time–frequency decomposition, nonlinear mapping, and residual correction enables the model to achieve high accuracy, strong generalization, and robust performance under complex and rapidly changing wind conditions. Future research will further explore sensitivity analysis of key parameters and the incorporation of uncertainty quantification techniques to enhance interpretability and predictive reliability in large-scale wind power forecasting applications.

Data availability

The data used in the article can be contacted with the corresponding author.

References

Hassan, Q. et al. The renewable energy role in the global energy transformations. Renew. Energy Focus. 48, 100545 (2024).
Article Google Scholar
Mostafaeipour, A., Bidokhti, A., Fakhrzad, M. B., Sadegheih, A. & Mehrjerdi, Y. Z. A new model for the use of renewable electricity to reduce carbon dioxide emissions. Energy 238, 121602 (2022).
Article Google Scholar
Lu, P., Zhang, N., Ye, L., Du, E. & Kang, C. Advances in model predictive control for large-scale wind power integration in power systems: A comprehensive review. Adv. Appl. Energy 14, 100177 (2024).
Article Google Scholar
Tian, Z. Analysis and research on chaotic dynamics behaviour of wind power time series at different time scales. J. Ambient. Intell. Humaniz. Comput. 14(2), 897–921 (2023).
Article Google Scholar
Tsai, W. C., Hong, C. M., Tu, C. S., Lin, W. M. & Chen, C. H. A review of modern wind power generation forecasting technologies. Sustainability 15(14), 10757 (2023).
Article ADS Google Scholar
Simankov, V. et al. Review of estimating and predicting models of the wind energy amount. Energies 16(16), 5926 (2023).
Article Google Scholar
Wang, Z. & Liu, W. Wind energy potential assessment based on wind speed, its direction and power data. Sci. Rep. 11(1), 16879 (2021).
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Xiaoxun, Z. et al. Research on wind speed behavior prediction method based on multi-feature and multi-scale integrated learning. Energy 263, 125593 (2023).
Article Google Scholar
Szostek, K., Mazur, D., Drałus, G., Kusznier, J. Analysis of the effectiveness of ARIMA, SARIMA, and SVR models in time series forecasting: A case study of wind farm energy production. Energies 19961073 17(19) (2024).
Wang, Y., Zou, R., Liu, F., Zhang, L. & Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 304, 117766 (2021).
Article Google Scholar
Elsaraiti, M. & Merabet, A. A comparative analysis of the arima and lstm predictive models and their effectiveness for predicting wind speed. Energies 14(20), 6782 (2021).
Article Google Scholar
Kurani, A., Doshi, P., Vakharia, A. & Shah, M. A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Ann. Data Sci. 10(1), 183–208 (2023).
Article Google Scholar
Wan, A., Chang, Q., Khalil, A. B. & He, J. Short-term power load forecasting for combined heat and power using CNN-LSTM enhanced by attention mechanism. Energy 282, 128274 (2023).
Article Google Scholar
Wu, Q., Guan, F., Lv, C. & Huang, Y. Ultra-short-term multi-step wind power forecasting based on CNN-LSTM. IET Renew. Power Gener. 15(5), 1019–1029 (2021).
Article Google Scholar
Shiri, F.M., Perumal, T., Mustapha, N., Mohamed, R. A comprehensive overview and comparative analysis on deep learning models: CNN, RNN, LSTM, GRU. arXiv preprint arXiv:2305.17473 (2023).
Wu, H., Meng, K., Fan, D., Zhang, Z. & Liu, Q. Multistep short-term wind speed forecasting using transformer. Energy 261, 125231 (2022).
Article Google Scholar
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proc. AAAI conference on artificial intelligence 35(12) 11106 11115 (2021).
Lim, B., Arık, S. Ö., Loeff, N. & Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 37(4), 1748–1764 (2021).
Article Google Scholar
Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., & Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International conference on machine learning 27268-27286 PMLR (2022).
Yan, L., Wu, S., Li, S. & Chen, X. SEAformer: frequency domain decomposition transformer with signal enhanced for long-term wind power forecasting. Neural Comput. Appl. 36(33), 20883–20906 (2024).
Article Google Scholar
Agga, A., Abbou, A., Labbadi, M., El Houm, Y. & Ali, I. H. O. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr. Power Syst. Res. 208, 107908 (2022).
Article Google Scholar
Wang, F. et al. Wavelet decomposition and convolutional LSTM networks based improved deep learning model for solar irradiance forecasting. Appl. Sci. 8(8), 1286 (2018).
Article Google Scholar
Liang, Y., Lin, Y. & Lu, Q. Forecasting gold price using a novel hybrid model with ICEEMDAN and LSTM-CNN-CBAM. Expert. Syst. Appl. 206, 117847 (2022).
Article Google Scholar
Hou, G., Wang, J. & Fan, Y. Multistep short-term wind power forecasting model based on secondary decomposition, the kernel principal component analysis, an enhanced arithmetic optimization algorithm, and error correction. Energy 286, 129640 (2024).
Article Google Scholar
Cakiroglu, C. et al. Data-driven interpretable ensemble learning methods for the prediction of wind turbine power incorporating SHAP analysis. Expert. Syst. Appl. 237, 121464 (2024).
Article Google Scholar
Finder, S. E., Amoyal, R., Treister, E. & Freifeld, O. Wavelet convolutions for large receptive fields. In European Conference on Computer Vision 363–380 (Cham, Springer Nature Switzerland, 2024).
Google Scholar
Yu, Y., Si, X., Hu, C. & Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019).
Article MathSciNet PubMed Google Scholar
Eldele, E., Ragab, M., Chen, Z., Wu, M., Li, X. Tslanet: Rethinking transformers for time series representation learning. arxiv preprint arxiv:2404.08472 (2024).
Jiang, M. et al. FECAM: Frequency enhanced channel attention mechanism for time series forecasting. Adv. Eng. Inform. 58, 102158 (2023).
Article Google Scholar
Li, Z. Kolmogorov-arnold networks are radial basis function networks. arxiv preprint arxiv:2405.06721 (2024).
Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., ... & Tegmark, M. Kan: Kolmogorov-arnold networks. arxiv preprint arxiv:2404.19756 (2024).
Zhang, Y. & Li, R. Short term wind energy prediction model based on data decomposition and optimized LSSVM. Sustain. Energy Technol. Assess. 52, 102025 (2022).
Google Scholar
Chicco, D., Warrens, M. J. & Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj. Comput. sci. 7, e623 (2021).
Article PubMed PubMed Central Google Scholar

Download references

Funding

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation(IITP)-Innovative Human Resource Development for Local Intellectualization program grant funded by the Korea government(MSIT)(IITP-2026-RS-2022–00156334).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Pai Chai University, 155-40 Baejae-Ro , Daejeon, 35345, Republic of Korea
Jinming Gao, Yixin Sun, Changsu Kim & Hoekyung Jung
Department of Music & Sound Technology, Korea University of Media Arts, 300 Daehak-Gil, Janggun-Myeon, Sejong-Si, 30056, Republic of Korea
Hankil Kim

Authors

Jinming Gao
View author publications
Search author on:PubMed Google Scholar
Yixin Sun
View author publications
Search author on:PubMed Google Scholar
Hankil Kim
View author publications
Search author on:PubMed Google Scholar
Changsu Kim
View author publications
Search author on:PubMed Google Scholar
Hoekyung Jung
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, J.G.; methodology, J.G and Y.S.; software, Y.S and H.K.; validation, J.G and C.K.; analysis, C.K and H.J.; writing—original draft, J.G.; supervision, H.J. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Hoekyung Jung.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Gao, J., Sun, Y., Kim, H. et al. Improving wind power prediction with advanced temporal and frequency domain processing combined with error correction. Sci Rep 15, 44300 (2025). https://doi.org/10.1038/s41598-025-27896-9

Download citation

Received: 18 March 2025
Accepted: 06 November 2025
Published: 08 December 2025
Version of record: 22 December 2025
DOI: https://doi.org/10.1038/s41598-025-27896-9

Subjects

Abstract

Introduction

Background and motivation

Literature review

Physical models

Statistical models

Machine learning-based models

Research motivation and proposed framework

Wind power prediction model structure

Data characterization and exogenous feature analysis

Module design rationale

Wavelet convolutions

Long short-term memory networks

Time series lightweight adaptive network

Frequency enhanced channel attention mechanism

Fast Kolmogorov-Arnold network attention

LSSVM error correction

Case analysis and verification

Data sources

Data preprocessing

Evaluation metrics

Model configuration and hyperparameter settings

Experimental analysis

Error correction

Impact of wind speed variations

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links