Abstract
Amid escalating climate change and energy crises, wind energy, as a pivotal renewable resource, poses significant challenges to grid stability and energy management due to its inherent stochastic intermittency and nonlinear dynamics. Consequently, this research presents a hybrid prediction system, ICEEMDAN-NCRBMO-AELM, integrating data decomposition with intelligent computing to reveal spatiotemporal coupling patterns in climatic variables for reliable wind power forecasting. This system utilizes Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN) to dissect sequences into several modes, addressing time–frequency features and alleviating mode mixing through a dynamic noise-weighting scheme. To further optimize Adaptive Extreme Learning Machine (AELM) performance, this research proposes a novel Normal Cloud Red-billed Blue Magpie Intelligent Optimization (NCRBMO) algorithm, motivated by cloud model theory and the swarm behavior of Red-billed Blue Magpie. NCRBMO employs a multiphase mapping inverse generation strategy for initializing individuals and designs five heuristic search strategies for global optimization. Regarding hyperparameter tuning, NCRBMO optimizes the weight matrix and bias vector in the output layer of a single-hidden-layer feedforward network, enhancing prediction accuracy and stability. The interseasonal wind power prediction results from Jiangsu region, China, indicate that this system surpasses competing representative techniques in addressing complex seasonal trends and meteorological abrupt changes.
Introduction
Motivation
Faced with escalating climate change and an energy crisis, the global energy system is undergoing an urgent structural reshaping. In this context, the shift towards a sustainable, low-carbon energy system, primarily driven by renewable energy, has emerged as an irreversible trend essential for achieving global sustainable development1. Wind energy, characterized by its superior environmental benefits and perennial availability, is rapidly emerging as one of the most promising types of renewable energy2. Researchers have compiled a statistical overview of the spatial distribution of dependable wind power density across global geographical areas, which is poised to profoundly shape the future energy supply structure of these regions3. Presently, China has progressively developed the world’s largest renewable energy generation system. Drawing from the strategic deployment of large-scale integrated energy bases in China’s 14th Five-Year Plan for Renewable Energy Development, wind energy is set to be the dominant energy type. According to the Global Wind Energy Council (GWEC) 2024 Report4, by the end of 2023, China’s newly installed capacity of renewable energy sources, including wind power, hydropower, and photovoltaics, accounted for over 80% of the national total newly added power generation capacity, and contributed over half of the global new renewable energy installed capacity. This signifies that sustainable clean energy is emerging as a primary driver for both China’s economic growth and energy transition.
Despite wind energy emerging as a pivotal cornerstone of the global energy transition, its intrinsic intermittency and significant turbulence result in wind speed sequences with significant nonlinearity and non-stationarity. These dynamics pose significant hurdles to grid operational stability, particularly impacting dispatch, energy storage system design, and the efficacy of power system planning5. Therefore, accurate wind power forecasting has emerged as a crucial breakthrough for enhancing the efficiency of wind energy utilization and advancing the grid integration capacity of wind power systems. This research focuses on the development of a reliable, multi-seasonal sustainable energy generation forecasting system capable of withstanding meteorological and environmental fluctuations. The system aims to assist stakeholders in accurately forecasting risks induced by stochastic wind speed fluctuations, mitigating reliability concerns stemming from wind power uncertainty. By providing highly reliable forecasting data, it offers crucial support for refined grid scheduling, optimal energy storage system sizing, and enhanced wind farm operational efficiency. Collectively, this research aims to curb wind power curtailment, facilitate the effective mitigation and precise response of the grid to wind power fluctuations, and actively support the global low-carbon energy system transition.
Existing mainstream methodologies for forecasting renewable energy are categorized into three distinct classes: physical, statistical, and hybrid approaches6. Given the limitations of single models, hybrid approaches have emerged as the preferred choice due to their integration of artificial intelligence with other advanced technologies, demonstrating robustness in dynamic settings and enhanced predictive performance for climatic variations7. However, certain methods are limited by inadequate data decomposition, which gives rise to cross-frequency interference and the blending of distinct modes8. This hinders models’ ability to capture seasonal patterns or abrupt fluctuations caused by extreme weather events. Without effective optimization mechanisms, models struggle to adjust adaptively based on actual data. Treating all input indiscriminately or utilizing predetermined weighting schemes can introduce significant bias, reducing generalizability9. Notably, although deep learning methods offer high predictive accuracy10, they are constrained by backpropagation and rigid activation functions, increasing computational complexity and limiting flexibility, especially in forecasting tasks demanding rapid responses or handling unusual weather incidents11.
Contributions of this research
This research proposes ICEEMDAN-NCRBMO-AELM, a hybrid forecasting framework integrating data decomposition with intelligent computing to capture intricate fluctuations and seasonal patterns in wind power. The system utilizes Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN) to resolve wind power signals into distinct modal components, effectively isolating long-term trends from stochastic short-term volatility. The system incorporates an Adaptive Extreme Learning Machine (AELM), whose single-layer feedforward network (SLFN) architecture, augmented by an adaptive activation function, facilitates expedited training through direct weight solution rather than iterative backpropagation. Furthermore, a novel Normal Cloud Red-billed Blue Magpie Optimization (NCRBMO) algorithm, inspired by cloud model theory and swarm intelligence behaviors, is developed to autonomously optimize the AELM architecture.
This research makes the following principal contributions:
A hybrid prediction system, ICEEMDAN-NCRBMO-AELM, is proposed for accurate renewable energy generation forecasting under multi-seasonal dynamic meteorological conditions.
A novel metaheuristic intelligent optimization algorithm, NCRBMO, is proposed, demonstrating improved performance in fine-tuning hyperparameters and enhancing the predictive stability of the AELM model.
A comprehensive multi-dataset validation framework has been developed for the rigorous assessment of the NCRBMO algorithm’s optimization potential and the predictive efficacy of the ICEEMDAN-NCRBMO-AELM system under multi-seasonal climatic conditions.
Novelty and relation to previous works
The previously established NCPO-ELM architecture12pioneered the integration of cloud model theory into the stochastic search framework of the Parrot Optimizer, achieving high-precision photovoltaic power forecasting across diverse regions. However, when extending this paradigm to wind power forecasting—characterized by heightened non-stationarity and cross-scale spatiotemporal coupling—the architecture encounters three core limitations. First, the absence of effective signal decomposition results in cross-frequency interference and mode mixing within the raw sequences, while the existing ELM is limited to extracting shallow features and struggles to capture nonlinear, deep-seated characteristics. Second, the highly volatile nature of wind data necessitates rapid directional learning capabilities, yet the stochastic search strategy inherent in the PO algorithm fails to precisely capture the underlying transient physical patterns. Finally, pseudo-random initialization leads to non-uniform population distribution, and the persistent computational bottlenecks within its stochastic framework further constrain its efficacy in real-time applications.
The core innovation of this study lies in the proposal of a novel optimization algorithm, NCRBMO, and the subsequent construction of a hybrid framework integrating data decomposition and predictive modeling to explore its practical potential in complex engineering scenarios. NCRBMO utilizes the RBMO algorithm as its foundational architecture, retaining the core search mechanism of adaptive population resizing to accelerate convergence. On this basis, an innovative initialization strategy is introduced, incorporating multiphase mapping across various dimensions and inverse-based population generation, which significantly enhances the distribution uniformity of the initial population within the solution space. Furthermore, the algorithm integrates three novel search strategies—Expansion search, Foraging, and Benefit-seeking and risk-avoidance—alongside an elite preservation mechanism to ensure the stability of the global optimum. Addressing the limitations of the stochastic architecture in the PO algorithm, a new switching coefficient is designed to facilitate a seamless transition between global exploration and local exploitation, complemented by a comprehensive parameter sensitivity analysis. Compared to the NCPO-ELM architecture, the proposed system employs the ICEEMDAN algorithm with noise weighting coefficients for the multi-scale decomposition of wind power sequences, precisely revealing long-term seasonal trends and short-term extreme disturbances, thereby significantly enhancing the physical interpretability of the forecasting results. Additionally, the system abandons the fixed activation function of the standard ELM in favor of a novel adaptive activation function tailored to the distribution characteristics of diverse datasets, achieving synchronized optimization of the activation slope and response range, which strengthens the representation capability for deep features within complex data structures.
Related works
Renewable energy forecasting approaches
This section provides a critical overview of the established physical, statistical, and hybrid approaches in the field of renewable energy prediction, offering a comparative analysis of their respective advantages and inherent limitations.
Physical approaches
Physical approaches Wang et al. integrated multilayer-perceptron-based feature selection with a Random Forest (RF), curtailing the mean error of the Weather Research Forecasting (WRF) model’s 10-m wind speed forecasts by over 45%13. Zheng et al. utilized a WRF-RF prediction model, incorporating Numerical Weather Prediction (NWP) and meteorological tower data, to achieve high-precision wind power forecasting for a Chinese wind farm, especially under high wind speed conditions14. However, physical model predictions are sensitive to data quality, with minor inaccuracies potentially propagating over time. The requirement for high-resolution spatial and temporal grids increases computational complexity, limiting their effectiveness for short-term forecasting that necessitates rapid updates.
Statistical approaches
Statistical approaches analyze historical time-series data to identify intrinsic patterns in renewable energy generation and their statistical relationships with exogenous variables. The Autoregressive Integrated Moving Average (ARIMA) model, based on linear regression of observed values, is widely used for wind output forecasting to capture trends. Conversely, it is susceptible to outliers, and its prediction accuracy can be compromised by abrupt events or non-periodic patterns. Therefore, Cao et al. combined the ARMA model with a pattern-matching approach, finding that it excelled in 1-hour short-term predictions15. Ahn et al. successfully integrated wavelet transform with ARIMAX methodologies to develop an ensemble model that outperformed single ARIMAX models and other benchmarks in reliability16. Despite their proficiency in processing historical data, statistical models are constrained by dependencies based on linear correlations, rendering them less capable of characterizing nonlinear patterns and stochastic fluctuations.
Hybrid approaches
In recent years, hybrid architectures integrating machine learning and deep learning have demonstrated significant technical advantages in addressing the non-stationary nature of wind power series, with decision-tree-based algorithms and their variants being most extensively implemented. Existing research indicates that in the application of wind farms in Texas, USA, the predictive accuracy of Random Forest (RF) transcends that of KNN and AdaBoost algorithms17. Concurrently, the LightGBM algorithm exhibited exceptional goodness-of-fit in processing Turkish wind energy data, with the coefficient of determination (R2) for power forecasting approaching 1.0 and computational latency maintained at a low level of less than 100 s, thereby validating the feasibility of gradient boosting frameworks in balancing predictive precision with computational efficiency18. Another research trajectory focuses on the construction of ensemble learning and hybrid frameworks. By integrating LightGBM with AdaBoost and employing 10-fold cross-validation, the Root Mean Square Error (RMSE) of SCADA system power forecasting was effectively reduced to 11.7819. To address the requirements for sophisticated spatiotemporal feature extraction from wind power data, the CNN-LSTM hybrid architecture has demonstrated superior predictive accuracy across various empirical studies20. Although the ensemble model GB+XGBoost remains highly competitive in terms of computational efficiency—achieving a Mean Square Error (MSE) of 7.2 within 45 seconds21—the CNN-LSTM framework typically minimizes predictive deviations through its capacity for deep nonlinear feature mining. This underscores the irreplaceable performance advantage of deep learning architectures in handling complex temporal dependencies for precision-sensitive wind power forecasting tasks. Moreover, Peng et al. combined CEEMDAN for signal decomposition with Bidirectional Long Short-Term Memory (BiLSTM) for submodal prediction22. However, CEEMDAN suffers from cross-frequency interference, and BiLSTM is susceptible to outliers. Wang et al. proposed a Quantile Regression (QR)-integrated BiLSTM framework for wind uncertainty, achieving over 97% interval coverage, but it requires individual quantile training, and maintaining optimal coverage across diverse scales remains challenging23. Neethu et al. used modified CNN layers for feature extraction and stacked LSTM networks, yet the lack of an optimization mechanism necessitates extensive manual hyperparameter tuning24. Additionally, potential feature-scale mismatch further compromises modeling fidelity. While Qiao et al. accelerated BP neural network convergence fivefold via Particle Swarm Optimization (PSO), its proneness to becoming trapped in local optima and the curse of dimensionality constrain multidimensional prediction accuracy25.
Methods
This section provides a detailed explanation of the key components of the proposed system, specifically including the data decomposition method with the introduction of a dynamic noise-weighting scheme, the construction of an intelligent algorithm framework for global optimization, and the incorporation of adaptive activation functions into the extreme learning machine. Figure 1 presents a graphical abstract of the ICEEMDAN-NCRBMO-AELM hybrid prediction system and summarizes the critical implementation steps. Table 1 delineates the definitions and descriptions of the key symbols utilized throughout the methodology to enhance the readability of the manuscript.
Step 1: Acquisition of climatic and environmental data. Data are sourced from anemometry towers and meteorological stations, encompassing wind speed, wind direction, temperature, pressure, and other meteorological variables. The Supervisory Control and Data Acquisition (SCADA) system in wind farms monitors and collects real-time operational data, such as yaw angle and turbine speed. Integrating external meteorological data with internal operational data enables the development of a more comprehensive and precise forecasting system. The system utilizes a high-resolution wind power dataset with 15-minute intervals, defining the forecasting scope as short-term wind power prediction for the subsequent 5 to 6 days through the implementation of a single-step forecasting strategy.
Step 2: Historical wind power data preprocessing and decomposition. The sampling frequency must match the meteorological and environmental inputs from Step 1. Preprocessing involves feature selection and normalization. ICEEMDAN decomposes raw signals into intrinsic mode functions, each representing distinct frequency components. Feature engineering employs the sliding window technique to convert continuous time series into meaningful input-output pairs. Normalization scales all datasets to [0, 1], addressing dimensional discrepancies. Following forecasting, denormalization restores results to original physical units for performance evaluation and error calculation.
Procedural diagram of the developed ICEEMDAN-NCRBMO-AELM hybrid prediction system.
Step 3: Preliminary work for predictive model development. The preprocessed dataset is subsequently divided into dedicated training and validation partitions to facilitate the development and assessment of the model. AELM parameters are initialized, followed by generalization tests using K-fold cross-validation. The dataset is randomly partitioned into K non-overlapping subsets, with one serving as the validation set and the remaining K-1 as the training set in each iteration. The RMSE averaged over all K subsets is computed to provide an unbiased performance estimate.
Step 4: Optimization and iterative refinement of the predictive model parameters. The model is trained with preinitialized parameters, followed by performance enhancement using the NCRBMO to fine-tune critical parameters, iteratively optimizing AELM weights and biases. After training, the optimized model predicts unseen data. The model is retrained to integrate fresh data or address shortcomings identified during application. Re-optimization continues until the prediction error converges or the iteration cap is attained, resulting in the ultimate predictions.
Step 5: Comparison and error assessment of the multi-seasonal wind power prediction outcomes. The seasonal prediction curves are evaluated against the measured power data to showcase the system’s predictive accuracy and generalizability. Subsequently, a detailed analysis of the prediction error distribution reveals inherent error patterns and evaluates the contribution of each input feature to the prediction outcomes.
Improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN)
Due to the complex, non-stationary, and nonlinear nature of signals in multi-seasonal wind power forecasting, traditional methods like the Fourier transform (FFT) and wavelet transform (WT) are limited by fixed basis functions, hindering precise dynamic analysis. In contrast, Ensemble Empirical Mode Decomposition (EEMD) employs a data-driven adaptive sifting process to decompose wind speed series into Intrinsic Mode Functions (IMFs) and a residual component, improving decomposition accuracy and interpretability. However, EEMD faces “mode mixing,” where components of different scales merge into a single IMF, leading to an inaccurate representation of the signal’s frequency characteristics. This issue arises from two factors:
-
1.
Cross-interactions among different frequency components, especially between low- and high-frequency ones, can lead to misclassification during decomposition, resulting in mode mixing.
-
2.
A fixed noise amplitude fails to accommodate the signal’s energy variations across scales. Inappropriate noise intensity may distort the inherent features of low-frequency modes, hindering the resolution of low-frequency components and exacerbating mode mixing.
Therefore, this research introduces a dynamic noise-weighting scheme that modulates the noise intensity in each decomposition step in accordance with the signal’s frequency characteristics.
The implementation procedure of ICEEMDAN can be summarized as follows:
Step 1: Dynamic noise infusion. The noise ω(i)(t) is scaled by the noise weighting coefficient βk and then superimposed onto the initial signal series y(t). The noise amplitude is adjusted in accordance with the signal’s spectral characteristics, with distinct weighting coefficients assigned to low-frequency and high-frequency modes.
where ω(i)(t) is the noise component infused at the i-th round. ηk indicates the dynamic noise weighting coefficient. A diminished ηk is utilized for low-frequency constituents to mitigate noise interference, whereas an amplified ηk is applied to high-frequency constituents to facilitate their clean resolution.
where λk serves as the scaling coefficient that governs the weight of noise within distinct frequency bands, and fk represents the frequency of the input signal. For high-frequency bands, a larger fk results in a smaller ηk, facilitating the precise separation of high-frequency modes triggered by noise without masking essential signal features. Conversely, for low-frequency bands, a smaller fk leads to a larger ηk which effectively enhances the resolution of low-frequency signals and overcomes mode mixing caused by non-uniform energy distribution.
Step 2: Applying empirical mode decomposition. Empirical mode decomposition is performed on the noise-weighted signal y(i) to derive the first IMF component, IMF1(t), while the other components constitute the updated residual signal r1(t).
where I denotes the aggregate number of noise additions. The first intrinsic mode function, IMF1(i)(t), is extracted from the noise-weighted signal y(i)(t) during the i-th iteration. ICEEMDAN reduces noise interference by averaging over I iterations, enhancing the representativeness and reliability of the IMF components. The residual signal r1(t) after updating can be given by:
This residual signal serves as the input for the extraction of subsequent IMF components.
Step 3: Residual update and process termination. During each iterative cycle, the rn(t) is updated by extracting IMF components. The process persists until the residual signal meets a predefined stop criterion, at which point the procedure concludes, resulting in the final set of IMF components IMF1(t), IMF2(t),…, IMFn(t), along with the associated residual sequence rn(t). Regarding the oscillation stopping criterion, when the number of extrema in the current residual signal falls below three, it signifies that the signal no longer possesses oscillatory characteristics. At this juncture, the sifting process terminates automatically to prevent the generation of spurious modes via over-decomposition.
ICEEMDAN decomposes wind data into several IMFs, each with distinct physical significance. Low-frequency components capture long-term trends, reflecting macro phenomena such as seasonal variations and climate patterns. Mid-frequency components reveal cyclical changes, like daily wind speed fluctuations, while high-frequency components highlight rapid fluctuations or noise, indicating short-term system instability. Section 4.2.5 evaluates the differences in decomposition performance between ICEEMDAN and traditional CEEMDAN using actual wind power data.
The proposed normal cloud Red-billed blue magpie optimization algorithm (NCRBMO)
The inspiration of the proposed NCRBMO swarm intelligence algorithm
Metaheuristic intelligent optimization algorithms, motivated by natural processes such as natural selection and swarm intelligence, provide a nongradient-based framework. This framework enables them to conduct global searches and mitigate the potential for converging to suboptimal solutions, an advantage that is especially suited for handling the inherent nonlinearity, randomness, and multiscale characteristics of wind power forecasting. Among these, swarm intelligence algorithms, with their superior information sharing and adaptability, are the primary selection for this research. The proposed NCRBMO algorithm is inspired by the collective intelligence behaviors of the Red-billed Blue Magpie in nature, which devises efficient foraging strategies through information sharing and group collaboration. Native to China and Southeast Asia, the Red-billed Blue Magpie is renowned for its striking plumage and bright red beak. Researchers have identified five key behavioral characteristics26. These features provide significant inspiration for designing optimization strategies in the NCRBMO. Figure 2 illustrates the visualization of the five distinct heuristic optimization strategies mentioned above, along with their corresponding position update formulas.
Five key collective behaviors in Red-billed Blue magpie. (a) Group search behavior: Red-billed Blue Magpies cooperate using visual and auditory signals to improve foraging efficiency. (b) Foraging behavior: Individuals adaptively assess food locations and signal companions. (c) Attacking prey behavior: Small groups, while capture minor prey, larger groups exploit collective advantages for larger targets. (d) Expansion search behavior: When food is scarce, individuals use spatial memory and path planning to explore new areas. (e) Benefit-seeking and risk-avoidance behavior: Individuals emit warning calls for group relocation, maximizing foraging while minimizing the risk of predation.
Population initialization with multiphase mapping and inverse generation strategy
The initialization uniformity and diversity of the population are paramount to optimization efficacy. Most current algorithms, which employ pseudorandom number generators, suffer from nonuniform distributions and clustering, increasing the risk of suboptimal convergence. Therefore, NCRBMO introduces a novel population initialization strategy that integrates multistage mapping with inverse generation. Specifically, within lower-dimensional contexts, the Tent chaotic map is utilized to generate a highly diverse initial population efficiently. For high-dimensional domains, Chebyshev mapping is applied to augment the uniform distribution of the population further. Subsequently, inverse solutions are generated to rapidly expand the population distribution, followed by a forward-inverse population merging process to select the final high-quality population.
The implementation procedure can be summarized as follows:
Step 1: The sequence Pi, dt is constructed within the variable bounds [ad, bd] through the application of the multiphase mapping model.
where d specifies the individual dimension, D corresponds to the overall dimensionality, and t represents the current iteration count. The Tent coefficient α is 0.499. The polynomial order n is set to 4 for the Chebyshev mapping. xi-1,d denotes the preceding iteration’s value, normalized to the interval [0,1]. Table 2 assesses the convergence error of different mapping models on the Sphere benchmark function in low-dimensional contexts and evaluates the population coverage on the Rastrigin benchmark function in high-dimensional domains.
Step 2: New positive populations M(N) are generated by mapping the sequences Pi, dt into the spatial domain [ub,lb]:
Step 3: A derived inverse populations FM(N) is acquired from the positive population M(N).
where N corresponds to the total number of individuals in the population. rand is the perturbation factor. By randomly scaling rand, the inverse-generated solutions do not exhibit strict symmetry with the forward solutions, circumventing the ‘symmetry trap’ in multimodal problems.
Step 4: A pool of the positive population X(N) and the inverse population FM(N) is created. The leading N individuals are culled based on their fitness to establish the initial population Xit.
The integration of forward and inverse populations leverages information from both search directions, while the elite selection mechanism ensures the high quality of the final population.
NCRBMO algorithm’s global exploration stage
(1) Group search behavior of Red-billed Blue Magpie.
The Red-billed Blue Magpie is a social bird species that typically operates in groups or small groups. It collaborates through visual and auditory signals to optimize search paths and enhance foraging efficiency. This group search behavior is defined by the following Eq. (10).
where Xit denotes the position of the i-th individual, and Xit+1 represents the updated position. Xmt represents a randomly selected m-th individual. Xrt consists of randomly chosen search agents. Parameter p indicates the number of individuals involved in small group searches, with a range of [2, 5], and q represents those searching in a flock, constrained within [10, N]. In wind power forecasting tasks, a significant spatial correlation exists between wind speed and power output. This strategy captures spatial dependency features through group positional interactions, thereby enhancing the predictive model’s holistic perception of regional wind energy distribution and mitigating forecasting biases induced by localized abrupt wind speed fluctuations.
(2) Foraging behavior integrated with the normal cloud model.
During foraging, the Red-billed Blue Magpie carefully observes their surroundings to estimate food locations, then flies or jumps to the target site. They may also emit calls or flight signals to guide other group members to the food source. The foraging behavior is articulated by Eq. (11).
where Xbestt denotes the optimal position. Xmeant is defined as the mean position of the current population. t and T represent the current and maximum number of iterations. rand corresponds to a random number uniformly distributed in [0,1]. Levy(dim) denotes the Levy flight strategy, which models the individual’s movement. Parameter µ serves as the scale factor, and γ is the jump exponent.
Further, NCRBMO introduces the “normal cloud model” as a position update mechanism in the “foraging behavior” search strategy for the following reasons:
-
1.
The normal cloud model incorporates fuzzy boundaries and randomness, generating random samples that follow a specific distribution. This enables it to effectively address uncertainties in practical engineering problems.
-
2.
In the normal cloud model, the generation of each cloud drop exhibits stochastic fluctuations, deviating from a fixed trajectory and “wandering” within the solution space according to a probabilistic distribution. This feature overcomes the constraints of deterministic search paths in conventional optimization methods by introducing a perturbation mechanism into the search process.
As illustrated in Fig. 3, the normal cloud model is characterized by a set of independent parameters, including Ex (expectation), En (entropy), and He (super entropy), to quantify the numerical attributes of qualitative concepts12. Ex defines the central position of the numerical range, similar to the statistical center of the distribution of cloud drops in a cloud model. En quantifies the concept’s fuzziness, which is its degree of uncertainty within the numerical space. A greater En value results in a more dispersed cloud drop distribution. He is a measure of the uncertainty in entropy, reflecting the dispersion degree of the cloud drops’ random distribution. A larger He value indicates a higher degree of randomness.
Visualization of the integration of the foraging behavior strategy into the normal cloud model. In the NCRBMO algorithm, the expectation Ex is defined as the current optimal position Xbest. Entropy En serves as a spatial distribution parameter, which is used to regulate the distance range between the swarm and the optimal individual. Superentropy He characterizes the dispersion of the group’s location distribution.
The search strategy steps incorporating the normal cloud model are as follows:
Step 1: A normally distributed random number, En`i, is generated with an expectation of En and a standard deviation of He.
Step 2: A normal random number, xi, is generated with an expectation of Ex and a standard deviation of En`i.
Step 3: The membership function, µi, is utilized to update the subsequent positions Xit+1.
NCRBMO adjusts the population’s distribution by updating the superentropy value of He, resulting in a denser population distribution around Xbest and a more dispersed distribution elsewhere. One of the core challenges in multi-seasonal wind energy forecasting pertains to the uncertainties inherent in meteorological conditions and power conversion, such as wind speed volatility, wind direction variations, and turbine downtime. The fuzzy randomness of the normal cloud model is precisely suited to characterize such uncertainties. The integration of the cloud model enables the algorithm not only to converge toward optimal predictive model parameters during the search process but also to simulate various potential meteorological scenarios through the stochastic generation of cloud drops.
NCRBMO algorithm’s local exploitation stage
(1) Attacking prey behavior of the Red-billed Blue Magpie.
In scenarios involving small-group hunting, individuals quickly respond to capture small insects, a behavior represented by the rand < 0.5 part of Eq. (17). In larger group hunts, they typically utilize their sharp beaks to capture smaller vertebrates, a behavior described by the rand ≥ 0.5 part of Eq. (17). This reflects the flexibility of hunting strategies.
where the randNormal denotes a random variate drawn from a normal distribution with zero mean and unit variance. The parameter p ranges from 2 to 5, while q spans from 10 to the maximum population size N. Building upon an existing optimized predictive model, this strategy employs small-step stochastic perturbations to circumvent local optima. Simultaneously, it integrates historical error distributions via a normal random term to perform bias correction on the forecasting results, thereby enhancing the smoothness of the predictive curve and its empirical fitting accuracy.
(2) Expansion search behavior of the Red-billed Blue Magpie.
When food resources become scarce or fail to meet foraging demands, the Red-billed Blue Magpie will leave its original foraging area to explore more distant regions. This expansion search behavior can be viewed as a dynamic environmental adaptation strategy, as described by Eq. (18).
where ε is leveraged to dynamically adapt the search range, with its value determined by the present iteration t relative to the total number of iterations T. Specifically, a larger ε value promotes global search capability, while a smaller ε value facilitates the refinement of local area exploration. When wind speed patterns undergo abrupt transitions due to seasonal successions, established predictive models may become invalidated. This strategy enables the algorithm to transcend the current parameter space to explore novel potential patterns, thereby augmenting the adaptability of the forecasting model during long-term operation.
(3) Benefit-seeking and risk-avoidance behavior of the Red-billed Blue Magpie.
During foraging, the Red-billed Blue Magpie remains highly vigilant. Upon the approach of a predator, it swiftly flees. This behavior reflects its dual objectives of maximizing foraging efficiency and minimizing predation risk. This behavior pattern is articulated by Eq. (20).
where rand1 and rand2 denote random variables uniformly distributed within [0, 1]. This search strategy guides the population toward potential regions (Xbestt) while eliminating ineffective search paths (Xworstt), achieving rapid convergence. This strategy guides the predictive model to learn from high-precision regions, such as historical periods with minimal error, while simultaneously avoiding the acquisition of low-quality data resulting from sensor malfunctions or anomalous weather conditions, which correspond to the worst solutions. This mechanism is particularly critical in the construction of data-driven forecasting models, as it enhances the screening efficiency of training data and reinforces model reliability.
Switching coefficient and elite preservation mechanism of the NCRBMO algorithm
The switching coefficient φ orchestrates the dynamic balance between global exploration and local exploitation. This enables the algorithm to identify promising solutions early on while also meticulously converging to the global optimum, preventing unproductive global exploration.
when φ > ψ, NCRBMO enters the exploration phase, conducting an extensive search of the solution space; when φ ≤ ψ, NCRBMO switches to the exploitation stage, performing a refined search around the known potential regions. A parameter sensitivity test conducted on the various benchmark tests reveals that the algorithm’s best performance is obtained with a ψ value of 0.5.
After each iteration, should the candidate solution’s fitness exceed that of its predecessor, the latter is superseded. Otherwise, the incumbent solution remains unchanged until the iteration concludes. This elite preservation mechanism is formalized in Eq. (22).
where fitnessnewi and fitnessoldi represent the fitness of the i-th individual before and after the position update. NCRBMO guarantees that the optimal solution consistently guides subsequent search paths through elite preservation. In real-time forecasting, newly acquired data continuously updates the model; however, the retention of historical optimal parameters is imperative to prevent overall performance degradation resulting from isolated predictive failures. This mechanism ensures the stability and cumulative optimization capacity of the forecasting system within dynamic environments.
Flowchart and pseudocode of the NCRBMO algorithm
Figure 4 illustrates the streamlined flowchart for the NCRBMO algorithm; simultaneously, Table 3 presents the corresponding pseudo-code.
Streamlined flowchart of the NCRBMO algorithm. The optimization process initiates with parameter initialization and the establishment of the population Xit. The switching coefficient φ determines when the algorithm transitions to extensive global exploration and when it shifts to more focused local exploitation. Distinct search strategies are applied at various stages to locate optimal or near-optimal solutions. In each iterative cycle, the current best-found solution is compared with the historical best, retaining the superior option to update Xit+1. The global optimization proceeds until the predefined termination criteria are satisfied, ultimately converging on the global optimum.
Adaptive extreme learning machine (AELM)
ELM is an efficient technique for training single-layer feedforward networks by stochastically initializing hidden layer weights and biases, while deriving output layer weights via least squares; this significantly improves efficiency in addressing abrupt events. Nevertheless, the benchmark ELM struggles with deep, complex nonlinear relationships. To enhance the performance of ELM in addressing complex nonlinear dynamic problems, this research incorporates an adaptive activation function. The key procedures of AELM can be summarized as follows:
Step 1: The initialization of model parameters. Let the input layer have N neurons, while the hidden and output layers have K and Q neurons, respectively. The input vector, represented as zj = [zj1, zj2, …, zjn]T, produces the output of the AELM as follows:
where βi represents the weight of the i-th activation unit, governing its contribution to the final output. zj denotes the j-th input feature. ωi is the weight associated with the input feature zj, controlling its influence on the activation function. bi is the bias term, which adjusts the activation output to allow for non-zero activations even when the input is zero, thereby enhancing the model’s expressive capacity. The specific form of the adaptive Sigmoid activation function is given by σ(⋅).
where parameter a determines the gradient of the activation function, thereby affecting its activation speed in various regions, while the bias term b shifts the function and modifies its output range. Both a and b are continuously optimized during the training process, enabling the function σ(⋅) to adapt to diverse information patterns. In each iteration, a and b are directly embedded into the search vectors of the NCRBMO, undergoing dynamic adjustment alongside the evolution of the swarm intelligence algorithm. This allows the activation function morphology to adaptively align with the frequency-domain characteristics of the input wind power sequences, realizing the synchronous optimization of “activation slope” and “response range.”
The output equation for the AELM employing an adaptive activation function is defined as:
Step 2: The hidden layer output matrix H is calculated on an element-by-element basis via the adaptive Sigmoid function.
Step 3: The weight matrix β of the output layer of AELM can be directly computed using the least squares method. The relationship is given by the following matrix equation.
where parameter H+ represents the Moore-Penrose pseudo-inverse of matrix H, and matrix Y denotes the target output of the model.
Computational complexity
The computational complexity associated with the ICEEMDAN-NCRBMO-AELM hybrid prediction system is comprised of three key constituents:
where Tseries is the length of the series, and Inoise is the noise reconstruction iterations of ICEEMDAN. Npop is the size of the population, Ndim is the dimensionality, and Ttotal refers to the total iterations of NCRBMO. Ssample is the sample size, while Hneuron indicates the number of hidden neurons in AELM.
Results and discussion
This section will rigorously assess the optimization potential of the proposed algorithm and the predictive efficacy of the proposed system under multi-seasonal meteorological and environmental volatility conditions. The experiments are compared with advanced methods in the field, and the results are comprehensively analyzed using standard evaluation metrics.
Optimization effectiveness assessment of the NCRBMO algorithm
Data description and experimental implementation details
The IEEE CEC 2017 benchmark test suite, developed by the IEEE Congress on Evolutionary Computation, is a standardized collection of numerical optimization problems designed as a general testing platform, particularly suitable for evaluating intelligent optimization algorithms. The collection includes four types of test benchmarks: unimodal (F1), multimodal (F3–F10), hybrid (F11–F20), and composite (F21–F30). Notably, the F2 function was officially omitted, as its pathological behavior compromised an unbiased assessment of algorithmic efficacy.
Wind power forecasting, particularly across seasons, faces complex challenges due to climatic patterns that vary with the season and unpredictable fluctuations in wind speed. These factors collectively present a predictive task with high dimensionality, multimodality, nonlinearity, and non-convexity, closely aligning with the optimization problems posed in the CEC 2017 dataset. To further highlight the performance of NCRBMO, a comparative analysis is conducted with several established algorithms: (Parrot Optimizer, PO, 2024)27, (Hippopotamus Optimization, HO, 2024)28(Black-Winged Kite Optimization, BKA, 2024)29(Arctic puffin Optimization, APO, 2024)30(Chaotic Sand Cat Swarm Optimization, SCSO, 2023)31, (Dung Beetle Optimization, DBO, 2023)32, and (Whale Optimization, WOA, 2016)33. All test benchmarks were evaluated at 30 dimensions. Each algorithm was run for 30 trials, with each trial utilizing 500 iterations and a population comprising 30 individuals. All experiments in this research were conducted on a computing platform equipped with an 13th Gen Intel® Core™ i9-13900HX CPU, an NVIDIA GeForce RTX 4060 GPU, and 32 GB of RAM, operating under a 64-bit Windows 11 environment.
Comparative analysis of optimization performance
Table 5 summarizes the key parameter configurations of the proposed and comparison algorithms, encompassing the algorithm names, parameter designations, functional descriptions, and the specific impact of each parameter on algorithmic performance. Tables 6 and 7, and 8 summarize the optimization performance across all benchmark functions. The rankings were calculated using the standard Friedman test34. The mean value was chosen as the reference metric for ranking to verify whether the algorithm can exhibit consistent performance across multiple runs. Table 8 presents the final Friedman mean rankings. The results demonstrate that NCRBMO outperformed all competitors in 26 out of 29 benchmark tests, securing the first rank in 89% of the tests. The final Friedman rank for NCRBMO was 1.172414, placing it at the top among all algorithms. Additionally, the statistical significance of the inter-algorithmic variations was ascertained through P-value computation. A smaller P-value (typically < 0.05) results in the rejection of the null hypothesis. The results show that NCRBMO’s P-value was 8.0138e-28 (approaching 0), indicating that the performance differences were statistically significant and could not be attributed to random factors.
The experiment also employs Cliff’s Delta effect size in conjunction with Bootstrap confidence intervals to comprehensively evaluate the performance discrepancies between the algorithms, rather than relying exclusively on P-values and ranks. An effect size of |δ| ≥ 0.474 signifies a substantial performance gap between the evaluated algorithm and NCRBMO on the specific test function, whereas |δ| < 0.147 indicates a negligible performance disparity. The aforementioned results are statistically summarized in Tables 9 and 10, respectively. Cliff’s Delt effect size, as a non-parametric metric, directly reflects the net probability of our NCRBMO algorithm outperforming the competitive algorithms. Additionally, the 95% confidence interval computed via Bootstrap resampling provides an estimation range for the algorithm’s average performance, reflecting the precision and stability of the differences. The interval width reflects the precision of the estimated average performance; a narrower confidence interval denotes superior performance stability on the test function.
Figure 5(a) presents a radar chart for a visual comparison of the algorithms’ rankings across various functions. The minimal area occupied by NCRBMO’s plot on the chart indicates its superior overall performance. Figure 5(b)-(d) depict the variation in convergence speed, reflecting NCRBMO’s ability to achieve faster convergence within a reduced time frame while simultaneously maintaining a lower fitness value. Figure 6 illustrates the statistical results of the single-run and average execution times for various optimizers under identical configurations and parameter settings, providing robust data verification for computational efficiency. Tables 11 and 12 provide the sensitivity analysis of the critical algorithmic parameters, namely the switching coefficient φ and the Tent mapping parameter α. The former governs the seamless transition between distinct algorithmic phases, while the latter determines the uniformity and diversity of the initial population; optimal performance is achieved when both parameters are assigned a value of 0.5.
Comparative analysis
The classic WOA achieves convergence by simulating the whale’s “spiral update” mechanism. In multimodal problems, WOA is prone to premature convergence and requires continuous fine-tuning of the convergence factor. The PO designs a stochastic framework incorporating multiple strategies; however, this randomness lacks specificity in tasks requiring fine-tuned optimization, constraining solution accuracy. In contrast, NCRBMO utilizes a switching coefficient to dynamically switch between exploration and exploitation. This mechanism enables NCRBMO to conduct in-depth refinement in specific areas while maintaining a global search focus, thereby enhancing solution accuracy and balancing the depth and breadth of exploration. The BKA and DBO derive their optimization strategies from the foraging strategy demonstrated by the black-winged kite and the migratory behavior of the dung beetle, respectively. Both exhibit a high sensitivity to the selection of initial solutions and struggle to escape local optimum traps. In contrast, NCRBMO innovatively employs a population initialization method that combines multistage mapping with inverse generation, leading to a marked improvement in the quality and uniformity of the population at its genesis. The introduction of a normal cloud model adds stochastic perturbations to the search process, overcoming the constraints of fixed search trajectories. This facilitates the algorithm’s escape from local optima and promotes rapid convergence. Even when stalled in local optima, the algorithm utilizes elite preservation to eliminate suboptimal solutions, ensuring the global optimality of the final solution.
Optimization performance assessment results of the NCRBMO algorithm on the challenging IEEE CEC 2017 benchmark dataset. (a) Radar chart comparing NCRBMO against rival algorithms on the IEEE CEC benchmark functions. The benchmark comprises a total of 29 functions (F1-F30). Performance is evaluated according to the standard Friedman test’s mean ranking (positions 1–8), and a reduced radar area signifies better optimization results. (b)-(d) Convergence rate comparison for NCRBMO against its competitors on the IEEE CEC benchmark functions. The figure’s left panel displays the function types, while its right panel presents the convergence curve. Superior optimization performance is characterized by faster convergence and lower fitness values.
Comparison of execution times of different optimization algorithms on the CEC 2017 benchmark functions. The tests were conducted under identical configuration settings: population size = 30, maximum iterations = 500, and problem dimension = 30. (a) Execution times of the algorithms on 29 benchmark functions (Note: F2 excluded); (b) Average execution time of each algorithm.
In summary, the four categories of benchmark functions in the IEEE CEC 2017 suite effectively characterize optimization landscapes of varying complexities, which closely align with the intrinsic challenges of multi-seasonal wind power forecasting. Specifically, to address the strong nonlinearity driven by meteorological factors, the hybrid functions integrate multimodality and asymmetry to capture the complex time-varying patterns of wind data. Considering the noise and outliers induced by extreme weather, the stochastic perturbations and asymmetric designs within these functions provide a rigorous baseline for assessing the algorithm’s robustness against real-world data volatility. Furthermore, the AELM is highly dependent on the configuration of its key parameters, making its tuning essentially a complex, high-dimensional, and multimodal numerical optimization problem. An ideal optimizer should, therefore, possess two core capabilities: strong global exploration ability and high convergence efficiency. Experimental results demonstrate that NCRBMO exhibits notable potential in both aspects, achieving a top ranking in 89% of the benchmark tests while demonstrating rapid convergence behavior. This evidence strongly supports the rationale for employing NCRBMO in the parameter tuning of AELM.
Engineering application: multi-seasonal forecasting of wind power generation
Dataset description
The multi-seasonal wind power dataset employed in this study was collected from the JSFD003 turbine at a large-scale wind farm in Jiangsu Province, China, in 202435. The total dataset spans 8,760 h, encompassing over 35,000 sampling points throughout the year with a 15-minute sampling interval. This dataset effectively captures synoptic-scale fluctuations (typically with a 2–7-day cycle), diurnal variations (the land-sea breeze effect caused by day-night temperature differences), and turbulent fluctuation characteristics. Complying with the IEC 61400-25 standard for wind energy data analysis, the data provides a comprehensive representation of the region’s typical seasonal characteristics. Specifically, the Jiangsu region is characterized by a typical monsoon climate. During spring, the atmospheric circulation transitions from the cold winter monsoon to the warm, moist airflow of summer, with the circulation shift inducing wind speed fluctuations. This period of data is essential for validating the stability of the prediction model under seasonal transitions and varying wind speed conditions.
Actual wind energy production originating within a major wind facility in Jiangsu, China, across various seasonal periods. The unique geographic and climatic features of the Jiangsu region, characterized by a lengthy coastline and reliable wind speeds, establish an advantageous natural environment for the generation of wind energy. The input variables derived from meteorological and environmental data comprise wind speed and direction at various heights (10 m, 30 m, 50 m, 70 m), hub height wind speed and direction, ambient temperature (°C), atmospheric pressure (hPa), and relative humidity (%), etc. This collection of data captures critical characteristics such as diurnal temperature differences, the influence of land-sea breezes, and atmospheric turbulence, which are essential for assessing system stability across seasonal variations.
Summer is dominated by the southeast monsoon, featuring high temperatures and humidity; strong wind speed variations provide favorable conditions for wind power generation but are also accompanied by extreme weather. During this period, the data exhibits stronger stochasticity, intermittency, and turbulence characteristics. Autumn is characterized by a dry and stable climate with good wind speed consistency, making it an ideal period for validating the model’s long-term forecasting capability under low-volatility and stable operating conditions. Winter is dominated by the northwest monsoon, bringing cold, dry air and strong wind speeds, marking the peak period for wind power generation. Assessing the model’s performance and stability under extreme wind speed conditions is particularly essential during this time. Due to commercial sensitivity and privacy restrictions, we are unable to disclose the wind power dataset in its raw form with precise geographic identifiers. To ensure transparency and reproducibility, a strictly anonymized derivative dataset has been deposited in a public repository, which is accessible via the data availability statement. The derivative dataset retains the key information required for validating the model’s effectiveness, including all meteorological variables, time-series partitions, and actual power statistics.
To facilitate seasonal analysis, four representative months—March (spring), June (summer), September (autumn), and December (winter)—were selected for the experiments, with each seasonal window containing approximately 2,920 valid data points, as illustrated in Fig. 7. The dataset was partitioned and cross-validated in strict chronological order to ensure that future data were not utilized for training, thereby preventing data leakage. This study adopted an 80/20 ratio for dataset partitioning, where the first 80% served as the training set for model training and the remaining 20% functioned as an independent validation set to evaluate predictive performance. Specifically, a seasonal time-window strategy was employed, utilizing historical data from the first 25 days of each month for training to achieve short-term wind power forecasting for the subsequent 5 to 6 days. This design fully accounts for the non-stationarity and seasonal feature mapping of wind energy data, aligning with the technical requirements of power systems for short-term power forecasting. Regarding missing or abnormal meteorological data, the system utilized a time-series-based interpolation method, filling gaps with data from adjacent time points to minimize the impact of missing data on prediction accuracy.
Dataset preprocessing
The preprocessing procedure for the dataset encompasses data importation, decomposition, feature selection, splitting, and normalization. Anomalous data points are first filtered out to guarantee analysis integrity. The input data is decomposed into intrinsic mode functions to represent distinct frequency components. Using the sliding-window technique, temporal data is transformed into a set of learning samples, generating feature vectors and input–output pairs for model training. The first 80% of the dataset is used for training, while the remaining 20% is used for validation and performance evaluation. Subsequent to feature construction, both sets are scaled to the [0, 1] interval for consistent scaling. Despite preprocessing, some outliers remain, enabling the assessment of NCRBMO’s effectiveness in reducing the noise-induced performance degradation of the SLFN.
Model generalization appraisal
K-fold cross-validation, as a standardized evaluation framework, is employed to gauge the model’s capacity for generalization. In each of the K iterations, K−1 subsets are used for training and the remaining subset for validation. The final performance is obtained by averaging the K validation outcomes, using RMSE as the primary evaluation metric. The training and validation sets are strictly partitioned in temporal order to prevent future data leakage. A value of K = 5 was chosen to achieve an optimal trade-off between computational overhead and stability of the evaluation.
Evaluation metrics
R-square (R2): R2 measures the proportion of variance in the dependent variable explained by the model. Its value typically ranges from [0, 1], where a value closer to 1 indicates a superior goodness-of-fit and a stronger explanatory power for data variability.
Mean absolute error (MAE): MAE quantifies the average magnitude of absolute residuals in the same units as the target variable. It provides a linear penalty for all errors, making it an intuitive indicator of overall prediction accuracy without overemphasizing extreme deviations.
Root mean square error (RMSE): RMSE is the square root of the average squared differences between predicted and actual values. Due to its quadratic scoring, it is more sensitive to large errors than MAE, thus serving as a critical metric for penalizing significant outliers.
Mean absolute percentage error (MAPE): MAPE expresses the average relative error as a percentage, offering a scale-independent measure of precision. It is particularly useful for comparing model performance across different data scales, with lower percentages reflecting higher relative accuracy.
where N refers to the count of samples, Pmeas, i represents the actual measured power. Ppred, i signifies the forecasted value at the i-th time.
Accuracy: To provide an intuitive measure of prediction performance, an accuracy metric is derived from the MAPE.
This derivative metric facilitates a straightforward interpretation of the model’s reliability, where a higher value signifies a closer proximity of the predicted results to the actual observations.
Orthogonality index (OI): OI quantifies the independence of IMF components and reflects the degree of cross-frequency interference. OI values greater than 0.2 signify a substantial degree of mode mixing, while a value of less than 0.1 indicates effective modal separation.
Correlation coefficient (CC): CC assesses the consistency of IMF components with their associated true frequencies in the original signal.
Residual energy ratio (RER): RER quantifies the unextracted portion of the initial signal’s total energy post-decomposition into IMFs. A high RER indicates insufficient extraction of low-frequency information.
where the parent signal at time t is denoted by y(t). rn(t) corresponds to the value of the n-th residual signal. ∥⋅∥ is defined as the squared L2 norm.
Assessment of prediction results and error analysis
For a clear demonstration of the superior performance of the ICEEMDAN-NCRBMO-AELM hybrid prediction system, a comparative analysis was conducted with five advanced or established hybrid approaches: (CPO-BITCN-BIGRU, 2024)36, (PSO-BP, 2023)25, (QRBI-LSTM, 2022) 23, (CNN+stacked+LSTM, 202224), (CEEMDAN-iMPA-BiLSTM, 202122) and (BKA-Transformer, 2024)29. Table 13 summarizes the specific classifications and parameter configurations of the hybrid forecasting approaches employed for comparison. This comprehensive comparison highlights the superiority of our system in data decomposition, long-term sequence modeling, multi-scale feature extraction, hyperparameter optimization, and capturing stochastic fluctuations and nonlinearity.
Figure 8(a) depicts the decomposition results of multi-seasonal wind power data using the ICEEMDAN method employed in our system, along with the resulting curves of the IMF components. Figure 8(b)-(e) present the comparative prediction curves of various hybrid approaches. The results show that the ICEEMDAN-NCRBMO-AELM system’s prediction curve is closest to the real-world wind power curve across a range of seasonal and environmental contexts. Figure 8(f) presents a bar chart that visually compares the evaluation metrics of the prediction results for each hybrid approach. Table 14 summarizes the specific seasonal evaluation metrics of the prediction results for each hybrid approach shown in Fig. 8(f). Experimental results show that despite the high randomness and intermittency associated with extreme weather in Jiangsu during June, the system achieved a prediction accuracy of 82.781% on the June validation set, with a MAPE of 17.219%, demonstrating its reliability under severe meteorological conditions. The system’s multi-seasonal average accuracy reached 86.455%, with a mean R2 of 0.893. MAE and RMSE were reduced to 4.049 and 7.196, respectively, consistently demonstrating high predictive precision across all evaluated metrics. Table 15 presents the average seasonal prediction metrics on the validation set, highlighting the system’s comprehensive performance. In contrast, the benchmark ELM, lacking data decomposition and optimization mechanisms, exhibited the worst performance, with a MAPE of 41.179%, 27.634% lower than our system’s. Table 16 summarizes the actual computational execution times of all hybrid methods applied to multi-seasonal wind power datasets in real-time environments, thereby highlighting the computational efficiency of each model and enhancing the numerical validation of the system’s overall performance.
Figure 9 illustrates the residual box plots of various hybrid forecasting methods derived from the June wind power dataset. The upper and lower boundaries of the box represent the 75th percentile (Q3) and the 25th percentile (Q1), respectively, with the box height (interquartile range, IQR = Q3 - Q1) reflecting the dispersion of the middle 50% of the residuals. Outliers (indicated by red ‘+’ symbols) are defined as data points exceeding Q3 + 1.5 \times IQR or falling below Q1–1.5 \times IQR, representing samples where predictions significantly deviate. The horizontal line within the box denotes the median (50th percentile), reflecting the central tendency of the residuals; the incorporated mean line (green dashed line) displays the arithmetic average of all residuals, indicating the direction and magnitude of systematic bias. The reference lines at ± 3σ (red dotted lines) are based on the standard deviation of the residuals σ, providing boundaries for extreme errors. A high-precision prediction model is characterized by a narrow box and a median line that is near zero (unbiased) or exhibits minimal deviation.
Table 17 tabulates the metric results of the ablation studies in multi-seasonal wind power forecasting tasks to systematically elucidate the primary contribution of each constituent module to the performance enhancement of the hybrid framework.
Comparison of wind power forecast profiles across multiple seasons using ICEEMDAN-NCRBMO-AELM and other representative hybrid prediction approaches. (a) Results of ICEEMDAN decomposition on multi-seasonal wind energy production data. (b)-(e) Comparison of forecast curves for representative months. (f) Bar chart comparing the performance metrics for multi-seasonal forecasting between ICEEMDAN-NCRBMO-AELM and other representative hybrid prediction approaches on the validation dataset.
Experimental results demonstrate that a forecasting architecture integrating efficient data decomposition capabilities, advanced intelligent optimization mechanisms, and rapid adaptive modeling characteristics achieves optimal performance in short-term wind power forecasting. Particularly under volatile climatic and extreme operational conditions, the architecture exhibits significantly enhanced stability. Table 18 compares different data decomposition methods, showing that ICEEMDAN improves the CC by 17.72% over traditional CEEMDAN, indicating its superior ability to effectively separate high-frequency and low-frequency components, reducing cross-frequency interference. OI and RER decreased by 73.21% and 66.67%, confirming that ICEEMDAN mitigates mode mixing while preserving low-frequency components representing long-term seasonal trends.
Comparative analysis
Seasonal variations are manifested through a series of coupled, time-varying meteorological patterns that precisely expose structural deficiencies in existing predictive models. For the CPO-BiTCN-BiGRU model, during the spring and autumn seasons, the correlation between instantaneous wind speed and antecedent states diminishes abruptly; its inherent strong temporal coupling forces an excessive reliance on invalidated historical dependencies, hindering rapid response to “step-wise” transitions triggered by transiting weather systems and resulting in significant lagging errors at the onset and termination of fluctuations. In the case of deep stacking models like CNN+Stacked LSTM, the gradients propagated through deep networks during high-frequency seasonal peaks become abnormally large due to the chain rule of differentiation. Seasonal high volatility directly triggers periodic gradient explosions, causing unstable weight updates and divergent training processes in specific seasons, which manifest as non-physical oscillations in predictive outputs. For the PSO-BP model, “particles” tend to cluster around the optimal solution of the current season; during the transition to summer, while the center and morphology of the data distribution shift, the particle swarm remains entrapped in the local optima of the previous season due to historical inertia, failing to effectively explore the novel solution space and leading to systematic biases. Regarding the CEEMDAN-iMPA-BiLSTM, the fixed noise amplitude in CEEMDAN fails to adjust for decomposing non-stationary signals during seasonal transitions. Increased diurnal temperature variations in late spring cause confusion between daily fluctuations and synoptic-scale oscillations, leading to mode mixing where “seasonal components” retain short-term noise and “detail components” are contaminated by trend information. Furthermore, the BiLSTM, predicated on the assumption of symmetry between past and future data, lacks the capacity to model asymmetric seasonal boundary transitions, resulting in cumulative errors. For quantile regression models like QRBI-LSTM, high and low quantile objectives remain relatively consistent during the stable summer season; however, in volatile spring and autumn, the fluctuation patterns of extreme wind speeds (high quantiles) differ drastically from those of low-to-moderate speeds (mid-to-low quantiles), intensifying the conflict between disparate quantile regression targets.
Notably, the BKA-Transformer model demonstrates considerable potential. On one hand, the multi-head self-attention mechanism within the Transformer bypasses the sequential recursion constraints of recurrent neural networks, enabling direct modeling of global dependencies across arbitrary spans within meteorological sequences such as wind speed and temperature. This facilitates an enhanced capacity to capture long-period fluctuation patterns characterizing seasonal and synoptic scales while fundamentally circumventing gradient disappearance or explosion issues inherent in long-term sequence dependencies. On the other hand, as an efficient optimization algorithm, BKA adaptively identifies the optimal configuration for critical Transformer parameters that align with the data distribution characteristics of different seasons, thereby endowing the model with dynamic seasonal adaptability.
Unlike the CEEMDAN-iMPA-BiLSTM hybrid approach that uses traditional CEEMDAN, the proposed system decomposes input data across multiple scales using ICEEMDAN, enabling each component to capture specific trend characteristics. ICEEMDAN adopts a dynamic noise-weighting scheme while continuously updating residual signals, markedly suppressing mode mixing between frequency-disparate components. This effectively isolates long-term trends (e.g., seasonal variations) from short-term fluctuations (e.g., abrupt wind speed and pressure changes induced by extreme weather), enhancing the physical clarity of the decomposition results and the interpretability of the model. Compared with existing deep learning-based hybrid approaches, the AELM model directly computes output layer weights via the closed-form solution of the Moore-Penrose pseudoinverse, minimizing computational overhead. By eschewing backpropagation and gradient descent, AELM fundamentally circumvents exploding gradient problems inherent in deep neural networks due to chain rule differentiation, endowing the system with stability. The incorporation of adaptive activation functions dynamically modulates function morphology in response to shifting data distribution characteristics, bolstering the system’s capacity to accommodate non-stationarity in heterogeneous data streams. Furthermore, the NCRBMO dynamically adjusts input-layer weights and biases through its iterative optimization, mitigating output variability in SLFNs induced by random initialization. This process concurrently refines the nonlinear mapping architecture within hidden layers and the alignment accuracy between feature space representations and training data distributions. Finally, K-fold cross-validation mitigates reliance on specific sample partitioning, substantially enhancing the system’s generalization.
Comparative analysis of residual box plots for diverse hybrid approaches based on June wind power data. Comparative analysis of residual box plots for diverse hybrid forecasting methods based on June wind power data. The green dashed line denotes the arithmetic mean of the residuals, while the red dotted line provides a reference based on the standard deviation σ. The interquartile range, represented by the box height, indicates the dispersion of the central 50% of the residuals. High-precision forecasting models are characterized by narrow boxes, with the median line positioned near zero or exhibiting minimal deviation.
Conclusions
This research presents a competitive hybrid prediction system, ICEEMDAN-NCRBMO-AELM, which integrates data decomposition and intelligent computation technologies, significantly improving the precision of multi-seasonal wind power forecasting and providing reliable decision support for renewable energy management under sustainability goals.
During the decomposition stage, ICEEMDAN accurately decouples the non-stationary and nonlinear constituents inherent in wind power series. Through dynamic noise weighting and residual updates, it effectively suppresses mode mixing. The resulting intrinsic mode functions exhibit distinct physical significance, promoting enhanced data interpretability. In the global optimization stage, NCRBMO introduces a population initialization strategy combining multiphase mapping and inverse-generation, overcoming the issues of uneven population distribution. Furthermore, the algorithm incorporates a normal cloud model for dynamic perturbation to break free from the constraints of a fixed search pathway. Even when trapped in local optima, the algorithm eliminates suboptimal solutions through an elite preservation mechanism. The design of the switching coefficient \phi balances exploration breadth and depth. Evaluations on the publicly available IEEE CEC dataset indicate that, in 89% of trials, NCRBMO demonstrates superior performance against competing state-of-the-art algorithms, reaching a Friedman ranking score of 1.172414, highlighting its outstanding global optimization and swift convergence rate. During the hybrid forecasting stage, NCRBMO is employed for the iterative optimization of the SLFN, determining the optimal output weights and biases. AELM leverages an adaptive activation function to characterize the distribution characteristics of diverse data, enhancing the system’s ability to handle the non-stationarity of heterogeneous data streams. K-fold cross-validation is applied to further augment the system’s generalization capacity. The multi-seasonal wind power forecasting results for the Jiangsu region indicate that our system robustly captures the intricate fluctuation features and periodic seasonal patterns in wind data, achieving a mean prediction accuracy of 86.455%, a 27.634% enhancement over the benchmark ELM. Meanwhile, the mean R2 reaches 0.893, with MAE and RMSE reduced to 4.049 and 7.196, surpassing other representative hybrid approaches. This underscores the revolutionary accuracy and reliability of our system, providing a novel research perspective in multi-seasonal wind power forecasting and offering critical support for advancing renewable energy management across diverse and challenging meteorological conditions.
Although this research has achieved significant progress in multi-seasonal wind power forecasting, certain limitations remain that necessitate further investigation. Current experiments are primarily based on data from a single wind farm within a specific monsoon climate zone; thus, future research will extend this framework to multiple wind farms across diverse climatic regions (e.g., arid inland or plateau regions) to comprehensively evaluate the model’s generalization capability. Furthermore, subsequent plans involve integrating the prediction module into real-time grid dispatching systems or operational tools to verify its practical utility in online environments. Finally, it is essential to conduct in-depth cost-benefit analyses to translate forecasting accuracy improvements into concrete economic indicators, quantifying the system’s actual contribution to reducing ancillary service costs and enhancing wind power integration.
Data availability
The datasets generated and analyzed during the current study, along with the custom source code, are publicly available in the GitHub repository: https://github.com/Pro-cai/SR-NCRBMO-DATASET and https://github.com/Pro-cai/SR-ICEEMDAN-NCRBMO-AELM.
References
Zhang, Y. et al. A short-term wind energy hybrid optimal prediction system with denoising and novel error correction technique. Energy 254, 124378. https://doi.org/10.1016/j.energy.2022.124378 (2022).
Chen, X. et al. Global perspectives on wind energy innovation: Policy impacts and component-level analysis. Energy 319, 135000. https://doi.org/10.1016/j.energy.2025.135000 (2025).
Antonini, E. G. A. et al. Identification of reliable locations for wind power generation through a global analysis of wind droughts. Commun. Earth Environ. 5(1), 103. https://doi.org/10.1038/s43247-024-01260-7 (2024).
Global Wind Report. Global Wind Energy Council (GWEC). (2024). https://www.gwec.net/(
Wang, J. et al. A novel combined forecasting model based on neural networks, deep learning approaches, and multi-objective optimization for short-term wind speed forecasting. Energy 251, 123960. https://doi.org/10.1016/j.energy.2022.123960 (2022).
Tuncar, E. A., Sağlam, Ş & Oral, B. A review of short-term wind power generation forecasting methods in recent technological trends. Energy Rep. 12, 197–209. https://doi.org/10.1016/j.egyr.2024.06.006 (2024).
Yu, L. et al. A complexity-trait-driven rolling decomposition-reconstruction-ensemble model for short-term wind power forecasting. Sustain. Energy Technol. Assess. 49, 101794. https://doi.org/10.1016/j.seta.2021.101794 (2022).
Hassan, A. Y. et al. Wind cube optimum design for wind turbine using meta-heuristic algorithms. Alex. Eng. J. 61(6), 4911–4929. https://doi.org/10.1016/j.aej.2021.09.059 (2022).
Liu, H. & Chen, C. Data processing strategies in wind energy forecasting models and applications: A comprehensive review. Appl. Energy 249, 392–408. https://doi.org/10.1016/j.apenergy.2019.04.188 (2019).
Hu, Y. et al. Temporal collaborative attention for wind power forecasting. Appl. Energy 357, 122502. https://doi.org/10.1016/j.apenergy.2023.122502 (2024).
Wang, Y. et al. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 304, 117766. https://doi.org/10.1016/j.apenergy.2021.117766 (2021).
Huachen, L. et al. Hybrid prediction method for solar photovoltaic power generation using normal cloud parrot optimization algorithm integrated with extreme learning machine. Sci. Rep. (1), 6491. https://doi.org/10.1038/s41598-025-89871-8 (2025).
Wang, A. et al. Random-forest based adjusting method for wind forecast of WRF model. Comput. Geosci. 155, 104842. https://doi.org/10.1016/j.cageo.2021.104842 (2021).
Zheng, L. et al. Short-term wind power prediction model based on wrf-rf model[C]//2023 8th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA). IEEE: 26–28. https://doi.org/10.1109/ICCCBDA56900.2023.10154834(2023).
Cao, Y. et al. Wind power ultra-short-term forecasting method combined with pattern-matching and ARMA-model[C]//2013 IEEE Grenoble Conference. IEEE: 1–4. (2013). https://doi.org/10.1109/PTC.2013.6652257
Ahn, E. J. & Hur, J. A short-term forecasting of wind power outputs using the enhanced wavelet transform and arimax techniques. Renew. Energy 212, 394–402. https://doi.org/10.1016/j.renene.2023.05.048 (2023).
Malakouti, S. M. Estimating the output power and wind speed with ML methods: A case study in Texas. Case Stud. Chem. Environ. Eng. 7, 100324. https://doi.org/10.1016/j.cscee.2023.100324 (2023).
Malakouti, S. M. Prediction of wind speed and power with LightGBM and grid search: case study based on Scada system in Turkey[J]. International Journal of Energy Production and Management. Vol. 8. Iss. 1, 8(1): 35–40. (2023). https://doi.org/10.18280/ijepm.080105(2023).
Malakouti, S. M. Improving the prediction of wind speed and power production of SCADA system with ensemble method and 10-fold cross-validation. Case Stud. Chem. Environ. Eng. 8, 100351. https://doi.org/10.1016/j.cscee.2023.100351 (2023).
Malakouti, S. M. et al. Predicting wind power generation using machine learning and CNN-LSTM approaches. Wind Eng. 46(6), 1853–1869. https://doi.org/10.1177/0309524X221113013 (2022).
Malakouti, S. M. et al. Advanced techniques for wind energy production forecasting: leveraging multi-layer Perceptron+ bayesian optimization, ensemble learning, and CNN-LSTM models[J]. Case Stud. Chem. Environ. Eng. 10, 100881. https://doi.org/10.1016/j.cscee.2024.100881 (2024).
Peng, T. et al. An integrated framework of Bi-directional long-short term memory (BiLSTM) based on sine cosine algorithm for hourly solar radiation forecasting. Energy 221, 119887. https://doi.org/10.1016/j.energy.2021.119887 (2021).
Wang, J. et al. A novel ensemble probabilistic forecasting system for uncertainty in wind speed. Appl. Energy 313, 118796. https://doi.org/10.1016/j.apenergy.2022.118796 (2022).
Elizabeth Michael, N. et al. Short-term solar power predicting model based on multi-step CNN stacked LSTM technique. Energies 15(6), 2150. https://doi.org/10.3390/en15062150 (2022).
Qiao, J. et al. Application research on the prediction of tar yield of deep coal seam mining areas based on PSO-BPNN machine learning algorithm. Front. Earth Sci. 11, 1227154. https://doi.org/10.3389/feart.2023.1227154 (2023).
Fu, S. et al. Red-billed blue magpie optimizer: A novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems. Artif. Intell. Rev. 57(6), 134. https://doi.org/10.1007/s10462-024-10716-3 (2024).
Lian, J. et al. Parrot optimizer: Algorithm and applications to medical problems. Comput. Biol. Med. 108064. https://doi.org/10.1016/j.compbiomed.2024.108064 (2024).
Amiri, M. H. et al. Hippopotamus optimization algorithm: A novel nature-inspired optimization algorithm. Sci. Rep. 14(1), 5032. https://doi.org/10.1038/s41598-024-54910-3 (2024).
Wang, J. et al. Black-winged kite algorithm: A nature-inspired meta-heuristic for solving benchmark functions and engineering problems. Artif. Intell. Rev. 57(4), 98. https://doi.org/10.1007/s10462-024-10723-4 (2024).
Wang, W. et al. Arctic puffin optimization: A bio-inspired metaheuristic algorithm for solving engineering design optimization. Adv. Eng. Softw. 195, 103694. https://doi.org/10.1016/j.advengsoft.2024.103694 (2024).
Kiani, F. et al. Chaotic sand Cat swarm optimization[J]. Mathematics 11 (10), 2340. https://doi.org/10.3390/math11102340 (2023).
Xue, J. & Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization[J]. J. Supercomputing. 79 (7), 7305–7336. https://doi.org/10.1007/s11227-022-04959-6( (2023).
Mirjalili, S. & Lewis, A. The Whale optimization algorithm[J]. Adv. Eng. Softw. 95, 51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008 (2016).
Jearsiripongkul, T. et al. An improved transient search optimization algorithm for building energy optimization and hybrid energy sizing applications. Sci. Rep. 14(1), 17644. https://doi.org/10.1038/s41598-024-68239-4 (2024).
Huachen, L. et al. Hybrid forecasting system for renewable energy generation incorporating seasonal variability under sustainable development targets. Alexandria Eng. Journal 128: 949–975. https://doi.org/10.1016/j.aej.2025.08.009(2025).
Abdel-Basset, M., Mohamed, R. & Abouhawwash, M. Crested Porcupine Optimizer: A new nature-inspired metaheuristic. Knowl. Based Syst. 284, 111257 https://doi.org/10.1016/j.knosys.2023.111257 (2024).
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 11975177).
Author information
Authors and Affiliations
Contributions
Conceptualization, Huachen Liu and Changlong Cai; methodology, Huachen Liu; software, Huachen Liu and Chao Tang; validation, Pangyue Li; formal analysis, Huachen Liu; investigation, Mingwei Zhao; resources, Huachen Liu and Yichen Ma; data curation, Yang Li and Ben Tu; writing—original draft preparation, Huachen Liu; writing—review and editing, Huachen Liu, Xinyan Zheng, and Yanmou Wang; visualization, Huachen Liu and Pangyue Li; supervision, Changlong Cai and Haifeng Liang; project administration, Changlong Cai and Minghui Chen; funding acquisition, Changlong Cai.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liu, H., Cai, C., Li, P. et al. Hybrid prediction system for reliable multi-seasonal sustainable energy generation under meteorological and environmental volatility. Sci Rep 16, 8637 (2026). https://doi.org/10.1038/s41598-026-40486-7
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-40486-7













