Introduction

Wind speed modeling is crucial in meteorology, with significant applications in optimizing renewable energy, designing robust structures, and improving weather forecasting models1,2. Accurate prediction of wind speeds at altitudes of 10 and 100 m is essential for optimizing wind turbine performance, designing robust structures, and enhancing weather prediction models3,4.Despite advancements in the field, simultaneously predicting wind speeds at these two heights remains a challenging problem due to the distinct atmospheric dynamics involved5,6.

At 10 m, wind speed is heavily influenced by surface roughness and obstacles like buildings, trees, and terrain7. These factors create localized turbulence and variability, resulting in less predictable wind patterns. Conversely, at 100 m, wind speed is less affected by surface obstructions, leading to a more stable and consistent profile8,9. The differences in atmospheric conditions at these altitudes necessitate a modeling approach that considers the unique dynamics of each altitude.

Wind speed modeling at various heights has been explored using diverse methodologies, ranging from traditional numerical weather prediction (NWP) models to advanced machine learning techniques.

NWP models rely on physical principles and mathematical equations to simulate atmospheric processes10. Examples include widely used models such as WRF and ECMWF11,12. NWP models provide detailed and accurate forecasts but are computationally intensive, requiring extensive data sets and high computational power13,14. Despite their success in predicting wind speeds, their high computational cost limits flexibility and accessibility12,15.

Computational fluid dynamics (CFD) models are physics-based models simulate fluid flows and their interaction with surfaces, making them effective for studying wind behavior around complex structures like buildings and turbines16. While CFD models can capture detailed flow patterns and turbulence, they are computationally demanding and require specialized resources. They are typically limited to localized studies or specific scenarios17,18.

Machine learning (ML) techniques, such as artificial neural networks (ANN), support vector machines (SVM), and deep learning models, have gained popularity for modeling complex, nonlinear relationships in large datasets19,20. Deep learning models like CNNs and RNNs are particularly effective in identifying patterns in wind speed data21,22. These ML models can adapt and improve over time, but they are computationally intensive and difficult to interpret23,24.

Hybrid approaches combining physics-based models with machine learning are also being explored. These models leverage the detailed understanding from NWP and CFD models along with the adaptability of ML techniques. While hybrid models can improve forecasting accuracy and reduce computational demands, they present challenges in integrating and balancing the different modeling approaches25,26.

Despite advancements in traditional and machine learning methods, predicting wind speeds at multiple heights continues to pose challenges. Traditional models like NWP and CFD are effective but computationally demanding and inflexible17,18.ML models, while promising, struggle to provide simultaneous predictions at multiple altitudes and often lack interpretability23,24. Hybrid models, which combine data-driven27 and physics-based approaches, offer improved accuracy but still face challenges in efficiency and generalization across various conditions25,26. Thus, there is a need for a unified model capable of predicting wind speeds at both 10 and 100 m, leveraging advanced computing while addressing the limitations of current models.

To address these challenges, this study introduces the brain emotional learning based on basic and functional memories (BELBFM) model, inspired by the emotional learning mechanisms of the mammalian brain. BELBFM leverages basic and functional memories to create an adaptive framework for modeling wind behavior. By integrating inputs from sensory, thalamic, cortical, amygdala, and orbitofrontal components, the model captures the complexities of wind speed dynamics with enhanced accuracy and computational efficiency.

The contributions of this study are as follows:

  • Unified dual-height modeling: BELBFM predicts wind speeds at both 10 and 100 m, addressing the distinct atmospheric dynamics at these heights.

  • Innovative methodology: The model employs an ensemble learning framework, combining outputs from various memory units trained on layered wind speed data, optimized for performance.

  • Efficient training: A correlation-based data pruning technique significantly reduces the training dataset, enhancing computational efficiency without compromising accuracy.

  • Real-world applicability: The model demonstrates potential for applications in renewable energy management, weather forecasting, and disaster preparedness.

By addressing the limitations of existing methods, BELBFM provides a novel solution for simultaneous wind speed prediction at multiple altitudes. The rest of the paper is structured as follows: Sect. 2 introduces the proposed model, Sect. 3 details its implementation, Sect. 4 presents the results, and Sect. 5 discusses the findings and concludes the study.

Methodology: brain emotional learning based on basic and functional memories (BELBFM)

Over the past three decades, a novel approach inspired by the mammalian brain’s emotional learning mechanisms has been developed for modeling and forecasting complex nonlinear systems28. Brain Emotional Learning-Based Models (BELMs) stem from neuroscience, psychology, AI, and computational modeling29. Psychologists like John Watson and neuroscientists such as Joseph LeDoux, who studied the amygdala’s role in fear conditioning, laid the groundwork for this field30. In 2001, Balkenius and Morén introduced a computational model of the interaction between the amygdala and orbitofrontal cortex in emotional conditioning as the first BELM31. This model laid the foundation for further advancements. In 2004, Lucas and his team introduced BELBIC (brain emotional learning-based intelligent controller)32, which was applied to domains such as washing machine control33 and dynamic system prediction34. Subsequent improvements enabled applications in intelligent control, prediction, and emotional learning. For instance, BELMs have been used for DC motor speed control35, earthquake prediction36, emotion recognition37,38, and Alzheimer’s diagnosis39. BELMs have also been integrated with other intelligent methods, such as fuzzy neural networks to enhance performance40. Recent developments include applications in humanoid robots41, active noise cancellation systems42, and time-series prediction43. Hybrid models combining BELMs with deep learning and fuzzy techniques have further expanded their real-world applicability38,44. From simulating brain processes to addressing complex systems, BELMs have evolved into powerful tools for tackling increasingly sophisticated challenges when combined with modern AI techniques.

As mentioned before, various computational models have been inspired by the human brain. One of these important models is the amygdala-orbitofrontal subsystem model45. The amygdala-orbitofrontal subsystem, which is the main basis of many computational models, has a simple structure. As shown in Fig. 1, this subsystem consists of four interconnected components: the sensory cortex, thalamus, amygdala, and orbitofrontal cortex. Sensory inputs are processed through the thalamus and sensory cortex before reaching the amygdala and orbitofrontal cortex. The amygdala generates emotional responses (E), while the orbitofrontal cortex modulates these responses based on feedback (REW) to refine decision-making and control in dynamic environments. Various architectures of the amygdala-orbitofrontal subsystem have been presented and used in above mentioned applications.

Fig. 1
figure 1

The graphical description of amygdala orbitofrontal subsystem45.

BELMs and their advancements represent a significant evolution in computational modeling, offering robust solutions for addressing complex, real-world challenges.

Building on BELM advancements, this paper suggests Brain Emotional Learning Based on Basic and Functional Memories (BELBFM) as a novel approach to the amygdala-orbitofrontal subsystem along with specialized memory for amygdala and orbitofrontal parts to model wind speeds simultaneously at 10 and 100 m. The architecture of the proposed model (depicted in Fig. 2) consists of five main components: Sensory input (SI), Thalamus (TH), Sensory cortex (SC), Amygdala (AMIG), and Orbitofrontal cortex (OFC).

  • SI: Receives and labels input signals (wind components u and v) for wind speeds at both 10 and 100 m.

  • TH: Calculates wind speeds at both altitudes, then passes these results to the SC block. Additionally, it sends an output indicating the wind speed type (10 or 100 m) to the F unit in the AMIG.

  • SC: Generates input-target vectors required for training, evaluation, and exploitation phases and sends them to both the OFC and AMIG blocks.

  • OFC: Includes two basic memory units (O1 and O2) and two functional memory units (WO1 and WO2).

  • AMIG: Contains a basic memory unit (A), a functional memory unit (WA), and a fusion unit (F). Basic memory units are trained with data from the SC block, while functional memory units store performance history based on error rates. The F unit integrates the results from the basic and functional memory units.

Fig. 2
figure 2

Architecture of the proposed BELBFM model, illustrating the five main components: Sensory input (SI), Thalamus (TH), Sensory cortex (SC), Amygdala (AMIG), and Orbitofrontal cortex (OFC). Each component is designed to process and integrate input data for wind speed prediction at multiple altitudes by utilizing both basic and functional memories. This architecture leverages emotional learning mechanisms to enhance prediction accuracy and adaptability.

This architecture emulates emotional learning processes and aims to enhance wind speed prediction accuracy by combining diverse memory types and leveraging past performance.

BELBFM implementation

Main dataset

This study relies on atmospheric data obtained from the ERA5 reanalysis dataset. The ERA5 dataset, developed by the European centre for medium-range weather forecasts (ECMWF), provides a detailed picture of Earth’s atmosphere since 1940. This dataset, with a high resolution of 0.25 degrees and hourly updates, provides detailed atmospheric data that captures even small-scale wind variations, which are critical for precise forecasting. In the equatorial region, each grid cell represents an area of approximately 29 × 29 km46. The focus is on the u-component and v-component of wind speed at 10 and 100 m above ground level, measured in meters per second, to calculate surface wind speed.

Feature selection

In this study, a neighborhood-based approach for feature selection was employed, considering the influence of neighboring cells on wind speed at specific locations. The neighborhood-based pattern (NBP) ensures systematic data collection across the entire area, providing comprehensive spatial coverage of wind speeds.

To extract relevant features from the wind speed data, an effective NBP must be defined. As shown in Fig. 3, various cellular patterns, such as 3 × 3, 5 × 5, and 7 × 7, can be used to identify the appropriate features. Wind patterns often undergo significant variations over short distances due to topographical features such as mountains and valleys. Using larger cellular patterns (e.g., 7 × 7) gathers more information from neighboring points, enabling the model to better capture complex and extensive wind patterns that may influence wind speed at the central point. However, larger cellular patterns incur higher computational costs and longer processing times. Conversely, modeling with smaller grids (e.g., 3 × 3) offers lower computational costs and requires less processing time and resources. However, it may fail to capture spatial details effectively, reducing prediction accuracy.

Fig. 3
figure 3

Cellular patterns for neighborhood-based feature selection. (a) is a 3 × 3 pattern, (b) is a 5 × 5 pattern, and (c) is a 7 × 7 pattern.

To balance accuracy, complexity, and computational efficiency, larger grids can be utilized in a pruned manner. With a simple argument, it can be assumed that the cellular layers closer to the center of the pattern have the greatest effect on the central cell of the pattern. The cells located in the farther layers of the cellular network can be pruned alternately. This selection method ensures that the wind effect in this layer is distributed evenly, with a gradual decrease from the cells closer to the center to the more distant cells, while the information from the neighboring cells closer to the center, which have a stronger wind effect, remains prioritized.

In this study, a pruned 5 × 5 grid pattern is employed, as illustrated in (Fig. 4). While a regular 5 × 5 pattern includes 25 features, the pruned version reduces this number to 17. Feature vectors based on this pruned pattern are constructed according to Eq. (1). In this equation, \(\:{C1}_{\left(t\right)}^{\left(i\right)}\) indicates the value of wind speed in the middle cell at time t. The i indicates the position of \(\:{C1}_{\left(t\right)}^{\left(i\right)}\) in the grid of database. \(\:{C2}_{\left(t\right)}^{\left(i\right)}\) to \(\:{CX}_{\left(t\right)}^{\left(i\right)}\) represent the values of wind speed in the neighboring cells of \(\:{C1}_{\left(t\right)}^{\left(i\right)}\) in the neighborhood-based pattern. \(\:{C1}_{(t+1)}^{\left(i\right)}\) indicates the value of wind speed in the middle cell at time t + 1 as the target.

$$\:{Feature\:Vector}_{\left(t\right)}^{\left(i\right)}:\left\{\begin{array}{c}Inputs=[{C1}_{\left(t\right)}^{\left(i\right)},{C2}_{\left(t\right)}^{\left(i\right)},\dots\:,{C29}_{\left(t\right)}^{\left(i\right)}]\\\:Target={C1}_{(t+1)}^{\left(i\right)}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right.$$
(1)
Fig. 4
figure 4

Selected features in the 5 × 5 grid used for generating the input vector.

Region and time range selection

The Persian Gulf (PG) is a shallow sea bordered by the mountainous coastlines of Iran and the flat shores of the Arabian Peninsula, with an average depth of only 35 m, reaching up to 180 m in the Northern part47. Positioned in a subtropical high-pressure zone, the PG experiences a dry climate characterized by low rainfall and high evaporation, making it particularly vulnerable to the impacts of climate change48. The PG’s climate is influenced by Mediterranean weather systems and the Indian monsoon, with two main seasons (summer and winter) and brief spring and fall transitions49,50.

One of the key climatic features of the PG is the presence of shamal winds. Shamal winds are a strong northwesterly wind that blows year-round, significantly affecting the region’s weather. These winds exhibit distinct patterns in summer and winter51. During late spring, low-pressure thermal systems form over southern Iran and Saudi Arabia, while a high-pressure ridge extends from the Mediterranean eastward, creating a pressure gradient that generates shamal winds. In winter (November to March), these winds, linked to mid-latitude weather systems, are stronger, reaching speeds of 15–20 m per second, leading to dust storms and reduced visibility. Winter shamal winds are more intense than those in summer, influencing the region’s precipitation-evaporation balance50,52.

Given the distinct seasonal patterns of shamal winds and their significant climatic impact, the PG serves as an ideal location for this research. Figure 5 shows the PG region. Data from 2001 to 2020 for the PG region (23° to 31°N, 47° to 59°E) were used for training and testing sets, while data from 2021 to 2023 were employed to evaluate the model’s performance across different time periods and locations.

Fig. 5
figure 5

Region of the persian gulf (23° to 31°N, 47° to 59°E).

Train and test datasets generation

We retrieved U and V wind component data from the ERA5 dataset to calculate wind speed in meters per second. Using the neighborhood model described in the “Feature Selection” section, input-target vector pairs were extracted from the wind speed data. The initial training dataset consisted of approximately 350 million records for the PG region, spanning the years 2001 to 2020.

The large size of the training dataset presents a significant challenge for the modeling process. Record pruning is a technique designed to reduce the size of large training datasets by removing redundant records53. This approach involves analyzing feature vectors to identify and eliminate those that are highly correlated. Highly correlated feature vectors can introduce redundancy and provide little to no additional information to the model. By pruning these vectors, the dataset is refined to retain only the most informative and unique features.

In this study, a correlation-based pruning approach was employed, utilizing the Spearman correlation coefficient. This coefficient measures both the strength and direction of monotonic relationships between records, making it particularly suitable for identifying highly representative records within the dataset. The process is as follows:

  1. a)

    A threshold value between 0 and 1 is determined.

  2. b)

    The Spearman correlation coefficient of each record is calculated against the entire dataset.

  3. c)

    Records with a maximum absolute correlation coefficient value less than or equal to the threshold are retained as training records, while the remaining records are designated as testing records.

To apply the record pruning technique, a threshold value of 0.55 for the Spearman correlation coefficient was determined through a trial-and-error approach. After the pruning process, the number of records in the training dataset was reduced from approximately 350 million to 242,628 records—less than 0.07% of the original dataset size. The reduced dataset was then used as the training dataset, while the remaining records were designated as the test dataset.

Additionally, to ensure the model’s robustness and generalizability, a separate test dataset was created using data from a different time period (2021 to 2023) that was excluded from the training dataset. Detailed descriptions of the training and test datasets are provided in (Table 1).

Table 1 Description of used datasets.

Model generation

Input labeling

Input signals are received in the SI block and labeled based on Eq. (2), where \(\:{SI}_{\left(t\right)}^{S}\) represents the wind components (u and v) corresponding to 10-meter and 100-meter speeds at time t.

$$SI_{{\left( t \right)}}^{S} :\left\{ {\begin{array}{*{20}c} {v10_{{\left( t \right)}} } \\ {u10_{{\left( t \right)}} } \\ {v100_{{\left( t \right)}} } \\ {u100_{{\left( t \right)}} } \\ \end{array} } \right.$$
(2)

Wind speed calculation in TH block

In the TH block, using Eqs. (3), 10 and 100 m wind speeds are calculated from the labeled wind components.

$$\:{TH}_{\left(t\right)}^{S}:\left\{\begin{array}{c}{Speed\_10}_{\left(t\right)}=\sqrt{{u10}_{\left(t\right)}^{2}+{v10}_{\left(t\right)}^{2}}\:\:\:\:\:\:\:\\\:{Speed\_100}_{\left(t\right)}=\sqrt{{u100}_{\left(t\right)}^{2}+{v100}_{\left(t\right)}^{2}}\end{array}\right.$$
(3)

Input-target vector generation

The SC block generates the input-target vectors needed for training and evaluation based on Fig. 4 and Eq. (4) to (6). Subsequently, \(\:{SC}_{\left(t\right)}^{O1}\) and \(\:{SC}_{\left(t\right)}^{O2}\)​ are sent to the OFC, and \(\:{SC}_{\left(t\right)}^{A}\) is sent to the AMIG block. In these equations, the values of \(\:{C}_{t}^{\text{i}}\) are equivalent to the wind speeds at the positions specified in (Fig. 4).

$$\:{SC}_{\left(t\right)}^{O1}=\left\{\begin{array}{c}Input\:Vector:[{C}_{t}^{1},\dots\:,{C}_{t}^{17}]\:\\\:Target:\:{C}_{t+1}^{1}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right\}\:with\:Speed\:10\:data$$
(4)
$$\:{SC}_{\left(t\right)}^{O2}=\left\{\begin{array}{c}Input\:Vector:[{C}_{t}^{1},\dots\:,{C}_{t}^{17}]\:\\\:Target:\:{C}_{t+1}^{1}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right\}\:with\:Speed\:100\:data$$
(5)
$$\:{SC}_{\left(t\right)}^{A}=\left\{\begin{array}{c}Input\:Vector:[{C}_{t}^{1},\dots\:,{C}_{t}^{17}]\:\\\:Target:\:{C}_{t+1}^{1}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\end{array}\right\}\:with\:both\:Speed\:10\:and\:100\:data$$
(6)

Basic memory units generation

Multilayer neural network models were used to create the basic memory units, labeled O1, O2, and A. These models were trained with Train 1, Train 2, and Train 3 datasets, respectively (see Table 1). To achieve the best models, a range of hyperparameters was explored according to (Table 2). Additionally, Table 2 shows the hyperparameters of the best models for basic memories O1, O2, and A.

Table 2 The range of searched hyperparameters and the architecture of the best models.

Generation of functional memory units

Evaluating the performance of machine learning models is essential. In this research, we use several error metrics: Standard deviation of error (SDE), Mean square error (MSE), Root mean square error (RMSE), Mean absolute error (MAE), Mean absolute percentage error (MAPE), and coefficient of determination (R-squared, or R2). SDE measures the dispersion of errors between predicted and actual values. A lower SDE indicates better model accuracy and stability54. MSE is the average of squared differences between predicted and actual values, with smaller values indicating better accuracy. It is sensitive to outliers due to the squaring of errors55. RMSE is similar to MSE but represents the error in the same units as the data. It penalizes larger errors more heavily and provides an average measure of error56. MAE measures the average absolute difference between predicted and actual values. It is easy to interpret but less sensitive to large errors compared to RMSE57. MAPE expresses forecast accuracy as a percentage, which makes it easy to understand. A lower MAPE indicates better accuracy. It does not account for the direction of errors (over or under predictions)54,55. R2 is a statistical measure that represents the proportion of the variance in the dependent variable that can be predicted from the independent variables in a regression model. In other words, it indicates how well the independent variables explain the variability of the dependent variable. While a high R2 suggests a good fit, it does not necessarily mean the model is perfect or the best one. It is possible for a model to have a high R2 but still not be ideal due to overfitting, where the model becomes too complex for the data. For this reason, R2 alone is usually not sufficient to evaluate models58.

To evaluate the performance of basic memory units, these metrics (SDE, MAE, MSE, RMSE, and MAPE) are applied to wind speed data at heights of 10 m and 100 m. Performance coefficients (W) are calculated for each basic memory unit (O1, O2, and A) and are updated as the error metrics change, reflecting the predictive ability of each unit.

To calculate the values of the performance coefficients in the functional memory units WO1, WO2, and WA, the trained models {X} were evaluated with the data from the Test 1 and Test 2 datasets (see Table 1). Finally, the performance coefficients of the functional memories are calculated using Eq. (7) to (9), where \(\:{Z}_{X}^{S}\) is the value of the error metric {Z} for the basic memory unit {X} and wind speed {S}.

$$Y_{{X_{j} }}^{{S_{i} }} = 1 - \left( {{{Z_{{X_{j} }}^{{S_{i} }} } \mathord{\left/ {\vphantom {{Z_{{X_{j} }}^{{S_{i} }} } {\mathop \sum \limits_{{j = 1}}^{{j = 3}} Z_{{X_{j} }}^{{S_{i} }} }}} \right. \kern-\nulldelimiterspace} {\mathop \sum \limits_{{j = 1}}^{{j = 3}} Z_{{X_{j} }}^{{S_{i} }} }}} \right)$$
(7)
$$\:{T}_{{X}_{j}}^{{S}_{i}}=\underset{X}{{min}}{(Z}_{\varvec{X}}^{{S}_{i}})+\left({Y}_{{X}_{j}}^{{S}_{i}}-\underset{X}{{min}}{(Y}_{\varvec{X}}^{{S}_{i}})/\underset{X}{{max}}{(Y}_{\varvec{X}}^{{S}_{i}})-\underset{X}{{min}}{(Y}_{\varvec{X}}^{{S}_{i}})\right)$$
(8)
$$W\left( Z \right)_{{X_{j} }}^{{S_{i} }} = {{T_{{X_{j} }}^{{S_{i} }} } \mathord{\left/ {\vphantom {{T_{{X_{j} }}^{{S_{i} }} } {\mathop \sum \limits_{{j = 1}}^{{j = 3}} T_{{X_{j} }}^{{S_{i} }} }}} \right. \kern-\nulldelimiterspace} {\mathop \sum \limits_{{j = 1}}^{{j = 3}} T_{{X_{j} }}^{{S_{i} }} }}$$
(9)

According to equations (10) to (12), each functional memory unit holds ten performance coefficients. Table 5 shows the values of the performance coefficients for functional memory units.

$$\:\:{WO1}_{t}=\left[\begin{array}{cc}{W}_{t}^{\left({STDE}_{S10}^{O1}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S10}^{O1}\right)}&\:{W}_{t}^{\left({MSE}_{S10}^{O1}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S10}^{O1}\right)}&\:{W}_{t}^{\left({MAPE}_{S10}^{O1}\right)}\end{array}\end{array}\\\:{W}_{t}^{\left({STDE}_{S100}^{O1}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S100}^{O1}\right)}&\:{W}_{t}^{\left({MSE}_{S100}^{O1}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S100}^{O1}\right)}&\:{W}_{t}^{\left({MAPE}_{S100}^{O1}\right)}\end{array}\end{array}\end{array}\right]$$
(10)
$$\:{WO2}_{t}=\left[\begin{array}{cc}{W}_{t}^{\left({STDE}_{S10}^{O2}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S10}^{O2}\right)}&\:{W}_{t}^{\left({MSE}_{S10}^{O2}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S10}^{O2}\right)}&\:{W}_{t}^{\left({MAPE}_{S10}^{O2}\right)}\end{array}\end{array}\\\:{W}_{t}^{\left({STDE}_{S100}^{O2}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S100}^{O2}\right)}&\:{W}_{t}^{\left({MSE}_{S100}^{O2}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S100}^{O2}\right)}&\:{W}_{t}^{\left({MAPE}_{S100}^{O2}\right)}\end{array}\end{array}\end{array}\right]$$
(11)
$$\:{WA}_{t}=\left[\begin{array}{cc}{W}_{t}^{\left({STDE}_{S10}^{A}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S10}^{A}\right)}&\:{W}_{t}^{\left({MSE}_{S10}^{A}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S10}^{A}\right)}&\:{W}_{t}^{\left({MAPE}_{S10}^{A}\right)}\end{array}\end{array}\\\:{W}_{t}^{\left({STDE}_{S100}^{A}\right)}&\:\begin{array}{ccc}{W}_{t}^{\left({MAE}_{S100}^{A}\right)}&\:{W}_{t}^{\left({MSE}_{S100}^{A}\right)}&\:\begin{array}{cc}{W}_{t}^{\left({RMSE}_{S100}^{A}\right)}&\:{W}_{t}^{\left({MAPE}_{S100}^{A}\right)}\end{array}\end{array}\end{array}\right]$$
(12)

Fusion of the results

In Eq. (9), each performance coefficient is constrained between zero and one, and the sum of the performance coefficients for each error metric {Z} and wind speed {S} equals one. This approach allows for separately calculating the combined outputs of the basic and functional memory units based on each error metric and wind speed. The model’s final output, represented by the F unit in the AMIG block, for each input with wind speed {S} at time t is calculated by averaging these combined outputs, as shown in equations (13) through (18).

$$\:{F}_{t}^{\left({STDE}_{S}\right)}={O1}_{t}\text{*}{WO1}_{t}^{\left({STDE}_{S}^{O1}\right)}+{O2}_{t}\text{*}{WO2}_{t}^{\left({STDE}_{S}^{O2}\right)}+{A}_{t}\text{*}{WA}_{t}^{\left({STDE}_{S}^{A}\right)}$$
(13)
$$\:{F}_{t}^{\left({MAE}_{S}\right)}={O1}_{t}\text{*}{WO1}_{t}^{\left({MAE}_{S}^{O1}\right)}+{O2}_{t}\text{*}{WO2}_{t}^{\left({MAE}_{S}^{O2}\right)}+{A}_{t}\text{*}{WA}_{t}^{\left({MAE}_{S}^{A}\right)}$$
(14)
$$\:{F}_{t}^{\left({MSE}_{S}\right)}={O1}_{t}\text{*}{WO1}_{t}^{\left({MSE}_{S}^{O1}\right)}+{O2}_{t}\text{*}{WO2}_{t}^{\left({MSE}_{S}^{O2}\right)}+{A}_{t}\text{*}{WA}_{t}^{\left({MSE}_{S}^{A}\right)}$$
(15)
$$\:{F}_{t}^{\left({RMSE}_{S}\right)}={O1}_{t}\text{*}{WO1}_{t}^{\left({RMSE}_{S}^{O1}\right)}+{O2}_{t}\text{*}{WO2}_{t}^{\left({RMSE}_{S}^{O2}\right)}+{A}_{t}\text{*}{WA}_{t}^{\left(R{MSE}_{S}^{A}\right)}$$
(16)
$$\:{F}_{t}^{\left({MAPE}_{S}\right)}={O1}_{t}\text{*}{WO1}_{t}^{\left({MAPE}_{S}^{O1}\right)}+{O2}_{t}\text{*}{WO2}_{t}^{\left({MAPE}_{S}^{O2}\right)}+{A}_{t}\text{*}{WA}_{t}^{\left({MAPE}_{S}^{A}\right)}$$
(17)
$$\:{F}_{t}^{S}=({F}_{t}^{\left({STDE}_{S}\right)}+{F}_{t}^{\left({MAE}_{S}\right)}+{F}_{t}^{\left({MSE}_{S}\right)}+{F}_{t}^{\left({RMSE}_{S}\right)}+{F}_{t}^{\left({MAPE}_{S}\right)})/5$$
(18)

The process outlined in equations (13) through (18) to generate the final model output is summarized in Eq. (19):

$$F_{t}^{S} = \left( {\sum O1_{t} {\text{*}}\user2{WO}1_{t}^{S} + \sum O2_{t} {\text{*}}\user2{WO}2_{t}^{S} + \sum A_{t} {\text{*}}\user2{WA}_{t}^{S} } \right)/5$$
(19)

Model exploitation

After the training phase, the model advances to deployment. During deployment, similar to training, the input signal passes through the SI and TH blocks before reaching the SC block. However, with new data for predictions, the SC block processes a single feature vector, as described in Eq. (20).

$$\:{SC}_{t}=Input\:Vector:\left[{C}_{t}^{1},\dots\:,{C}_{t}^{17}\right]$$
(20)

The feature vector \(\:{SC}_{t}\) is then sent to the basic memory units O1, O2, and A, and the outputs of these units are forwarded to the F unit. In the F unit, the final result is computed using Eq. (18) or Eq. (19).

Brief overview of the BELBFM components and interactions

Table 3 provides a brief description of the BELBFM blocks and units, along with their role, inputs and outputs. Furthermore, to better understand how the BELBFM model works, the following pseudo-code is provided:

Table 3 Description of BELBFM Components and their interactions.
  • Input: U10, V10, U100, V100 (wind components at 10 and 100 m heights).

  • Output: Predicted wind speeds (S10, S100).

Initial preparation

  1. a)

    Define the study region and time range.

  2. b)

    Select the grid-based database with wind speed data (e.g., ERA5 with hourly updates).

  3. c)

    Define a neighborhood-based pattern to generate feature vectors (e.g., a 5 × 5 grid).

Feature extraction

  1. a)

    Load the ERA5 reanalysis dataset.

  2. b)

    Extract the U and V wind components at 10 and 100 m heights.

  3. c)

    Generate labeled input signals in SI block (U10, V10, U100, V100).

  4. d)

    Calculate wind speed at both altitudes in TH block.

  5. e)

    Extract input-target vectors based on defined neighborhood-based pattern in SC block.

Model training

  1. a)

    Split input-target vectors to generate train and test datasets:

  1. i.

    Create train and test datasets based on a correlation-based pruning approach.

  2. ii.

    Generate extra test datasets for evaluation of the final model.

  1. b)

    Train basic memory units:

  1. i.

    O1: Train on 10 m wind speed data (S10).

  2. ii.

    O2: Train on 100 m wind speed data (S100).

  3. iii.

    A: Train on combined wind speed data from both heights (S10 and S100).

  4. c)

    Optimize hyperparameters for each unit model (e.g., layer size, activation functions).

Evaluate basic memory units and update functional memory weights

  1. a)

    Calculate performance using error metrics (e.g., SDE, MAE, and MAPE) for O1, O2, and A.

  2. b)

    Update functional memory weights (WO1, WO2, WA) based on error metrics.

Fusion of results

  1. a)

    Combine outputs from the basic and functional memory units using:

  1. i.

    Weighted sum (preferred method).

  2. ii.

    Simple mean (optional).

Model exploitation

  1. a)

    Process new input data through SI, TH, and SC blocks and O1, O2, and A units.

  2. b)

    Predict final wind speeds (S10, S100) using the fusion mechanism (F unit).

Output results

  1. a)

    Validate model predictions against the test dataset.

  2. b)

    Evaluate performance using error metrics to ensure accuracy and reliability.

Results

The modeling and testing processes were performed on a desktop computer with specifications outlined in (Table 4). Hyperparameter optimization took approximately 5 h, while the final model training completed in just 3 min.

Table 4 Specifications of software and hardware used for modeling process.

During the training phase, three basic memory units, O1, O2, and A, were trained using data from the Train 1, Train 2, and Train 3 datasets, respectively. The trained models were evaluated on the Test 1 and Test 2 datasets to determine the performance coefficients of functional memories WO1, WO2, and WA, as shown in (Table 5). The final model was created by combining the outputs of the basic memory units and functional memories.

Table 5 Values of performance coefficients of functional memory units.

To assess the effectiveness of functional memories in the final model, the F unit used both weighted and simple mean methods, as shown in Eq. (21).

$$\:{F}_{t}^{S}=\left({O1}_{t}+{O2}_{t}+{A}_{t}\right)/3$$
(21)

During deployment, the final model was tested using data from Test 3, Test 4, Test 5, and Test 6. Table 6 displays results from tests with the basic memory units (O1, O2, and A) using Test 1, Test 2, Test 4, and Test 5.

Table 6 Test results of the basic memory units.

Key findings include the following:

  • Model O1, trained on 10-meter wind speed data, demonstrates cross-height predictive capability by effectively predicting 100 m wind speed. Similarly, O2, trained on 100-meter data, performs well in predicting 10-meter wind speed.

  • Model A, trained on both 10-meter and 100-meter data, shows better accuracy in predicting 10 m speeds.

  • The models generally exhibit lower error metrics with recent data (2021–2023) compared to earlier periods (2001–2020), suggesting improved performance with newer data.

  • O1 consistently outperforms O2 and A, indicating that predicting wind speeds at higher altitudes (100 m) is more complex than at lower altitudes (10 m), requiring advanced data integration.

Table 7 evaluates the performance of the BELBFM model in predicting wind speeds at both 10 and 100 m, using six statistical error metrics (SDE, MAE, MSE, RMSE, MAPE, and R2) across two time periods: 2001–2020 and 2021–2023. The analysis includes three basic memory units (O1, O2, A) and two combination methods (Simple Mean and Weighted Sum).

Table 7 Results of the BELBFM model on train and test datasets.

Key insights:

  • Training phase (2001–2020, Train 3): Model A, trained on both heights, achieves the lowest MAPE (27.841), indicating superior accuracy over models O1 and O2.

  • Test phase (2001–2020, Test 3): The Weighted Sum method outperforms the Simple Mean across all error metrics, indicating enhanced accuracy by leveraging additional information from the base models.

  • Test phase (2021–2023, Test 6): In this recent data period, the Weighted Sum method once again outperforms the Simple Mean, demonstrating lower error metrics across the board and confirming the model’s adaptability to recent data and consistency in predictive accuracy.

Furthermore, to evaluate the performance of the BELBFM model, its results were compared with those of other regression models. For this purpose, several regression models were trained using the Train 3 dataset (2001–2020) and evaluated with the Test 6 dataset (2021–2023) using the Regression Learner Toolbox in MATLAB R2023R software. Table 8 compares the performance of BELBFM with that of these models. An analysis of the table highlights the superior performance of BELBFM compared to the alternative models. Notably, BELBFM achieves the lowest RMSE (0.6002) and MAE (0.4480), along with the highest R2 value (0.9506), indicating its exceptional ability to explain variance in wind speed data. In contrast, models such as Neural Networks and Gaussian Process Regression show comparatively higher error rates and lower predictive accuracy.

Table 8 Results of the BELBFM model vs. known regression models on the test dataset (test 6).

Discussion and conclusion

The BELBFM model represents a significant advancement in wind speed prediction, utilizing an innovative ensemble learning methodology that outperforms traditional approaches such as numerical weather prediction (NWP) and computational fluid dynamics (CFD). By integrating outputs from memory units trained on distinct wind speed data layers with optimized performance coefficients, BELBFM achieves superior predictive accuracy. Its emotionally-inspired learning principles effectively capture the inherent nonlinearities of atmospheric processes, facilitating reliable wind speed forecasts across diverse conditions.

A notable contribution of this study is BELBFM’s dual-height wind speed modeling capability, predicting wind speeds at both 10 and 100 m. This feature provides a comprehensive understanding of wind behavior across critical altitudes, crucial for applications in renewable energy optimization, climate modeling, and weather forecasting. The model’s flexible regression approach, based on a neighborhood feature set, minimizes computational demands and improves accuracy by prioritizing the most relevant data points, setting it apart from conventional time series models.

Performance metrics underscore the model’s predictive power. Despite the challenges of predicting wind speeds at 100 m due to numerous influencing factors, BELBFM’s dual-height modeling effectively utilizes 10-meter data for accurate 100-meter predictions, and vice versa. Model A, which integrates data from both heights, demonstrates high accuracy in predicting 10-meter wind speeds. Its consistent performance across two periods (2001–2020 and 2021–2023) suggests strong adaptability to new data while mitigating concerns about overfitting. Furthermore, the Weighted Sum method consistently outperforms the Simple Mean method, enhancing predictive accuracy across diverse datasets.

Comparative analysis with other well-known regression models, as detailed in Table 8, highlights BELBFM’s superiority. It achieves the lowest RMSE and MAE values and the highest R2, underscoring its exceptional ability to explain variance in wind speed data effectively.

BELBFM’s correlation-based data pruning technique is another valuable feature, reducing the training dataset to less than 0.07% of its original size while retaining representative samples. This efficient process accelerates the modeling workflow without compromising prediction quality, making the model particularly suitable for real-time applications in renewable energy management, weather forecasting, and disaster preparedness.

The model also demonstrates computational efficiency. By leveraging a pruned 5 × 5 grid during feature selection, BELBFM significantly reduces the number of processed features. The feature selection phase has a time complexity of O(n·k), while the training phase for each memory unit’s multilayer perceptron (MLP) has a complexity of O(T·L·N2), where n is the number of data points, k the neighboring cells, T the number of iterations, L the number of layers, and N the neurons per layer. Spearman correlation-based pruning further reduces computational demands, achieving a complexity of O(n2), thereby drastically reducing the dataset size to 0.07%. These optimizations ensure both accuracy and efficiency, completing hyperparameter tuning within five hours and model training in under three minutes on standard hardware.

Despite its strengths, the model’s reliance on ERA5 data may introduce variability in performance, contingent upon the quality and resolution of the input data. Further research is needed to validate BELBFM’s effectiveness across diverse regions and incorporate additional atmospheric variables for refined predictions. Enhancing adaptability to real-time data and changing weather conditions could further expand its utility.

The novelty of BELBFM lies in its innovative architecture and methodological advancements. By integrating basic and functional memory units with adaptive emotional learning mechanisms, the model effectively captures the nonlinear and dynamic nature of atmospheric processes. Its dual-height predictive capability addresses a critical gap in simultaneous wind speed modeling at varying altitudes, providing enhanced utility for renewable energy optimization and climate modeling. The incorporation of correlation-based data pruning ensures computational efficiency, setting BELBFM apart as a practical solution for real-time applications. This framework underscores the transformative potential of brain-inspired learning methods in advancing wind speed prediction and related meteorological applications.

In conclusion, BELBFM exemplifies the potential of emotionally-inspired learning models to advance meteorological research. Its performance, adaptability, and efficiency open promising opportunities for deployment across wind energy forecasting and broader environmental fields.