Electricity price forecasting with ensemble meta-models and SHAP explainers: a PCA-driven approach

Hayati, Amirhosein; Gharehveran, Sina Samadi; Shirini, Kimia

doi:10.1038/s41598-026-35839-1

Download PDF

Article
Open access
Published: 28 January 2026

Electricity price forecasting with ensemble meta-models and SHAP explainers: a PCA-driven approach

Amirhosein Hayati¹,
Sina Samadi Gharehveran² &
Kimia Shirini³

Scientific Reports volume 16, Article number: 6466 (2026) Cite this article

802 Accesses
Metrics details

Subjects

Abstract

Accurate electricity price forecasting is essential for optimizing market operations, enhancing resource allocation, and ensuring sustainable energy management in volatile and complex markets. This research introduces a comprehensive ensemble meta-modeling framework that integrates machine learning techniques with SHAP (SHapley Additive exPlanations) for enhanced interpretability and PCA (Principal Component Analysis) for effective dimensionality reduction. The methodology capitalizes on the complementary strengths of predictive models such as XGBoost, LSTM, and CNN to address the non-linear and temporal intricacies of electricity price datasets. Two ensemble approaches were implemented: (1) Weighted Averaging, assigning weights inversely proportional to model RMSE, achieving an RMSE of 2.126761, and (2) Meta-Model Ensemble, employing Linear Regression, achieving superior accuracy with an RMSE of 1.939032. SHAP analysis provided actionable insights into model contributions, highlighting XGBoost and LSTM as key components. Furthermore, error trajectory analysis demonstrated the robustness of the ensembles in minimizing cumulative forecasting errors over time. This study contributes to the field by combining advanced machine learning models, ensemble strategies, and explainability frameworks to deliver an interpretable, high-performing electricity price forecasting system. The results inform policy-making and lay the foundation for scalable, data-driven energy market solutions.

Introduction

The electricity market is a dynamic and multifaceted system influenced by factors such as consumption, generation patterns, weather, and regulatory frameworks. As renewable energy sources become increasingly prominent in electricity production, accurately forecasting prices has grown crucial for market stakeholders and policymakers. The inherent volatility and multidimensionality of electricity pricing require sophisticated methodologies to untangle contributing factors. Recent advancements in machine learning (ML) and deep learning (DL) have demonstrated their capability to model complex, non-linear relationships in electricity price forecasting. For example, XGBoost¹, LSTM², and CNN³ have shown promise in predictive tasks involving time series and multi-factorial data structures^4,5,6,7,8. However, while these methods improve prediction accuracy, they often act as “black-box” models^9,10,11, making it challenging to interpret how various factors contribute to their predictions^12,13,14. This research aims to bridge this interpretability gap using SHAP and advanced ensemble techniques.

Forecasting electricity prices presents unique challenges due to the interplay of diverse factors, such as electricity consumption, renewable energy penetration, and weather variability^15,16,17. These influences are further compounded by the intermittent nature of renewables and fluctuating market conditions^18,19. Traditional forecasting models often fail to adequately capture these dynamics, necessitating more robust and interpretable methods. Moreover, while deep learning models such as LSTM and CNN excel in capturing temporal dependencies, their lack of interpretability limits their practical utility in market settings. Addressing these challenges requires a framework that not only achieves high predictive accuracy but also provides transparent insights into the factors driving these forecasts^13,20.

The primary objectives of this research are to identify the key factors influencing electricity prices in Spain. In Spain, electricity prices are heavily influenced by a combination of factors including renewable energy generation, market coupling with neighboring countries, and fluctuations in natural gas prices. Spain’s significant reliance on renewable energy sources like wind and solar has added complexity to forecasting, as these sources are intermittent and sensitive to weather patterns. The prices are more reactive to natural gas and carbon pricing as the renewable share grows and gas-fired plants are often used to balance supply and demand. On the other hand, Iran’s electricity market is notably different due to its heavy dependence on fossil fuel-based power generation, primarily natural gas and oil. The energy price in Iran is tightly linked to global oil prices due to its subsidized fuel costs. Unlike Spain, where renewable sources are gradually taking over, Iran’s power sector continues to rely on cheap and abundant fossil fuels, which keep electricity prices relatively stable and often low. However, the fluctuation in global oil prices can still have an indirect effect on the market, influencing government subsidies and the energy balance. Oil prices also have a more direct impact on overall inflation and economic conditions, which in turn influences energy demand and electricity prices. The relationship between oil costs and electricity price forecasting in Iran is more tightly intertwined than in Spain, where renewable energy and market dynamics play a more significant role in driving prices. Therefore, while the machine learning models you use (XGBoost, LSTM, CNN)^18,21 may perform similarly in both regions, the predictive features that drive electricity prices in each country will differ substantially. For example, oil price fluctuations would likely be more influential in Iran’s electricity price forecasting than in Spain’s, where natural gas prices, renewable generation patterns, and cross-border electricity trading have a more significant impact. Additionally, the study will utilize SHAP (SHapley Additive exPlanations) to interpret model predictions. This will provide insights into feature importance and enhance transparency. Finally, the research aims to demonstrate the effectiveness of Principal Component Analysis (PCA) in reducing dimensionality and improving computational efficiency within the forecasting process^22,23.

This study presents a novel, interpretable framework for electricity price forecasting by combining ensemble modeling with explainability techniques. By leveraging SHAP, the research offers transparent insights into the predictive contributions of various factors, addressing the long-standing interpretability challenges associated with ML models^6,21. The findings contribute to energy economics and forecasting research, offering a scalable methodology applicable to dynamic energy markets worldwide.

Research gap and motivation

Although recent studies have used ensemble models, DL hybrid networks, or SHAP-based explainability methods, most of these studies have focused on only one of these aspects and have not presented a complete integration between dimensionality reduction (PCA), CNN–LSTM hybrid models, XGBoost, and Meta-Model Ensembles. Also, most studies have performed SHAP analysis only for individual models and have not investigated the role of each model in the final ensemble in an explanatory manner. Therefore, the existing research gap is the lack of a coherent and explainable framework that can analyze reduced dimensionality, deep learning models, tree models, and the final ensemble simultaneously. This study is designed to fill this gap. The key contributions of this study are summarized as follows:

Providing an integrated electricity price forecasting framework including PCA, CNN–LSTM, XGBoost, and Meta-Ensemble models.
Simultaneously combining three types of models (Tree-Based + Hybrid DL + Time-Distributed MLP), which has been rarely reported in the literature.
Providing a two-level SHAP analysis including: Feature analysis for individual models and Analysis of the contribution of each model to the final Ensemble.
Providing a comprehensive comparison of model performance including RMSE, Loss, and cumulative error, which is not usually reported in previous studies.
Showing the impact of PCA on the stability and speed of models, which has been less investigated alongside Ensemble methods.

Related works

Electricity price forecasting has garnered significant attention due to its critical role in energy market operations. Recent advancements in machine learning have introduced interpretable and robust techniques for tackling this complex problem. Notably, Shapley Additive Explanations (SHAP) have been extensively employed to interpret the predictive power of machine learning models, making them more transparent for practical applications. For instance, SHAP has been applied to XGBoost models for extracting spatial effects^21,24,25 and understanding feature contributions in areas like concrete compressive strength predictions^26,27.

Deep learning has emerged as a transformative approach in time-series forecasting, providing state-of-the-art performance in modeling temporal dependencies. Comprehensive surveys highlight the versatility of architectures like LSTMs and encoder-decoder frameworks for multivariate time-series forecasting^{5,18,28,29,30}. Hybrid models, which combine LSTMs with feature selection algorithms, have demonstrated superior accuracy in predicting day-ahead electricity prices under market coupling conditions¹⁹. Similarly, attention-based mechanisms have been pivotal in capturing temporal patterns effectively²⁸.

The integration of ensemble learning has further strengthened forecasting frameworks. Techniques such as weighted averaging^6,31,32 and meta-model ensembles¹⁹ capitalize on the strengths of multiple base models to deliver enhanced accuracy, particularly in markets influenced by renewable energy sources. These ensemble methods often outperform individual models by balancing their strengths and mitigating weaknesses. Dimensionality reduction techniques like Principal Component Analysis (PCA) have also proven instrumental in simplifying complex datasets while preserving essential information. In energy forecasting, PCA has been effectively used to improve prediction stability and manage feature interactions²². The combination of PCA and ensemble learning represents a promising avenue for advancing electricity price prediction.

Finally, explainable AI has become indispensable in understanding the factors driving electricity market fluctuations^33,34. Studies using SHAP⁴ and related methodologies have unraveled intricate market behaviors, bridging the gap between model accuracy and interpretability. These advancements collectively underscore the importance of integrating interpretable and hybrid techniques to address the challenges of electricity price forecasting.

While various studies have used Ensemble or SHAP, most of them have only investigated one type of base model (such as XGBoost or an LSTM architecture)^24,31. Some studies have used SHAP for feature analysis, but have not provided an analysis of the contribution of the models to the final Ensemble. Also, many related works have not used PCA or combined it with explainability methods. The present study provides a more comprehensive and explainable framework than the existing works by presenting a Meta-Model Ensemble including CNN–LSTM, XGBoost, TDM, and Encoder–Decoder along with PCA and multilayer SHAP analysis.

Methodology

This study utilizes a dataset spanning four years, comprising electricity consumption, generation, prices, and weather conditions in Spain. Data sources include ENTSOE for consumption and generation, Red Eléctrica de España (REE) for pricing, and Open Weather API for weather data from Spain’s five largest cities. Models were evaluated using metrics such as RMSE to quantify predictive performance. To enhance interpretability, SHAP analysis was employed to rank feature importance, while PCA was used to streamline data preprocessing by reducing dimensionality^24,28,35,36. This study aimed to forecast electricity prices by leveraging advanced machine learning techniques, preprocessing strategies, and explainability tools. The methodology involved several systematic steps, as outlined below.

Data gathering and analyzing

The dataset used in this research includes four years of hourly data related to electricity consumption, generation, and weather conditions in Spain. It includes information on electricity consumption and generation, sourced from ENTSOE, a publicly accessible platform for Transmission System Operator (TSO) data. Pricing information was obtained from Red Eléctrica España (REE), the Spanish TSO responsible for managing electricity market operations. Additionally, weather data for Spain’s five largest cities was collected through the Open Weather API. A distinctive feature of this dataset is the inclusion of hourly forecasts for electricity consumption and prices, provided by TSOs, which allows comparisons with real-world predictions currently in use within the industry.

The raw dataset used in this study underwent a meticulous preprocessing pipeline to ensure its quality, relevance, and readiness for effective machine-learning applications. Since raw data often contains imperfections such as missing values, irrelevant information, and inconsistencies, addressing these issues was a critical first step toward reliable forecasting. As shown in Fig. 1, the electricity load for the first two weeks of 2015, in terms of megawatt-hours, is depicted, providing a visual representation of the hourly consumption data.

The process began with the removal of irrelevant or unusable columns. This step involved identifying data fields that lacked significance or contributed noise to the analysis. By excluding such columns, the dimensionality of the dataset was reduced, improving computational efficiency and ensuring that the model focused on meaningful features. Handling missing values was the next challenge. Missing data points, common in real-world datasets, can lead to disruptions in the training process and degrade model performance. To address this, an interpolation technique was applied, leveraging existing data trends to estimate missing entries. This method ensured continuity in the dataset while maintaining its overall statistical integrity. Outlier detection and removal was an essential step in the data preprocessing process. Outliers, caused by measurement errors or rare anomalies, can distort model training and result in biased predictions. We employed statistical techniques, such as z-score analysis and interquartile range (IQR), to identify and handle these extreme values. Once detected, outliers were replaced with NaN values to ensure the dataset represented typical market conditions and remained free of disruptive anomalies.

Before merging the df_energy and df_weather datasets, we addressed the outliers in the ‘pressure’ and ‘wind_speed’ columns. Boxplots were used to visualize these outliers, and extreme values were replaced with NaNs. Specifically, for the ‘pressure’ column, values above 1051 hPa and below 931 hPa were set to NaN, as these extremes fall outside typical atmospheric pressure ranges. Similarly, for ‘wind_speed’, values above 50 m/s were replaced with NaN, as this exceeds the highest wind speeds recorded in the region. This issue is illustrated in Figs. 2 and 3.

After replacing the outliers with NaNs, linear interpolation was applied to estimate the missing values, ensuring data continuity and maintaining statistical integrity. Once cleaned, both the ‘pressure’ and ‘wind_speed’ columns were ready for use in the modeling phase.

An essential part of the preprocessing involved merging multiple datasets to create a unified framework for analysis. The energy dataset, which included electricity consumption, generation, and pricing data, was integrated with weather data collected from Spain’s five largest cities. This integration allowed the models to analyze the relationships between electricity market dynamics and external weather factors, which are known to have a significant influence on energy consumption and pricing patterns.

To gain further insight into these relationships, we plotted the ‘rain_1h’ and ‘rain_3h’ data for Bilbao (Figs. 4 and 5). In Fig. 4, we examined the actual electricity prices from the first month of 2015 to the first month of 2019, focusing on six-month intervals to assess how the actual prices varied over time. Figure 5 shows the rainfall data (in millimeters) over the same period, specifically for six-month intervals, to better understand the impact of rainfall on electricity demand and price fluctuations.

Stationarity checks were performed as the final step of preprocessing. For time-series modeling, it is crucial to work with data that exhibits stationarity, where statistical properties such as mean and variance remain constant over time. The KPSS (Kwiatkowski-Phillips-Schmidt-Shin) test was applied to evaluate the stationarity of the dataset. Non-stationary data can lead to unreliable model outputs, so any detected trends or seasonality were addressed through techniques such as differencing or detrending. To further investigate the behavior of the data and understand the seasonal trends in electricity prices, we also analyzed the electricity price series by comparing the actual monthly prices with their 1-year lagged counterparts. This comparison helps to identify any long-term seasonal patterns and cyclical behavior in the electricity price data, which can significantly influence forecasting models.

Figure 7 illustrates the actual electricity price at monthly frequency alongside its 1-year lagged series. This plot provides a clear visualization of how the electricity price fluctuates over time and how past prices from the same period in the previous year are compared to the current price. It helps confirm the presence of seasonality and trends, which is an important aspect to address before applying forecasting models.

Through this comprehensive preprocessing workflow, the dataset was transformed into a robust and reliable resource, capable of supporting advanced machine learning models for electricity price forecasting. These steps ensured that the data was clean, consistent, and enriched, providing a solid foundation for capturing complex patterns and producing accurate predictions.

Overall framework of the proposed methodology

Figure 8 illustrates the overall workflow of the proposed electricity price forecasting framework. The process starts with data collection, followed by preprocessing steps including handling missing values, outlier removal, and feature engineering. The next stage involves model selection and training using various machine learning and deep learning approaches, such as XGBoost, LSTM, and CNN. Finally, ensemble techniques are applied to enhance forecasting accuracy, and SHAP analysis is utilized for model interpretability.

Methods

In this paper, a diverse set of machine learning models was implemented to capture the intricate patterns and dependencies present in the data, aiming to improve the accuracy of electricity price forecasting. These models were chosen for their ability to handle both linear and nonlinear relationships in the data, as well as their capacity to deal with sequential and time-dependent features, which are crucial for this type of forecasting task. Each model was selected and fine-tuned based on its unique strengths to ensure optimal performance in forecasting electricity prices.

XGBoost (Extreme Gradient Boosting) was one of the first models used. XGBoost is a powerful gradient-boosting algorithm that has become a standard in machine learning due to its efficiency and effectiveness in handling large datasets with complex, nonlinear relationships. In this paper, the model was configured with key parameters such as a learning rate (eta) of 0.03 and a maximum tree depth of 180. The learning rate controlled the step size at each iteration, while the maximum depth controlled the complexity of the individual trees. The model’s performance was evaluated using RMSE (Root Mean Squared Error), which is a commonly used metric to assess prediction accuracy by measuring the average squared difference between predicted and actual values. XGBoost’s ability to model feature interactions and its built-in handling of missing data made it a reliable and efficient choice for electricity price forecasting.

The next set of models involved Long Short-Term Memory (LSTM) networks, which are specifically designed to work with sequential data, making them an ideal choice for time-series forecasting. LSTM networks are capable of learning long-range dependencies in the data, which is particularly important in time series data where future values often depend on past observations. LSTM networks have a unique architecture that allows them to retain information over time, making them suitable for capturing long-term trends and fluctuations in electricity prices. To improve their predictive power, a Stacked LSTM architecture was used, which stacked multiple LSTM layers to create a deeper network. The deeper structure of the stacked LSTM allowed the model to capture more complex features and temporal relationships, thus enhancing its ability to forecast long-term trends and cyclical patterns in the data.

In addition to LSTM-based models, Convolutional Neural Networks (CNNs) were also employed to capture localized patterns within the data. CNNs are typically used for image recognition, but their ability to extract hierarchical features from data has also proven valuable in time-series forecasting. CNNs apply convolutional layers to scan the input data and extract local features, which can then be used to detect short-term dependencies in electricity prices. To combine the strengths of CNNs and LSTMs, a CNN-LSTM hybrid model was developed. The CNN component of this model first performed feature extraction from the raw time-series data by identifying short-term patterns, while the LSTM component handled the temporal dependencies and long-term forecasting. This hybrid approach allowed the model to leverage the strengths of both architectures—CNN for feature extraction and LSTM for modeling sequential patterns—resulting in more accurate predictions.

Additionally, the Time Distributed Multi-Layer Perceptron (TDM) was used to process sequential data. This model applied a fully connected multi-layer perceptron (MLP) structure across each time step in the sequence, effectively analyzing the data both spatially and temporally. The TDM was particularly useful in handling sequential dependencies in the data while ensuring that the model could adapt to variations in the time series.

Another advanced approach implemented in this paper was the Encoder-Decoder architecture. This architecture is commonly used in tasks involving complex input-output relationships, such as machine translation or time-series forecasting. The encoder compresses the input sequence into a fixed-size latent representation, which captures the essential information from the input data. The decoder then reconstructs the output sequence from this latent representation. The encoder-decoder structure is particularly effective at modeling long-term dependencies and extracting meaningful features from long sequences of data. In this paper, the encoder-decoder model was used to capture complex temporal patterns in the electricity price forecasting task.

To further enhance the accuracy of the predictions, ensemble methods were employed, combining the outputs of multiple models to reduce the overall error. The ensemble model used a weighted averaging technique, where each model’s output was weighted based on its RMSE performance on the validation set. Models that performed better, i.e., those with lower RMSE, were assigned higher weights, allowing them to have a more significant influence on the final prediction. This weighted averaging approach helped to minimize errors by leveraging the strengths of the individual models, each of which captured different aspects of the data. Furthermore, a meta-model was employed, where a linear regression model was trained using the outputs of all the base models. The meta-model combined the predictions of the base models, learning to assign the appropriate weights to each model’s contribution. This meta-model approach helped to further improve the accuracy of the final predictions by combining the complementary strengths of all the base models.

The performance of all the models was evaluated using Root Mean Squared Error (RMSE), which is a standard metric for regression tasks. The RMSE measures the average magnitude of errors between predicted and actual values, with lower values indicating better model performance. Among the individual models, the CNN-LSTM hybrid model outperformed the rest, demonstrating its ability to capture both short-term and long-term dependencies effectively. However, the ensemble model, which combined the predictions of all the base models, achieved the best overall performance. By leveraging the unique strengths of each model, the ensemble strategy was able to provide more accurate and robust predictions.

To further ensure the robustness of the models, cross-correlation analysis was performed to examine the relationships between different features in the dataset. This analysis helped to identify the strength and direction of dependencies between the input features and the target variable (electricity prices). Additionally, Pearson correlation analysis was conducted to quantify the linear relationships between the features and the target. This analysis highlighted the most influential features for electricity price forecasting, such as TSO price forecasts, actual electricity consumption, and weather conditions. These features were found to have the strongest correlation with electricity prices, and their inclusion in the model significantly improved forecasting accuracy.

Finally, to ensure the transparency and interpretability of the models, Shapley Additive exPlanations (SHAP) were employed. SHAP is a popular technique for explaining the output of machine learning models by attributing the prediction to each feature. It provides a way to understand how much each feature contributed to a particular prediction, allowing for greater insight into the model’s decision-making process. In this project, SHAP was used to answer two key questions: first, which model contributed the most to the ensemble’s decision-making process, and second, which features were the most important in predicting electricity prices. The analysis revealed that the CNN-LSTM model was the most influential in the ensemble’s predictions, playing a central role in determining the final forecast. Furthermore, the SHAP analysis identified the key features driving the price fluctuations, including TSO price forecasts, electricity consumption, and weather conditions. These insights into the underlying factors influencing electricity prices provided valuable transparency into the model’s predictions and helped to increase trust in the forecasting process. The use of SHAP also enabled the identification of potential areas for further improvement in the model, providing a foundation for future research and refinement.

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Forecasting framework and data splitting

To ensure a reliable forecasting setup, a strictly chronological data-splitting strategy was employed. The dataset was divided as follows:

Training set: January 2015 – December 2017.
Validation set: January 2018 – June 2018.
Test set: July 2018 – December 2019.

This prevents information leakage and reflects real-world forecasting conditions.

A rolling-origin expanding-window forecasting scheme was adopted. After each prediction step, the model was retrained using all available past observations up to that point. The forecasting horizon was defined as one hour ahead, consistent with operational price forecasting routines in electricity markets.

Handling price spikes and structural breaks

Electricity prices often exhibit abrupt spikes and structural shifts. Instead of removing these events, they were retained to preserve the true market behavior. Price values beyond 3 standard deviations were flagged as extreme events and smoothed using an IQR-based outlier marking combined with rolling-median correction, ensuring robustness without distorting the signal. Additionally, periods with known market rule changes were timestamped so the models could implicitly learn regime-dependent behavior. A robustness analysis was performed by evaluating model RMSE separately for high-volatility segments, showing that the ensemble remained stable across different regimes.

Handling price spikes and structural breaks

To ensure a fair and unbiased comparison among models, a unified and systematic hyperparameter tuning strategy was employed. A two-stage optimization procedure was applied:

1.
Random Search was used to explore broad parameter ranges for each model.
2.
The best 10 candidates were further refined using Bayesian Optimization, targeting RMSE on the validation set.

For each model, 30–50 configurations were evaluated, and no test data was used during tuning. All models were trained using the same forecasting horizon, input features, and chronological training/validation split to guarantee comparability. Table 1 shows the search ranges of the hyperparameters used for the different models. As can be seen, for each model, a set of key parameters is selected and specific ranges are considered for the search. These ranges are used in the hyperparameter tuning stage and during the combined process of random search and Bayesian optimization to establish a fair and equal comparison between all models. Determining these ranges plays an important role in preventing overfitting, increasing stability, and ensuring that the final performance of the models is optimal.

Table 1 Hyperparameter search ranges used in model tuning.

Full size table

Were trained on the same rolling-origin expanding window,
Used the same input representation,
Predicted the same 1-hour-ahead horizon,
Used RMSE on the validation set as the selection criterion,
and were retrained from scratch after each tuning round.

This ensures that differences in performance reflect the modeling capabilities rather than differences in training conditions.

Several meta-learners were tested for stacking, including:

Linear Regression
Ridge Regression
Support Vector Regression (SVR)
Random Forest Regressor
XGBoost Regressor

Although more complex models achieved lower training error, they consistently showed overfitting on the validation set, particularly SVR and XGBoost-meta. Linear Regression provided:

1.
the lowest validation-set RMSE,
2.
the highest stability across volatile periods,
3.
and full interpretability, enabling SHAP-based analysis of model contributions.

For these reasons, Linear Regression was selected as the final meta-learner. Table 2 shows the performance of the tested metalearners in the model aggregation process. As can be seen in the table, several different models including linear regression, ridge, SVR, random forest, and XGBoost have been evaluated as metalearners. Comparison of the error on the validation and test data shows that although some more complex models have lower training error, they often suffer from overfitting. In contrast, linear regression, having the lowest and most stable RMSE on the validation and test data, was the best choice for use as the final metalearner in the aggregation framework of this study.

Table 2 Comparison of different meta-learners for stacking ensemble.

Full size table

Results and discussion

In this section, the performance of various machine learning models for hour-ahead electricity price forecasting is analyzed using the Root Mean Squared Error (RMSE) and loss metrics. Models evaluated include XGBoost, LSTM, Stacked LSTM, CNN, CNN-LSTM, Time Distributed MLP (TDM), and Encoder-Decoder. The effectiveness of ensemble methods, such as Weighted Averaging and Meta Model Ensemble, is also discussed, showcasing their potential for enhancing forecasting accuracy. The robustness assessment indicates that although price spikes increased short-term prediction error for individual models, the meta-ensemble consistently maintained lower cumulative error during periods of extreme volatility. This confirms the ability of the ensemble framework to generalize across structural breaks and high-uncertainty intervals.

Comparison of model performance and selection of the best approach

To comprehensively assess the effectiveness of various models, we summarize their predictive performance using RMSE, training loss, and validation loss. Table X presents a comparative analysis of all implemented models. The results indicate that XGBoost achieved the lowest RMSE (2.225), outperforming deep learning models such as CNN (2.872) and Stacked LSTM (3.948). The superior performance of XGBoost can be attributed to the following factors:

1.
Efficient Feature Utilization: Unlike LSTM and CNN, which require extensive time-series dependencies, XGBoost effectively leverages the most critical input features, making it more adaptable to different market conditions.
2.
Computational Efficiency: XGBoost’s optimized tree-based structure allows for parallel computation, significantly reducing training time compared to deep learning models.
3.
Better Generalization: The model effectively balances bias and variance, preventing overfitting, which was evident in Stacked LSTM (RMSE: 3.948).
4.
Explainability & Interpretability: Unlike black-box deep learning models, XGBoost provides direct feature importance insights through SHAP analysis, making it a more transparent and practical choice for electricity price forecasting.

Based on these results, XGBoost is selected as the best individual model. However, the ensemble approaches (Weighted and Meta-Model Ensembles) achieved further improvements by leveraging multiple models’ strengths. The following sections provide a more detailed analysis of each model’s performance.

Table 3 Summary of model performance across RMSE, training loss, and validation loss.

Full size table

Comparative analysis of model performance and justification for XGBoost superiority

To comprehensively evaluate the effectiveness of different models, their predictive performance was assessed using RMSE, training loss, and validation loss. Table 3 presents a comparative analysis of all implemented models, demonstrating that XGBoost achieved the lowest RMSE (2.225), outperforming deep learning models such as CNN (2.872) and Stacked LSTM (3.948). This superior performance can be attributed to several key factors. Unlike LSTM and CNN, which rely on extensive time-series dependencies, XGBoost efficiently extracts and leverages the most relevant input features, making it more adaptable to varying market conditions. Additionally, its optimized tree-based structure enables parallel computation, significantly reducing training time compared to deep learning models. Another critical advantage of XGBoost is its strong generalization capability. While models such as Stacked LSTM exhibited signs of overfitting, as indicated by an RMSE of 3.948, XGBoost effectively balances bias and variance, leading to more robust predictions. Furthermore, XGBoost provides greater interpretability compared to deep learning models, which are often considered “black-box” approaches. By utilizing SHAP analysis, XGBoost offers direct insights into feature importance, making it a more transparent and practical choice for electricity price forecasting.

Based on these findings, XGBoost is identified as the best individual model. However, further improvements were achieved through ensemble approaches, including Weighted and Meta-Model Ensembles, which leverage the strengths of multiple models. The subsequent sections provide a more detailed analysis of each model’s performance.

Model performance

The RMSE results, along with the training and validation losses for each model, provide valuable insights into their predictive capabilities and limitations.

LSTM results

The LSTM model, trained with a learning rate of 0.001 and 100 epochs, demonstrates its potential for capturing temporal dependencies in data. However, its performance, with a test RMSE of 2.195, reflects the challenges posed by the complex electricity price prediction task. The loss values during training and validation are depicted in Fig. 9, highlighting the model’s convergence over epochs. Similarly, the RMSE values across training and evaluation datasets, shown in Fig. 10, underline the model’s overall predictive performance.

Stacked LSTM results

The deeper architecture of the Stacked LSTM model resulted in a higher RMSE of 3.948, indicating overfitting despite its capacity for intricate feature learning. Training and validation loss trends for this model are presented in Fig. 11, while the corresponding RMSE values are depicted in Fig. 12. These figures suggest that the stacked structure captured complex features at the expense of generalization.

CNN results

The CNN model achieved a moderate RMSE of 2.872, reflecting its ability to extract local patterns in the time-series data. However, its limited capacity for modeling long-term dependencies constrained its effectiveness. Training and validation loss trends (Fig. 13) and RMSE metrics (Fig. 14) confirm its steady but moderate performance.

CNN-LSTM results

The CNN-LSTM hybrid model, with an RMSE of 2.227, combined the strengths of CNNs and LSTMs to capture both short-term and long-term dependencies. The loss curves for this model are shown in Fig. 15, while its RMSE results are presented in Fig. 16. These figures demonstrate the improved performance of the hybrid approach over standalone CNN or LSTM models.

Time-distributed MLP (TDM) results

The TDM model, which applies MLPs across sequential data, achieved an RMSE of 2.316. The training and validation loss trends (Fig. 17) and RMSE metrics (Fig. 18) highlight its potential for handling sequential dependencies, though it lagged behind CNN-LSTM in capturing long-term trends.

Encoder–decoder results

The Encoder-Decoder model, designed to capture intricate temporal dependencies, had an RMSE of 2.562. Despite its advanced architecture, challenges in tuning its parameters limited its effectiveness. Loss values during training and validation (Fig. 19) and RMSE trends (Fig. 20) illustrate its overall performance and convergence behavior.

To address the limitations of individual models, ensemble methods were employed, leveraging the complementary strengths of multiple models.

Weighted averaging ensemble

This ensemble technique combined predictions from all base models using weights inversely proportional to their RMSE. The result was an RMSE of 2.126761, demonstrating its capability to reduce errors effectively. However, while it outperformed most individual models, XGBoost and LSTM maintained slightly better RMSEs due to their strong standalone predictive power. Figure 21 below compares the RMSE values of all base models and the ensembles, illustrating the superiority of ensemble approaches in integrating the strengths of individual models. Notably, the Weighted Averaging Ensemble achieves lower errors than CNN and Stacked LSTM, further emphasizing its effectiveness.

Meta model ensemble

The Meta Model Ensemble employed a Linear Regression meta-model to combine predictions from base models. This technique achieved the best performance, with an RMSE of 1.939032, underscoring its superior ability to generate robust forecasts. The meta-model leveraged the complementary strengths of the base models, producing a significant improvement in forecasting accuracy. In Fig. 22, the average weights assigned to each base model in the Weighted Averaging Ensemble are visualized. It becomes evident that models with lower RMSEs, such as XGBoost and LSTM, were assigned higher weights, reflecting their importance in ensemble prediction.

Ensemble contributions: SHAP analysis

To gain deeper insights into the Meta Model Ensemble, SHAP (Shapley Additive exPlanations) analysis was applied. This technique evaluates the contribution of each base model to the ensemble’s final predictions. As shown in Fig. 23, XGBoost and LSTM emerge as the most influential models, confirming their critical role in the meta-model’s performance. Meanwhile, models like CNN and Stacked LSTM provided complementary contributions, enhancing overall accuracy.

Interpretation of SHAP values in the context of electricity market dynamics

In this section, SHAP results are analyzed not only from a machine learning perspective, but also based on the real mechanisms of the Spanish electricity market. This approach allows the model results to go beyond the level of a “numerical analysis” and become a “deep understanding of the energy system behavior.”

The role of TSO day-ahead forecast

SHAP shows that this feature has the most positive effect on the model output. This is also consistent with market logic. The Spanish electricity market operates within the European Market Coupling framework and day-ahead prices, determined through auctions based on the EUPHEMIA algorithm, are usually the most accurate estimate of tomorrow’s market conditions.

When the TSO forecast increases, the SHAP value also shows an increase in the final price, a behavior that is fully consistent with the observed realities in the market. The reason for this alignment can be sought in the physical-economic mechanism of the market: the pricing algorithm integrates a set of factors including forecasted demand, renewable generation, and fuel costs, and as a result, an increase in this variable acts as a strong signal for price growth.

The role of actual load

SHAP shows that peak-hour consumption has the greatest positive effect on the model output. This result is also fully consistent with the logic of the Spanish electricity market. Spain has very distinct daily consumption cycles, especially the afternoon and evening peaks, when demand increases significantly.

During these hours, gas-fired power plants with high marginal costs usually enter the supply curve. Therefore, the increase in consumption causes more expensive units to enter the market, which in turn increases the clearing price, exactly the same behavior as the increase in SHAP.

This pattern is more intense during winter periods and heatwaves, as the need for heating or cooling significantly increases demand, reinforcing the role of consumption in determining market prices.

Impact of wind and solar generation

SHAP shows that increasing renewable generation has a negative SHAP, meaning that clean generation growth tends to lower market prices. This result is quite consistent with the actual behavior of the Spanish energy system. Spain has a high share of wind and solar, especially wind in the regions of Castilla-La Mancha and solar in Andalusia.

On stormy nights, when wind generation increases, prices tend to fall. Also, during the middle of the day, high solar generation leads to a Midday Price Dip, as a significant part of the supply curve is covered by very low-cost units.

This pattern is exactly what SHAP also shows and is quite consistent with the Merit Order Effect—the displacement of more expensive power plants by less expensive generation.

Temperature, wind, and precipitation (weather variables)

SHAP shows that low temperatures cause positive SHAP and thus increase prices, while strong winds have negative SHAP due to increased wind generation and reduce prices. Also, high precipitation usually has a slightly negative effect on SHAP, as it reduces consumption and changes the thermal behavior of buildings.

These results are fully consistent with the logic of the Spanish heat and power market. A significant part of winter electricity demand in Spain is related to electric heating; therefore, a sudden drop in temperature can significantly increase consumption and push the system towards the use of gas units with higher marginal costs, which in turn increases market prices. On the other hand, an increase in wind generation or a decrease in demand due to precipitation is associated with a downward pressure on prices, and this behavior is also precisely observed in the SHAP patterns.

Why does the CNN-LSTM model contribute the most to SHAP ensemble?

SHAP of the models in Ensemble shows that CNN-LSTM produces the largest contribution to the final model weight. This behavior is quite understandable in terms of the model structure. The CNN part is able to extract short-term patterns such as spikes and ramp events well, while LSTM performs very well for learning longer-term trends including weekly cycles and seasonal changes.

Given that electricity prices are multi-scale in nature and have both fast fluctuations and slow trends, the combination of CNN and LSTM provides an optimal balance between these two types of patterns. This is why this model excelled during the volatile market periods of 2018–2019, where simultaneous detection of short-term spikes and long-term structural changes was essential for accurate forecasting.

Overall, the SHAP results not only demonstrate the importance of features, but also directly reflect the real mechanisms of the Spanish electricity market, such as Market Coupling, Merit Order Effect, the role of renewable fluctuations, thermal-electrical load and consumption cycles. This shows that the model has correctly learned not only statistical patterns, but also the economic-physical structure of the market.

Statistical significance analysis of the difference in model performance (Diebold–Mariano test)

To assess whether the difference in model performance is statistically significant, the Diebold–Mariano (DM) test was performed on the Forecast Error distribution. This test was performed pairwise between the baseline models and also between each baseline model and the Meta-Model Ensemble. The test results are presented in Table 4.

Table 4 Results of Diebold–Mariano statistical test for pairwise model Comparisons.

Full size table

According to the results, the difference between the Meta-Model Ensemble and all baseline models was statistically significant (p < 0.05). Therefore, the superiority of the final model is not only proven based on the RMSE values but also on the significant difference in the error distribution.

It was also observed that some differences between the baseline models—for example LSTM and CNN-LSTM—are not statistically significant. This indicates that the closeness of the RMSE of these models is due to the similarity of their temporal error behavior.

Cumulative error comparison

The cumulative error plot in Fig. 24 compares the temporal error accumulation of the base models against the ensemble methods. Both ensembles show a consistent reduction in cumulative errors, with the Meta Model Ensemble maintaining the lowest error growth over time. This visualization underscores how combining models reduces forecasting errors more effectively than relying on any individual model.

Hyperparameter tuning and analysis

Each model underwent extensive hyperparameter tuning to optimize its performance. For instance, XGBoost was fine-tuned with a learning rate of 0.03 and a maximum tree depth of 180, balancing complexity and generalization. Similarly, CNN-LSTM and Encoder-Decoder architectures required careful adjustments to their learning rates and epoch settings to maximize predictive accuracy.

Conclusion

This study focused on enhancing hour-ahead electricity price forecasting by employing a diverse set of machine-learning models and optimizing their performance through ensemble techniques. After evaluating multiple models, including XGBoost, LSTM, Stacked LSTM, CNN, CNN-LSTM, Time Distributed MLP (TDM), and Encoder-Decoder, it was clear that no single model was consistently the best across all evaluation metrics. However, the XGBoost model performed notably well with an RMSE of 2.225155, and the LSTM model showed a slightly improved performance with an RMSE of 2.194644. While more complex models like Stacked LSTM yielded higher errors (RMSE of 3.948107), simpler models like CNN (RMSE of 2.871703) and CNN-LSTM (RMSE of 2.226790) demonstrated strong results. Despite these individual successes, the most significant improvement in predictive performance came from the ensemble approaches.

The Weighted Ensemble model, which combined the individual model predictions weighted by their respective RMSE values, resulted in an RMSE of 2.126761, significantly reducing the error compared to individual models. However, the Meta Model Ensemble outperformed all other methods, achieving the best result with an RMSE of 1.939032. This outcome highlights the power of ensemble strategies in improving model accuracy by harnessing the complementary strengths of various models. The meta-model, trained using a linear regression approach, was able to effectively combine the outputs of the base models, resulting in a more reliable and accurate forecast. A key contribution of this study was the application of the Shapley Additive exPlanations (SHAP) framework, which provided a clear and interpretable understanding of the factors influencing electricity price predictions. Through SHAP analysis, we identified that TSO price forecasts, actual electricity consumption, and weather conditions were the most influential variables affecting the accuracy of the predictions. This insight not only reinforces the transparency of the model’s decision-making process but also offers valuable practical information for market participants and policymakers seeking to understand the underlying drivers of electricity price fluctuations. While individual models provided valuable insights, it was the ensemble methods that led to the most accurate predictions in this study. The Meta Model Ensemble emerged as the best-performing approach, demonstrating the effectiveness of combining different models. Additionally, the use of SHAP provided much-needed transparency and interpretability, offering a deeper understanding of the factors influencing electricity prices. This research underscores the importance of leveraging multiple models in combination with explainable AI techniques for both improving forecasting accuracy and ensuring trust in predictive models within the energy sector³⁹.

According to the results of the Diebold–Mariano test, the difference in the performance of the Meta-Model Ensemble compared to other models is not only significant in terms of RMSE but also statistically significant. Therefore, it can be concluded that the final model reliably and reasonably provides better performance in electricity price forecasting.

Further directions

While this study provides a solid foundation for electricity price forecasting, several opportunities for future research remain. First, the exploration of more diverse and advanced ensemble techniques, such as stacked generalization or boosting, could further enhance the performance of the model. Additionally, incorporating external factors such as policy changes, economic indicators, or real-time demand fluctuations could help refine the predictions and make them even more robust. Another area of improvement is the exploration of deep reinforcement learning (DRL) techniques, which have been successfully applied in other domains to handle dynamic, sequential decision-making problems like energy price forecasting. By integrating DRL with ensemble methods, future models could be trained to adapt to changing market conditions over time. Moreover, expanding the dataset to include more granular time intervals, such as 15-minute or 30-minute intervals, could provide further insights into short-term fluctuations and improve the accuracy of hour-ahead predictions. Finally, the application of these models to different geographical regions, with varying electricity markets and consumption patterns, would allow for broader generalizations and comparisons, providing a more comprehensive understanding of the methods’ scalability and adaptability across diverse environments.

In conclusion, the potential for machine learning in electricity price forecasting is vast, and there remains much to explore in terms of model refinement, integration of additional features, and application to different energy markets. The findings of this study pave the way for future advancements in predictive analytics within the energy sector, contributing to more efficient market operations and better-informed decision-making.

Data availability

The datasets generated or analyzed during the current study are not publicly available, but are available from the corresponding author, Sina Samadi Gharehveran on reasonable request.

References

Zahiri, M., Shirini, K., S. Samadi-Gharehveran. Network traffic analysis with machine learning for faster detection of distributed denial of service attack. J. Adv. Def. Sci. Technol. 10.1001.1.26762935.1402.14.4.6.2 (2024).
Oskouei, A. G. et al. Efficient superpixel-based brain MRI segmentation using multi-scale morphological gradient reconstruction and quantum clustering. Biomed. Signal Process. Control. 100, 107063. https://doi.org/10.1016/j.bspc.2024.107063 (2025).
Article Google Scholar
Gheibi, Y. et al. CNN-Res: deep learning framework for segmentation of acute ischemic stroke lesions on multimodal MRI images. BMC Med. Inf. Decis. Mak. 23 (1), 192. https://doi.org/10.1186/s12911-023-02289-y (2023).
Article Google Scholar
Samadi Gharehveran & Sina A review of energy management of Multi-microgrid power systems in the presence of uncertainty of distributed generation resources. Control Data Process. Syst. 2 (4), e731409. https://doi.org/10.30511/pcdp.2025.2072671.1046 (2025). Power.
Article Google Scholar
Cini, A., Marisca, I., Zambon, D. & Alippi, C. Graph deep learning for time series forecasting. ACM Comput. Surv. 57(12), 1–34. https://doi.org/10.1145/3742784 (2025).
Article Google Scholar
Nemati, R., Shirini, K. & Sina Samadi Gharehveran FER-HA: a hybrid attention model for facial emotion recognition. J. Supercomputing. 81 (16), 1485. https://doi.org/10.1007/s11227-025-07983-4 (2025).
Article Google Scholar
Taherihajivand, A., Shirini, K., Samadi, S. & Gharehveran An overview of product performance prediction using artificial algorithms. Agricultural Mechanization. 9 (3), 1–14. https://doi.org/10.22034/jam.2024.61899.1276 (2024).
Article Google Scholar
Sattari, M. T. et al. Modeling daily and monthly rainfall in Tabriz using ensemble learning models and decision tree regression. Clim. Change Res. 5 (18), 31–48. https://doi.org/10.30488/ccr.2024.433394.1192 (2024).
Article Google Scholar
Shirini, K. & Samadi Gharehveran, S. Balancing time and cost in resource-constrained project scheduling using meta-heuristic approach. J. Agric. Mach. https://doi.org/10.22067/jam.2023.81735.1157 (2024).
Article Google Scholar
Sattari, M. T., Shirini, K. & Javidan, S. Evaluating the efficiency of dimensionality reduction methods in improving the accuracy of water quality index modeling in Qizil-Uzen river using machine learning algorithms (1982). https://doi.org/10.22098/mmws.2023.12434.1241
Taherihajivand, A., Shirini, K. & Samadi Gharehveran, S. Weed detection in fields using convolutional neural network based on deep learning. Agric. Eng. 47(1), 129–142. https://doi.org/10.22055/agen.2024.45327.1688 (2024).
Article Google Scholar
Arora, R. K., Soni, A., Tiwari, A. & Mandadapu, P. A dual approach to forecasting in the Irish day-ahead market: time series and machine learning techniques. Int. J. Inf. Technol. 17(6), 3185–3195. https://doi.org/10.1007/s41870-024-02353-4 (2025).
Article Google Scholar
Belenguer, E. & Segarra-Tamarit, J. Emilio Pérez, and Ricardo Vidal-Albalate. Short-term electricity price forecasting through demand and renewable generation prediction. Math. Comput. Simul. 229, 350–361. https://doi.org/10.1016/j.matcom.2024.10.004 (2025).
Article Google Scholar
Ahrari, M. et al. A security-constrained robust optimization for energy management of active distribution networks with presence of energy storage and demand flexibility. J. Energy Storage. 84, 111024. https://doi.org/10.1016/j.est.2024.111024 (2024).
Article Google Scholar
Gharehveran, S. S., Ghassemzadeh, S. & Rostami, N. Two-stage resilience-constrained planning of coupled multi-energy microgrids in the presence of battery energy storages. Sustainable Cities Soc. 83, 103952. https://doi.org/10.1016/j.scs.2022.103952 (2022).
Article Google Scholar
Gharehveran, S. S., Zadeh, S. G. & Rostami, N. Resilience-oriented planning and pre-positioning of vehicle-mounted energy storage facilities in community microgrids. J. Energy Storage. 72, 108263. https://doi.org/10.1016/j.est.2023.108263 (2023).
Article Google Scholar
GHAREHVERAN, S. S. & ZADEH, S. G. Investigation of the effects of transmission system cooperation operators in electric energy networks. J. Crit. Rev. 7(1) (2020).
El-Azab, H. A. I. et al. Seasonal forecasting of the hourly electricity demand applying machine and deep learning algorithms impact analysis of different factors. Sci. Rep. 15, 9252. https://doi.org/10.1038/s41598-025-91878-0 (2025).
Article ADS PubMed PubMed Central CAS Google Scholar
Samadi Gharehveran, S. & Nasiri, M. Resilient planning against disturbances and optimal location determination for mobile energy storage systems in smart microgrids. Passive Def. 16(2), 69–80 (2025).
Google Scholar
Chen, D., Lin, X. & Qiao, Y. Perspectives for artificial intelligence in sustainable energy systems. Energy 318, 134711. https://doi.org/10.1016/j.energy.2025.134711 (2025).
Article Google Scholar
Shao, D., KyungJin & Zoh Study on the Spatial distribution patterns and formation mechanism of religious sites based on XGBoostSHAP and Spatial econometric models: a case study of the Yangtze river Delta, China. J. Asian Archit. Building Eng. 24 (6), 5853–5874. https://doi.org/10.1080/13467581.2024.2431318 (2025).
Article Google Scholar
Saeedi, N. et al. Prediction of electrical energy consumption using principal component analysis and independent components analysis. J. Supercomput. 81, 1072. https://doi.org/10.1007/s11227-025-07505-2 (2025).
Article Google Scholar
Mohamed, M. M., Fouda, Z. M., Fadlullah, R., Abdelfattah & Ibrahem, M. I. A Bayesian-Optimized LSTM model for Day-Ahead load price forecasting in the ERCOT market. IEEE Open. J. Comput. Soc. 6, 1001–1011. https://doi.org/10.1109/OJCS.2025.3580107 (2025).
Article Google Scholar
Ismail, Z., Alali, A. S., Muhammad, A., Ashraf, M. & Abdellatif, S. O. Introducing a novel figure of merit for evaluating stability of perovskite solar cells: Utilizing long short-term memory neural networks. IEEE Access 13, 49735–49749. https://doi.org/10.1109/ACCESS.2025.3550658 (2025).
Article Google Scholar
Shirini, K., Kordan, M. B. & Gharehveran, S. S. Impact of learning rate and epochs on Lstm model performance: a study of chlorophyll-a concentrations in the Marmara sea. J. Supercomput. 81(1), 1–18. https://doi.org/10.1007/s11227-024-06806-2 (2025).
Article Google Scholar
Cihan, M. T., Pınar & Cihan Bayesian-Optimized ensemble models for geopolymer concrete compressive strength prediction with interpretability analysis. Buildings 15 (20), 3667. https://doi.org/10.3390/buildings15203667 (2025).
Article Google Scholar
Gharehveran, S. S. et al. Deep learning-based demand response for short-term operation of renewable-based microgrids. J. Supercomputing. 80 (18), 26002–26035. https://doi.org/10.1007/s11227-024-06407-z (2024).
Article Google Scholar
Binte Habib, A., Alam, M. G. R. & Uddin, M. Z. AUNET (Attention-Based Unified Network): Leveraging attention-based N-BEATS for enhanced univariate time series forecasting. IEEE Access 13, 95184–95217. https://doi.org/10.1109/ACCESS.2025.3574459 (2025).
Article Google Scholar
Zaki Dizaji, H. et al. Modeling variables affecting the yield of sugarcane fields using deep recurrent neural network. Iran. J. Biosystems Eng. https://doi.org/10.22059/ijbse.2025.378958.665557 (2025).
Article Google Scholar
Gharehveran, S. S., Shirini, K. & Abdolahi, A. Optimizing Energy Storage Solutions for Grid Resilience: A Comprehensive Overview. (2025). https://doi.org/10.5772/intechopen.1006499
Jin, K. et al. Robust power management capabilities of integrated energy systems in the smart distribution network including linear and non-linear loads. Sci. Rep. 15, 6615. https://doi.org/10.1038/s41598-025-89817-0 (2025).
Article PubMed PubMed Central CAS Google Scholar
Shirini, K., Taherihajivand, A. & Samadi Gharehveran, S. A review of algorithms for solving the project scheduling problem with resource-constrained considering agricultural problems. Agricultural Mechanization. 8 (1), 1–14. https://doi.org/10.22034/jam.2023.55751.1227 (2023).
Article Google Scholar
Shirini, K., Aghdasi, H. S. & Saeedvand, S. Multi-objective aircraft landing problem: a multi-population solution based on non-dominated sorting genetic algorithm-II. J. Supercomputing. 80 (17), 25283–25314. https://doi.org/10.1007/s11227-024-06385-2 (2024).
Article Google Scholar
Shirini, K., Aghdasi, H. S. & Saeedvand, S. A comprehensive survey on multiple-runway aircraft landing optimization problem. Int. J. Aeronaut. Space Sci. 25 (4), 1574–1602. https://doi.org/10.1007/s42405-024-00747-z (2024).
Article Google Scholar
Mystakidis, A. et al. A multi-energy meta-model strategy for multi-step ahead energy load forecasting. Electr. Eng. 107, 9675–9699. https://doi.org/10.1007/s00202-025-02995-y (2025).
Article Google Scholar
Shirini, K., Hajivand, A. T. & Gharehveran, S. S. A novel deep learning-based method for potato leaf disease classification. 9th Adv. Eng. Days. 9, 462–464 (2024).
Google Scholar
Nazaré, T., Zhao, Y., Browne, J. & Nepomuceno, E. Energy efficiency in NARMAX models for reduced carbon footprint. IEEE Trans. Ind. Appl. https://doi.org/10.1109/TIA.2025.3591586 (2025).
Article Google Scholar
Mudassir, M., Bennbaia, S., Unal, D. & Hammoudeh, M. Time-series forecasting of Bitcoin prices using high-dimensional features: a machine learning approach. Neural Comput. Appl. 37(28), 22979–22993 (2025).
Article Google Scholar
Shirini, K., Aghdasi, H. S. & Saeedvand, S. Modified imperialist competitive algorithm for aircraft landing scheduling problem. J. Supercomput. https://doi.org/10.1007/s11227-024-05999-w (2024).
Article Google Scholar

Download references

Funding

No funding was received for this work.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Azarbaijan Shahid Madani University, Tabriz, Iran
Amirhosein Hayati
Department of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
Sina Samadi Gharehveran
Department of Multimedia, Tabriz Islamic Art University, Tabriz, Iran
Kimia Shirini

Authors

Amirhosein Hayati
View author publications
Search author on:PubMed Google Scholar
Sina Samadi Gharehveran
View author publications
Search author on:PubMed Google Scholar
Kimia Shirini
View author publications
Search author on:PubMed Google Scholar

Contributions

Authors contribution statement: Amirhosein Hayati : Methodology, Investigation, Software, Writing-Original draft. Sina Samadi Gharehveran: Conceptualization, Methodology, Writing-Reviewing and Editing, Supervision. Kimia Shirini: Methodology, Conceptualization, Writing-Reviewing and Editing.

Corresponding author

Correspondence to Kimia Shirini.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Hayati, A., Gharehveran, S.S. & Shirini, K. Electricity price forecasting with ensemble meta-models and SHAP explainers: a PCA-driven approach. Sci Rep 16, 6466 (2026). https://doi.org/10.1038/s41598-026-35839-1

Download citation

Received: 27 November 2024
Accepted: 08 January 2026
Published: 28 January 2026
Version of record: 16 February 2026
DOI: https://doi.org/10.1038/s41598-026-35839-1

Subjects

Abstract

Introduction

Research gap and motivation

Related works

Methodology

Data gathering and analyzing

Overall framework of the proposed methodology

Methods

Ethics approval

Forecasting framework and data splitting

Handling price spikes and structural breaks

Handling price spikes and structural breaks

Results and discussion

Comparison of model performance and selection of the best approach

Comparative analysis of model performance and justification for XGBoost superiority

Model performance

LSTM results

Stacked LSTM results

CNN results

CNN-LSTM results

Time-distributed MLP (TDM) results

Encoder–decoder results

Weighted averaging ensemble

Meta model ensemble

Ensemble contributions: SHAP analysis

Interpretation of SHAP values in the context of electricity market dynamics

The role of TSO day-ahead forecast

The role of actual load

Impact of wind and solar generation

Temperature, wind, and precipitation (weather variables)

Why does the CNN-LSTM model contribute the most to SHAP ensemble?

Statistical significance analysis of the difference in model performance (Diebold–Mariano test)

Cumulative error comparison

Hyperparameter tuning and analysis

Conclusion

Further directions

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links