Abstract
As the solar photovoltaic (PV) penetration level increases in smart grids, precise and computationally efficient short-term forecasting becomes essential to aid operational planning and real-time energy management. However, the power produced by PV is highly nonlinear and stochastic due to variations in weather factors, which weakens the performance of single forecasting models. The aim of this work is to propose a stacked ensemble regression model that combines Gradient Boosting and XGBoost (Extreme Gradient Boosting) as base learners, with Ridge Regression as the meta-learner, for very short-term PV power prediction. The model operates using meteorological and operational parameters, such as temperature, humidity, wind profile, cloudiness distribution, and solar situation. Standard preprocessing steps (missing value imputation, feature selection, and normalization) are adopted to facilitate stable model training. An empirical study is carried out using real-world PV generation data, and the results are compared with popular gradient boosting algorithms such as Gradient Boosting, XGBoost, LightGBM (Light Gradient Boosting Machine), and CatBoost (Categorical Boosting), and machine learning models such as multilayer perceptron (MLP) and LSTM, using k-fold cross-validation. The boosted ensemble improves predictive accuracy, achieving MAE = 0.042 ± 0.002, MSE = 0.0031 ± 0.0002, and R² = 94% ± 1% under the experimental conditions used in this work. Nonparametric tests (i.e., the Wilcoxon signed-rank test and the Friedman test) show that such improvements are statistically significant (p < 0.05). Moreover, the inference latency of the proposed model is quite low, which demonstrates its suitability for near-real-time deployment in real-world smart grid scenarios. According to experimental results, lightweight ensemble learning can stand as a competitive and practical alternative to complicated deep learning methods for short-term PV power forecasting when data availability and computational budget are taken into account.
Similar content being viewed by others
Data availability
The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.
Abbreviations
- PV:
-
Photovoltaic
- ML:
-
Machine learning
- GB:
-
Gradient boosting
- XGBoost:
-
Extreme gradient boosting
- LightGBM:
-
Light gradient boosting machine
- CatBoost:
-
Categorical boosting (developed by Yandex)
- LSTM:
-
Long short-term memory
- MAE:
-
Mean absolute error
- MSE:
-
Mean squared error
- R2 or R2 :
-
Coefficient of determination
- RNN:
-
Recurrent neural network
- AI:
-
Artificial intelligence
- ARIMA:
-
AutoRegressive Integrated Moving Average
- SARIMA:
-
Seasonal AutoRegressive Integrated Moving Average
- ANN:
-
Artificial neural network
- SVM:
-
Support vector machine
- GBM:
-
Gradient boosting machine
- ReLU:
-
Rectified linear unit
- CNN:
-
Convolutional neural network
- GAN:
-
Generative Adversarial Network
- BERT:
-
Bidirectional encoder representations from transformers
- SR:
-
Stacked regressor
- IQR:
-
Interquartile range
- Q1:
-
First Quartile (25th Percentile)
- Q3:
-
Third Quartile (75th Percentile)
- ε:
-
Residual
References
Hossain, M. S. & Mahmood, H. Short-term photovoltaic power forecasting using an LSTM neural network and synthetic weather forecast. IEEE Access. 8, 172524–172533 (2020).
Erbay, C. Machine learning models for solar forecasting and impact on green hydrogen production costs. Int. J. Hydrogen Energy. 132, 225–238 (2025).
Yu, J. et al. Deep Learning Models for PV Power Forecasting. Rev. Energies. 17 (16), 3973 (2024).
Adimoolam, M. et al. A hybrid learning approach for the stage-wise classification and prediction of COVID-19 X-ray images. Expert Syst. 39, e12884 (2021).
Nawaz, S. A. et al. Medical image zero watermarking algorithm based on dual-tree complex wavelet transform, AlexNet and discrete cosine transform. Appl. Soft Comput. 169, 112556 (2025).
Abdelsattar, M., Azim, M. A. & AbdelMoety, A. Comparative analysis of deep learning architectures in solar power prediction. Sci. Rep. 15, 31729 (2025).
Phukaokaew, W., Suksri, A., Punyawudho, K. & Wongwuttanasatian, T. Thermal management of photovoltaic module using affordable organic phase change material combined with nano metal oxide particles enhancer. Heliyon 10, e41054 (2024).
Di Leo, P., Ciocia, A., Malgaroli, G. & Spertino, F. Advancements and Challenges in Photovoltaic Power Forecasting: A Comprehensive Review. Energies 18 (8), 2108 (2025).
Shakhovska, N., Medykovskyi, M., Gurbych, O., Mamchur, M. & Melnyk, M. Enhancing solar energy production forecasting using advanced machine learning and deep learning techniques: a comprehensive study on the impact of meteorological data. Computers Mater. Continua. 81 (2), 3147–3163 (2024).
Chen, C., Chai, L. & Wang, Q. Research on stacking ensemble method for day-ahead ultra-short-term prediction of photovoltaic power. Renew. Energy. 238, 121853 (2025).
Kumar, A., Dubey, A. K. & Segovia Ramírez, I. Artificial Intelligence Techniques for the Photovoltaic System: A Systematic Review and Analysis for Evaluation and Benchmarking. Arch. Computat Methods Eng. 31, 4429–4453 (2024).
Markovic, V. et al. Attention augmented recurrent architectures for solar energy production forecasting. Appl. Soft Comput. 186 (Part B), 114119 (2026).
Khayat, A. et al. A novel hybrid GRU-XGBoost model for day-ahead photovoltaic generation forecasting in microgrids. Sci. Afr. 29, e02884 (2025).
Naghapushanam, M., Jeevarathinam, B. & Sankari, C. Physics-informed voting ensemble for solar power generation forecasting: integrating domain knowledge with machine learning. Energy Inf. 9, 3 (2026).
Straub, N., Karalus, S., Herzberg, W. & Lorenz, E. Satellite-based solar irradiance forecasting: Replacing cloud motion vectors by deep learning. Sol RRL. 8, 2400475 (2024).
Piantadosi, G. et al. Photovoltaic power forecasting: A Transformer-based framework. Energy AI. 18, 100444 (2024).
Pereira, S., Canhoto, P., Oozeki, T. & Salgado, R. Comprehensive approach to photovoltaic power forecasting using numerical weather prediction data and physics-based models and data-driven techniques. Renew. Energy. 251, 123495 (2025).
Munawar, U. & Wang, Z. A. Framework of Using Machine Learning Approaches for Short-Term Solar Power Forecasting. J. Electr. Eng. Technol. 15, 561–569 (2020).
Khouili, O. et al. Evaluating the impact of deep learning approaches on solar and photovoltaic power forecasting: A systematic review. Energy Strategy Rev. 59, 101735 (2025).
Tang, H. H. & Ahmad, N. S. Fuzzy logic approach for controlling uncertain and nonlinear systems: A comprehensive review of applications and advances. Syst. Sci. Control Eng. 12(1), (2024).
Chen, Y., Wang, X. & Huang, R. Photovoltaic power interval prediction with conditional error dependency using Bayesian optimized deep learning. Sci. Rep. 15, 43887 (2025).
Song, Z., Xiao, F., Chen, Z. & Madsen, H. Probabilistic ultra-short-term solar photovoltaic power forecasting using natural gradient boosting with attention-enhanced neural networks. Energy AI. 20, 100496 (2025).
Sarmas, E., Dimitropoulos, N. & Marinakis, V. Transfer learning strategies for solar power forecasting under data scarcity. Sci. Rep. 12, 14643 (2022).
Xu, Y., Ji, X. & Zhu, Z. A photovoltaic power forecasting method based on the LSTM-XGBoost-EEDA-SO model. Sci. Rep. 15, 30177 (2025).
Ma, J. et al. Integrated CNN-LSTM for Photovoltaic Power Prediction based on Spatio-Temporal Feature Fusion. Eng. Rep. 7, e13088 (2025).
Singh, U., Singh, S., Gupta, S., Alotaibi, M. A. & Malik, H. Forecasting rooftop photovoltaic solar power using machine learning techniques. Energy Rep. 13, 3616–3630 (2025).
Wu, Z., Fang, G., Ye, J., Zhu, D. Z. & Huang, X. A reinforcement learning-based ensemble forecasting framework for renewable energy forecasting. Renew. Energy. 244, 122692 (2025).
Hou, Z., Zhang, Y., Liu, Q. & Ye, X. A hybrid machine learning forecasting model for photovoltaic power. Energy Rep. 11, 5125–5138 (2024).
Jannah, N., Gunawan, T. S., Yusoff, S. H., Hanifah, M. S. A. & Sapihie, S. N. M. Recent advances and future challenges of solar power generation forecasting. IEEE Access. 12, 168904–168924 (2024).
Fang, L., He, B. & Zhang, C. A multi-module framework for enhanced day-ahead photovoltaic power forecasting considering input heterogeneity. Expert Syst. Appl. 299, 129991 (2026).
Zhou, N., Shang, B., Xu, M., Peng, L. & Feng, G. Enhancing photovoltaic power prediction using a CNN-LSTM-attention hybrid model with Bayesian hyperparameter optimization. Glob Energy Interconnect. 7, 667–681 (2024).
Li, Y. & Chen, H. Image recognition based on deep residual shrinkage network. In Proc. Int. Conf. Artificial Intelligence and Electromechanical Automation (AIEA), 334–337 (2021).
Wu, Z., Sun, S., Tang, M. & Li, P. A Fourier neural operator enhanced physics-embedded iterative learning solver for electromagnetic scattering analysis. IEEE Antennas Wirel. Propag. Lett. 24, 1954–1958 (2025).
Villavicencio Paz, A., Romero Reyes, R. & Sahu, P. Planning multilayer networks under deep uncertainties: an approach based on flexibility in engineering design. J. Opt. Commun. Netw. 18 (4), 338–353 (2026).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Chen, T., Guestrin, C. & XGBoost A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (2016).
Ke, G. et al. LightGBM: A highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017).
Zhang, B., Xu, C., Dai, X. & Xiong, X. Research on mining land subsidence by intelligent hybrid model based on gradient boosting with categorical features support algorithm. J. Environ. Manage. 354, 120309 (2024).
Dhananjay, B. & Sivaraman, J. Analysis and classification of heart rate using CatBoost feature ranking model. Biomed. Signal. Process. Control. 68, 102610 (2021).
Wolpert, D. H. Stacked generalization. Neural Netw. 5, 241–259 (1992).
Coscrato, V., Inácio, M. H. A. & Izbicki, R. The NN-stacking: Feature weighted linear stacking through neural networks. Neurocomputing 399, 141–152 (2020).
Waring, J., Lindvall, C. & Umeton, R. Automated machine learning: Review of the state-of-the-art and opportunities for healthcare. Artif. Intell. Med. 104, 101822 (2020).
Zhang, Y. et al. Model-guided system operational reliability assessment based on gradient boosting decision trees and dynamic Bayesian networks. Reliab. Eng. Syst. Saf. 259, 110949 (2025).
Noura, H. N., Allal, Z., Salman, O. & Chahine, K. Explainable artificial intelligence of tree-based algorithms for fault detection and diagnosis in grid-connected photovoltaic systems. Eng. Appl. Artif. Intell. 139, 109503 (2025).
Ying, C., Shi, A. & Li, X. Hybrid boosted attention-based LightGBM framework for enhanced credit risk assessment in digital finance. Humanit. Soc. Sci. Commun. 12, 1–13 (2025).
Wang, Y. Personality type prediction using decision tree, GBDT, and CatBoost. In Proc. Int. Conf. Big Data, Information and Computer Network (BDICN), 552–558 (2022).
Pan, C., Poddar, A., Mukherjee, R. & Ray, A. K. Impact of categorical and numerical features in ensemble machine learning frameworks for heart disease prediction. Biomed. Signal. Process. Control. 76, 103666 (2022).
Nayem, Z. & Uddin, M. A. Unbiased employee performance evaluation using machine learning. J. Open. Innov. Technol. Mark. Complex. 10, 100243 (2024).
Guo, J. et al. Prediction of heating and cooling loads based on light gradient boosting machine algorithms. Build. Environ. 236, 110252 (2023).
Borisov, V. et al. Deep neural networks and tabular data: A survey. IEEE Trans. Neural Netw. Learn. Syst. 35, 7499–7519 (2024).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R. & Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Networks Learn. Syst. 28, 2222–2232 (2017).
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997).
Joshi, R., Ghosh, J., Kalani, N. & Tanna, R. L. Assessment of stacked LSTM, bidirectional LSTM, ConvLSTM2D, and autoencoders LSTM time series regression analysis at ADITYA-U Tokamak. IEEE Trans. Plasma Sci. 52 (7), 2403–2409 (2024).
Dhibi, K. et al. A hybrid fault detection and diagnosis of grid-tied PV systems: Enhanced random forest classifier using data reduction and interval-valued representation. IEEE Access. 9, 64267–64277 (2021).
Funding
This work received no external funding.
Author information
Authors and Affiliations
Contributions
T. Mariprasath, Rohini G: Conceptualization, Methodology, Software, Visualization, Investigation, Writing- Original draft preparation. Seif Al Bustanji: Data curation, Validation, Supervision, Resources, Writing - Review & Editing. Ievgen Zaitsev: Project administration, Supervision, Resources, Writing - Review & Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Rohini, G., Mariprasth, T., Bustanji, S.A. et al. A stacked Gradient Boosting–XGBoost ensemble with ridge meta-learner for accurate short-term solar PV power forecasting in smart grids. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47042-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-47042-3


