MCrossFormer: multi-level cross-scale transformer for photovoltaic power and lifespan prediction

Sun, JiaWen; Yang, WenZhong; Yin, YaBo; Sa, JinHai; Wen, JianQiang; Wei, FuYuan; Zhou, JianLi; Ma, Hui

doi:10.1038/s41598-025-27638-x

Download PDF

Article
Open access
Published: 16 December 2025

MCrossFormer: multi-level cross-scale transformer for photovoltaic power and lifespan prediction

JiaWen Sun¹,
WenZhong Yang¹,
YaBo Yin¹,
JinHai Sa¹,
JianQiang Wen²,
FuYuan Wei¹,
JianLi Zhou^3,4 &
…
Hui Ma¹

Scientific Reports volume 15, Article number: 43925 (2025) Cite this article

918 Accesses
Metrics details

Subjects

Abstract

Accurate prediction of photovoltaic (PV) module lifespan and power output is essential for ensuring system reliability and economic viability. However, this task remains challenging due to two main factors: the complex coupling of degradation mechanisms under varying environmental stresses, and the multi-scale temporal characteristics inherent in PV power data. To address these challenges, this study proposes an integrated approach combining a weighted power degradation coupling model with a deep learning-based forecasting framework. The research first systematically analyzes how key environmental factors–such as temperature, humidity, ultraviolet radiation, and thermal cycling–individually and interactively affect module degradation. Building on this physical understanding, we develop a neural network model capable of capturing multi-scale temporal patterns in power generation data. Besides, we propose a novel Multi-level Cross-scale Transformer (MCrossFormer) architecture to overcome the limited generalization ability of traditional PV power prediction models. It adopts three parallel encoder-decoder structures to capture the trend, periodic, and closeness characteristics, respectively. Also, in each encoder-decoder module, we design a long short-distance attention mechanism, which consists of a Short Distance Attention (SDA) module, a Long Distance Attention (LDA) module, and a Multilayer Perceptron (MLP), to dynamically identify and capture critical patterns from PV power time series data. Extensive experiments on three public benchmarks show that the proposed MCrossFormer achieves significant and consistent improvements over state-of-the-art models, underscoring its effectiveness in practical forecasting scenarios.

Multi-label machine learning for power forecasting of a grid-connected photovoltaic solar plant over multiple time horizons

Article Open access 23 September 2025

Historical and future learning for the new era of multi-terawatt photovoltaics

Article 23 December 2025

Actions for sustainably scalable multi-terawatt photovoltaics

Article 05 January 2026

Introduction

With the continuous development of the global economy, fossil energy sources such as petroleum, coal, and natural gas are rapidly depleting. These non-renewable resources are increasingly unable to meet the rising global energy demand, contributing to a growing energy crisis¹. At the same time, global challenges such as climate change and ecological degradation have pushed countries around the world to actively pursue sustainable energy alternatives. Among these alternatives, solar energy stands out due to its wide availability, zero emissions, low cost, and abundant reserves². It plays a crucial role in addressing both energy security and environmental concerns. In particular, the photovoltaic (PV) industry–especially crystalline silicon-based PV technology–has witnessed significant growth in recent years. As shown in Fig. 1, global installed PV capacity has increased dramatically over the past decade, reaching 2,200 GW by the end of 2024. Continued advances in PV technology are not only improving system performance but also driving down costs, making solar power more accessible and economically viable^3,4. Together, these factors are accelerating the global shift toward a cleaner, low-carbon energy structure^5,6.

PV modules are the core components of PV power generation systems, tasked with converting solar energy into electrical energy. However, as PV systems scale up, reliability issues related to PV modules have become more pronounced. Because they are directly exposed to natural conditions, PV modules must operate under harsh and variable environments, facing continuous environmental stress⁷. Additionally, external mechanical stresses–such as those caused during transportation, installation, and maintenance–can also lead to physical damage. Over time, both environmental and mechanical stresses contribute to performance degradation, ultimately shortening the service life of PV modules. The causes of performance degradation can be broadly categorized into internal and external factors. Internal factors include defects in encapsulation materials, imperfections from the manufacturing process, and electrical stress effects. External factors involve temperature fluctuations, high humidity, ultraviolet radiation, extreme temperature differences, and hail. Among these, external environmental factors are often the dominant cause of power output degradation, directly impacting long-term module performance⁸.

To ensure the operational reliability of PV modules, they must undergo rigorous quality testing based on International Electrotechnical Commission (IEC) standards before leaving the factory⁹. However, while IEC tests are standardized and systematic, they do not accurately reflect the actual service life of PV modules¹⁰. This is mainly because the stress levels applied during testing are much higher and more uniform than those encountered under real-world operating conditions. In practice, PV modules are exposed to a wide range of complex and unpredictable environmental stresses, including temperature fluctuations, humidity, ultraviolet (UV) radiation, and thermal cycling–all of which may act simultaneously and interactively over time¹¹. These combined stresses create coupled effects on module performance, making degradation a nonlinear and time-dependent process. As a result, experimental conditions in IEC tests fail to fully capture the random and interactive nature of real-world environments, leading to a noticeable gap between lab-tested and actual field performance^12,13. Moreover, the reliability and service life of PV modules directly affect the energy output and overall efficiency of PV power plants. Therefore, analyzing the failure mechanisms of performance degradation under real-world conditions is essential. This enables more accurate reliability assessments and lifespan predictions, providing both theoretical and data-driven support for the investment, construction, and long-term operation and maintenance of PV power systems. Ultimately, this approach helps extend module lifespans, improve power generation efficiency, and enhance economic returns¹⁴. However, previous studies inadequately address coupling effects and non-linearity modeling, often resulting in unreliable model predictions.

However, research on the performance degradation of PV modules struggles to directly reflect its systemic impact on the overall power generation capacity of PV power plants. On this basis, accurate power output forecasting effectively captures the temporal dynamics of PV generation, enabling optimized operation and maintenance strategies while providing reliable capacity assessments for grid dispatch. This capability ultimately enhances the stability and efficiency of the power system¹⁵.

Over the past decades, deep learning models have attracted significant attention in PV power prediction due to their strong capability in handling complex time series data. For example, Husein et al.¹⁶ reviewed the prediction accuracy of various deep learning approaches–including LSTM, Transformer, and CNN–for PV time series forecasting. Their experimental results show that the attention mechanism in Transformer models enables better capture of temporal dependencies, thereby enhancing forecasting accuracy. However, a key limitation in current PV power prediction models remains: data-driven deep learning methods struggle to effectively capture multi-scale temporal features, particularly the distinct patterns at hourly and daily time scales¹⁷. This shortcoming may compromise the models’ adaptability and flexibility in real-world PV power generation scenarios characterized by diverse and variable conditions.

This study addresses the challenges in predicting power degradation and lifespan of photovoltaic (PV) modules. It investigates these issues by examining the failure mechanisms under diverse environmental stresses and analyzing the coupling relationships between different degradation modes. Besides, as lifespan prediction method struggle to reflect the overall power generation capacity of PV power plants, we design a deep learning based PV power forecasting model. Existing deep learning approaches struggle to capture multi-scale temporal dependencies in PV power generation. Precisely to address this gap, we introduce the Multi-level Cross-scale Transformer (MCrossFormer). Its dedicated long-short distance attention mechanism is designed to capture these complex patterns, thereby improving forecasting precision. In addition, MCrossFormer incorporates three parallel encoder-decoder branches, each designed to model a distinct component of the PV power time series: trend, seasonality, and short-term fluctuations. Within each branch, both long-distance and short-distance attention modules are used to capture temporal dependencies across multiple time scales. Moreover, the model employs multi-scale feature extraction and fusion to further improve its generalization ability, enabling more accurate and adaptable PV power forecasting.

The main contributions of our study can be summarized as follows:

PV Module Failure Mechanisms Analysis: We propose a lifespan prediction method by analyzing the failure mechanisms of PV modules under various environmental stresses and the coupling relationships among diverse forms of power degradation.
Multi-Level Cross-Scale Transformer Model: To fully capture the multi-scale temporal characteristics, we propose a novel multi-level cross-scale Transformer model for PV power prediction. To the best of our knowledge, this work represents the first exploration of fine-grained, multi-scale feature fusion methods within the PV industry.

To validate the effectiveness of our proposed approaches, we conduct extensive experiments on three real-world datasets to evaluate both the lifespan prediction accuracy and power forecasting performance. The experimental results demonstrate that our methods achieve superior performance and consistently outperform other state-of-the-art techniques across multiple evaluation metrics.

The remainder of this paper is organized as follows. Section “Related work” provides a brief introduction of related work. Section "PV module lifespan prediction model" presents the lifespan prediction of PV modules and Section "PV power prediction model" describes the problem formulation and the details of our proposed prediction model. Section “Experimental details” presents the detailed experiment setup, followed by the analysis of the results in Section “Experimental results”. Finally, we conclude our study in Section “Conclusion”.

Related work

Power degradation mechanisms of PV modules

Performance degradation of PV modules refers to the irreversible decline in output power during operational service¹⁸. While manufacturers have implemented advanced solutions such as improved sealing, structural reinforcements, and anti-aging technologies to boost durability, PV modules still face relentless exposure to severe outdoor conditions. This constant exposure subjects them to complex combinations of environmental stresses that collectively drive performance degradation. These include temperature fluctuations, humidity variations, UV radiation, dust accumulation, and other synergistic factors. Such environmental stresses are inherently random and volatile, making the degradation process unpredictable¹⁹. Moreover, the resulting power loss often reflects complex coupling effects among different degradation mechanisms. Therefore, studying output power degradation under combined environmental stresses remains a key and ongoing challenge in the field of PV reliability engineering.

In existing research on the degradation mechanisms of PV module output power, several studies have examined the effects of limited environmental stresses on power degradation. For example, Rajput et al.²⁰ assessed how factors such as dust accumulation, partial shading, and humidity influence the reliability and performance degradation of PV modules during outdoor exposure. Mahdi et al.²¹ investigated PV failures and degradation mechanisms while also summarizing relevant fault detection techniques used in most PV systems. Similarly, Khan et al.²² compared the performance of different models addressing PV performance degradation and reliability. However, these studies fail to comprehensively capture the combined effects of complex environmental stresses on PV performance degradation under real-world outdoor conditions. As a result, they remain insufficient for accurately assessing long-term reliability or predicting the service lifetime of PV modules in practical applications.

PV module lifetime prediction methods

Currently, primary methods for PV module lifetime prediction primarily consist of statistical methods, physical approaches, and machine learning techniques²³.

Statistical methods

Statistical methods for lifetime prediction assume that selected parameters follow specific probability distribution models²⁴. By fitting these distribution functions to PV module power degradation data, the service lifetime of the modules can be estimated. Commonly used distributions include the exponential, normal, and Weibull distributions. For instance, Wu et al.²⁵ applied the Weibull distribution to model PV module lifetime under the influence of UV radiation, analyzing how different UV dose levels impact power degradation. However, the effectiveness of statistical methods depends on making reasonable assumptions about the characteristics of the degradation data to ensure appropriate distribution selection. Moreover, these methods are generally limited to short-term predictions, as they fail to capture the coupled relationships between degradation parameters over time²⁶. This limitation often leads to significant deviations between predicted and actual values in the later stages of module operation²⁷.

Physical methods

Due to stochastic fluctuations in environmental stresses, the performance degradation of PV modules exhibits inherent uncertainty and nonlinear behavior^28,29. Stochastic process approaches make use of the time-dependent characteristics of PV output power to model degradation as a stochastic process. This helps reduce prediction bias caused by uncertain factors and improves the accuracy of lifetime predictions³⁰. For example, Kaaya et al.²³ used physical models to evaluate multiple sources of uncertainty in PV degradation and lifetime prediction, including climate variable estimation, module reliability, and degradation rate modeling. However, the effectiveness of these methods depends heavily on the proper selection of stochastic process functions. Without careful model design, the ability to accurately characterize uncertainty in performance degradation diminishes, leading to suboptimal prediction results³¹.

Machine learning methods

Machine learning (ML) models are capable of estimating mapping functions between input and output variables and have proven effective in solving complex physical problems³². In the context of PV modules, ML models can extract degradation trends from historical data to estimate long-term power degradation rates, thereby enabling accurate lifetime prediction. Depending on modeling methodology, ML techniques can be broadly classified into two categories: traditional machine learning methods–such as Random Forests (RF), k-Nearest Neighbors (kNN), and Support Vector Regression (SVR)–and deep learning methods, including Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU). For example, Wen et al.³³ developed an LSTM-based model to predict PV power output using inputs such as temperature, humidity, wind speed, Global Horizontal Irradiance (GHI), and Diffuse Horizontal Irradiance (DHI). However, ML methods typically require large volumes of high-quality training data to build reliable models. In scenarios with limited real-world data, their generalizability tends to suffer, reducing prediction effectiveness.

Deep learning-based PV power prediction

The forecasting of PV power offers an effective solution to mitigate power output fluctuations, enhance grid stability, and support more accurate power generation planning. In recent years, deep learning-based methods have gained significant attention for their ability to model complex temporal patterns in PV data³⁴. For instance, Akarapon et al.³⁵ employed various machine learning and deep learning models–including Multilayer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNNs)–as core techniques for PV power forecasting. Christos et al.³⁶ proposed a novel deep learning framework that integrates power generation data from multiple interconnected countries to improve forecasting accuracy. Similarly, Guo et al.³⁷ utilized quantile regression for probabilistic prediction, combining CNNs with a bidirectional gated recurrent unit (BiGRU) and an attention mechanism to enhance forecasting performance. Cortez et al. applied a combination of LSTM and autoencoder networks to refine prediction results by learning the errors between predicted and actual values. However, autoencoders tend to be sensitive to input data distribution shifts, which may affect robustness. In addition, Husein et al.¹⁶ conducted a comparative review of various deep learning models, including LSTM, Transformer, and CNNs, for PV time series forecasting. Their experimental results showed that the Transformer model demonstrated the best performance in capturing temporal dependencies. Despite these advances, most existing methods overlook the importance of multi-scale features in PV time series data, which limits their generalization ability and leaves room for further improvement.

PV module lifespan prediction model

Degradation rate models for four degradation modes

Our proposed model accounts for four types of power degradation in PV modules, each associated with a specific environmental stressor or combination of stressors. These include: (A) Thermal-humidity degradation caused by temperature and humidity; (B) Ultraviolet (UV) radiation degradation caused by temperature and ultraviolet radiation; (C) Temperature cycling degradation caused by temperature and thermal cycling; (D) Humidity cycling degradation caused by temperature and humidity cycling. The corresponding output power degradation rate models for each of these degradation modes are described as follows.

(A) Thermal-humidity degradation mode

$$\begin{aligned} R_{(T, R H)}=A_{(T, R H)}\left( r h_{e f f}\right) ^\alpha \exp \left( -\frac{E_{(T, R H)}}{k_B T}\right) , \end{aligned}$$

(1)

where $R_{(T, R H)}$ represents the failure reaction rate of PV modules under the influence of temperature and humidity. $A_{(T, R H)}$ denotes the pre-exponential factor, which is related to material properties. $E_{(T, R H)}$ is the activation energy (unit: eV) required for the degradation of PV module output power due to temperature and humidity effects. $r h_{e f f}$ stands for the ambient relative humidity. $\alpha$ is the influence coefficient of relative humidity on PV module output power degradation. A higher value of $\alpha$ indicates a more severe impact of relative humidity on PV module power output degradation. $k_B$ is the Boltzmann constant ($8.62 \times 10^{-5}$), and T represents the ambient temperature (Unit: K).

(B) UV radiation degradation mode

$$\begin{aligned} R_{(T, U V)}=A_{(T, U V)}(U V)^\beta \exp \left( -\frac{E_{(T, U V)}}{k_B T}\right) , \end{aligned}$$

(2)

where $R_{(T, U V)}$ represents the failure reaction rate of PV modules under ultraviolet radiation influence. $A_{(T, U V)}$ denotes the pre-exponential factor related to material properties. UV stands for ultraviolet radiation intensity. $E_{(T, U V)}$ is the activation energy (in eV) required for PV module output power degradation caused by UV radiation. $\beta$ is the influence coefficient of UV radiation on PV module degradation. A higher $\beta$ value indicates more severe impact of UV radiation on PV module power degradation.

(C) Thermo-mechanical cycling degradation mode

$$\begin{aligned} R_{\left( T_{\max }, \Delta T\right) }=A_{\left( T_{\max }, \Delta T\right) }(\Delta T)^\gamma f_{\left( T_{\max }, \Delta T\right) } \exp \left( -\frac{E_{\left( T_{\max }, \Delta T\right) }}{k_B T_{\max }}\right) , \end{aligned}$$

(3)

where $R_{\left( T_{\max }, \Delta T\right) }$ represents the failure reaction rate of PV modules under humidity cycling. $A_{\left( T_{\max }, \Delta T\right) }$ denotes the pre-exponential factor related to material properties. $\Delta T$ is the temperature difference between maximum and minimum temperatures. $f_{\left( T_{\max }, \Delta T\right) }$ indicates the thermal cycling frequency - higher cycling frequencies lead to more severe PV module output power degradation. $E_{\left( T_{\max }, \Delta T\right) }$ is the activation energy (Unit: eV) required for output power degradation caused by thermal cycling effects. $\gamma$ represents the influence coefficient of thermal cycling on PV module degradation. A higher $\gamma$ value corresponds to more significant impact of thermal cycling on power degradation.

(D) Humidity cycling degradation mode,

$$\begin{aligned} R_{(T, \Delta R H)}=A_{(T, \Delta R H)}\left( \Delta r h_{e f f}\right) ^\delta f_{(T, \Delta R H)} \exp \left( -\frac{E_{(T, \Delta R H)}}{k_B T}\right) , \end{aligned}$$

(4)

where $R_{(T, \Delta R H)}$ represents the failure reaction rate of PV modules under thermal cycling stress. $A_{(T, \Delta R H)}$ denotes the pre-exponential factor related to material properties. $r h_{e f f}$ stands for the humidity difference between maximum and minimum humidity. $f_{(T, \Delta R H)}$ indicates the humidity cycling frequency. $E_{(T, \Delta R H)}$ is the activation energy (Unit: eV) required for output power degradation caused by thermal cycling effects. $\delta$ represents the influence coefficient of humidity cycling on PV module degradation. A higher $\delta$ value corresponds to more significant impact of humidity cycling on power degradation.

Since PV modules operate in outdoor environments, the coupling relationships between various environmental stressors and among different power degradation modes remain insufficiently understood. As a result, the overall power degradation of PV modules cannot be accurately estimated by simply summing the individual degradation rates in a linear manner. The four degradation rate models described earlier are each applicable to specific, isolated degradation modes. However, when interactions occur between multiple degradation processes, these individual models cannot be directly applied, as they do not account for the synergistic effects among stressors.

Weighted power degradation coupling model for integrated environments

Based on the failure mechanisms of PV modules and their real-world operating conditions, this paper proposes an output power degradation rate model that incorporates weighted values reflecting the influence of multiple environmental stressors. This model enables the quantification of coupling relationships among different degradation modes. The actual power degradation rate of PV modules under combined environmental stress conditions is expressed by,

$$\begin{aligned} R_{\textrm{PV}}=\left( 1+R_{(T, R H)}\right) ^m \cdot \left( 1+R_{(T, U V)}\right) ^n \cdot \left( 1+R_{\left( T_{\max }, \Delta T\right) }\right) ^p \cdot \left( 1+R_{(\Delta R H, T)}\right) ^l-1 , \end{aligned}$$

(5)

where $R_{\textrm{PV}}$ represents the actual output power degradation rate of PV module. $R_{(T, R H)}$ denotes the power degradation rates under thermal-humidity, $R_{(\Delta R H, T)}$ $R_{(T, U V)}$, denotes the power degradation rates under the ultraviolet radiation, $R_{\left( T_{\max }, \Delta T\right) }$, denotes denotes the power degradation rates under temperature cycling, and $R_{(\Delta R H, T)}$ denotes the power degradation rates under humidity cycling degradation modes. The coefficients m, n, p and l are the corresponding weighting factors that quantify the contribution of each respective degradation mode to the overall power degradation rate. These weighting factors are initially assigned based on established expert knowledge.

The exponential degradation model for the output power of PV modules

The actual degradation rate of PV module output power does not follow a simple linear relationship with the degree of degradation. To better capture the nonlinear characteristics of degradation behavior, this paper adopts an exponential-based function to model PV power output degradation, as shown in Eq. (6),

$$\begin{aligned} \frac{P_{m m p}(\textrm{t})}{P_{m m p}(0)}=1-\exp \left( -\left( \frac{B}{R_{\textrm{PV}} \cdot t}\right) ^\mu \right) , \end{aligned}$$

(6)

where $P_{m m p}(0)$ represents the initial maximum power point (MPP) power of the PV module, while $P_{m m p}(t)$ denotes the MPP power at time t. The parameter B is the power sensitivity coefficient, which generally varies depending on the material properties of the PV module. The symbol $\mu$ represents the shape parameter.

Since the lifespan of a PV module is typically defined as the time it takes for its maximum power point (MPP) power to degrade to 80% of its initial value, the MPP power, service life, and power degradation rate can be mathematically linked. Based on this relationship, the lifespan prediction model for PV modules proposed in this paper is formulated as follows:

$$\begin{aligned} t=\frac{B(\ln 0.2)^{-\frac{1}{\mu }}}{R_{\textrm{PV}}}, \end{aligned}$$

(7)

where $R_{\textrm{PV}}$ is calculated in Eq. (5).

PV power prediction model

In this section, we first provide a brief formulation of the PV power prediction problem. The detailed explanation of the proposed model is presented in the subsequent sections.

Problem formulation

In this study, we address the problem of high-dimensional time series forecasting. Let $\varvec{X}^{Tr}\in \mathbb {R}^{T_{Tr} \times D}=[\varvec{x}_1^{Tr},\varvec{x}_2^{Tr},...,\varvec{x}_T^{Tr}]^{\top }$ denotes the trend component of the PV power data, $\varvec{X}^{P} \in \mathbb {R}^{T_{P} \times D}=[\varvec{x}_1^{P},\varvec{x}_2^{P},...,\varvec{x}_T^{P}]^{\top }$ denote the periodic component of the PV power data, $\varvec{X}^{C}\in \mathbb {R}^{T_{C} \times D}=[\varvec{x}_1^{C},\varvec{x}_2^{C},...,\varvec{x}_T^{C}]^{\top }$ denote the closeness components of the PV power data. These three component correspond to weekly, daily, and hourly time series patterns. Here, $T_{r}, T_{P}, T_{C}$ denote the time steps associated with the weekly trend sequences, daily periodic sequences, and hourly closeness sequences, respectively.

The objective of this study is to formulate the task as an h-step-ahead forecasting problem. Specifically, we aim to learn a predictor $f(\cdot )$ that estimates the future PV power output $\varvec{\hat{Y}}\in \mathbb {R}^{h \times D}=[\varvec{y}_{T+1},\varvec{y}_{T+2},...,\varvec{y}_{T+h}]^{\top }$. Also, our proposed deep learning model quantifies the minor drift of key physical parameters (m, n, p, l) caused by long-term degradation mechanisms. Then, the formal objective function is defined as follows:

$$\begin{aligned} \varvec{\hat{Y}}=f(\varvec{x}_{1}^{Tr},\varvec{x}_{2}^{Tr},...,\varvec{x}_{T}^{Tr};\varvec{x}_{1}^{P},\varvec{x}_{2}^{P},...,\varvec{x}_{T}^{P};\varvec{x}_{1}^{C},\varvec{x}_{2}^{C},...,\varvec{x}_{T}^{C};m,n,p,l), \end{aligned}$$

(8)

where h denotes the prediction horizon ahead of the current timestep. In PV power scenario, h usually ranges from hours to months.

Model description

Figure 2 presents an overview of the proposed Multi-level Cross-scale Transformer (MCrossFormer), which aims to model the complex temporal dependencies of power output data. Transformers excel at capturing long-range dependencies in sequential data through self-attention mechanisms, overcoming the limitations of RNNs and CNNs in modeling distant contextual relationships. This architecture enables superior parallelization and more effective learning of complex temporal patterns compared to traditional recurrent models. Motivated by this, we design a novel Multi-level Cross-scale Transformer architecture with a long-short distance attention mechanism to improve prediction accuracy. The framework consists of three parallel encoder-decoder structures, which designed to capture the trend, periodic, and closeness features of PV power data at weekly, daily, and hourly levels, respectively. Besides, in each encoder-decoder module, we design a Long Short Distance Attention (LSDA) mechanism, which consists of three key modules: a Short Distance Attention (SDA) module, a Long Distance Attention (LDA) module, and a Multilayer Perceptron (MLP). Our architecture incorporates two complementary attention modules. The Short Distance Attention (SDA) focuses on capturing local dependencies within the power output series, while the Long Distance Attention (LDA) is designed to model global characteristics across the generation data. Finally, the outputs from the weekly, daily, and hourly branches are fused to generate the final prediction.

In the following sections, we will introduce each component of the architecture in detail.

Long short distance attention module

Multi-head attention enables the model to simultaneously learn diverse dependency patterns across different representation subspaces, thereby improving its ability to capture varied contextual relationships. To exploit this strength in PV power forecasting, we adopt a multi-head attention mechanism to extract meaningful temporal features from PV power time series. The self-attention module is divided into two complementary components: Short-Distance Attention (SDA) and Long-Distance Attention (LDA). In SDA, adjacent embeddings are grouped into contiguous segments of length G, all input tokens are evenly divided into multiple non-overlapping windows of fixed size. Self-attention is then computed independently within each window, allowing the model to effectively capture local dependencies.

In contrast, LDA samples embeddings at fixed intervals I to construct segments of length S, where $G = S / I$, enabling the capture of longer-range patterns. Actually, a “diluted” sampling is applied to all input tokens. Specifically, tokens are subsampled from the entire feature map at a fixed stride, forming a sparse set. Global self-attention is then performed on this sparse token set. Furthermore, both SDA and LDA then apply standard self-attention mechanisms within their respective segments to model intra-segment dependencies. This alternating structure enables the network to simultaneously capture both local and global information at different scales, leading to richer feature representations.

Fusion module

To account for the varying influence of different temporal components, we integrate the outputs from the hourly, daily, and weekly modules using a weighted fusion mechanism. This approach allows the model to assign appropriate significance to each temporal scale based on its contribution. The final prediction is formulated as:

$$\begin{aligned} \hat{\varvec{Y}} = \varvec{W}^{Tr} \hat{\varvec{Y}}^{Tr} + \varvec{W}^{P} \hat{\varvec{Y}}^{P} + \varvec{W}^{C} \hat{\varvec{Y}}^{C}, \end{aligned}$$

(9)

where $\hat{\varvec{Y}}^{Tr}, \hat{\varvec{Y}}^{P}, \hat{\varvec{Y}}^{C}$ denote the outputs of the hourly, daily, and weekly components, respectively. $\varvec{W}^{Tr}, \varvec{W}^{P}, and \varvec{W}^{C}$ are learnable weight matrices used to adjust the relative contribution of each temporal component to the final prediction.

Loss function

In our framework, the proposed MCrossFormer is trained by minimizing the mean squared error (MSE) between the predicted values $\hat{\varvec{Y}}_i$ and the corresponding ground truth values $\varvec{Y}_i$. The training objective is defined as follows:

$$\begin{aligned} L(\theta ) = \frac{1}{2} \sum _{i=1}^{M} (\hat{\varvec{Y}}_i - \varvec{Y}_i)^2 \end{aligned}$$

(10)

where $\theta$ are all trainable parameters in MCrossFormer, M is the number of samples.

Experimental details

Datasets description

The examined grid-connected PV installations were implemented through World Bank’s Africa Renewable Energy Access initiative. This initiative strengthened renewable energy capabilities at Kwame Nkrumah University of Science and Technology (KNUST) and partner institutions. The 20 kWp array integrates five distinct module technologies at 4 kWp per type: amorphous silicon (a-Si), monocrystalline silicon (mc-Si), polycrystalline silicon (pc-Si), Heterojunction with Intrinsic Thin-layer (HIT - combining a-Si/mc-Si), and Copper Indium Sulfide (CIS). Module selection intentionally encompassed both crystalline and thin-film configurations, including silicon and non-silicon materials.

In addition, this study conducts extensive comparative experiments using three real-world datasets, with details shown in Table 1.

Yamansu Dataset: Monthly power generation data collected from 125 zones and 2 inverters at the TBEA Yamansu PV Power Plant, spanning from April 2023 to April 2025.
Beitashan Dataset: Hourly operational data from Inverter #1 at the Three Gorges Beitashan PV Plant, ranging from January 1, 2021 to March 31, 2025. It includes three key features: DC voltage, DC current, and DC power.
Shichengzi Dataset: Hourly operational data from Inverter #1 at the Three Gorges Shichengzi Power Station, ranging from January 1, 2021 to March 31, 2025. It includes three key features: DC voltage, DC current, and DC power.

Table 1 The details of three real-world datasets.

Full size table

Experimental setting

Data cleaning and preprocessing

To improve the accuracy of PV module degradation rate calculations and reduce the effects of seasonal variation and extreme weather on output power, this study uses annual metrics to quantify environmental stresses. Specifically, six key environmental parameters are computed for each PV installation: Annual average ambient temperature, Annual average relative humidity, Annual average ultraviolet irradiation, Annual peak ambient temperature, Annual average cyclic temperature, Annual average cyclic humidity. Initial data filtration is conducted on the performance parameters of the PV plant, involving batch deletion and correction of erroneous records. For remaining outliers or noise not fully resolved during this step, a widely adopted median filtering technique is applied. In this process, each data point is replaced with the median value computed from its own value and a defined number of neighboring observations, effectively reducing anomalies and producing a smoothed dataset. Finally, the Min-Max normalization is utilized to scale the numerical data to the interval of [0, 1]. Furthermore, we transform metadata such as holidays and the day of the week via one-hot encoding.

Hyperparameters

We monitor module temperatures using PT100 thermocouples mounted on rear surfaces. Environmental parameters including ambient temperature and plane-of-array irradiance are captured by SMA’s Sunny SensorBox. Inverter performance metrics alongside SensorBox environmental readings undergo collation and storage within the Sunny WebBox data logger. Comprehensive installation details appear in the reference³⁸. Besides, grid search is conducted to tune all adjustable hyperparameters. For baseline models, we use the parameter settings provided in their original papers.

For our proposed MCrossFormer model, the key hyperparameters are set as follows: A batch size of 64, an initial learning rate of $1\times 10^{-3}$, and the Adam optimizer. The model dimension, $d_{\textrm{model}}$, is set to 128, and the number of multi-head attention heads, $N_{\textrm{head}}$, is 8. We employ 2 encoder layers and 1 decoder layer, with a dropout rate of 0.1. The model is trained for a maximum of 300 epochs, with an early stopping patience of 10 epochs to prevent overfitting. Additionally, we set the prediction horizon to $h = 1$, which corresponds to forecasting PV power for the next hour on the Beitashan and Shichengzi datasets, and the next month on the Yamansu dataset. Our model employs an open-loop forecasting strategy, where predictions are generated based exclusively on historical data without incorporating previously predicted values as inputs for subsequent time steps. The optimal results for each model are highlighted in bold. All experiments are performed on a Linux server equipped with four NVIDIA TITAN X (Pascal) GPUs, each with 12 GB of memory.

Baseline methods

To evaluate the effectiveness of our proposed method, we conduct a comparative analysis against several existing models commonly used for time series prediction.

GRU: A sequential model based on Gated Recurrent Units (GRU), known for its simple structure and broad application in time series forecasting.
LSTM: An advanced variant of recurrent neural networks that effectively captures long-term dependencies, addressing the limitations of traditional RNNs.
Transformer³⁹: Utilizes the attention mechanism to model both short-term and long-term dependencies in time series data.
Informer⁴⁰: Introduces the ProbSparse attention mechanism to reduce computational and memory costs while maintaining performance.
iTransformer⁴¹: Applies attention and feed-forward layers to transposed input dimensions in order to better capture multivariate correlations.

Metrics

Root Mean Square Error (RMSE): RMSE measures the square root of the average of squared differences between predicted and actual values. It is calculated as:

$$\begin{aligned} RMSE = \sqrt{\frac{1}{n}\sum _{i=1}^{n}(y_i - \hat{y}i)^2} \end{aligned}$$

Mean Absolute Error (MAE): MAE represents the average of absolute differences between predicted and actual values. It is defined as:

$$\begin{aligned} MAE = \frac{1}{n}\sum _{i=1}^{n}|y_i - \hat{y}i| \end{aligned}$$

Mean Absolute Percentage Error (MAPE): MAPE expresses the prediction accuracy as a percentage by calculating the mean of absolute percentage errors. It is formulated as:

$$MAPE = \frac{1}{n}\sum _{i=1}^{n}\left| \frac{y_i - \hat{y}i}{y_i}\right| \times 100\%$$

where n is the number of observations, $y_i$ represents the actual values, and $\hat{y}i$ denotes the predicted values.

Experimental results

Lifespan prediction of PV module

To validate the accuracy of the PV module lifespan prediction model, this study first calibrates the model parameters using field-measured maximum power point data from the Jinyang Solar Power Plant. The calibrated model is then verified using measured data from the Zuoyun Mingyang Solar Power Plant. Six key environmental parameters from the Mingyang PV power station are used as inputs to the lifespan prediction model. These include: annual average ambient temperature of 16.17°C, annual average ambient humidity of 23.99°C, annual average maximum ambient temperature of 70.11%, annual average ultraviolet radiation of 4.2 kWh/m², annual average cycling temperature of 15.05°C, and annual average cycling humidity of 56%. The predicted lifespan of the PV modules is illustrated in Fig. 3 where the light blue shaded area represents the 90% confidence interval of the prediction.

The prediction results are shown in Table 2. We present comparative experimental results for 20 samples, where both target and prediction values represent normalized output power data. As observed from the table, the weighted power degradation coupling model achieves accurate predictions for both PV module power degradation and lifespan. The absolute relative deviations between the predicted values and the field-measured data points are all within 1.43%, while the average relative deviations between the predicted values and the field-measured data points are all within 1.17%.

Table 2 Life prediction results considering the coupling relationship of weight values.

Full size table

Prediction performance of PV power

We systematically evaluated the performance of seven forecasting models on three real-world PV datasets and comprehensively measured the prediction accuracy of different models using average ranking. As shown in Table 3, MCrossFormer significantly outperforms the current best baseline model iTransformer across all three evaluation metrics: RMSE, MAE, and MAPE.

As shown in Table 3, ARIMA exhibits the poorest performance among the baseline models, primarily due to its limited ability to capture complex nonlinear patterns and long-term dependencies in the data. While both GRU and LSTM can effectively model long-term temporal dependencies through their gated mechanisms, their inherent sequential computation nature leads to two critical drawbacks: firstly, they cannot achieve parallel computation, significantly impacting training efficiency; secondly, they still face gradient vanishing problems when modeling ultra-long sequences, which is particularly evident in scenarios requiring analysis of annual PV power generation data.

The Transformer model, based on the multi-head attention mechanism, breaks through the limitations of sequential computation by enabling parallel access to information from any position in the sequence. However, the standard Transformer’s performance in PV forecasting still lags behind Informer and iTransformer. The fundamental bottleneck lies in its full self-attention mechanism–as sequence length increases, computational complexity grows quadratically, while global attention to all time points introduces substantial noise that obscures the physically meaningful patterns in PV generation data.

Informer effectively reduces computational overhead through its ProbSparse attention mechanism, enabling the model to focus on the most relevant temporal correlations. Its unique attention distillation technique refines key features layer by layer, forming more compact and physically meaningful temporal representations. iTransformer further breaks through the limitations of pure temporal dimension modeling by incorporating channel-wise feature extraction to construct flexible representations of multivariate time series. This architecture enables better generalization across different datasets and input structures.

The proposed MCrossFormer surpasses existing state-of-the-art methods across multiple metrics, an advantage derived from two key innovations. First, through carefully designed attention allocation strategies, MCrossFormer balances the capture of local details and global trends. In practical PV forecasting applications, this mechanism enables the model to both sensitively respond to power fluctuations caused by sudden weather changes and accurately grasp seasonal generation patterns. Second, MCrossFormer uniquely employs three parallel processing paths, each specializing in different temporal patterns: trend, periodicity, and short-term behavior. This modular architecture enables specialized modeling for different types of PV generation characteristics: the trend path captures the slow decline in generation efficiency due to component aging; the periodicity path learns regular patterns caused by sunrise/sunset and seasonal changes; while the short-term path handles rapid fluctuations caused by weather transitions. By intelligently recombining these specialized analysis results, more accurate power prediction is achieved.

Table 3 Prediction performance comparisons on three real-world datasets.

Full size table

Visualization analysis

In this section, we carry out a detailed visual analysis to assess the prediction accuracy of our proposed MCrossFormer model in comparison with the best-performing baseline, iTransformer. Fig. 4 presents one-month-ahead PV power forecasts for the Yamansu dataset. Figs. 5 and 6 show one-hour-ahead predictions for the Beitashan and Shichengzi datasets, respectively. For each dataset, the ground truth PV power values are shown as black solid lines, while predictions from MCrossFormer and iTransformer appear as red and blue solid lines, respectively. MCrossFormer demonstrates remarkable consistency in following actual PV power trends across both long-term (monthly) and short-term (hourly) forecasts, as shown in the figures. For instance, in Fig. 4, the red prediction line closely follows the black ground truth line, even during sharp peaks, drops, and rapid changes, particularly around data points 10–15 where power generation exhibits high volatility. MCrossFormer outperforms iTransformer in hourly prediction scenarios (Figs. 5 and 6) by more accurately capturing rapid power fluctuations with minimal lag. This improved responsiveness to sudden changes is particularly valuable for real-time grid operations. Unlike iTransformer which shows tendencies toward oversmoothing, our model’s multi-scale design provides the dual capability of maintaining short-term sensitivity while preserving long-term modeling capacity. This dual capability is particularly valuable for applications requiring both strategic planning and operational control. These visual comparisons underscore not only the model’s statistical accuracy but also its practical reliability in real-world PV power forecasting scenarios. Furthermore, MCrossFormer’s enhanced capacity to track complex temporal patterns indicates its adaptability to changing weather, seasonal shifts, and operational uncertainties frequently encountered in solar power generation.

Computational efficiency analysis

Suppose the sequence length is denoted by L and the feature dimension by D. In the attention mechanism, the computations involving the Q, K, V projections and the attention weight calculation have a time complexity of $O(N\cdot D^2 + N^2 \cdot D)$. The Feedforward Neural Network (FNN) contributes a time complexity of $O(N\cdot D^2)$. Thus, the overall computational complexity for each encoder-decoder module is $O(N\cdot D^2 + N^2 \cdot D)$. Consequently, the total computational complexity of MCrossFormer can be summarized as $\mathcal {O}(L^2 \cdot D + D^2 \cdot L)$.

As a result, training time grows quadratically with sequence length L and feature dimension D. The space complexity of MCrossFormer is $\mathcal {O}(L\cdot D^2)$, indicating that the parameter scale increases quadratically with the feature dimension and linearly with the sequence length.

Despite the model’s architectural sophistication, the complexity scales manageable with both time and space requirements growing at controlled rates. This analysis demonstrates that the performance improvements achieved by MCrossFormer come with reasonable computational overhead, making it practically viable for real-world PV power forecasting applications.

Conclusion

This study focuses on PV module lifetime prediction and power degradation modeling through an integrated theoretical and computational approach. We first establish a comprehensive theoretical foundation for analyzing degradation mechanisms under multi-stress environmental conditions, explicitly characterizing the coupling relationships between distinct degradation modes. Building on this foundation, we propose a novel MCrossFormer architecture that includes three specialized encoder-decoder branches, each designed to model trend, seasonality, and short-term fluctuations, respectively. The integration of long-short distance attention mechanisms allows the model to simultaneously capture global dependencies and local patterns. Extensive validation on three real-world datasets demonstrates that the proposed model consistently outperforms state-of-the-art benchmarks in PV power forecasting.

Future work will focus on refining the lifespan prediction model for PV modules operating in extreme or unique environmental scenarios to further enhance model applicability. In parallel, with continued advancements in deep learning, we will also explore its broader applications in PV power prediction to improve forecasting performance.

Data availability

The datasets, including Yamansu Dataset, Dongfeng Dataset, Shichengzi Dataset, were analysed during the current study are not publicly available due to licensing constraints and institutional agreements. But they are available from the corresponding author on reasonable request.

References

Machlev, R. et al. Explainable artificial intelligence (xai) techniques for energy and power systems: Review, challenges and opportunities. Energy and AI 9, 100169 (2022).
Article Google Scholar
Wu, H. & Xu, Z. Multi-energy load forecasting in integrated energy systems: A spatial-temporal adaptive personalized federated learning approach. IEEE Transactions on Ind. Informatics 20, 12262–12274 (2024).
Article Google Scholar
Cao, Yisheng, Yong, Pei, Yu, Juan & Yang, Zhifang. Stacking algorithm based framework with strong generalization performance for ultra-short-term photovoltaic power forecasting. Energy 322, 135599 (2025).
Article Google Scholar
Cao, Yisheng, Liu, Gang, Luo, Donghua, Bavirisetti, Durga Prasad & Xiao, Gang. Multi-timescale photovoltaic power forecasting using an improved Stacking ensemble algorithm based LSTM-Informer model. Energy 283, 128669. https://doi.org/10.1016/j.energy.2023.128669 (2023).
Article Google Scholar
Harbi, I. et al. Model-predictive control of multilevel inverters: Challenges, recent advances, and trends. IEEE Transactions on Power Electron. 38, 10845–10868 (2023).
Article ADS Google Scholar
Cao, Yisheng, Liu, Gang, Sun, Jian, Bavirisetti, Durga Prasad & Xiao, Gang. PSO-Stacking improved ensemble model for campus building energy consumption forecasting based on priority feature selection. J. Build. Eng. 72, 106589. https://doi.org/10.1016/j.jobe.2023.106589 (2023).
Article Google Scholar
Mayer, M. J. & Yang, D. Pairing ensemble numerical weather prediction with ensemble physical model chain for probabilistic photovoltaic power forecasting. Renew. Sustain. Energy Rev. 175, 113171 (2023).
Article Google Scholar
Shin, Dongha, Ha, Eungyu, Kim, Taeoh & Kim, Changbok. Short-term photovoltaic power generation predicting by input/output structure of weather forecast using deep learning. Soft Comput. 25, 771–783 (2021).
Article Google Scholar
Deb, M., Yang, Z., Haas, K. & Wang, T. Hydrokinetic tidal energy resource assessment following international electrotechnical commission guidelines. Renew. Energy 229 (2024).
Li, Yifan et al. WNPS-LSTM-Informer: A Hybrid Stacking model for medium-term photovoltaic power forecasting with ranked feature selection. Renew. Energy 244, 122687 (2025).
Article Google Scholar
Wu, H., Xu, Z., Wang, M. & Jia, Y. Full-model-free adaptive graph deep deterministic policy gradient model for multi-terminal soft open point voltage control in distribution systems. J. Mod. Power Syst. Clean Energy 12, 1893–1904 (2024).
Article Google Scholar
Adli, H. K. et al. Forecasting and analysis of solar power output from integrated solar energy and iot system. In 2021 5th International Conference on Informatics and Computational Sciences (ICICoS), 222–226 (IEEE, 2021).
Zhang, Z et al. A Multi-Task End-to-End Multivariate Long-sequence Time Series Prediction Model for Load Forecasting. IEEE Transactions on Smart Grid 1–1, https://doi.org/10.1109/TSG.2025.3605653 (2025).
Prema, V., Bhaskar, M. S., Almakhles, D., Gowtham, N. & Rao, K. U. Critical review of data, models and performance metrics for wind and solar power forecast. IEEe Access 10, 667–688 (2021).
Article Google Scholar
Wu, H. & Xu, Z. Prototype federated reinforcement learning for voltage regulation in distribution systems with physics-aware spatial-temporal graph perception. IEEE Transactions on Sustainable Energy (2025).
Husein, M., Gago, E., Hasan, B. & Pegalajar, M. Towards energy efficiency: A comprehensive review of deep learning-based photovoltaic power forecasting strategies. Heliyon 10, e33419. https://doi.org/10.1016/j.heliyon.2024.e33419 (2024).
Article PubMed PubMed Central Google Scholar
Li, H., Ren, Z., Xu, Y., Li, W. & Hu, B. A multi-data driven hybrid learning method for weekly photovoltaic power scenario forecast. IEEE Transactions on Sustainable Energy 13, 91–100 (2021).
Article ADS Google Scholar
Aghaei, M. et al. Review of degradation and failure phenomena in photovoltaic modules. Renew. Sustain. Energy Rev. 159, 112160 (2022).
Article CAS Google Scholar
Singh, R., Sharma, M. & Yadav, K. Degradation and reliability analysis of photovoltaic modules after operating for 12 years: A case study with comparisons. Renew. Energy 196, 1170–1186 (2022).
Article CAS Google Scholar
Rajput, P. et al. A comprehensive review on reliability and degradation of pv modules based on failure modes and effect analysis. Int. J. Low-Carbon Technol. 922–937 (2024).
Al Mahdi, H., Leahy, P. G., Alghoul, M. & Morrison, A. P. A review of photovoltaic module failure and degradation mechanisms: causes and detection techniques. In Solar 4, 43–82 (2024).
Ullah Khan, Z. et al. A review of degradation and reliability analysis of a solar PV module. IEEE Access 12. https://doi.org/10.1109/ACCESS.2024.3432394 (2024).
Kaaya, I., Ascencio-Vásquez, J., Weiss, K.-A. & Topič, M. Assessment of uncertainties and variations in pv modules degradation rates and lifetime predictions using physical models. Sol. Energy 218, 354–367 (2021).
Article ADS Google Scholar
Khan, Z. U. et al. A review of degradation and reliability analysis of a solar pv module. IEEE Access 12, 185036–185056 (2024).
Google Scholar
Wu, Z., Lv, S., Song, H. & Yun, M. Statistical modeling of uv-induced pv module power degradation based on acceleration tests. IEEE J. Photovoltaics 10, 144–149 (2019).
Article Google Scholar
Di Leo, P., Ciocia, A., Malgaroli, G. & Spertino, F. Advancements and challenges in photovoltaic power forecasting: A comprehensive review. Energies 18, 2108 (2025).
Article Google Scholar
Huang, C. & Li, Y. A short-term prediction method for pv power generation based on svm weather classification and pso-bp neural network. In 2023 IEEE 2nd International Power Electronics and Application Symposium (PEAS), 2544–2549 (IEEE, 2023).
Akram, M. W., Li, G., Jin, Y. & Chen, X. Failures of photovoltaic modules and their detection: A review. Appl. Energy 313, 118822 (2022).
Article Google Scholar
Liu, W., Xu, B., Liu, Y., Li, S. & Yan, W. A field-function methodology predicting the service lifetime of photovoltaic modules. Renew. Sustain. Energy Rev. 192, 114266 (2024).
Article Google Scholar
Malik, A. & Haque, A. Reliability and lifetime assessment of pv inverter with monte carlo approach. In 2023 Saudi Arabia Smart Grid (SASG), 1–7 (IEEE, 2023).
Wang, X. et al. Research progress of battery life prediction methods based on physical model. Energies 16, 3858 (2023).
Article CAS Google Scholar
Nassreddine, G., El Arid, A. & Nasseredine, M. Solar pv power prediction system based on machine learning approach. In 2023 IEEE international conference on Energy Technologies for Future Grids (ETFG), 1–7 (IEEE, 2023).
Wen, L., Zhou, K., Yang, S. & Lu, X. Optimal load dispatch of community microgrid with deep learning based solar power and load forecasting. Energy 171, 1053–1065 (2019).
Article Google Scholar
Lin, W., Zhang, B., Li, H. & Lu, R. Multi-step prediction of photovoltaic power based on two-stage decomposition and bilstm. Neurocomputing 504, 56–67 (2022).
Article Google Scholar
Lertwiputh, A., Watcharopas, C. & Wattuya, P. Forecasting of photovoltaic power using deep learning. In Proceedings of the 2024 6th Asia Conference on Machine Learning and Computing, 1–6, https://doi.org/10.1145/3690771.3690774 (2025).
Dimitriadis, C. N., Passalis, N. & Georgiadis, M. C. A deep learning framework for photovoltaic power forecasting in multiple interconnected countries. Sustain. Energy Technol. Assessments 77, 104330. https://doi.org/10.1016/j.seta.2025.104330 (2025).
Article Google Scholar
Guo, W., Xu, L., Wang, T., Zhao, D. & Tang, X. Photovoltaic power prediction based on hybrid deep learning networks and meteorological data. Sensors 24, 1593 (2024).
Article PubMed PubMed Central ADS Google Scholar
Quansah, D. A., Adaramola, M. S., Appiah, G. K. & Edwin, I. A. Performance analysis of different grid-connected solar photovoltaic (pv) system technologies with combined capacity of 20 kw located in humid tropical climate. Int. J. Hydrog. Energy 42, 4626–4635. https://doi.org/10.1016/j.ijhydene.2016.10.119 (2017).
Article ADS CAS Google Scholar
Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. (2017).
Zhou, H. et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. In The Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Virtual Conference, 35, 11106–11115 (AAAI Press, 2021).
Liu, Y. et al. itransformer: Inverted transformers are effective for time series forecasting. In International Conference on Learning Representations (2024).

Download references

Funding

This work was supported by the “Tianshan Talent” Research Project of Xinjiang (No. 022TSYCLJ0037), the National Natural Science Foundation of China (No. 62262065, 62341206, 62476233), the Science and Technology Program of Xinjiang (No. 2022B01008), the National Key R&D Program of China (No. 2022ZD0115800), and the Key R&D Program of Xinjiang (No. 2023B01033).

Author information

Authors and Affiliations

School of Computer Science and Technology, Xinjiang University, Urumqi, 830046, China
JiaWen Sun, WenZhong Yang, YaBo Yin, JinHai Sa, FuYuan Wei & Hui Ma
China Three Gorges Renewables (Group) Co., Ltd. Xinjiang Branch Company, Urumqi, 830011, China
JianQiang Wen
School of Economics and Management, Xinjiang University, Urumqi, 830046, China
JianLi Zhou
Engineering Research Center of Northwest Energy Carbon Neutrality (ERCNECN), Ministry of Education, Urumqi, 830046, China
JianLi Zhou

Authors

JiaWen Sun
View author publications
Search author on:PubMed Google Scholar
WenZhong Yang
View author publications
Search author on:PubMed Google Scholar
YaBo Yin
View author publications
Search author on:PubMed Google Scholar
JinHai Sa
View author publications
Search author on:PubMed Google Scholar
JianQiang Wen
View author publications
Search author on:PubMed Google Scholar
FuYuan Wei
View author publications
Search author on:PubMed Google Scholar
JianLi Zhou
View author publications
Search author on:PubMed Google Scholar
Hui Ma
View author publications
Search author on:PubMed Google Scholar

Contributions

Jiawen Sun, Yabo Yin, Jinhai Sa, and Fuyuan Wei wrote the main manuscript text, Jianqiang Wen provided experimental data from the company, Hui Ma revised the manuscript and added experiments, and Wenzhong Yang is the corresponding author, and he managed and funded its preparation. All authors reviewed the manuscript.

Corresponding author

Correspondence to WenZhong Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sun, J., Yang, W., Yin, Y. et al. MCrossFormer: multi-level cross-scale transformer for photovoltaic power and lifespan prediction. Sci Rep 15, 43925 (2025). https://doi.org/10.1038/s41598-025-27638-x

Download citation

Received: 06 August 2025
Accepted: 05 November 2025
Published: 16 December 2025
Version of record: 16 December 2025
DOI: https://doi.org/10.1038/s41598-025-27638-x

Subjects

Abstract

Similar content being viewed by others

Multi-label machine learning for power forecasting of a grid-connected photovoltaic solar plant over multiple time horizons

Historical and future learning for the new era of multi-terawatt photovoltaics

Actions for sustainably scalable multi-terawatt photovoltaics

Introduction

Related work

Power degradation mechanisms of PV modules

PV module lifetime prediction methods

Statistical methods

Physical methods

Machine learning methods

Deep learning-based PV power prediction

PV module lifespan prediction model

Degradation rate models for four degradation modes

Weighted power degradation coupling model for integrated environments

The exponential degradation model for the output power of PV modules

PV power prediction model

Problem formulation

Model description

Long short distance attention module

Fusion module

Loss function

Experimental details

Datasets description

Experimental setting

Data cleaning and preprocessing

Hyperparameters

Baseline methods

Metrics

Experimental results

Lifespan prediction of PV module

Prediction performance of PV power

Visualization analysis

Computational efficiency analysis

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links