iMOE: prediction of second-life battery degradation trajectory using interpretable mixture of experts

Huang, Xinghao; Tao, Shengyu; Liang, Chen; Tang, Yining; Chen, Jiawei; Shi, Junzhe; Li, Yuqi; Xia, Bizhong; Zhou, Guangmin; Zhang, Xuan

doi:10.1038/s41467-026-69369-1

Download PDF

Article
Open access
Published: 09 February 2026

iMOE: prediction of second-life battery degradation trajectory using interpretable mixture of experts

Nature Communications volume 17, Article number: 2549 (2026) Cite this article

4874 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Retired electric vehicle batteries offer immense potential to support energy infrastructure stability in underdeveloped regions through second-life use, but uncertainties in battery degradation behaviors pose major safety concerns. This work proposes an interpretable mixture of experts (iMOE) network that predicts battery degradation trajectories using partial, field-accessible signals in a single cycling operation. iMOE leverages an adaptive multi-degradation prediction module to classify battery degradation modes using expert weight synthesis learned from battery capacity-voltage and relaxation data. The module produces latent degradation trend embeddings, which are input to a use-dependent recurrent network for long-term degradation trajectory prediction. Validated on three typical use patterns (i.e. consistent operating histories, deeply aged batteries with unknown prior use, and uncertain second-life conditions, including 295 batteries, 93 use conditions, and 84,213 cycles), iMOE achieves an average mean absolute percentage errors (MAPE) of 0.95% with a 0.43 ms inference time for life-long battery degradation trajectory prediction. Compared to state-of-the-art Informer and PatchTST, it reduces computational time and MAPE by 50% and 77%, respectively. Compatible with data sampling in random state of charge regions, iMOE supports a 150-cycle time-horizon degradation trajectory prediction with 1.50% and 6.26% MAPE on average and at maximum, respectively. Notably, iMOE can operate effectively even with pruned 5MB training data while retaining 0.95% MAPE. Broadly, this network offers a deployable, history-free solution for battery degradation trajectory prediction at the time of second-life deployment, redefining how second-life energy storage systems are sensed, evaluated, controlled, and integrated for sustainable energy infrastructures at scale.

Artificial intelligence for battery reuse, recycling and remanufacturing

Article 28 April 2026

Collaborative and privacy-preserving retired battery sorting for profitable direct recycling via federated machine learning

Article Open access 05 December 2023

A multi-stage lithium-ion battery aging dataset using various experimental design methodologies

Article Open access 19 September 2024

Introduction

Lithium-ion batteries have been the core energy storage medium for electric vehicles (EVs) and renewable energy systems¹. However, it is projected that by 2030, 120 GWh of in-service batteries will reach end of life (EOL) and be retired from primary use, such as EVs, bringing considerable economic and environmental concerns^2,3. The reusing and recycling of these retired EV batteries has been identified emerging solution to improve the affordability and sustainability in battery lifecycle, particularly for the energy infrastructures in underdeveloped regions^4,5,6,7.

Degradation trajectory is a widely adopted indicator for battery degradation characterization, which can be critical to the safety performance during extended reusing or repurposing process, i.e., second-life of the retired batteries^8,9,10,11. For example, retired batteries undergo degradation trajectory evaluation before second-life use based on the cell, module or pack-level application requirements^12,13. Conventional non-data-driven methods relying on destructive disassembly or full-cycle capacity tests^14,15, inflicting extra damage or labor investments. Notably, the economic feasibility of these state measurement methods of retired batteries are impractical due to prohibitive time, momentary, and environmental costs at scale^1,16,17,18. Data-driven methods have demonstrated promises in predicting degradation trajectories using non-destructive electrical signal data, but they typically require 5–40% of full lifecycle historical data^19,20,21,22. Moreover, uncertain second-life use conditions can significantly differ from those in first-life, rendering conventional battery degradation trajectory prediction models ineffective, given that the data used for model training were built on constant-condition tests or typical dynamical cycling tests while the degradation trajectory is highly path-dependent^23,24,25. Lu et al. leveraged at least one full cycle of capacity-voltage curves to predict degradation trajectories under uncertain future operating scenarios by learning the relationship between use condition and battery capacity²³. Yet, the method encounters limitations due to the stochasticity in state of charge (SOC) of retired batteries and demanding requirement for full discharge-charge data. Moreover, the unavailability of historical cycling data of retired batteries complicates the tracing of initial conditions of degradations, such as impedance rise, loss of lithium inventory, and loss of active material^26,27,28,29, challenging degradation trajectory prediction effectiveness. At the policy level, global initiatives such as the Battery Data Genome Initiative³⁰ and the European Commission’s July 4, 2025, Delegated Regulation on battery recycling³¹ aim to enhance data standardization, traceability, and material recovery across the battery lifecycle. However, policy advances still face practical challenges due to frequent ownership change and uncertain second-life use conditions of retired batteries, emphasizing the need for degradation prediction frameworks that rely solely on field-accessible data and adapt to diverse operational environments.

Additionally, due to highly non-linear degradation of second-life batteries, the incorporation of physics knowledge into models has gained increased attention. Recent advances in sensory-based measurements include X-ray imaging³², electrochemical impedance, optical fiber sensing³³, acoustic sensing³⁴, and partial charging³⁵, which serve as the data input of the data-driven methods. Nevertheless, most sensing techniques remain at the laboratory stage and are invasive³³. Meanwhile, purely data-driven methods struggle to capture internal physical information of retired batteries³⁶. Given the understanding that internal physical state is particularly critical for second-life applications with safety-sensitive considerations, the challenge of the “black-box” nature and lack of interpretability in data-driven methods for battery state prediction should be addressed. Even if progresses have been made in physics-informed neural networks^37,38, the promise lies in additional and explicit integration of physical laws into feature engineering process^39,40, neural network loss functions³⁶, and transferability metrics⁴¹. However, the complex physical constraints involve numerous parameters and prior physical understanding from the modeling of retired batteries, which are still hardly available post-retirement¹². Tao et al. employed physics-informed machine learning to achieve full degradation trajectory prediction using early-cycle data, reducing data requirements, while still relying on historical data²². In recent years, the mixture-of-experts (MOE) architecture, a core component in large language models, has demonstrated notable performance in learning heterogeneous data representations. By decomposing complex tasks into sub-tasks handled by “expert” modules with specialized task knowledge, MOE captures multi-level and nonlinear feature representations. The ideal of decomposing tasks into multiple subtasks naturally aligns with coupled degradation mechanisms of batteries. However, in absence of physical interpretability, the learning outcomes of model are confined to purely statistical correlations and fail to reflect true physical mechanisms underlying degradation processes, particularly with diverse cathode chemistries⁴², historical usages⁴³, and uncertain second-life operating conditions²³. Incorporating physical knowledge into the MOE architecture anchors data-driven learning outcomes to typical electrochemical processes, thereby enhancing model interpretability and enabling mechanism-aware degradation trajectory prediction.

To fulfil this gap, this work proposes an interpretable mixture of experts (iMOE) network for predicting degradation trajectories of retired batteries under uncertain second-life use complexities without requiring historical data. As shown in Fig. 1a, the model uses electrical signal data from randomly sampled SOC regions at collection field of retired batteries, interactively incorporating assumed future use conditions to predict degradation trajectories toward extended time horizons in future second-life use. Figure 1b illustrates the designed adaptive multi-mode degradation prediction (AMDP) module, which integrates multiple expert systems’ weights studied from capacity-voltage and relaxation data to achieve classification of degradation modes of retired batteries. AMDP outputs latent degradation trend embedding vectors that are forwarded into a feature-operational recurrent neural network (FORNN), enabling interpretable and extendable prediction of degradation trajectories. In Fig. 1c, model’s applications on battery reusing and recycling are illustrated, highlighting a safety-aware allocation of second-life batteries based on their predicted degradation trajectories toward a sustainable battery circular economy. Without historical data, the proposed approach achieves an average mean absolute percentage errors (MAPE) of 0.88% in predicting degradation trajectories under uncertain and diversified second-life use scenarios, spanning 295 batteries and 93 use conditions and 84,213 cycles. Notably, the required input data can be easily extracted at random SOC region, without charging or discharging retired batteries to anticipated SOC levels. The proposed method can extend degradation trajectory prediction over a 150 cycle of horizon in future second-life use with an MAPE of 1.50%. This work underscores transformative potential of incorporating physics prior principles into predictive health management of retired batteries, enabling broader proactive fault detection and reliability assessment across complex, safety-critical systems, such as large-scale energy infrastructures, with careful consideration of durability, safety, and sustainable operation under data-scarce and data-heterogeneous conditions.

**Fig. 1: Model motivation, model architecture, and model deployment.**

Results

Methodology overview

The objective of this study is to compute battery degradation trajectories under uncertain future use conditions using one cycle data collected at random SOC regions. We use partial charging curves, relaxation voltage, and future conditions (charge/discharge current, temperature) as required model inputs (specific extraction methods are detailed in “Methods”). Input signal selection prioritizes two critical aspects: first, the controllability of charging processes in second-life battery ensures easier signal acquisition; second, relaxation voltage measurements remain unaffected by charge/discharge operations, thus enhancing model applicability. Six physics-informed features extracted from both relaxation voltage and partial charging curves (detailed in Supplementary Figs. 1–6and Supplementary Notes 1–2) characterize polarization and degradation states. These features drive the model to classify batteries with different aging modes into distinguishable latent subspaces. Notably, physics-informed feature extraction extends beyond the charging and relaxation voltage curves discussed here but serves as the inputs of the AMDP module.

A two-phase collaborative mechanism achieves deep integration of physical insights and data-driven approaches: (1) AMDP, and (2) FORNN. The AMDP module employs “degradation-router” routing to assign expert network weights based on the physical features, adaptively modeling degradation modes under dominant mechanisms for degradation trend representation⁴⁴, which can be formulated as:

$${{Trend}}_{i}={\sum }_{j=1}^{{{{{\rm{N}}}}}_{{{{\rm{expert}}}}}}{{{{\rm{G}}}}}_{{{{\rm{j}}}}}\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)\cdot {{{\rm{Exper}}}}{{{\rm{t}}}}_{j}\left({{{{\boldsymbol{Q}}}}}_{{{{\boldsymbol{i}}}}}\right)$$

(1)

where, ${{Trend}}_{i}$ denotes the computed degradation trend vector for the i-th battery sample. it is important to note that “Trend” is a high-dimensional feature vector embedded through a deep neural network, encoding the latent feature representation of the battery’s current health state and short-term degradation trend. Its length is the same as the degradation trajectory we predict. ${{{{\boldsymbol{F}}}}}_{i}$ represents the physics-informed features extracted from the field cycle, and ${{{{\boldsymbol{Q}}}}}_{i}$ is the partial charging curve (possibly with random initial SOC) for the field cycle. ${{{{\rm{N}}}}}_{{{{\rm{expert}}}}}$ is the total number of expert networks, and ${{{{\rm{G}}}}}_{{{{\rm{j}}}}}\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)$ is the router weight associated with ${{\rm{Exper}}}{{{\rm{t}}}}_{j}$, determined by a degradation-router mechanism. ${{\rm{Exper}}}{{{\rm{t}}}}_{j}\left({{{{\boldsymbol{Q}}}}}_{{{{\boldsymbol{i}}}}}\right)$ denotes a specific degradation trend estimator under the $j$-th dominant degradation mode.

The FORNN module iteratively integrates AMDP predictions regarding latent degradation trends with assumed future load conditions cycle-by-cycle, enabling proactive degradation trajectory computation without historical data, which can be formulated as:

$$\hat{{{{{\boldsymbol{S}}}}}_{{i}}}={{{\rm{FORNN}}}}\left({{{\rm{Tren}}}}{{{\rm{d}}}}_{{i}},{{{\boldsymbol{Con}}}}{{{{\boldsymbol{d}}}}}_{{{{\boldsymbol{i}}}}}\right)$$

(2)

where, $\hat{{{{{\boldsymbol{S}}}}}_{i}}$ represents the final predicted degradation trajectory (e.g., capacity fade curve) for the i-th battery sample over the computation time horizon, ${Tren}{d}_{i}$ is the short-term trend output from the AMDP module, and ${{{\boldsymbol{Con}}}}{{{{\boldsymbol{d}}}}}_{{{{\boldsymbol{i}}}}}$ contains future usage conditions such as charge/discharge current, which can be assumed available. The FORNN leverages the sequential modeling capability to capture how future load profiles affect the long-term degradation starting from the current state.

These two modules exhibit unique but complementary strengths that AMDP computes latent degradation trends from learned degradation modes while FORNN deals with second-life use complexities (see Supplementary Fig. 7 for the detailed machine learning pipeline). The degradation routing simultaneously considers physics-informed features reflecting current existing degradation modes and future load profiles to select optimal experts, forming a closed-loop physics-driven specialization and data-driven fusion architecture. Theoretically, iMOE’s design philosophy benefits from the success of collaborative learning^1,45, where the weighted collaboration from multiple networks enhances prediction robustness^46,47,48. In principle, iMOE applies to degradation trajectory computation across the entire lifecycle stage under arbitrary use conditions, including performance computation in other lifetime stages despite considerably different internal degradation modes.

Dataset

This work addresses the computation challenge of degradation trajectories for retired batteries under uncertain future use conditions using only partial cycling data at test field, i.e., no historical data is assumed. To validate computation performance when historical and future operating conditions are consistent but historical data is unavailable, we select the Uniform-Life (UL) dataset⁴⁹ for demonstration experiment. To validate computation performance when the historical operating conditions are unknown and the batteries are in a deeply aged state, we select the late-stage degradation (LSD) dataset⁴³ for demonstration experiment. To validate computation performance when the second-life use conditions change significantly as compared to their first-life counterpart, we select the two-phase second-life (TPSL) dataset²³ for demonstration experiment.

The UL dataset comprises batteries tested under different use conditions, with each batch cycled under consistent conditions throughout their entire lifespan until EOL, comprising three batches totaling 130 commercial 18650 cells. Batch 1 consists of LiNi_0.86Co_0.11Al_0.03O₂ positive electrode (NCA battery) with 3500 mAh nominal capacity and cutoff voltages of 2.65–4.2 V. Batch 2 contains LiNi_0.83Co_0.11Mn_0.07O₂ positive electrode (NCM battery) with 3500 mAh nominal capacity and cutoff voltages of 2.5–4.2 V. Batch 3 includes 42 (3) wt% Li(NiCoMn)O₂ blended with 58 (3) wt% Li(NiCoAl)O₂ positive electrode (NCM + NCA battery) with 2500 mAh nominal capacity and cutoff voltages of 2.5–4.2 V. All cells underwent cycling in thermal chambers under three temperatures (25 °C, 35 °C, 45 °C) with variable charge rates (from 0.25 C to 4 C) and fixed 1 C discharge rate. Each battery experienced identical operational profiles throughout its full lifecycle until reaching EOL (see Supplementary Table 1 for details). Degradation trajectories in Fig. 2a demonstrate cycling variability.

The LSD dataset contains 86 commercial batteries with LiNi_0.5Co_0.2Mn_0.3O₂ cathodes and graphite anodes. These 2.4 Ah batteries (3.7 V nominal) underwent two-phase testing, In Phase 1, the batteries were grouped into 16 distinct charge-discharge protocols to induce diverse degradation behaviors as the state of health (SOH) decreased from 100% to 80% (details in Supplementary Table 2). In Phase 2, all 86 cells were re-cycled from 80% to 50% SOH under a unified low-rate protocol (0.5 C charge/0.2 C discharge) to simulate stationary energy-storage operation. Degradation trajectories in Fig. 2b demonstrate Phase 2 cycling variability. We predict the deep degradation trajectories of second-phase batteries under unknown and varying historical operating conditions.

The TPSL dataset contains batteries subjected to diverse second-life scenarios, where they undergo both randomized and standardized operating conditions following their primary usage phase, containing 77 pieces of 18650 batteries with LiCoO₂ and LiNi_0.5Co_0.2Mn_0.3O₂ cathodes and graphite anodes. These 2.4 Ah batteries (3.7 V nominal, from 3.0 to 4.2 V cutoffs) underwent two-phase testing: Phase 1 involved 20 cycles of 0.5 C constant-current constant-voltage (CCCV) charging and 2 C discharging. Phase 2 divided batteries into two subgroups: (1) TPSL-Random (55 cells) subjected to stochastic charging profiles with current rates (1 C/2 C/3 C) changed randomly every 5 cycles that follow a uniform statistical distribution, combined with fixed 3 C discharge; (2) TPSL-Fixed (22 cells) cycled under predetermined charge/discharge current combinations (1 C/2 C/3 C) (details in Supplementary Table 3). Figure 2c shows degradation trajectories under uncertain use conditions. In the TPSL-fixed dataset, different magnitudes of charge and discharge currents lead to vastly different degradation trajectories. Meanwhile, in the TPSL-Random dataset, due to variations in the magnitude of the charging current, the difference in the maximum discharge capacity between adjacent cycles can even exceed 0.3 Ah, disproving the assumption of identical operational history for prediction reliability. Figure 2d reveals substantial capacity dispersion at cycles 21, 60, and 100 post-load variation, demonstrating how uncertain loads interact with battery degradations. Figure 2e details the input data construction process, where each sample combines raw charging curve sequences with physics-informed features, see “Methods” (data processing) and Supplementary Note 2.

Model performance and generalization capability

To validate the robustness of the proposed method under unknown historical data and uncertain second-life use conditions, we conduct evaluations in four real-world application scenarios (UL, LSD, TPSL-Random, TPSL-Fixed). iMOE adaptively computes nonlinear degradation modes under uncertain use conditions through the AMDP module, see Supplementary Table 4 for the associated model parameter.

Performance metrics (RMSE, MAPE, R²) are shown in “Methods” (Evaluation metric). It is emphasized that all experiments use current-cycle data with random initial SOC regions extracted from field conditions to compute degradation trajectories spanning dozens to hundreds of cycles, rather than EOL points (see the prediction horizon selection in Supplementary Note 3). As a time-series computation task, we benchmark the model performance against state-of-the-art (SOTA) sequence computation models^50,51 PatchTST and Informer (see Supplementary Note 4 and Supplementary Table 4 for detailed information). In Fig. 3a, existing research demonstrates that battery degradation can be divided into three typical while distinct degradation phases: SEI formation, SEI thickening, and lithium plating^8,52. During the battery’s lifecycle, the formation and thickening of the SEI layer impacts the battery’s capacity degradation in early stages, while lithium-ion deposition typically occurs in the later stages of the battery’s use and is one of the main causes of sharp capacity decline. However, it is important to note that the capacity knee at the EOL is not always directly related to lithium plating. In some cases, the capacity decline at the EOL may also be influenced by other degradation mechanisms, such as SEI thickening, loss of active material, or electrolyte degradation, which can contribute to the sharp capacity drop observed in later stages. The degradation patterns across these phases exhibit significant differences in both available capacity and subsequent aging rates, especially in the slope of the degradation trajectory, with each phase reflecting distinct degradation behaviors^53,54. Supplementary Fig. 8 illustrates that the definition of degradation stages and mechanisms can be regarded as typical and reasonable. We randomly select representative samples from each degradation stage to analyze performance, where each sample reflects distinct dominant degradation modes with varying capacity retention and aging rates. While PatchTST and Informer show errors across early, middle, and late degradation phases, iMOE demonstrates superior stability, delivering accurate current-cycle capacity estimation while generating trajectory predictions consistent with phase-specific degradation rates. Supplementary Figs. 9–11 validate iMOE’s effective prediction capability across full lifecycle samples.

Figure 3b reveals that in TPSL data, baseline methods display limited capability in capturing long-term trends after future load changes and completely fail during initial load transitions. In contrast, iMOE effectively identified transitional characteristics between phases through future usage condition integration. Supplementary Figs. 12-15 further demonstrate iMOE’s successful learning of relationships between future operating conditions and capacity performance. In LSD dataset, the baseline models exhibit larger errors in predicting the later stages of degradation trajectories. This is mainly due to the unknown and varying operating conditions in the first phase, which result in highly heterogeneous degradation behaviors during deep aging. In contrast, our proposed method, benefiting from the AMDP module, can achieve more stable and reliable prediction performance.

Figure 3c demonstrates that iMOE exhibits significant and stable advantages in both MAPE and R² metrics with the cross-dataset performance comparison setting. For UL dataset, iMOE achieved its worst prediction result on the NCA-25-1-1 dataset with an average MAPE of 0.98%, while maintaining MAPE below 1.00% across all UL conditions, showcasing robust prediction performance. PatchTST delivers the second-best results with an average MAPE of 0.85% and a maximum MAPE of 1.55% in the UL dataset, indicating that its patch-based tokenization strategy captures aging variations more effectively than single-point sampling. Informer performs the worst in this scenario, with an average MAPE of 1.48% and a maximum MAPE of 2.13%, suggesting that its sparse attention mechanism may lose critical information and fail to detect subtle temporal differences in voltage-capacity curves during such extreme prediction tasks. In LSD dataset, iMOE achieves an average MAPE of 1.82% for batteries in the deep degradation stage, whereas the Informer reaches an average MAPE of 2.68%. This result indicates that purely data-driven models struggle to make relatively accurate predictions under the unknown and varying operating conditions of the first-phase usage. In the TPSL dataset, iMOE shows even greater performance advantages over baseline methods, achieving an average RMSE of 0.05 compared to 0.38 for PatchTST and 0.39 for Informer, further confirming the necessity of incorporating future operating conditions in second-life applications.

In Fig. 3d, the scenario analysis confirms that iMOE achieves an average MAPE of 0.52% with a standard deviation of 0.28% for the UL dataset, 1.81% with a standard deviation of 0.13% for the LSD dataset. 2.96% with a standard deviation of 0.44% for the TPSL-Fixed dataset, and 2.81% with a standard deviation of 0.15% for the TPSL-Random dataset. iMOE performed better on the UL dataset with continuous operating history than on the LSD dataset with unknown historical conditions and deep degradation, as well as on the TPSL dataset with highly variable operating conditions (see Supplementary Table 5 for detailed information). This is primarily because continuous operation provides more stable degradation patterns, allowing the model to more accurately capture battery aging behaviors. In contrast, the deep degradation in the LSD dataset exhibits stronger nonlinearity due to differences in aging mechanisms, while the random load switching in the TPSL dataset introduces additional cycle-level nonlinear complexities, increasing the difficulty of degradation trajectory computation. Nevertheless, all MAPE values remained below 3.00%, demonstrating that the proposed model can effectively compute degradation trajectories under uncertain future conditions using only current-cycle data without relying on historical records.

In Fig. 3e, we analyze the relationship between model computational efficiency and performance to meet the real-world deployment requirements. After training, iMOE requires 0.43 ms to infer the entire degradation trajectory in the coming hundreds of cycles incorporating future operating conditions for a single retired battery, while achieving an MAPE of 0.93%. In comparison, baseline methods PatchTST and Informer required 0.58 ms and 0.87 ms per battery, with MAPEs of 4.15% and 4.73%, respectively. iMOE thus achieves the most robust prediction accuracy while emphasizing lightweight real-world deployment.

Unlike existing approaches that rely on fixed voltage or SOC regions and require time-consuming SOC recalibration, we validate iMOE’s robustness across five random initial SOC regions. In Fig. 3f, model prediction performance was relatively poorer at an initial SOC of 50%, with MAPEs of 1.44%, 3.35%, and 3.43% for the UL, TPSL-Random, and TPSL-Fixed datasets respectively, accompanied by standard deviations of 0.84%, 0.14%, and 0.50%. Reducing the initial SOC to 20% improved average performance by 63.8%, 16.2%, and 13.7%, respectively.

The observed performance disparity can be attributed not only to the reduced volume of training data available at higher initial SOC from a machine learning perspective but also, more fundamentally, to the inherent lack of adequate Open Circuit Voltage feature information in high-SOC data segments from a physicochemical standpoint, which hinders the model’s ability to differentiate degradation modes such as lithium inventory loss and active material loss. Furthermore, as the computations for the UL dataset do not involve cycle-level condition changes, the model outcomes rely more heavily on the data volume at the initial SOC, resulting in more pronounced performance degradation as the initial SOC increases.

Rationalization of statistical model performance

Here we explore the fundamental mechanisms underlying performance improvements. In Fig. 4a, we first randomly select NCA battery test samples from the UL dataset to visualize the weight assignments of the degradation router throughout their full lifecycle. Without requiring historical data, the degradation router partitions battery lifecycle degradations into three distinct phases using partial cycling data, which is evidenced by existing literature^8,52. It is observed that early-retirement samples are predominantly governed by Expert Networks 1 and 4, mid-life retired batteries exhibit higher weights for Expert Networks 2 and 3, while late-life retired batteries are primarily regulated by Expert Network 5. These results demonstrate iMOE’s capability to identify evolving degradation patterns, with clear functional specialization among expert networks for different degradation stages. Mechanistically, this suggests Expert Networks 1, 3, and 5, respectively, dominate initial SEI layer formation, subsequent thickening, and lithium plating processes, consistent with prior research on primary battery degradation modes⁵². Expert Network 4 maintains high weights during both early and late degradation phases, potentially reflecting its statistical correlation with high degradation rates, i.e., slope of the degradation trajectories, rather than specific mechanistic contributions.

Figure 4b takes Battery 1 as an example to deeply analyze the dynamic correlation between individual cycling samples and expert network weight assignments during capacity degradation. Single-cycle samples at similar degradation stages exhibit clustering characteristics in expert weight allocation, demonstrating highly consistent expert category selection. Meanwhile, the degradation router adaptively adjusts the weights of different expert networks based on the features of individual cycling samples, thereby achieving differentiated prediction of degradation variations both between cycles and across degradation stages. Supplementary Fig. 16 further confirms the observed expert weight dependencies across different battery material systems and second-life use conditions.

We perform T-SNE dimensionality reduction experiments on expert weights across the entire test battery cohort (details in Supplementary Note 5) to visualize the separability of decision boundaries among expert networks. In Fig. 4c, despite significant inter-cell variability, samples from early and late degradation stages form distinct decision boundaries, as demonstrated in Supplementary Figs. 17–19 that iMOE has generalizability across battery material systems and operating conditions. These findings raise a critical question: can the trained iMOE model directly classify retired batteries based on degradation router-generated weight patterns? In essence, this question is an interpretability analysis showing why AMDP should work well.

In Fig. 4d, for the 130 UL dataset batteries retired at different SOH levels (classify solely based on expert weights, see Supplementary Note 6 for criteria), those retired at 95% SOH achieve an “Excellent” classification confidence of 91.5%, while batteries retired at 75% SOH receive a “Degraded” rating of 96%. This further demonstrates that the model utilizes different primary experts for prediction based on the dominant degradation mechanisms at various stages, thereby enabling the reverse classification of retired batteries at different degradation stages according to expert selection. Supplementary Figs. 20–22and Supplementary Table 6 visualize the expert weight distributions across different SOH, with marked differences further demonstrating our proposed method’s novelty and simplicity of AMDP module that can classify the degradation mode using a few electric measurements. Figure 4e examines post-classification computation accuracies of degradation trajectories. iMOE achieves MAPEs of 0.47%, 0.52%, and 0.47% for batteries retired at 95%, 85%, and 75% SOH, respectively. While predictions remain robust for 95%-85% SOH, samples at 75% SOH show maximum outliers, likely due to coupling of accumulated and other unobserved degradation mechanisms.

In Fig. 4f, controlled experiments with noise addition validate feature importance and expert network specialization. By individually applying Gaussian noise ($\sigma=$1) to each physical feature while holding other features at their original value, we observe elevated errors from baseline MAPE of 1.13% when features are perturbed with noise. Supplementary Figs. 23–24 present the performance analysis under conditions of missing values and reduced feature inputs. With the same amount of noise injections, the feature whose perturbation causes the largest decrease in degradation trajectory computation accuracy is regarded as the most important. It reveals that features extracted from relaxation voltage that is reflective of internal resistance demonstrate greater noise resilience than those from capacity-voltage curves (particularly their mean values, after adding noise, the MAPE increased from 1.13% to 1.38%, representing a 22% relative increase, as these means directly reflect charge storage capacity per voltage increment). The observed performance decrease caused by misrouting samples to expert networks not specialized for their specific degradation phases (e.g., late-stage samples erroneously routed to Expert Network 1 instead of 5) confirms the functional divergence among expert networks, enabling history-free computation of degradation mode. Supplementary Figs. 25–26 show reduced specialization when degradation trajectories are geometrically similar, while demonstrating that iMOE’s superior performance does not strictly rely on any single physical feature. Supplementary Fig. 27 and Supplementary Note 7 presents the SHAP and Spearman importance analysis for different features. We further analyze the model results after removing features with lower SHAP importance in Supplementary Fig. 28. The results indicate that, in certain scenarios, redundant features can be removed to maintain a balance between computational efficiency and accuracy.

In Fig. 4g, we investigate the contribution of the AMDP module to predict accuracy using an ablation experiment. When replacing AMDP with a simple linear layer while keeping other components unchanged, comparative analysis of the embedded representations (see Supplementary Note 8 for details) shows that while the ablated model can roughly distinguish degradation stages, it fails to capture subtle SOH variations within similar stages. This results in discontinuous embeddings and unreasonable trajectory predictions, highlighting AMDP’s critical role in enabling robust long-term predictions under history-free settings, Supplementary Fig. 29 directly shows the magnitudes of different cycling trend metrics. By decoding implicit semantics of expert weights, iMOE transforms from a prognostic tool into an interpretable decision engine, bypassing traditional sequential assessment workflows that require extensive data curation and model calibration in real-world deployment.

Uncertainty analysis

Under the assumption of available historical data and consistent second-life operational conditions, we extend our analysis to scenarios where partial historical data is accessible. While this significantly simplifies the prediction task, it does not align with the complex data composition of real-world applications, using polynomial fitting and MLP-based approaches to map historical trajectories to future degradation (see implementation details in Supplementary Note 9). In Fig. 5a, without requiring battery-specific mechanistic analysis, a simple polynomial regression and single-layer MLP achieved a 0.79% and a 0.93% MAPE, respectively. We demonstrate the iMOE’s capability in computing degradation trajectories using historical data regarding maximum capacity to validate its adaptability to data-accessible scenarios (see Supplementary Fig. 30 and Supplementary Note 10 for implementation details), With 10 historical maximum capacity points accessible, iMOE achieves an average MAPE of 0.53% when predicting the next 50 cycles. We stress that these historically-dependent methods become infeasible when historical data is unavailable in real-world settings, underscoring the practical limitations of such assumptions. To quantify component contributions of AMDP and FORNN, we designed two ablation studies: (1) replacing AMDP with a linear layer (iMOE-linear); and (2) substituting FORNN with a standard RNN (iMOE-woFO). After replacing the AMDP architecture with a linear layer directly, the overall MAPE increased from 0.95% to 1.19%, which aligns with the observation in Fig. 4e that the embedding vectors became indistinguishable between different degraded samples after removing the AMDP architecture. This demonstrates that the adaptive integration of predictions from different degradation modes contributes to robust and high-precision estimation. When replacing FORNN with a standard RNN, the model exhibited a significant decline in prediction performance for TPSL dataset. This further underscore the necessity of incorporating future load conditions in second-life scenarios, as future operating conditions often differ from those in their first-life service patterns.

Figure 5b evaluates the iMOE performance under limited training data access (data partitioning details are provided in Supplementary Note 11). When training data was reduced from 100% to 40%, the model performance maintained stable on UL dataset, achieving an average MAPE of 0.75% with a 0.56% standard deviation. However, a linear performance decrease was observed on TPSL dataset, with an average MAPE increasing to 3.86% with a 0.88% standard deviation. When training data was reduced to 20% (32 batteries in total, pruned training data size 5MB), both datasets exhibited noticeable error and uncertainty increase, with MAPE reaching 1.07% for UL and 5.30% for TPSL. This training data access experiment indicates that learning future operating conditions’ impact on battery degradation trajectories requires higher data curation costs, whereas the UL dataset demonstrates relatively lower data access dependency. Despite this observed data access challenge, Supplementary Tables 7–9 compare baseline model performance under scarce data access, validating iMOE’s superior generalization capability in few-shot learning scenarios.

The decomposition of degradation subspaces and dominant mechanism prediction depend on the number of experts and TopK selection (see “Methods”). Figure 5c examines the impact of number of experts on the iMOE performance. The iMOE demonstrates relative insensitivity to hyperparameter modifications, across experiments with five distinct expert network quantities, the average MAPE was 0.98%. when employing 4 experts, the model showed comparatively poorer performance with an average MAPE of 1.03%, while the 6-expert configuration exhibited inferior stability with a standard deviation of 1.04%. After carefully balancing practical deployment requirements against performance metrics, we ultimately adopted a configuration with 5 experts and TopK = 2.

Different application scenarios have varying requirements for prediction horizons. In practical applications, it is desirable to predict degradation trajectories as far as possible while maintaining acceptable performance. As shown in Fig. 5d, we present results across different prediction lengths. When the prediction length is reduced, model performance improves, achieving an average MAPE of 0.80%. In most cases, model performance degrades with increasing prediction length. Nevertheless, even when predicting 150 future cycles from current-cycle data, the average MAPE remains below 1.50%, demonstrating iMOE’s robustness in long-term forecasting. Supplementary Tables 10–12 compare our method’s performance with baseline approaches under extended prediction horizons. The results show that all models exhibit performance degradation as prediction length increases. When predicting future 100-cycle degradation trajectories, PatchTST and Informer achieve average MAPEs of 8% and 7.98%, respectively, while iMOE maintains a lower average MAPE of 1.47%, enabling more accurate long-term predictions. We further analyzed the model’s predictive performance under unknown and varying historical operating conditions at different levels of deep degradation. As shown in Fig. 5e, across six SOH intervals from 80% to 50%, the model achieved consistently good prediction accuracy, with an average MAPE of 1.81% and a standard deviation of 0.16%. However, when SOH fell within the 55–50% range, the prediction accuracy decreased, with an average MAPE of 2.00%, reflecting the increased difficulty of modeling degradation trajectories when multiple degradation mechanisms interact at advanced aging stages.

Discussion

Existing research paradigm of degradation trajectory prediction predominantly relies on historical data access and assumed identical future use conditions, which is often not true for second-life batteries¹. This paradigm inflicts either expensive lifecycle data acquisition or time-consuming capacity calibration tests post-retirement, also raising safety concerns due to its inability to account for second-life use uncertainty. The proposed iMOE network, after offline training, utilizes data already available in battery management systems in deployment scenarios to directly predict degradation trajectories under assumed but uncertain future second-life use conditions, without relying on historical data, eliminating the need for extra offline testing. The success of iMOE stems from adaptively and interpretably classifying degradation modes, statistically the geometric slope of the degradation trajectory at the observation point, to inform use-condition-dependent degradation trajectory prediction in the extended second-life use.

Through case studies involving four usage scenarios (i.e., UL, LSD,TPSL-Random, TPSL-Fixed), 295 independent cells, and 93 secondary-use operating conditions, we demonstrate that iMOE achieves a promising lifecycle MAPE of 0.95% in computing degradation trajectories of 50 future cycles using only real-time online data, while reducing computation time by up to 50% compared to the SOTA time-serials computational models. The average MAPE values across all validated datasets were reduced by 77% as compared to PatchTST and Informer. It is argued that iMOE is flexible with fixed and random SOC regions, allowing for seamless integration into second-life batteries that exhibit random initial SOC distributions at the collection field. By adaptively integrating degradation mode classification using proposed AMDP module and assumed uncertain future use conditions, the battery degradation trajectory prediction horizon can be extended to 150 cycles, with an average MAPE of 1.50% with a maximum MAPE below 6.26%. Hyperparameter sensitivity analysis on the number of experts confirms model’s robustness under uncertain second-life use conditions, with an average MAPE of 0.98%. The interpretability of the iMOE is demonstrated via the visualization of latent embeddings from expert weights that represent degradation modes, thus the model is physically interpretable.

iMOE model shows great potential to transform the battery reusing and recycling industry landscape by reducing the need of manual-assisted testing approach to an automated data-driven decision support system. However, it must be acknowledged that iMOE approach primarily relies on electrochemical characteristics (such as voltage-capacity relationships and relaxation voltage) for macroscopic diagnostics. While extensive validation confirms the physical understanding of these features^19,49, they essentially remain statistical correlations instead of physical causality. It is recommended that future work should focus on non-invasive in-situ sensing signals^33,34,55, such as vibration sensing, strain sensing, ultrasonic signals, fiber optic sensing to enhance internal state observation for a physics-informed modeling, instead of physically interpretable presented in this work. Building a “physics-statistics” fusion framework has the potential to improve the interpretability of iMOE³⁶. For example, parameters from battery electrochemical or equivalent circuit models could be updated online as an intermediate physical interpretation layer for degradation trajectory prediction. Integrating primary governing equations of battery modeling with data-driven methods would also provide physical interpretability³⁹.

With acknowledging the assumption that future use conditions can be assumed, it is reasonable for verifying the effectiveness of iMOE under predefined uncertain scenarios with considerable uncertainties, specifically, consistent operating histories, deeply aged batteries with unknown prior use and varying second-life conditions with 95 use conditions being validated. However, despite practical applications such as home energy storage and integrated photovoltaic-storage systems often have pre-defined operational plans, the actual operating conditions of battery energy storage systems remain even more uncertain and are difficult to predict^56,57. Although this work has discussed the importance of introducing future use conditions in the FORNN structure, the analysis of the uncertainty of future use conditions needs to be extended to highly random and time-varying use scenarios. Thus, future work could focus on introducing load conditions at finer temporal resolutions to ensure the validity of predicted use conditions of retired batteries. Another aspect is that random initial SOC is an important factor in the data availability of retired batteries. Although the effectiveness of iMOE in predicting degradation trajectory under different initial SOC conditions has been validated, the initial SOC of retired batteries is more uncontrollable due to the time-consuming nature of acquisition such information. Future work suggests considering SOC normalization during the feature extraction phase or introducing an SOC-aware routing mechanism in the model, to better handle battery degradation difference under different SOC states⁵⁸. Although the paper has considered four very different application conditions, namely UL, LSD, TPSL-Random, TPSL-fixed, given that the second-life application of the battery is safety-critical, more extreme conditions need to be investigated. This is because the robustness of the dataset to non-steady-state use conditions, such as the battery with thermal runaway or internal short circuits, has not been included. Considering the bias of the training data is crucial, and it is necessary to assess the uncertainty and confidence intervals of the predicted degradation trajectories, as UL and TPSL data hardly cover all environmental fluctuations under actual service conditions, such as thermal non-uniformity at the battery pack level. Future work is suggested to collect real-world variables that include different manufacturers, degradation mechanisms, and cell-to-cell variations.

This work demonstrates the potential of history-free, proactive degradation trajectory computation amid second-life use uncertainty, reducing the need for time-consuming and costly offline testing while ensuring safety compliance. The iMOE has demonstrated technical feasibility in retired battery deployment decision-making at scale by laying groundwork for light, interpretable and generalizable battery management models. Broadly, this work underscores the promise of integrating physics insights into predictive health management models for early malfunction detection and reliability assessment, paving the way for safer and more durable operation of critical infrastructures such as energy storage systems.

Methods

Data preparation

Input signals

We consider two signals in each cycle: partial charging profile that starts at a random voltage (i.e., a random initial SOC) and ends at the cutoff voltage. Relaxation voltage curve recorded immediately after charging. Let $I\left(t\right)$, $V\left(t\right)$, and $t$ denote the instantaneous current, voltage, and time, respectively. We track the battery voltage from ${V}_{\min }$, which varies across samples due to random SOC, up to the cutoff voltage ${V}_{{end}}$.

Partial charging curve

We divide voltage range $[{V}_{\min },{V}_{{end}}]$ into $M$ increments of size $\Delta V$. The partial charging curve $q$ denotes the cumulative input charge as a function of voltage:

$${{{\boldsymbol{q}}}}=\left[{q}_{0}\left(V\right),{q}_{1}\left(V\right),\cdots,{q}_{M}\left(V\right)\right],{q}_{i}\left(V\right)={\int }_{t({V}_{\min })}^{t({V}_{\min }+i\Delta V)}\left|I\left(t\right)\right|{dt}$$

(3)

In experiments, we record 50 segments at equal intervals (i.e., $M=50$), yielding a 50-dimensional sequence.

Relaxation voltage curve

To capture the relaxation behavior, we measure the open-circuit voltage ${v}_{k}(t)$ for $k$ = 0, 1,…, $M$ at 30-minute intervals after charging concludes:

$${{{\boldsymbol{v}}}}=\left[{V}_{0}\left(t\right) \, {V}_{1}\left(t\right)\cdots {V}_{M}\left(t\right)\right]$$

(4)

This sequence of length $M+1$ reflects how the polarized voltage recovers over time.

Feature engineering

From partial charging and relaxation voltage signals, we extract 12 features derived from physical considerations, including polarization overcharge, relaxation slope, and voltage plateau transitions. All 12 features are normalized to $\left[{{\mathrm{0,1}}}\right]$ before entering the model. The final input to the iMOE consists of the partial charging curve $q\in {{\mathbb{R}}}^{50}$ and the feature vector ${{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\in {{\mathbb{R}}}^{12}$ derived from both the charging and relaxation profiles.

Model architecture

The proposed interpretable mixture of deep expertized learning network consists of two modules, Adaptive Multi-degradation Prediction (AMDP) module and a Future-Operation Recurrent Neural Network (FORNN), to include current-cycle degradation phenomena and prospective operating conditions that may arise in second-life applications.

Adaptive multi-degradation prediction (AMDP)

Formally, let ${{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\in {{\mathbb{R}}}^{12}$ be physics-informed features for the $i$-th sample, and ${{{\boldsymbol{q}}}}=\left[{q}_{0}\left(V\right),{q}_{1}\left(V\right),\cdots,{q}_{M}\left(V\right)\right]$ be its partial charging curve. The AMDP model defines the expert networks, each denoted ${{{\rm{Exper}}}}{{{{\rm{t}}}}}_{{{{\rm{j}}}}}$, $j={{\mathrm{1,2}}},\ldots,E$. A degradation mode routing ${{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\mapsto g$ produces a probability weight $g\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)$ across these experts:

$$g\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)={Softmax}({{{\rm{TopK}}}}\left({Softmax}\left({{\rm{H}}}\left({{{{\boldsymbol{F}}}}}_{i}\right)\right),k\right))$$

(5)

where, $H\left({{{\bf{F}}}}\right){{{\boldsymbol{\in }}}}{{\mathbb{R}}}^{E}$ is a feed-forward transformation with a noise term:

$${{\rm{H}}}\left({{{{\boldsymbol{F}}}}}_{i}\right)={{{{\boldsymbol{F}}}}}_{i}{{{{\boldsymbol{W}}}}}_{g}+\psi \, {{{\rm{Softplus}}}}\left({{{{\boldsymbol{F}}}}}_{i}{{{{\boldsymbol{W}}}}}_{{{\rm{noise}}}}\right)$$

(6)

where, ${{{{\boldsymbol{W}}}}}_{g}\in {{\mathbb{R}}}^{12\times E}$ and ${{{{\boldsymbol{W}}}}}_{{noise}}\in {{\mathbb{R}}}^{12\times E}$ are learned parameters, and $\psi \in N({{\mathrm{0,1}}})$ is drawn from a standard Gaussian. The top-$k$ experts are selected and normalized by a ${Softmax}$ function.

Retired batteries may undergo multiple degradation mechanisms, e.g., SEI formation, SEI thickening, loss of active material, and lithium plating, that can coexist or interact. To flexibly and specifically model these different pathways, AMDP module enables a MOE structure by fusing predictions from multiple expert submodels, each “specialized” in a certain degradation mode. Within the AMDP enabled MOE framework, an Expert is not confined to a single degradation mode; instead, it is represented as a weighted superposition of countably many implicitly defined “typical degradation mode”. Let the $j$-th ${{{{\rm{Expert}}}}}_{{{{\rm{j}}}}}$ be denoted as:

$${{{\rm{Exper}}}}{{{{\rm{t}}}}}_{{{{\rm{j}}}}}\left({q}_{i}\right)={\sum }_{\theta \in \Omega }{w}_{i}\left(\theta \right) \, \phi \left({q}_{i};\theta \right)$$

(7)

where ${q}_{i}$ represents the partial charging sequence extracted under a random initial SOC condition, $\phi \left({q}_{i};\theta \right)$ is the basis response associated with the latent degradation mode indexed by $\theta$, and ${w}_{i}\left(\theta \right)$ is the weight that ${{\rm{Exper}}}{{{\rm{t}}}}_{{{\rm{j}}}}$ assigns to corresponding degradation mode. The output is a short-term (within a small region around the computation point) degradation trend under the assumption that the $j$-th mechanism is dominant. The final AMDP output for sample $i$ is a weighted sum:

$${Tren}{d}_{i}={\sum }_{j=1}^{E}g\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right){{{{\rm{Expert}}}}}_{{{{\rm{j}}}}}\left({{{{\boldsymbol{q}}}}}_{i}\right):{{\mathbb{R}}}^{N}\to {{\mathbb{R}}}^{L}$$

(8)

Where ${Tren}{d}_{i}\in {{\mathbb{R}}}^{1\times L}$ represents the preliminary degradation trend output by AMDP module, and $L$ indicates length of the predicted future degradation trajectory. ${{{\rm{Exper}}}}{{{{\rm{t}}}}}_{{{{\rm{j}}}}}$ denotes the predictor for a specific degradation mode, where ${g}_{j}\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)$ is the weight for expert $j$. This step produces a latent degradation trend vector ${Tren}{d}_{i}$ for subsequent cycles, reflecting the dominant mechanism(s) underpinned by ${{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}$. Hence, the model adaptively fuses multiple degradation mode subspaces into a short-term trend embedding for subsequent long-horizon computation.

Future-operation recurrent neural network (FORNN)

Second-life applications involve future load conditions that can alter degradation pace as well as the maximum available capacity in the current cycle. To include these influences, we employ a recurrent neural network that takes as input AMDP, as output ${Tren}{d}_{i}$ and a set of future load parameters:

$${C}_{i}=\left[\left({I}_{{{\rm{charge}}}}^{\left(1\right)},{I}_{{{\rm{discharge}}}}^{\left(1\right)},{T}^{\left(1\right)}\right),\ldots,\left({I}_{{{\rm{charge}}}}^{\left(L\right)},{I}_{{{\rm{discharge}}}}^{\left(L\right)},{T}^{\left(L\right)}\right)\right]$$

(9)

where ${C}_{i}\in {{\mathbb{R}}}^{L\times 3}$, ${I}_{{{\rm{charge}}}}^{\left(\ell \right)}$, ${I}_{{{\rm{discharge}}}}^{\left(\ell \right)}$, and ${T}^{\left(\ell \right)}$ represent the charge current, discharge current, and temperature in future cycle $\ell$. For each future load condition, its length is equal to L, and the length of the degradation trajectory needs to be predicted. We concatenate these load vectors with the short-term trend from AMDP:

$${{{{\boldsymbol{X}}}}}_{{{{\boldsymbol{i}}}}}={{{\rm{Concat}}}}\left({Tren}{{d}}_{i},{C}_{i}\right)$$

(10)

The LSTM processes ${X}_{i}\in {{\mathbb{R}}}^{L\times 4}$ sequentially to predict the capacity trajectory $\hat{{S}_{i}}$ over $L$ future cycles:

$$\hat{{S}_{i}}={{\rm{LSTM}}}\left({{{{\boldsymbol{X}}}}}_{{{{\boldsymbol{i}}}}}\right)$$

(11)

By iterating cycle by cycle, FORNN accommodates a prediction horizon that the user chooses, merging the degradation modes signature from AMDP with the load sequence the battery will experience under the second-life complexities and uncertainties.

Model training

We train the AMDP and FORNN modules jointly by minimizing a loss function that balances trajectory accuracy with degradation router diversity. Let ${S}_{i}$ be the ground-truth capacity trajectory of length $L$ for sample $i$, the true capacity label represents the maximum discharge capacity of that cycle. Let $\hat{{S}_{i}}$ be the corresponding model output. The trajectory fidelity objective is

$${\ell }_{{{\rm{traj}}}}=\frac{1}{N}{\sum }_{i=1}^{N}{{{{\rm{||}}}}\hat{{S}_{i}}-{S}_{i}{{{\rm{||}}}}}^{2}$$

(12)

where the summation is over the training set of size $N$.

Because MOE can exhibit “expert collapse” (where one expert is used for all samples), we include a regularization term to encourage usage across experts. For a mini-batch, the router weight assigned to expert $j$ is averaged as

$${A}_{j}={\sum }_{i\in {{\rm{Batch}}}}{g}_{j}\left({{{{\boldsymbol{F}}}}}_{{{{\boldsymbol{i}}}}}\right)$$

(13)

We encourage these averages to avoid large variation by penalizing their coefficient of variation:

$${\ell }_{{router}}=\frac{{{{\rm{Var}}}}\left({A}_{1},{A}_{2},\ldots,{A}_{E}\right)}{{\left({{{\rm{Mean}}}}\left({A}_{1},{A}_{2},\ldots,{A}_{E}\right)\right)}^{2}+{{{\rm{\varepsilon }}}}}$$

(14)

where $\varepsilon$ is a constant set to 10, ${{\rm{Var}}}$ and ${{\rm{Mean}}}$ are the variance and mean operator, respectively. The total loss is:

$$\ell=\alpha \, {\ell }_{{{\rm{traj}}}}+\beta \, {\ell }_{{router}}$$

(15)

with $\alpha$ and $\beta$ controlling the trade-off between trajectory accuracy and expert diversity. We set $\alpha=0.75$ and $\beta=0.25$ in our experiments.

Evaluation metrics

Root mean square error (RMSE)

$${{{\rm{RMSE}}}}=\sqrt{\frac{1}{N}{\sum }_{i=1}^{N}{\left(\hat{{S}_{i}}-{S}_{i}\right)}^{2}}$$

(16)

Mean absolute percentage error (MAPE)

$${{\rm{MAPE}}}=\frac{1}{N}{\sum }_{i=1}^{N}\frac{\left|\hat{{S}_{i}}-{S}_{i}\right|}{{S}_{i}}\times 100\%$$

(17)

Coefficient of determination (R²)

$${R}^{2}=1-\frac{{\sum }_{i=1}^{N}{\left(\hat{{S}_{i}}-{S}_{i}\right)}^{2}}{{\sum }_{i=1}^{N}{\left({S}_{i}-\bar{S}\right)}^{2}}$$

(18)

where, $N$ is the total number of samples (i.e., cycles), ${S}_{i}$ and $\hat{{S}_{i}}$ are the true and computed capacity, respectively. $\bar{S}$ is the sample mean of the ground-truth capacities in the computation horizon.

Data availability

Raw data can be found here^23,43,49. Source data are provided as a Source Data file. Source data are provided with this paper.

Code availability

The code used in this work has been deposited at the Zenodo repository here⁵⁹.

References

Tao, S. et al. Collaborative and privacy-preserving retired battery sorting for profitable direct recycling via federated machine learning. Nat. Commun. 14, 8032 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Global_EV_Outlook_2025.
Zhu, J. et al. End-of-life or second-life options for retired electric vehicle batteries. Cell Rep. Phys. Sci. 2, 100537 (2021).
Article Google Scholar
Wu, W., Lin, B., Xie, C., Elliott, R. J. R. & Radcliffe, J. Does energy storage provide a profitable second life for electric vehicle batteries? Energy Econ. 92, 105010 (2020).
Article Google Scholar
Ren, Y. et al. Hidden delays of climate mitigation benefits in the race for electric vehicle deployment. Nat. Commun. 14, 3164 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Innocenti, A., Beringer, S. & Passerini, S. Cost and performance analysis as a valuable tool for battery material research. Nat. Rev. Mater. 9, 347–357 (2024).
Article ADS CAS Google Scholar
Duffner, F., Wentker, M., Greenwood, M. & Leker, J. Battery cost modeling: A review and directions for future research. Renew. Sustain. Energy Rev. 127, 109872 (2020).
Article Google Scholar
Han, X. et al. A review on the key issues of the lithium ion battery degradation among the whole life cycle. eTransportation 1, 100005 (2019).
Article Google Scholar
Tian, H., Qin, P., Li, K. & Zhao, Z. A review of the state of health for lithium-ion batteries: Research status and suggestions. J. Clean. Prod. 261, 120813 (2020).
Article CAS Google Scholar
Attia, P. M. et al. Closed-loop optimization of fast-charging protocols for batteries with machine learning. Nature 578, 397–402 (2020).
Article ADS CAS PubMed Google Scholar
Li, W., Zhang, H., Van Vlijmen, B., Dechent, P. & Sauer, D. U. Forecasting battery capacity and power degradation with multi-task learning. Energy Storage Mater. 53, 453–466 (2022).
Article Google Scholar
Tao, S. et al. Generative learning assisted state-of-health estimation for sustainable battery recycling with random retirement conditions. Nat. Commun. 15, 10154 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Ma, R. et al. Pathway decisions for reuse and recycling of retired lithium-ion batteries considering economic and environmental functions. Nat. Commun. 15, 7641 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Tao, S. et al. Rapid and sustainable battery health diagnosis for recycling pretreatment using fast pulse test and random forest machine learning. J. Power Sources 597, 234156 (2024).
Article CAS Google Scholar
Li, J. et al. Degradation pattern recognition and features extrapolation for battery capacity trajectory prediction. IEEE Trans. Transp. Electrific. 10, 7565–7579 (2024).
Article Google Scholar
Roman, D., Saxena, S., Robu, V., Pecht, M. & Flynn, D. Machine learning pipeline for battery state-of-health estimation. Nat. Mach. Intell. 3, 447–456 (2021).
Article Google Scholar
He, K. et al. A novel quick screening method for the second usage of parallel-connected lithium-ion cells based on the current distribution. J. Electrochem. Soc. 170, 030514 (2023).
Article ADS CAS Google Scholar
Tian, J., Xiong, R., Shen, W., Lu, J. & Sun, F. Flexible battery state of health and state of charge estimation using partial charging data and deep learning. Energy Storage Mater. 51, 372–381 (2022).
Article Google Scholar
Severson, K. A. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 4, 383–391 (2019).
Article ADS Google Scholar
Tan, R. et al. Forecasting battery degradation trajectory under domain shift with domain generalization. Energy Storage Mater. 72, 103725 (2024).
Article Google Scholar
Zhao, H., Meng, J. & Peng, Q. Early perception of Lithium-ion battery degradation trajectory with graphical features and deep learning. Appl. Energy 381, 125214 (2025).
Article Google Scholar
Tao, S. et al. Non-destructive degradation pattern decoupling for early battery trajectory prediction via physics-informed learning. Energy Environ. Sci. 10.1039.D4EE03839H https://doi.org/10.1039/D4EE03839H (2025).
Lu, J. et al. Battery degradation prediction against uncertain future conditions with recurrent neural network enabled deep learning. Energy Storage Mater. 50, 139–151 (2022).
Article Google Scholar
Rogge, M. & Jossen, A. Path-dependent ageing of lithium-ion batteries and implications on the ageing assessment of accelerated ageing tests. Batter. Supercaps 7, e202300313 (2024).
Article CAS Google Scholar
Kim, I. et al. Degradation path prediction of lithium-ion batteries under dynamic operating sequences. Energy Environ. Sci. 18, 3784–3794 (2025).
Article CAS Google Scholar
Pinson, M. B. & Bazant, M. Z. Theory of SEI formation in rechargeable batteries: capacity fade, accelerated aging and lifetime prediction. J. Electrochem. Soc. 160, A243–A250 (2013).
Article CAS Google Scholar
Fly, A. & Chen, R. Rate dependency of incremental capacity analysis (dQ/dV) as a diagnostic tool for lithium-ion batteries. J. Energy Storage 29, 101329 (2020).
Article Google Scholar
Xiong, R., Wang, P., Jia, Y., Shen, W. & Sun, F. Multi-factor aging in Lithium Iron phosphate batteries: mechanisms and insights. Appl. Energy 382, 125250 (2025).
Article CAS Google Scholar
Wang, J. et al. Degradation of lithium ion batteries employing graphite negatives and nickel–cobalt–manganese oxide + spinel manganese oxide positives: part 1, aging mechanisms and life estimation. J. Power Sources 269, 937–948 (2014).
Article ADS CAS Google Scholar
Ward, L. et al. Principles of the battery data genome. Joule 6, 2253–2271 (2022).
Article CAS Google Scholar
Delegated regulation - EU − 2025/606 - EN - EUR-Lex.
Heenan, T. M. M. et al. Mapping internal temperatures during high-rate battery applications. Nature 617, 507–512 (2023).
Article ADS CAS PubMed Google Scholar
Huang, J., Boles, S. T. & Tarascon, J.-M. Sensing as the key to battery lifetime and sustainability. Nat. Sustain. 5, 194–204 (2022).
Article Google Scholar
Fan, J. et al. Wireless transmission of internal hazard signals in Li-ion batteries. Nature 641, 639–645 (2025).
Article ADS CAS PubMed Google Scholar
Zhang, C. et al. Flexible method for estimating the state of health of lithium-ion batteries using partial charging segments. Energy 295, 131009 (2024).
Article Google Scholar
Wang, F., Zhai, Z., Zhao, Z., Di, Y. & Chen, X. Physics-informed neural network for lithium-ion battery degradation stable modeling and prognosis. Nat. Commun. 15, 4332 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Navidi, S., Thelen, A., Li, T. & Hu, C. Physics-informed machine learning for battery degradation diagnostics: a comparison of state-of-the-art methods. Energy Storage Mater. 68, 103343 (2024).
Article Google Scholar
Aykol, M. et al. Perspective—Combining physics and machine learning to predict battery lifetime. J. Electrochem. Soc. 168, 030525 (2021).
Article ADS CAS Google Scholar
Wang, F. et al. Inherently Interpretable Physics-Informed Neural Network for Battery Modeling and Prognosis. IEEE Trans. Neural Netw. Learning Syst. 1–15 https://doi.org/10.1109/TNNLS.2023.3329368 (2024).
Zhang, Y., Feng, X., Zhao, M. & Xiong, R. In-situ battery life prognostics amid mixed operation conditions using physics-driven machine learning. J. Power Sources 577, 233246 (2023).
Article CAS Google Scholar
Su, L. et al. Data sufficiency for transferable lithium-ion battery periodical SOH estimation under resource constraints. Cell Rep. Phys. Sci. 6, 102901 (2025).
Article Google Scholar
Fu, S. et al. Data-driven capacity estimation for lithium-ion batteries with feature matching based transfer learning method. Appl. Energy 353, 121991 (2024).
Article CAS Google Scholar
Wang, S., Gao, F. & Tian, H. Deep sorting of reused batteries for enabling long-term consistency grouping with unknown prior conditions. Cell Rep. Phys. Sci. 6, 102657 (2025).
Article Google Scholar
Li, R. et al. The importance of degradation mode analysis in parameterising lifetime prediction models of lithium-ion battery degradation. Nat. Commun. 16, 2776 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Hu, Y., Liu, P., Zhu, P., Cheng, D. & Dai, T. Adaptive multi-scale decomposition framework for time series forecasting. In Proc. AAAI Conference on Artificial Intelligence (AAAI, 2025).
Shi, X. et al. Time-MoE: billion-scale time series foundation models with mixture of experts. In Proc. 13th International Conference on Learning Representations (ICLR, 2025).
Shazeer, N. et al. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In Proc. 5th International Conference on Learning Representations (ICLR, 2017).
Fedus, W., Zoph, B. & Shazeer, N. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23, 1–39 (2022).
Zhu, J. et al. Data-driven capacity estimation of commercial lithium-ion batteries from voltage relaxation. Nat. Commun. 13, 2261 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Nie, Y., Nguyen, N. H., Sinthong, P. & Kalagnanam, J. A time series is worth 64 words: long-term forecasting with transformers. In Proc. 11th International Conference on Learning Representations (ICLR, 2023).
Zhou, H. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. AAAI 35, 11106–11115 (2021).
Article Google Scholar
Schuster, S. F. et al. Nonlinear aging characteristics of lithium-ion cells under different operational conditions. J. Energy Storage 1, 44–53 (2015).
Article Google Scholar
Zhao, M., Zhang, Y. & Wang, H. Battery degradation stage detection and life prediction without accessing historical operating data. Energy Storage Mater. 69, 103441 (2024).
Article Google Scholar
Desai, T., Gallo, A. J. & Ferrari, R. M. G. Multi timescale battery modeling: integrating physics insights to data-driven model. Appl. Energy 393, 126040 (2025).
Article Google Scholar
Huang, J. et al. Operando decoding of chemical and thermal events in commercial Na(Li)-ion cells via optical sensors. Nat. Energy 5, 674–683 (2020).
Article ADS CAS Google Scholar
Geslin, A. et al. Dynamic cycling enhances battery lifetime. Nat. Energy https://doi.org/10.1038/s41560-024-01675-8 (2024).
Pozzi, A., Incremona, A. & Toti, D. Imitation learning-driven approximation of stochastic control models. Appl. Intell. 55, 838 (2025).
Article Google Scholar
Pozzi, A., Incremona, A. & Toti, D. Neural network-based imitation learning for approximating stochastic battery management systems. IEEE Access 13, 71041–71052 (2025).
Article Google Scholar
Huang, X. iMOE: Prediction of second-life battery degradation trajectory using interpretable mixture of experts, terencetaothucb/Prediction-of-second-life-battery-degradation-trajectory-using-iMOE. zenodo. https://doi.org/10.5281/zenodo.18061208 (2025).

Download references

Acknowledgements

This work was funded by the National Natural Science Foundation of China (Grant No. 51877120) (B.X.), the Key Scientific Research Support Project of the Shanxi Energy Internet Research Institute (Grant No. SXEI2023A002) (X.Z.), the Meituan Scholar Program-International Collaboration Project (Grant No. 202209A) (X.Z.), the Tsinghua Shenzhen International Graduate School-Shenzhen Pengrui Young Faculty Program of Shenzhen Pengrui Foundation (Grant No. SZPR2023007) (G.Z.), the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2023B1515120099) (G.Z.), and the National Natural Science Foundation of China (No. 92572103).

Funding

Open access funding provided by Chalmers University of Technology.

Author information

These authors contributed equally: Xinghao Huang, Shengyu Tao.

Authors and Affiliations

Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
Xinghao Huang, Shengyu Tao, Chen Liang, Bizhong Xia, Guangmin Zhou & Xuan Zhang
Department of Civil and Environmental Engineering, UC Berkeley, Berkeley, CA, USA
Shengyu Tao & Junzhe Shi
Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden
Shengyu Tao
Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
Yining Tang
Institute for Artificial Intelligence, Peking University, Beijing, China
Jiawei Chen
Department of Materials Science and Engineering, Stanford University, Stanford, CA, USA
Yuqi Li
Center of International Innovation for Technology and Science, Shenzhen, Guangdong, China
Xuan Zhang

Authors

Xinghao Huang
View author publications
Search author on:PubMed Google Scholar
Shengyu Tao
View author publications
Search author on:PubMed Google Scholar
Chen Liang
View author publications
Search author on:PubMed Google Scholar
Yining Tang
View author publications
Search author on:PubMed Google Scholar
Jiawei Chen
View author publications
Search author on:PubMed Google Scholar
Junzhe Shi
View author publications
Search author on:PubMed Google Scholar
Yuqi Li
View author publications
Search author on:PubMed Google Scholar
Bizhong Xia
View author publications
Search author on:PubMed Google Scholar
Guangmin Zhou
View author publications
Search author on:PubMed Google Scholar
Xuan Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Xinghao Huang: Writing—review and editing, writing—original draft, visualization, validation, methodology, formal analysis, conceptualization. Shengyu Tao: Writing—review and editing, writing—original draft, visualization, validation, methodology, formal analysis, conceptualization, supervision, project administration. Chen Liang: Writing—review and editing, visualization. Yining Tang: Writing—review and editing, visualization. Jiawei Chen: Writing—review and editing, methodology, formal analysis. Junzhe Shi: Writing—review and editing, methodology, formal analysis. yuqi li: writing—review and editing, methodology, formal analysis. Bizhong Xia: Resources, supervision, project administration, funding acquisition, conceptualization. Guangmin Zhou: Supervision, project administration, writing—review and editing, supervision, conceptualization. Xuan Zhang: Supervision, project administration, writing—review and editing, supervision, conceptualization.

Corresponding authors

Correspondence to Shengyu Tao, Bizhong Xia, Guangmin Zhou or Xuan Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Md Sazzad Hosen, Maksim Subbotin and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, X., Tao, S., Liang, C. et al. iMOE: prediction of second-life battery degradation trajectory using interpretable mixture of experts. Nat Commun 17, 2549 (2026). https://doi.org/10.1038/s41467-026-69369-1

Download citation

Received: 10 July 2025
Accepted: 28 January 2026
Published: 09 February 2026
Version of record: 18 March 2026
DOI: https://doi.org/10.1038/s41467-026-69369-1