Introduction

The advancement of Li-ion (Li-ion) batteries has been driven by the demand for high-performance energy storage technologies1,2,3,4. As the need intensifies for batteries not only with higher power density but also with enhanced cycle stability5,6, attention has turned to silicon (Si) as a promising candidate.

Silicon boasts a remarkable theoretical capacity of 4200 mAh/g, surpassing conventional graphite anodes by over tenfold (372 mAh/g), positioning it as a viable option for anode materials in Li-ion batteries7. However, Si encounters significant volume expansion (300%) during lithiation and delithiation processes, leading to pulverization, loss of electrical contact, unstable solid electrolyte interface (SEI) formation, and rapid capacity degradation, thus limiting its practical application in industrial Li-ion batteries8,9.

To tackle these challenges, researchers have dedicated significant efforts10, exploring various strategies including nanostructured Si11,12, porous Si13,14, Si-carbon composite materials15,16,17, and innovative structural designs18,19,20,21. While these methods have shown promise in mitigating volume fluctuations and enhancing the electrochemical performance of Si anodes, their widespread adoption is impeded by the substantial costs and intricate synthesis processes involved8. One effective approach to tackle the challenges associated with Si-based anodes is by developing advanced binder materials capable of accommodating the significant mechanical strain induced by Si volume changes22. Binders not only provide structural integrity to the electrode23but also facilitate efficient electron and ion transport within the electrode matrix24,25, aid in the dispersion of active materials26, and promote the development of a stable solid electrolyte interface (SEI) film27,28,29.

To enhance the mechanical integrity and performance of Si-based anodes, incorporating polar functional groups into polymer binders has been explored. These groups can form strong interactions, such as hydrogen bonding and electrostatic attractions, with Si particles, thereby improving structural cohesion. In addition to these interactions, metal–ligand coordination plays a significant role in designing self-healing binders. This dynamic bonding involves reversible interactions between metal ions and ligands, enabling the material to autonomously repair and maintain its mechanical properties under stress30. For instance, elastomers utilizing various metal–ligand bonds have shown excellent self-healing properties due to the full reconstruction of these bonds during the healing process31. A 3D conductive polyaniline binder doped with phytic acid was synthesized and used with silicon nanoparticles, achieving 1137 mA h g⁻¹ after 500 cycles at 1 A g⁻¹. polyaniline enhances structural stability and conductivity, offering insights for commercial high-performance lithium-ion batteries32.

One innovative approach involves the synthesis of a composite binder composed of multi-hydroxyl polyvinyl alcohol (PVA) and electroactive squaric acid (SA). The crosslinking reactions between SA and PVA chains form a robust binder network that stabilizes both the macro and micro-structures of Si electrodes. This composite binder has demonstrated improved initial coulombic efficiency, enhanced rate performance, and prolonged cycling stability in Si anodes33. Molecularly engineered conductive polymer binders, including star-like polyaniline (s-PANi), cross-linked polyaniline (c-PANi), and linear polyaniline (l-PANi) were developed by He et al.34. Among these, s-PANi demonstrated superior performance, delivering a reversible capacity of 1776 mAh g⁻¹ after 100 cycles at 500 mA g⁻¹. This was attributed to its 3D-conjugated conductive network, which better accommodates the large strain induced by cycling compared to cross-linked or linear structures.

A cross-linked polyacrylamide (PAM) binder was developed, forming a 3D polar network via a condensation reaction with citric acid. This binder significantly improved tensile properties and adhesion, enabling the silicon anode to achieve a reversible capacity of 1280 mAh g⁻¹ after 600 cycles at 2.1 A g⁻¹ and 770.9 mAh g⁻¹ after 700 cycles at 4.2 A g⁻¹. The study presents a cost-effective binder engineering strategy that enhances the long-term cycling stability and performance of silicon anodes35. A double-wrapped binder with rigid polyacrylic acid (PAA) inside and elastic bifunctional polyurethane (BFPU) outside was designed to manage stress in silicon anodes. PAA dissipates stress during lithiation, while BFPU buffers residual stress and self-heals microcracks, improving cycling life for high-energy-density batteries36.

Among the various materials, poly (vinylidene fluoride) (PVDF) has long been the conventional binder of choice for Li-ion batteries (LIBs) utilizing carbonaceous anodes, prized for its electrochemical stability, binding capacity, and facilitation of Li-ion transportation. However, its effectiveness wanes when applied to alloy-based anodes due to weak interactions with active materials and the current collector, resulting in electrode instability and significant volume changes37.

In contrast, both natural polymers (e.g., gelatin38, carboxymethyl cellulose (CMC)39, alginate40, pectin41) and synthetic polymers (e.g., poly (acrylic acid) (PAA)42, poly (vinyl alcohol) (PVA)43,44, polyacrylonitrile (PAN))45have demonstrated efficacy in enhancing the cyclability of Si37, Si/C, and Si/Gr composite anodes8,46. However, their one-dimensional configuration limits interaction with the Si surface and fails to adequately restrain swelling in the electrolyte47,48. Branched polysaccharides such as gum arabic49, xanthan gum50, and graft copolymers like NaPAA-gCMC51, PAL-NaPAA52, and PVDF-g-PtBA53 offer additional contact points with active materials for effective stress dissipation. Nevertheless, the absence of interchain connections leads to polymer chain slipping during volume expansion that compromise the electrode long-term stability.

To address these challenges, 3D polymer networks have been engineered, featuring multiple interchain connections that sustain mechanical integrity to withstand cycling-induced stress54,55,56. These network binders, characterized by covalent and noncovalent interactions such as hydrogen bonding and electrostatic interactions, effectively restrict continuous movement of Si particles and introduce a self-healing effect57,58. The self-healing capability not only restores the binder’s original structure but also enhances cell efficacy by re-establishing conduction pathways and mechanical resilience7. The exploration of self-healing properties in binders holds significant promise for advancing battery technologies, particularly in enclosed and intricate systems like energy storage devices.

The design of 3D polymer networks for self-healing binders relies heavily on the presence of specific structural features and functional groups that enable dynamic interactions, such as hydrogen bonding, electrostatic interactions, and covalent bonds. These interactions are critical for maintaining the structural integrity of silicon anodes during charge/discharge cycles and for restoring mechanical properties after damage. For example, functional groups like carbonyl oxygens, ether groups, and hydrogen bond donors/acceptors play a key role in facilitating self-healing by forming reversible bonds that can break and reform under stress. These insights from experimental studies provide a foundation for understanding the mechanisms behind self-healing binders.

However, identifying the optimal binder through experimental design remains a time- and resource-intensive process that is prohibitively expensive. The rapid advancements in artificial intelligence (AI) and machine learning (ML) present a considerable opportunity to revolutionize the design loop of polymeric binders for Si-based anodes. These innovations have the potential to significantly expedite and enhance the binder selection process, thereby improving performance and reducing the need for iterative experimental stages currently required.

This study builds upon previous binder design methodologies by addressing a critical gap in the field: the reliance on costly and time-consuming experimental approaches. While prior researches have explored various binder materials (e.g., natural and synthetic polymers) and their mechanical properties, our approach uniquely targets the identification of key functional groups that enhance self-healing capabilities and capacity retention. We make a synergy between experimental insights and computational predictions by incorporating the structural and functional properties into our computational methods. This allows us to systematically identify the key design principles for high-performance binders, accelerating the discovery of features which positively impact the state of health (SoH) or capacity retention of silicon-based LIBs.

Here, we have developed a ML model to predict the functional groups with the greatest impact on the capacity retention of Si-based Li-ion batteries, leveraging a dataset comprising 80 cross-linked polymer binders extracted from 450 articles detailing experimental tests on binders for LIBs silicon anodes.

Our results revealed that an increase in the number of carbonyl oxygens (excluding carboxyl), secondary amine functional group, and two interconnected rings positively influences the model’s performance. Additionally, a reduction in the number of aromatic nitrogens, and the tertiary amine functional group positively impacts the model. Our research represents the inaugural exploration into this specific area, and has the potential to pave the way for the development of more efficient and enduring Li-ion batteries.

Dataset and machine learning models

Dataset

For this study, we systematically reviewed 450 articles detailing experimental tests on binders for silicon anodes in LIBs. From this pool, 160 articles reporting on the utilization of cross-linked binders were identified, considering their superior intrinsic performance. Among these, a subset of 80 binders was selected based on the feasibility of rendering their chemical structures using the KingDraw program. To ensure the robustness of our models, we used a standard train-test split along with cross-validation to divide our data for training, validation, and testing. The structures generated by the KingDraw program were saved in .mol chemical format and subsequently converted to Simplified Molecular-Input Line-Entry System (SMILES) chemical format using the OpenBabel platform. SMILES notation is a concise line notation method for representing chemical species using short ASCII strings. It offers a clear depiction of molecular structures based on established chemical rules. In the realm of machine learning, SMILES format is widely employed for constructing datasets, storing molecules, predicting molecular properties, designing drugs, and other chemical tasks. Its standardized structure facilitates analysis using machine learning algorithms, contributing to advancements in chemical research and applications.

Fingerprints

To evaluate polymer performance, we calculated the ratio of capacity after one hundred cycles to the initial cycle’s capacity, expressed as a percentage, which served as the target measure. The RDKit library, tailored for chemical computations, facilitated the computation of properties for each polymer. Initially considering 114 properties from the SCBDD site, we retained 73 after filtering out features with zero or undefined values, focusing on characteristics such as functional groups and molar weight.

Alternatively, we employed a method that bypassed the use of the RDKit library for feature extraction. In this approach, we utilized the KingDraw program to obtain .mol outputs, which were then processed through the OpenBabel site to convert them into SMILES format. The SMILES representations were subsequently inputted into the SCBDD site for calculations, with feature extraction performed using the RDKit library. It is noteworthy that while the outcomes of this alternative method aligned with those of the initially mentioned approach, the latter stood out for its efficiency in terms of time.

Machine-learning models

In the realm of supervised learning, two distinct approaches were explored: regression and classification. Regression models, cornerstone in predictive analytics, aim to establish relationships between independent and dependent variables, enabling quantitative predictions. For regression analysis, we employed random forest, ridge algorithms, and neural networks. These models play a crucial role in identifying patterns within datasets and predicting numerical outcomes, especially when dealing with a relatively high number of features compared to the available data.

In contrast, for classification analysis, we utilized random forest algorithms, support vector machines, and neural networks to categorize data into distinct classes, providing a framework for sorting and labeling information. These algorithms were chosen for their ability to handle the complexity introduced by the relatively high number of features.

Results and discussion

Regression models performance and validation

In the Random Forest algorithm for regression, after fitting the data with the StandardScaler function and utilizing 5-fold cross-validation with 100 estimators, the results obtained were as follows: Mean Train R-squared: 0.83 and Mean Test R-squared: −0.07. To improve upon these results, a simpler method of data division (Train-Test-Split) was employed, allocating 20% of the data for testing. This approach yielded relatively better results compared to the previous state, with Train R-squared reaching 0.82 and Test R-squared achieving 0.22. Furthermore, we tried to analyze feature importance and optimize hyperparameters (e.g., the number of trees, maximum depth, and minimum samples per leaf) but the model’s performance on the test data remained suboptimal, highlighting the need for further refinement. Random forest, while powerful, requires large datasets to perform well. Here, the relatively small dataset and relatively high-dimensional feature set may cause issues to identify meaningful patterns.

In the Ridge algorithm, combined with feature selection, the data were scaled using the RobustScaler algorithm, and 20% of the data were designated for testing. Subsequently, the SelectKbest command was utilized to identify 25 important features, which were then processed within the Ridge algorithm. The outcomes revealed a Mean Train R-squared of 0.10 and a Mean Test R-squared of 0.00. Ridge regression is likely struggling because it assumes a linear relationship, which might not fit the data well if there are non-linearities or interactions between features.

Moving on to the Neural network algorithm, the data were scaled using the RobustScaler algorithm, followed by division into 15-fold cross-validation sets for training and testing. A five-layer forward neural networks algorithm with the tanh activation function was employed. The results indicated a training R-squared of 0.88 and a testing R-squared of about 0.84. Figure 1 illustrates the changes in the loss function with respect to the number of epochs executed by the algorithm. The fluctuations observed in Fig. 1 could be attributed to various factors such as local gradients, learning rate, the number of nodes, the depth of the neural networks implemented, and the limited amount of training data in each category. Nonetheless, given the overall decreasing trend depicted in the graph, the efficiency and accuracy of the model are affirmed.

Fig. 1
figure 1

Loss function changes of train and test sets according to the number of times the program is executed (epochs) for the regression neural network algorithm.

Grounded in game theory, SHAP values were calculated to assign importance to each feature, with positive values indicating a positive impact on the prediction and negative values signifying a negative impact. The magnitude of these values quantifies the strength of the respective effects59,60.

According to Fig. 2 obtained from SHAP analysis, an increase in the following features positively influences the model’s performance: number of carbonyl oxygens (excluding carboxyl), secondary amine functional group, two interconnected rings, methoxy and ester functional groups, carboxylic acid (including unsaturated carboxylic acids), number of carbonyl oxygens, and number of benzene and aromatic rings. Additionally, a reduction in rotatable bonds, the number of unsaturated and saturated carbo rings, the number of radical electrons, the number of aromatic nitrogens, and the tertiary amine functional group also positively impacts the model.

Fig. 2
figure 2

The effect of features on the model performance using SHAP analysis on regression neural network algorithm.

Classification models performance and validation

In this section, we delve into the outcomes of machine learning classification models. To facilitate analysis, we categorized capacity efficiency values into three groups: “low,” “good,” and “perfect,” representing varying levels of capacity retention after 100 cycles. Specifically, values ranging from 0 to 34.6 were labeled as “low,” those from 34.6 to 69.33 as “good,” and those from 69.33 to 104 as “perfect.” This classification framework aids in comprehending how effectively the models predict capacity efficiency across diverse retention levels. In the Support Vector Machine (SVM) algorithm, the data underwent scaling using the StandardScaler function and was subsequently partitioned using 5-fold cross-validation. The Poly kernel was selected due to its minimal error compared to other available options.

For the SVM, the train accuracy, precision, recall, and F1-score were 0.68, 0.73, 0.68, and 0.64, respectively, while the test metrics were 0.54, 0.59, 0.54, and 0.44, indicating moderate performance but poor generalization. For the Random Forest algorithm, the data was standardized using the StandardScaler command and split using 5-fold cross-validation. The number of estimators was fixed at 5.

The Random Forest model achieved strong training performance (accuracy: 0.91, precision: 0.92, recall: 0.91, F1-score: 0.91) but showed significant overfitting, with test metrics dropping to 0.52, 0.63, 0.52, and 0.51 for accuracy, precision, recall, and F1-score respectively. While the SVM and random forest models achieved low test accuracies (SVM: 0.54, Random Forest: 0.52), they provide valuable insights into the dataset’s complexity and the challenges of predicting binder performance. These baseline results reinforced the need for more sophisticated models, such as neural networks, to effectively handle the high-dimensional and complex nature of the dataset.

In the Neural network algorithm with feature selection, the data was scaled using the StandardScaler and partitioned using 20-fold cross-validation. Subsequently, features were selected using lasso with an alpha value of 0.0003, retaining the top 50 features. The alpha value of 0.0003 was determined through manual experimentation, where we tested a range of alpha values and selected the one that provided the best balance between feature selection and model performance. This value was chosen because it effectively reduced the number of features while maintaining a high level of predictive accuracy. A 3-layer neural network was then constructed, with 150 epochs, a batch size of 2, and a patience of 5. The Neural Network with Lasso feature selection demonstrated excellent performance on both training (accuracy: 0.95, precision: 0.95, recall: 0.95, F1-score: 0.95) and test data (accuracy: 0.90, precision: 0.90, recall: 0.90, F1-score: 0.89), highlighting its superior generalization ability, and high accuracy without overfitting or underfitting.

This model achieves a commendable accuracy of approximately 90% in correctly predicting the data within each group (low, good, perfect). Figure 3 illustrates the cost function of the neural network classification algorithm against the number of epochs. The fluctuations observed in this graph can be attributed to the significant number of parameters involved in the algorithm, as well as the limited amount of test data and the complexity of the model. It is noteworthy that despite these fluctuations, the results obtained with the current parameter values surpass all previous implementations. Thus, these specific parameter values were selected for the algorithm.

Fig. 3
figure 3

Changes in the train and test sets loss function of the neural network classification algorithm along with feature selection according to the number of epochs.

The confusion matrix of this algorithm is depicted in Fig. 4. Furthermore, the absolute impact of important features on each group was explored through SHAP analysis (Fig. 5).

Fig. 4
figure 4

Confusion matrix of neural network algorithm along with feature selection for training and test data.

Fig. 5
figure 5

SHAP analysis and display of the absolute magnitude of the effect of important features on each perfect, good, and low group for the neural network with feature selection.

According to Fig. 5, the carboxylic acid functional group, carbonyl oxygens, ether functional group, average molecular weight, and saturated carbocycles exhibit the greatest absolute effect on the “perfect” group. The Neural Networks algorithm without feature pre-selection underwent the same execution steps as the previous model, with the exception that all features were included in the neural networks. The algorithm was iterated 250 times, with each category having a training data size (category size) of 16. The number of training sessions before adjusting the learning rate was set to 50. The performance metrics demonstrated a Mean Train Accuracy of 0.97 and a Mean Test Accuracy of 0.96.

Fig. 6
figure 6

Changes in the train and test sets loss function of the neural networks algorithm without feature pre-selection across epochs.

The hyperparameters for the neural network, including the number of layers, neurons, epochs, batch size, and patience, were manually chosen based on iterative experimentation and performance evaluation. While we did not employ automated optimization techniques like grid search or Bayesian optimization, the selected configuration (e.g., a five-layer network with the tanh activation function) demonstrated strong performance, achieving a test accuracy of 96%. To prevent overfitting, we relied on techniques such as early stopping (with a patience parameter) and robust data splitting into training and test sets. Although additional regularization methods like dropout or weight decay were not tested in this study, the high-test accuracy and consistent performance across training and test data suggest that the model generalizes well.

Figure 6 illustrates the changes in the loss function with respect to the number of epochs. To reduce loss fluctuations in the model without feature selection, the following measures were taken: (1) Cross-Validation: A 15-fold cross-validation was implemented to ensure a balanced and robust evaluation of the model performance across different splits of the dataset. (2) EarlyStopping: The use of the EarlyStopping callback monitored validation loss and halted training when no further improvement was detected, helping to avoid overfitting and stabilize loss trends. (3) Robust Scaling: The dataset was scaled using the RobustScaler to mitigate the influence of outliers and improve convergence stability during training. (4) Random Seed Setting: fixed random seeds were applied across libraries (NumPy, TensorFlow, and Random) to ensure consistent and reproducible results. (5) Network Architecture: A balanced architecture with progressively reducing dense layers and the use of the tanh activation function contributed to smoother learning curves.

In comparison to the algorithm with feature selection described earlier, the fluctuations in loss are reduced, and even at its peak, the loss remains relatively small, approximately 0.12, which can be considered negligible. In Fig. 7, the confusion matrix of the neural network algorithm is presented, providing insights into the accuracy of predictions across different groups. For the training data, only one misclassification is observed (a polymer categorized as belonging to the “good” group is incorrectly predicted as a member of the “low” group), while for the test data, all groups are accurately predicted. According to Fig. 8, carbonyl oxygens, saturated carbo rings, ester functional group, average molecular weight, and exact molecular weight exert the greatest absolute effect on the “perfect” group.

Fig. 7
figure 7

Confusion matrix of the neural network algorithm without feature pre-selection.

Fig. 8
figure 8

Impact of some important features on each perfect, good, and low group for the neural network without feature pre-selection according to SHAP analysis.

Here the neural network without feature selection outperformed the version with feature selection, achieving a test accuracy of 0.96 compared to 0.90. This suggests that the full set of features may contain valuable information that is lost during feature selection, even when using methods like Lasso regression. While we attempted various manual refinements to the feature selection process, none resulted in better performance than the model without feature selection. This could indicate that the feature selection criteria need further refinement or that the interactions between features are too complex to be captured by simple selection methods. Alternatively, the NN may inherently handle high-dimensional data more effectively, leveraging all available features to identify subtle patterns that contribute to improved performance.

It is important to note that categorizing the binders into three distinct capacity retention groups (low, good, and perfect) was intended to improve interpretability and allow a more meaningful inspection of binder efficiency. Consequently, some imbalance was introduced unintentionally. We addressed this issue to some extent through careful data splitting and cross-validation, future work may explore techniques such as data augmentation or synthetic data generation to balance the dataset. Despite this, our results demonstrate that the model was still able to achieve high precision and reliable predictions. We recognize that a more balanced dataset could further enhance model performance, and we hope that in future research—either by us or by other researchers—a more comprehensive and balanced dataset will be developed to refine these findings.

Guidelines for designing self-healing binders

Guidelines for design provide valuable insights and heuristics, offering logical approaches to crafting polymers that meet specific property criteria. In this study, we introduce a set of design principles for selecting functional groups aimed at enhancing the self-healing capabilities and prolonging the lifespan of Si-based LIBs. Here, SHAP analysis is employed as a standardized and unbiased method to comprehensively elucidate the influence of each feature on the model’s prediction. Positive and negative feature contributions reveal the factors that most strongly influence binder capacity retention, guiding material scientists in optimizing chemical compositions or cross-linking strategies to enhance performance. According to the results of the algorithms, the presence of polar functional groups (e.g., -OH, -COOH, -(C = O)NH2, and others depicted in Fig. 9) in the polymer is crucial when designing a more efficient binder for Si anodes. These groups exhibit strong interactions with the silanol groups on the Si surface and facilitate the dissolution of the binder in the solvent for streamlined processing. For example, the presence of carbonyl oxygens (excluding carboxyl groups) could lead to strong hydrogen bonds and electrostatic interactions, which enhance the self-healing properties of binders. These interactions help maintain structural integrity during the large volume changes of silicon anodes, preventing pulverization and improving cycle life. Ether groups contribute to the flexibility and elasticity of polymer chains, allowing the binder to accommodate volume changes during charge/discharge cycles. Additionally, ether groups can participate in hydrogen bonding, further enhancing the self-healing capabilities of the binder. Based on these findings, we propose the following design principles for self-healing polymer binders:

  • Incorporate functional groups: Prioritize the inclusion of carbonyl oxygens, ether groups, and hydrogen bond donors/acceptors to enhance self-healing and mechanical properties.

  • Optimize polymer architecture: Design polymers with interconnected rings and balanced rigidity/flexibility to accommodate volume changes and dissipate stress effectively.

  • Leverage polar interactions: Use polar functional groups to improve adhesion, solubility, and electrochemical stability.

These insights not only guide the design of binders for silicon anodes but also provide a framework for developing advanced materials for other energy storage systems.

Fig. 9
figure 9

Structural factors of binders with the most positive effects on the amount of battery capacity retention after 100 cycles compared to the first cycle.

The primary limitation of this study was the relatively small dataset of 80 cross-linked polymer binders. While this dataset was carefully curated from 450 articles, its size may restrict the generalizability of the ML models. A larger and more diverse dataset would improve the robustness of the predictions and enable the models to capture more complex patterns. While our ML model provide valuable insights, the predictions have not yet been validated experimentally. We plan to synthesize and test the top-performing binders predicted by our model to validate their performance experimentally.

Conclusion

Considering that network-structured polymers with covalent bonds exhibit higher bond energies compared to dynamic covalent bonds (such as disulfide bridges, alkoxyamine bonds, Diels–Alder, hydrazine bonds, and boronic ester bonds), as well as non-covalent supramolecular interactions (including hydrogen bonding, host-guest interactions, metal coordination bonds, and electrostatic interactions), it is advisable to prioritize polymers containing non-covalent bonds in their composition to enhance the capacity retention of Si-based LIBs during extended cycles.

In light of the results from the regression neural networks algorithm, which demonstrated superior performance compared to other algorithms, the following considerations are recommended for the design of polymer binders (where applicable) to leverage their self-healing properties for prolonged lifespan and enhanced capacity of LIBs based on Si anodes:

  • Increasing the presence of carbonyl oxygens (excluding carboxylic acid), secondary amine functional groups, two interconnected rings, methoxy functional groups, carboxylic acid (including unsaturated carboxylic acids), the number of carbonyl oxygens, and the number of benzene and aromatic rings.

  • Reducing the number of rotatable bonds, unsaturated and saturated carbocycles, radical electrons, aromatic nitrogen, and tertiary amine functional groups.