Introduction

Chemical mixtures are widely used in chemical processes and devices such as energy storage and conversion1,2,3,4, chemical reactions and catalysis5,6,7, and environmental engineering8,9,10. Often, the mixture chemistry and compositions are carefully designed to achieve higher device performances. In particular, battery electrolytes, as mixtures of salts and solvents, have been optimized to facilitate ion transport, prevent electron transfer, and stabilize electrode-electrolyte interfaces for an energy-dense and durable battery system11,12,13,14.

The design and optimization of electrolyte mixtures remain challenging due to the complexity of mixture chemistry and compositions, as well as the high experimentation cost15,16. Physics-based modeling offers a solution by probing the underlying mixture physics and rationalizing the design principles for high-performing mixtures. Among physics-based mixture modeling techniques, molecular simulation is a powerful tool to study the interactions and dynamic evolution inside a complex mixture system, but it can be limited to time and length scales due to its high computational costs17,18. Alternatively, chemical physicists proposed empirical function relationships to describe mixture physics. For example, Redlich-Kister (R-K) polynomials19 were designed for modeling mixture thermodynamics, and Arrhenius equation20 was proposed to describe the temperature dependence of chemical reactions and other dynamic behaviors. Although they may provide decent model accuracy and indicate intrinsic physical behaviors such as reaction energy barriers, these empirical relationships are lacking in predictive power when new chemical species are provided. Emerging data-driven methods21,22,23,24,25,26,27 can potentially bridge the gap in the predictive modeling of electrolyte mixtures28,29,30,31. Notably, with a linear regression method, Kim et al.32 discovered a strong correlation between the oxygen content in battery electrolytes and lithium-metal-cell Coulombic efficiencies. Bradford et al.33 developed a graph machine learning model of solid polymer electrolytes (SPEs) and predicted ionic conductivities of thousands of new SPEs. Furthermore, the differentiability of modern deep learning models provides a new opportunity for unifying physics-based and data-driven models34,35,36,37,38,39. Especially, Guan proposed a general differentiable framework merging thermodynamic modeling and deep learning for multi-component mixtures, where all the thermodynamic observables including thermochemical quantities and phase equilibria can be auto-differentiated, thus allowing models learned by gradient-based optimization37. It was subsequently extended to a more comprehensive framework of differentiable materials modeling and design, including the full processing-structure-properties-performance relationships40.

In this work, we leverage the geometric deep learning (GDL) method for battery electrolyte modeling and optimization, where, in GDL, necessary geometric priors are applied as constraints on the model space to improve model efficiency41,42. In particular, we develop a differentiable GDL model of chemical mixtures, DiffMix, which is applied in guiding the robotic experimentation towards fast-charging battery electrolytes. The GDL component is designed to transform the molecular species, compositions, and environment conditions, to physical coefficients in predefined mixture physics laws, where the Redlich-Kister (R-K) mixing theory and Vogel-Fulcher-Tammann (VFT) model are selected for mixture thermodynamic and transport properties, respectively. Specifically, we leverage the graph neural networks43 for molecular-level representation learning and design a customized neural network to learn mixture representations while preserving the permutation invariance of mixture components in DiffMix. We test the predictive power of DiffMix on a non-electrolyte binary mixture dataset of excess molar enthalpies and excess molar volumes, and thereafter on a large-scale simulation dataset of electrolyte ionic conductivities. We compare our model with its purely data-driven methods and show superior performances on prediction accuracy and robustness. Further, with our previously built robotic experimentation setup, Clio16, we demonstrate a differentiable optimization on battery electrolyte mixtures, based on the gradient information from DiffMix auto-differentiation44. We successfully improve the ionic conductivity values by over 18.8% within 10 experimental steps in the evaluated chemical space, enabling the fast-charging design of battery systems. Our method extends the modeling techniques of battery electrolyte mixtures by unifying physics models and geometric deep learning, and realizing the differentiable optimization of battery electrolyte properties.

DiffMix: combining physics and geometric deep learning for modeling chemical mixtures

Our model, DiffMix, combines physics and geometric deep learning in order to build a differentiable and predictive model for chemical mixtures, as shown in Fig. 1a. Taking the input of chemical graphs, compositions, and environment condition vector (gxE), DiffMix processes with two components, geometric deep learning Gθ(gxE) and physics laws, f( , xE), and then output mixture property Pm = f(GθxE), in an end-to-end differentiable framework.

Fig. 1: Differentiable Modeling and Optimization of Chemical Mixtures with DiffMix.
figure 1

a Model Architecture. Input: chemical graphs g = {gi}, compositions x = {xi} and environment condition vector E (e.g., temperature, pressure). Output: mixture property Pm. DiffMix combines geometric deep learning (GDL) and physics laws in a differentiable framework. The GDL component, Gθ( ), transforms (gxE) to coefficients in physics laws, {CRKCi}, via graph convolutional operations and MixtureNet that convolves both chemical identities and (xE) to learn mixture representations. θ is the set of learnable parameters. Two example physics laws, f({CRKCi}, xE), Vogel-Fulcher-Tammann (VFT) law and Redlich-Kister (R-K) mixing law, are included here and can be generalized. Overall, the mixture property output can be written as Pm = f(GθxE). b Detailed architecture of MixtureNet. Input is \(({{{{\bf{g}}}}}^{{\prime} },{{{\bf{x}}}},{{{\bf{E}}}})\), where \({{{{\bf{g}}}}}^{{\prime} }\) is the graph embedding after graph convolutions. Input is processed by fully connected neural networks (FCNN), SubNet and PairNet, to learn the per-substance and pairwise-interaction embeddings, {si} and {pij}, respectively. Mixture embedding m is created after a pooling operator ( ) on {sipij}, and followed by another FCNN, MixNet, to produce physical coefficients. The design of and MixNet depends on the preselected physics laws. c Differentiable optimization and robotic experimentation for fast-charging battery electrolytes. With a trained DiffMix model on battery electrolyte ionic conductivities, auto-differentiation provides the gradient information of \(\frac{d({P}_{{{{\rm{m}}}}})}{d({x}_{{{{\rm{i}}}}})}\) over input compositions. We run a gradient-ascent algorithm on composition space and guide a robotic experimentation setup, Clio16, for fast-charging battery electrolyte design.

Physics models for thermodynamics of mixing and ion transport

The selection of physics models, f( , xE), depends on the mixture properties of interest. Here, we take the mixing thermodynamics of binary non-electrolyte mixtures and ion transport of multicomponent electrolyte mixtures as examples, which can be further generalized to other forms45,46.

To describe the thermodynamics of mixing of non-electrolyte mixtures, a polynomial expansion can be used for representing the excess function of mixing ΔPm, i.e., the difference between mixing thermodynamic quantity Pm and the linear combination of each component ΣixiPi, where Pi is the property of the species i. It has been successfully applied in differentiable thermodynamic modeling37, with the Redlich-Kister (R-K) polynomial19 being a popular choice:

$$\Delta {P}_{{{{\rm{m}}}}}={\Sigma }_{i < j}\left[{x}_{{{{\rm{i}}}}}{x}_{{{{\rm{j}}}}}{\Sigma }_{k=0}^{N}{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}{({x}_{{{{\rm{i}}}}}-{x}_{{{{\rm{j}}}}})}^{k}\right]$$
(1)

where xi and xj are mole fractions of species i and \(j,{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}\) is the R-K polynomial coefficients between the two species and with order number k. Equation (1) preserves the permutation invariance of chemical species i and j, when the odd orders of polynomials follow the parity rule of permutation. The mixture thermodynamic property Pm can be further obtained by:

$${P}_{{{{\rm{m}}}}}={\Sigma }_{i < j}\left[{x}_{{{{\rm{i}}}}}{x}_{{{{\rm{j}}}}}{\Sigma }_{k=0}^{N}{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}{({x}_{{{{\rm{i}}}}}-{x}_{{{{\rm{j}}}}})}^{k}\right]+{\Sigma }_{i}{x}_{{{{\rm{i}}}}}{P}^{{{{\rm{i}}}}}$$
(2)

Equation (2) preserves permutation invariance over mixture components and can be applied to a wide range of mixture thermodynamic properties.

On the ion transport properties, we focus on the ionic conductivities of battery electrolytes. A higher ionic conductivity will reduce the ion transfer resistance between electrodes and lessen the formation of electrolyte concentration polarization, therefore enabling fast-charging battery applications47. Here, we select the Vogel-Fulcher-Tammann (VFT) model to capture the temperature dependence48 as:

$${P}_{{{{\rm{m}}}}}={C}_{1}{e}^{-\frac{{C}_{2}}{T-{C}_{3}}}$$
(3)

where T is the temperature and {Ci} is a set of physical coefficients.

Geometric deep learning (GDL) to learn mixture representations

Equation (2) and (3) describe thermodynamic and ion transport laws that conventionally rely on empirically fitting experimental data to obtain physical coefficients, {CRKCi}. However, the function relationship between mixture input (gxE) and physical coefficients {CRKCi} remains unknown. GDL component is therefore introduced to replace physical coefficients with learnable GDL functions, {CRKCi} = Gθ(gxE). In this way, the mixture physics model now becomes predictive and fully differentiable from chemical structures to properties. The first step in the GDL component is a graph convolution43 transformation over each component graph gi to obtain the graph-level feature vector \({{{{\bf{g}}}}}_{{{{\rm{i}}}}}^{{\prime} }\), for component i, as shown in Fig. 1a. In the second step, \({{{{\bf{g}}}}}_{{{{\rm{i}}}}}^{{\prime} }\) is attached with compositions and environment conditions and processed by MixtureNet to learn the mixture-level representations. The detailed model architecture of MixtureNet is shown in Fig. 1b. Each attached mixture component vector, vi=\([{{{{\bf{g}}}}}_{{{{\rm{i}}}}}^{{\prime} },{x}_{{{{\rm{i}}}}},{{{\bf{E}}}}]\), passes through two weight-sharing fully connected neural networks (FCNN), SubNet and PairNet, to learn the per-substance and pairwise-interaction embeddings, {si} = {fsub(vi)} and {pij} = {fpair(vivj)}, respectively, where fsub( ) and fpair( ) represent the neural network transformations. Depending on the mixture physics laws, {si} and {pij} are combined in a certain form to produce the physical coefficients. For VFT model in Equation (3) and battery electrolyte mixtures, mixture feature vector m is created via a pooling operator , m = {si} {pij} = [Σixi si, Σijxixj pij], which provides a permutation-invariant map from {si} and {pij} to m by concatenating the weighted sums of substance and pair embeddings. The physical coefficients in the VFT model, {Ci}, is a function of m via another FCNN, MixNet, {Ci} = { f mix(m)}. When no physics laws are applied, MixNet simply outputs mixture property. A special case exists in the mixing law of thermodynamics and R-K polynomial-based model in Equation (2), due to the intrinsic per-substance dependence of Pi and pairwise interaction dependence of \(\{{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}\}\). They can be modeled directly from SubNet and PairNet, and our implementation follows: {Pi} = { fs2p(si)} and \(\{| {C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}| \}=\{( \, \, {f}^{{{{\rm{s}}}}2{{{\rm{RK}}}}}([{{{{\bf{s}}}}}_{{{{\rm{i}}}}},{{{{\bf{s}}}}}_{{{{\rm{j}}}}}])+{f}^{{{{\rm{s}}}}2{{{\rm{RK}}}}}([{{{{\bf{s}}}}}_{{{{\rm{j}}}}},{{{{\bf{s}}}}}_{{{{\rm{i}}}}}]))/2\}\), where fs2p( ) and fs2RK( ) are two separate FCNNs to learn the pure component property given substance embeddings and RK coefficients given pairs of substance embeddings, respectively. An alternative implementation is to replace concatenated substance embeddings [sisj] with pij. We further introduce the mean pooling of permuted sequences via dividing the fs2RK( ) summation by two so that \(| {C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}|=| {C}_{{{{\rm{RK}}}},{{{\rm{ji}}}}}^{{{{\rm{k}}}}}|\), where the absolute value symbol is added since the RK coefficients follow permutation parity, \({C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}={C}_{{{{\rm{RK}}}},{{{\rm{ji}}}}}^{{{{\rm{k}}}}}\cdot {(-1)}^{{{{\rm{k}}}}}\). We set \({C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}=| {C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}|\) when chemical i’s molecular weight is smaller than j’s. No additional pooling operations are required. More details about the model implementation can be found in the Methods section. Note that the geometric deep learning concept refers to the approaches that incorporate appropriate geometric priors, i.e., information on the structure space and symmetry properties of input signals41, and in this work we preserve permutation invariance over components with the mixture pooling operator ( ) in the VFT model and the intrinsic permutation invariance introduced in the R-K model. The preserved permutation invariance will improve the model robustness by outputting identical mixture properties when mixture component sequences are permuted. To confirm that, we reverse the co-solvent sequences and recalculate the loss on the permuted testing set, as reported in the result section.

In the graph convolution transformation, we perform an ablation study on the impact of 3-dimensional (3D) molecular structural information, by introducing DimeNet++49,50 that operates on 3D molecular conformers generated and optimized by RDKit51, compared with the original version of graph convolutions on 2-dimensional (2D) molecular graph objects. More details are described in the Methods part. Furthermore, to benchmark the effectiveness of combining physics laws with data-driven models, we design a purely data-driven baseline, GNN-only, a graph-neural-network model without physics incorporated, which is created by removing the mixture physics model in DiffMix. In the VFT-type of the GDL component, instead of outputting {Ci} as the physics law coefficients, the GNN-only model ignores the physics laws and directly produces the mixture properties. All models are evaluated on two thermodynamic datasets of binary non-electrolyte mixtures and one transport property dataset of battery electrolyte mixtures. The thermodynamic data include literature-curated excess molar enthalpies (631 data points) and excess molar volumes (1,069 data points). For electrolytes, the ionic conductivity dataset is prepared that contains 24,822 mixtures of single-salt-ternary-solvent electrolyte solutions, generated by the Advanced Electrolyte Model (AEM)52,53. More data generation details can be found in the Methods part. In Supplementary Information (SI), we also test a data-driven variant with non-graph Morgan fingerprints for molecule representation54 and without permutation invariance of mixture component sequences.

DiffMix-guided robotic experimentation and optimization for battery electrolytes

Differentiability enables gradient-based optimization for materials modeling and design37,40. With auto-differentiation44 on a trained DiffMix model, we can conveniently obtain gradient information of mixture property output over input compositions, \(\frac{d({P}_{{{{\rm{m}}}}})}{d({x}_{{{{\rm{i}}}}})}\), and thereafter navigate the mixture chemical space in order to optimize the mixture property objective. In Fig. 1c, we illustrate the battery electrolyte optimization on a ternary co-solvent composition space to maximize the ionic conductivity via a gradient-ascent algorithm, and guide our previously developed robotic experimentation setup, Clio16, to improve the electrolyte ion transport properties for fast-charging batteries. Note that other factors, such as electrochemical stability and interfacial reactivity, may also play an important role in fast-charging battery electrolyte design and need to be discussed in a future study.

Results

Differentiable modeling on thermodynamic and transport properties of chemical mixtures

We start our result analysis on excess molar enthalpies (\({H}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)) and excess molar volumes (\({V}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)) of binary non-electrolyte mixtures. The model performances of DiffMix and GNN-only model are summarized in Table 1. First, we do not observe a significant performance boost when replacing the 2D Graph Convolutions with 3D-information-incorporated DimeNet++ models, which may be attributed to the fact that the datasets we include in our work are real-world mixtures, where the individual molecule may have spatially varying 3D coordinates depending on the local environments, so one fixed configuration generated by RDKit may not be representative enough. Second, we find that DiffMix and DiffMix-3D models, built on the known physics prior, outperform the GNN-only and GNN-only-3D models. With DiffMix model, we achieve mean-absolute-errors (MAEs) of 0.033 ± 0.009 (cm3/mol) and 5.10 ± 0.32 (J/mol) for excess molar volumes and excess molar enthalpies, respectively.

Table 1 Model Performance on Thermodynamic Properties and Ionic Conductivitiesa

Further, we investigate the predictive power of DiffMix and DiffMix-3D on ionic conductivities (κ) of multi-component electrolyte solutions. With the 24,822 ionic conductivity data points, we train both models and compare them with the GNN-only and the GNN-only-3D baseline models. The prediction accuracy on the testing sets is shown in Table 1. Compared with thermodynamic results, the accuracy improvement by adding physics priors in DiffMix is not as significant here, which holds for both 2D and 3D model variants. This may be attributed to the limited physical capacity of the VFT model in Equation (3), but further investigation is required, such as testing alternative physics laws for ionic conductivities. Here, with DiffMix model, we achieve the MAE of 0.044 ± 0.005 (mS/cm) considering the maximum ionic conductivity above 12 (mS/cm) in the training set.

It is also worth noting that DiffMix-3D models are trained at least two times more slowly than DiffMix models across the tasks, while we do not observe a distinct performance improvement when incorporating 3D information in the graph convolution input. Therefore, we focus our further result analysis on the machine learning models without 3D information. First, we confirm the permutation invariance of DiffMix and GNN-only models, considering the identical loss values before and after permuting the component sequences for all tasks, as shown in Table 1. According to Table S 1, the baseline model without permutation invariance reports permuted testing loss values that are significantly higher than the vanilla testing loss in all tasks, indicating the importance of model invariance. Furthermore, the parity plots for three tasks are shown in Fig. 2(a–c), indicating a high correlation between true and DiffMix-predicted values. In Fig. 2(d), we visualize the DiffMix-learned mixture features (m in Fig. 1b) for ionic conductivities with principal component analysis (PCA) in two dimensions, where we observe a smooth distribution of high and low κ values. The smooth mixture-feature distribution demonstrates the effectiveness of the graph convolution step, SubNet and PairNet, and pooling, on distinguishing different mixture compositions, which can boost prediction accuracy after passing the features through MixNet.

Fig. 2: Prediction Accuracy Analysis of DiffMix.
figure 2

Parity plots of a excess molar volume (\({V}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)) testing dataset, b excess molar enthalpy (\({H}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)) testing dataset, and c Ionic Conductivity (κ) Testing Dataset. GNN-only stands for the baseline graph-neural-network model without physics incorporated. The prediction mean-absolute-error (MAE) is reported following the model name in the legend. d Two-dimensional Principal-Component-Analysis (PCA) of Mixture Features Extracted from the Trained DiffMix Model on the Full Ionic Conductivity Dataset. Source data are provided as a Source Data file.

Physics model capacity and temperature extrapolation

For the mixture thermodynamics tasks, so far, the polynomial order N in Equation (2) is specified as four. To study the polynomial-order dependence of model capacity, we vary the polynomial order as N = 0, 4, 9, 14 or fully remove the excess term in Equation (2). The latter essentially describes the linear mixing rule. The results are shown in Fig. 3a, b, where we also compare them with GNN-only in order to see the effectiveness of the added physics models. First, we observe the trend of decreasing testing errors when higher orders of polynomials are introduced, i.e., increasing the capacity of the mixture physics model. With N = 4, MAEs for both \({V}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\) and \({H}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\) get reduced by over half than those of the linear mixing model. However, the model performance plateaus as we further increase the polynomial-based model capacity. It is worth noting that the experimentation uncertainty is around 0.005 (cm3/mol) and 5 (J/mol) for the two measurements. For the excess molar volume task, the plateauing behavior may be due to the fact that DiffMix accuracy is limited by the GDL model capacity. However, for the enthalpy task, it can also be attributed to the data uncertainty, considering that the DiffMix prediction MAE is close to the measurement error. Compared with the GNN-only baseline model, in both cases, adding the zeroth-order interaction terms improves the worst performing linear mixing model significantly and DiffMix (N=0) matches the GNN-only baseline. This indicates that both the molar volume and enthalpy changes of mixing rely on the intermolecular interactions, modeled by the pair-wise interaction coefficients in R-K mixing laws. Further, Figure S 2 describes the overall decreasing trend of R-K polynomial coefficients \(\{{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}\}\) when 15 polynomials are included in the physics-based R-K model, explaining the plateauing pattern of the model accuracy.

Fig. 3: Physics Model Capacity and Extrapolation Analysis of DiffMix.
figure 3

a, b Physics Law Analysis on Thermodynamic Data. Varying the polynomial order for a excess molar volume (\({V}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)), and b excess molar enthalpy (\({H}_{{{{\rm{m}}}}}^{{{{\rm{E}}}}}\)). In both cases, the regression mean-absolute-error (MAE) values of DiffMix models with N = 0, 4, 9, 14 are compared with those of GNN-only and linear mixing (LinearMix) models. GNN-only stands for the baseline graph-neural-network model without physics incorporated, and N represents the Redlich-Kister (R-K) polynomial order in Equation (2). The bar plots are generated after running an ensemble of 5 models, where the black lines display the standard deviations of results and each black dot represents one MAE value. c, d Model Extrapolation on Ionic Conductivity (κ) Data. c Prediction Accuracy of DiffMix and GNN-only model for interpolation and extrapolation cases. Regression MAE values are grouped by data points with the same temperatures (T, in Celsius), and their standard deviations are denoted by the shaded region. The training is performed on low-temperature Advanced-Electrolyte-Model (AEM) data (≤20 °C) so the evaluations at higher temperatures are viewed as extrapolation. d Parity plots of DiffMix predictions with experimental measurements. Electrolyte solutions are made by dissolving LiPF6 salts in in propylene carbonates (PC), ethylene carbonates (EC), and ethyl methyl carbonates (EMC) solvent mixtures55. Blue dots are interpolation cases, while red dots are extrapolation cases. Black solid lines are standard deviations of the reported experimental values. Source data are provided as a Source Data file.

On modeling the ionic conductivities (κ) of battery electrolytes, we test model extrapolation to higher temperatures, as shown in Fig. 3c. In Fig. 3c, we report the prediction MAEs grouped by temperatures in the range of [-30 °C, 60 °C]. Note that our training is performed on the data with temperature range [-30 °C, 20 °C]. We notice that the interpolation MAE is close to 0 for both models, consistent with the low MAE results reported in Table 1. However, in the extrapolation test on the data generated above 20 °C, non-negligible errors have been detected, and the MAE magnitudes are positively correlated with the temperature change from 20 °C. The average MAE at 60 °C goes above 1 (mS/cm), two orders of magnitudes higher than that in the interpolation case. Compared to the GNN-only baseline, we found a superior accuracy with DiffMix. The average MAE drops from 0.39 (mS/cm) to 0.24 (mS/cm), from 0.83 (mS/cm) to 0.49 (mS/cm), from 1.28 (mS/cm) to 0.75 (mS/cm), and from 1.74 (mS/cm) to 1.03 (mS/cm), at T of 30, 40, 50, 60 °C, respectively. We further compare the DiffMix prediction results with the experimental measurements, as shown in the parity plot of Fig. 3d. Both the interpolation and extrapolation testing results of DiffMix are validated by experimental measurements for the solutions of lithium hexafluorophosphate (LiPF6) in ethylene carbonates (EC), propylene carbonates (PC), and ethyl methyl carbonates (EMC) solvent mixtures55. In the experimentation, the salt concentration varied between 0.2 (mol/kg) and 2.1 (mol/kg), and the EC:PC ratio was varied with (EC+PC):EMC ratio fixed at 3:7 and 1:1, respectively. We find a good agreement between DiffMix predictions and experiments, even in the extrapolation test with temperatures higher than 20 °C. Quantitatively, the R2 and Pearson correlation coefficient values for interpolation and extrapolation sets are (0.80, 0.94) and (0.75, 0.92), and the interpolation and extrapolation MAEs are 0.75 (mS/cm) and 1.07 (mS/cm), respectively. Based on the results in Fig. 3d, we conclude that the AEM-generated data provide an accurate basis to learn the complex electrolyte patterns via DiffMix at the given conditions.

Differentiable battery electrolyte optimization with DiffMix and robotic experimentation

Fast charging of Li-ion batteries is impacted by electrolyte ionic conductivities, and electrolyte optimization can be challenging for battery design due to high experimentation costs16. With the trained DiffMix model, we test its capability to evaluate ionic conductivities and design electrolyte mixtures for high-performing Li-ion batteries. We select three types of electrolyte solutions as test cases and evaluate their ionic conductivities at 30 °C and varying co-solvent compositions. They are LiPF6 salt in solvent mixtures of (i) cyclic carbonates, including ethylene carbonates (EC), propylene carbonates (PC) and fluorinated ethylene carbonates (FEC), (ii) linear carbonates, including ethyl methyl carbonates (EMC), diethyl carbonates (DEC) and dimethyl carbonates (DMC) and (iii) cyclic and linear carbonates, including EC, PC, and DMC. We first show the ionic conductivity landscape of (i) in Fig. 4a by varying co-solvent compositions with fixed lithium mole fractions of 0.08, where we observe a moderate ionic conductivity peak up to 8 (mS/cm) in the EC-enriched region. Note that we treat the anions and cations separately when computing the mole fraction. Fig. 4b provides the conductivity landscape of electrolyte mixture (ii), where the highest κ values are observed in the DMC-enriched region. Here, we fixed the lithium mole fraction at 0.12 due to the low dielectric constants of linear carbonates and thus the low dissociation degree of lithium salts. According to the conductivity map of the electrolyte mixture (iii) shown in Fig. 4c, adding linear carbonate molecules into cyclic carbonate solvents can significantly increase the mixture ionic conductivities, where the maximum ionic conductivity is 14.39 (mS/cm) when PC:DMC:EC mole ratio is close to 0:0.70:0.30 with a fixed lithium-ion mole fraction of 0.08. It is worth noting that the training data is produced with a temperature lower than or equal to 20 °C (maximum is 12.6 (mS/cm)), but we are testing its generalization capability at 30 °C. We verify this result with the output of our data generator, AEM52, which provides the highest conductivity of 14.2 (mS/cm) at 0.082 to 0.085 lithium mole fraction with the given PC:DMC:EC ratio. This agrees well with the differentiable modeling result and indicates good generalization of DiffMix.

Fig. 4: Differentiable Battery Electrolyte Optimization with DiffMix and Robotic Experimentation.
figure 4

Optimizing on ionic conductivity (κ) landscape (Temperature = 30 C) for a LiPF6 salts dissolved in ethylene carbonates (EC), propylene carbonates (PC) and fluorinated ethylene carbonates (FEC) solvent mixtures (with fixed lithium-ion mole fraction of 0.08); b LiPF6 salts dissolved in ethyl methyl carbonates (EMC), diethyl carbonates (DEC) and dimethyl carbonates (DMC) solvent mixtures (with fixed lithium-ion mole fraction of 0.12); c LiPF6 salts dissolved in EC, PC and DMC solvent mixtures (with fixed lithium-ion mole fraction of 0.08). In each optimization case, a batch of four trajectories has been simulated starting from the dot sign and ending at the cross sign. The white arrows are the gradient information obtained by auto-differentiating the ionic conductivity against compositions through DiffMix. d Optimization curves of ionic conductivities in c along the four trajectories, where we include both DiffMix results and the robotic experimentation results generated by Clio. The shaded areas of the Clio curves represent the standard deviations of the measurement. Source data are provided as a Source Data file.

As previously introduced, the gradient information is readily accessible by differentiating the trained DiffMix model. To illustrate that, we show the gradient vectors as arrows in the ionic conductivity landscapes in Fig. 4a–c. In Fig. 4a, c, we observe large gradients at pure EC solvent area, indicating that adding a small number of co-solvents can significantly improve the ionic conductivity. This can be explained by EC’s being solid-like at room temperature11. Another interesting observation is that pure DMC solvent area in Fig. 4c displays a much higher gradient than that in Fig. 4b, with which we conclude that adding a small quantity of high-polarity cyclic carbonate solvents (EC, PC) could enable higher ionic conductivities. The gradient field map is generated instantaneously by auto-differentiation leveraging training data generated by AEM in the trained DiffMix model, which saves expensive experimental costs by reducing the number of optimization steps. Based on the gradient information provided by DiffMix, we implement a gradient-ascent algorithm by increasing the objectives iteratively following the gradient directions, where a batch of four starting points are initialized in the mixture space. From Fig. 4a–c, our optimization algorithm robustly identified local maximum spots. This differentiable optimization framework further guides the robotic experimentation performed by our hardware setup, Clio. We extract the batch of four optimization trajectories in Fig. 4c and compare the ionic conductivities evaluated at each step by both DiffMix and Clio, as shown in Fig. 4d. DiffMix and Clio results show a good agreement between simulation and experimentation, and we show at least an 18.8% increase from the initial ionic conductivities. Note that the temperature of Clio is managed at around 27 °C, which is slightly varied from the DiffMix temperature of 30 °C. This may account for deviations between experimental and simulation data. Across Fig. 4a–d, we focus on varying solvent composition space while fixing lithium mole fractions. The fixed lithium mole fractions are selected for demonstration purposes and are close to the optimal lithium concentrations in the training data generated at lower temperatures (20 °C). We further realize an improved ionic conductivity by fixing solvent compositions and optimizing salt concentrations to prove the generalizability of our framework, as described in SI. These results elucidate the capability of differentiable modeling of battery electrolytes, with which we could efficiently explore the chemical space of multi-component electrolyte mixtures.

Discussion

In this work, focusing on battery electrolytes, we develop a GDL-based differentiable model for chemical mixtures, DiffMix, that combines the advantages of physics-based models and geometric deep learning and further guides the robotic experimentation for practical electrolyte optimization. The evaluation results on thermodynamic data of binary non-electrolyte mixtures and ion transport data of electrolyte mixtures indicate that DiffMix preserves the component-wise permutation invariance and enables more accurate and robust predictions than GNN-only and MixECFP (in Table S 1), as can be seen from the low MAEs. When extrapolated to high temperatures, DiffMix ionic conductivity predictions show superior accuracy than the GNN-only baseline, due to the incorporation of a temperature-dependent VFT model. The experimental measurements and DiffMix display a good agreement with each other, even in the extrapolation case, enabling the real-world applications of our trained model.

We further test the physics model capacity of R-K thermodynamic mixing law in DiffMix by tuning the polynomial order N in Equation (2) and observe a plateauing behavior beyond N = 4. A similar trend between excess molar volumes and enthalpies is observed that adding the zeroth-order interaction terms significantly improves the worst performing linear mixing model, indicating strong intermolecular interactions that may exist in the tested mixture systems. Although this demonstrates the flexibility of our model in terms of the function forms of physics laws, future investigation is required to explore other types of thermodynamic and kinetic laws for mixtures.

By building our model in a fully differentiable framework, gradient information is readily accessible for a trained DiffMix model. This further allows us to optimize ionic conductivity over the input space. Taking the input co-solvent composition as variables, we identify peak ionic conductivity areas for various ternary co-solvent electrolyte chemical spaces by mixing linear carbonate solvents and cyclic carbonate solvents. The simulated trajectories have been utilized to guide the robotic experimentation performed by Clio, which successfully increases ionic conductivity values by over 18.8%. It is worth noting that in this work we conduct the DiffMix-guided robotic experimentation in a two-step process, (1) training a DiffMix model with simulated AEM data and running the optimization on the modeled response surface, (2) guiding Clio with the predefined optimization trajectory. In an alternative way, especially when the simulation is not of high quality, a closed-loop optimization can be designed via retraining the DiffMix model every few iterations during experimental data collection, which may enable a more robust and adaptive optimization. Our work has expanded the modeling and optimization techniques of battery electrolyte mixtures by unifying physics laws and geometric deep-learning in a differentiable framework.

So far, our discussions on battery electrolyte optimization focus on improving their ionic conductivities, a key factor that determines fast-charging behaviors by reducing the concentration polarization and ion transfer resistance, while other factors, e.g., electrochemical stability and interfacial reactivity, may be important and need to be considered in a future work. The chemical space we explore in this work is well-benchmarked56,57, and a similar electrolyte recipe has shown improved fast-charging performances on the device level16. However, when we move to other chemical spaces, electrochemical device measurements are required with the optimized electrolyte compositions. Future works can also be conducted on constructing comprehensive datasets for other properties beyond ionic conductivities to perform multi-objective optimization, to which our DiffMix framework is readily applicable with a modified multi-task objective function.

Methods

Data collection and generation

Thermodynamic and transport mixture property datasets were prepared for benchmarking models developed in this work. The thermodynamic datasets include excess molar enthalpy58,59,60,61 and excess molar volume60,62,63,64,65,66,67,68 values curated from the literature. There are 631 data points for excess molar enthalpy, covering 34 unique mixture chemistries composed of 35 organic chemicals with varying compositions. For excess molar volume, there are 1069 binary mixture data points based on 28 unique mixtures composed of 25 organic chemicals with varying compositions. For ionic conductivities, we prepared an ionic conductivity dataset that contains 24,822 mixtures of single-salt-ternary-solvent electrolyte solutions. These electrolyte components consist of two unique salt species, including lithium hexafluorophosphate (LiPF6), lithium bis((trifluoromethyl)sulfonyl)azanide (LiTFSI), and six organic carbonate solvents, including ethylene carbonates (EC), propylene carbonates (PC), fluorinated ethylene carbonates (FEC), ethyl methyl carbonates (EMC), diethyl carbonates (DEC) and dimethyl carbonates (DMC). The electrolyte data were generated with one salt and any arbitrary combinations of three co-solvents, with the salt concentration ranged in {0.025, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0} molal and each-solvent mass fractions varying from {0, 0.2, 0.4, 0.6, 0.8, 1.0}. The temperature range of training data is [-30 °C, 20 °C] and temperature extrapolation cases, as shown in Fig. 3c, are evaluated at higher temperatures. The data generation was performed by the Advanced Electrolyte Model (AEM) that produces high-fidelity electrolyte data for the chemical species evaluated here52. In the collected datasets, molecule names are stored and can be converted to their Simplified Molecular Input Line Entry System (SMILES) format69, from which we can retrieve the topology and chemical information on atoms and bonds with RDKit51. We converted all compositions into mole fractions during model training. The availability of training datasets is discussed in the data availability statement.

Model implementation and training

For GNN-only and DiffMix, the atom features considered include one hot encoding of atom type, number of heavy neighbors, formal charges, hybridization type, chirality, and number of implicit hydrogens, and numerical information on ring structures, aromaticity, atomic mass, VdW radius, and covalent radius, giving a 97-dimension feature vector. Note that no bond features are incorporated in our model, but can be included in future work. The graph convolution cell is made up of 3 torch_geometric.nn.conv.GraphConv()43 steps, each of which is followed by ReLU and dropout layers (dropout rate, p = 0.25). The whole graph convolution cell ends up with a global mean pooling layer and provides graph-level embeddings for component molecules \({{{{\bf{g}}}}}_{{{{\rm{i}}}}}^{{\prime} }\). These graph-level embeddings are then concatenated with compositions and environment conditions. For DiffMix-3D models, we replaced the aforementioned graph convolution by DimeNet++49,50 model, where we followed the Pyorch-Geometric implementation of the model (torch_geometric.nn.DimeNetPlusPlus()), and applied default parameters from the original official repository70 (emb_size: 128; out_emb_size: 256; int_emb_size: 64; basis_emb_size: 8; num_blocks: 4; num_spherical: 7; num_radial: 6). To combine the DimeNet++ block with the original DiffMix model, the DimeNet++ block should output graph-level embeddings in the same dimension (256) as graph convolutional layers. The 3D molecular structures are constructed by RDKit conformer generator (rdDistGeom.EmbedMolecule(mol, randomSeed=123); .GetConformer().GetPositions(), where mol is a molecule object) and then optimized by AllChem.MMFFOptimizeMolecule(). In the mixture ionic conductivity model, to reduce the computational cost, we chose a different hyperparameter set: (emb_size: 16; out_emb_size: 256; int_emb_size: 8; basis_emb_size: 4; num_blocks: 2; num_spherical: 2; num_radial: 2), which is still more than 2 times slower than the DiffMix model. Note that RDKit fails to generate 3D coordinates of lithium hexafluorophosphate (LiPF6), so 2D coordinates of which are manually downloaded from PubChem database71. In the DiffMix-3D models for ionic conductivities, we treated each salt molecule as a whole, without separating lithium cations and anions since there is no explicit charge information encoded for atoms in the DimeNet++ input.

For GNN-only model, no physics laws are incorporated, and therefore MixtureNet output is the predicted mixture property. The dimensions of SubNet, PairNet, and MixNet go as follows: [NNNN], [2N, 2N, 2NN], [2N, 2N, 4N, 2N, 1], where N = 256 + 1 + Nenv and Nenv is the dimension of environment conditions. For DiffMix, it is treated differently for thermodynamic and ion transport properties, since distinct mixing laws are selected. With the VFT model selected, MixNet now is changed into [2N, 2N, 4N, 2N, 3], which outputs the three physical coefficients in the VFT model. In terms of the thermodynamics of mixing, SubNet is used to obtain the per-substance embeddings and two additional neural networks with hidden-layer dimensions, [NN] and [2N, 4N, 2N], to obtain the component-wise physical parameters {Pi} and pair-wise physical parameters \(\{{C}_{{{{\rm{RK}}}},{{{\rm{ij}}}}}^{{{{\rm{k}}}}}\}\), where the input-layer and output-layer dimension for the former is N and 1, and for the latter is 2N and the defined R-K polynomial order.

During training, we set the learning rate as 0.001 with a weight decay rate of 10−4 in PyTorch72Adam optimizer. L1Loss is used for loss backpropagation. We also applied early stopping criteria to select the epoch with the lowest validation error to avoid overfitting. The ionic conductivity labels went through a logarithm transformation before computing the loss values to distinguish mixture properties that span multiple orders of magnitudes. All models were implemented with PyTorch72 and PyTorch Geometric73. For each mixture property, the full dataset was randomly split into training, validation, and testing sets, in the ratio of 8: 1: 1. The cross-validation results were reported based on an ensemble of 5 models with randomly varying splits.

Differentiable optimization algorithm

With the gradient information generated from auto-differentiation, a gradient-ascent algorithm is developed to iteratively increase the electrolyte ionic conductivity, as shown in Fig. 4. With the constraint that the mole fractions of three co-solvents should be summed into 1, we only need to vary the mole fractions of the first two components, a and b. The gradient-based update is as follows:

$$\left[{x}_{{{{\rm{a}}}}}^{1},{x}_{{{{\rm{b}}}}}^{1}\right]=\left[{x}_{{{{\rm{a}}}}}^{0},{x}_{{{{\rm{b}}}}}^{0}\right]+\left[\frac{d\kappa }{d{x}_{{{{\rm{a}}}}}^{0}},\frac{d\kappa }{d{x}_{{{{\rm{b}}}}}^{0}}\right]\cdot {{{\rm{dx}}}}$$
(4)

where [\({x}_{{{{\rm{a}}}}}^{0},{x}_{{{{\rm{b}}}}}^{0}\)] and [\({x}_{{{{\rm{a}}}}}^{1},{x}_{{{{\rm{b}}}}}^{1}\)] are the initial and updated mole fractions (points) of a and b, respectively, and \([\frac{d\kappa }{d{x}_{{{{\rm{a}}}}}^{0}},\frac{d\kappa }{d{x}_{{{{\rm{b}}}}}^{0}}]\) are the gradient of ionic conductivity κ at the initial point. The adaptive step size dx is calculated by normalizing 0.02 with the L2 norm of the two-dimensional gradients. It is worth noting that the optimizer needs to observe the constraints that each dimension of mole fractions needs to be within 0 and 1. When the updated step \([{x}_{{{{\rm{a}}}}}^{1},{x}_{{{{\rm{b}}}}}^{1}]\) is outside the boundaries, we perform a two-step update as follows:

$$\left[{x}_{{{{\rm{a}}}}}^{1{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{1{{{\rm{t}}}}}\right]=\left[{x}_{{{{\rm{a}}}}}^{0},{x}_{{{{\rm{b}}}}}^{0}\right]+\left[\frac{d\kappa }{d{x}_{{{{\rm{a}}}}}^{0}},\frac{d\kappa }{d{x}_{{{{\rm{b}}}}}^{0}}\right]\cdot \left(1-{x}_{{{{\rm{a}}}}}^{0}-{x}_{{{{\rm{b}}}}}^{0}\right)/{{{\rm{dx}}}}0\cdot {{{\rm{dx}}}}$$
(5)
$$\left[{x}_{{{{\rm{a}}}}}^{2{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{2{{{\rm{t}}}}}\right]=\left[{x}_{{{{\rm{a}}}}}^{1{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{1{{{\rm{t}}}}}\right]+\left[\frac{d\kappa }{d{x}_{{{{\rm{a}}}}}^{0}}-\frac{d\kappa }{d{x}_{{{{\rm{b}}}}}^{0}},\frac{d\kappa }{d{x}_{{{{\rm{b}}}}}^{0}}-\frac{d\kappa }{d{x}_{{{{\rm{a}}}}}^{0}}\right]\cdot \left({x}_{{{{\rm{a}}}}}^{1}+{x}_{{{{\rm{b}}}}}^{1}-1\right)/{{{\rm{dx}}}}0\cdot {{{\rm{dx}}}}$$
(6)

where \([{x}_{{{{\rm{a}}}}}^{1{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{1{{{\rm{t}}}}}]\) and \([{x}_{{{{\rm{a}}}}}^{2{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{2{{{\rm{t}}}}}]\) are the intermediate and updated points. Here, the first step is to get closer to the boundary while the second step is designed to move along the boundary, i.e., the zero composition line of the third component. dx0 (0.02) is introduced to normalize the proxy distances of \([{x}_{{{{\rm{a}}}}}^{0},{x}_{{{{\rm{b}}}}}^{0}],[{x}_{{{{\rm{a}}}}}^{1},{x}_{{{{\rm{b}}}}}^{1}]\) against the zero-composition boundary of the third component, i.e., \((1-{x}_{{{{\rm{a}}}}}^{0}-{x}_{{{{\rm{b}}}}}^{0})\) and \(({x}_{{{{\rm{a}}}}}^{1}+{x}_{{{{\rm{b}}}}}^{1}-1)\). Note that this is an approximate derivation (due to the simplified distance formula) and the updated point \([{x}_{{{{\rm{a}}}}}^{2{{{\rm{t}}}}},{x}_{{{{\rm{b}}}}}^{2{{{\rm{t}}}}}]\) may go beyond the boundary. Therefore, they are followed by clipping \({x}_{{{{\rm{a}}}}}^{2{{{\rm{t}}}}}\) and \({x}_{{{{\rm{b}}}}}^{2{{{\rm{t}}}}}\) (within 0 and 1) and being normalized by the summation of \({x}_{{{{\rm{a}}}}}^{2{{{\rm{t}}}}}\) and \({x}_{{{{\rm{b}}}}}^{2{{{\rm{t}}}}}\) (when above 1). This optimization algorithm performs well in all tested scenarios, as shown in Fig. 4. In Fig. 4, we visualize the optimization history by showing the explored trajectory every five steps until convergence (criteria: the L2 norm of the gradient is below 0.2 or up to 100 steps). When guiding the experimentation robot, we conducted one measurement every two visualized optimization steps, i.e., 10 simulation steps, until no significant ionic conductivity improvement was observed.

Automated experimental measurements of electrolyte properties

The ionic conductivity measurements in this work were done by Clio, a custom-built robotic setup16. The ionic conductivity data were measured by electrochemical impedance spectroscopy (EIS) in a PTFE fixture chamber using a PalmSens4 impedance analyzer. The electrolytes were filled into the chamber between two symmetric Pt electrodes. The impedance of the cell was measured at five frequencies between 14 kHz and 800 kHz. The resistance of the sample is determined by evaluating the real part of the impedance at the frequency where the smallest phase difference is observed during measurement. To calculate the specific ionic conductivity of the sample, a cell constant is obtained through a single point calibration using a known solution (Acetonitrile and LiPF6). The specific ionic conductivity is then determined by dividing the inverse resistance by the cell constant. The temperature was managed via glove-box-wide heating and airflow. Temperatures were 27.2 °C ± 0.3 °C. We note that this is slightly lower than the predictions of DiffMix, thus temperature may account for deviations between experimental and modeled data.

Experimental methods: materials

The electrolyte salt (LiPF6) and solvents (PC, DMC, EC) used in this study were obtained from Linyi Gelon LIB Co. Ltd., anhydrous (<20 ppm) and battery grade (99.9% pure). The precursors and electrolyte stock solutions were prepared and stored in a dry Ar-filled glove box (<100 ppm oxygen,  <10 ppm H2O). The stock solutions were made by first mixing the solvents into the desired mass ratios, then gradually adding salts to the solvents to the designated concentrations. The mass of the solutes and solvents were measured using a Denver Instrument PI-214.1 analytical balance. All solutions were mixed with a magnetic stir bar and magnetic stir plate in a glass beaker for at least half an hour after the last visible salt. The solutions were then transferred to and stored in 60-mL amber glass vials with Sure/Seal septa lids.