Introduction

Ionic liquids (ILs) are generally described as salt-like materials -completely composed of ions- with melting points below \(100^{\circ }\)C. Although they were introduced in 19141, it was not until the early 1990s that they received an explosion of attention due to their extraordinary properties, such as negligible volatility, low melting point, high polarity, high thermal stability, high ionic conductivity and structural designability2,3. The growing interest on these innovative fluids for the development of new technologies lies in their potential for being used in a wide range of applications, including synthesis, gas adsorption, drug delivery, electrolytes for energy storage devices, and metal processing applications4. The huge amount of possible combinations of cations and anions, as well as the extensive variation of their physicochemical properties, is in the basis of the name “designer solvents”5. In general, it is agreed that the cation strongly impacts the properties of the IL and defines its stability, whereas the anion controls the chemistry and functionality of the IL. Therefore, understanding how the ions interact with each other is key to obtain a deep comprehension on how the aforementioned properties arise and can be modulated.

In this context, computer modeling is a powerful and efficient tool in the search of the most promising IL for a given aim, which allows to avoid expensive and repetitive experiments. For example, quantum chemical calculations are very useful for analyzing the interactions in ILs6,7,8,9, but these methods are not suitable for the characterization of liquid properties. On the other hand, molecular dynamics (MD) simulations, which compute the motion of each individual atom or molecule using Newton’s second law, are well-suited for the study of molecular liquids. Conventional MD allows one to reach time and length scales in the nano-regime, and has been extensively employed for exploring the structure and dynamics of ILs10,11,12. A crucial issue in MD simulations is the origin of the force field (FF), that is, the accuracy of the parameters that are used to estimate intra and intermolecular forces. Depending on the molecular-level details they include, FFs can be clasified as (i) all-atom (all atomic sites are explicitly included), (ii) united-atom (hydrogen atoms are gathered with the atoms to which they are bonded) and (iii) coarse-grained (a group of atoms is combined into a single interaction site) FFs. Some examples of very popular FFs are CHARMM13, AMBER14, MARTINI15, and GROMOS16, which are mainly employed in the simulations of biomolecules; or OPLS17 and COMPASS18, originally developed to simulate condensed matter. Most of them are fixed-charge FFs, that is, the effective partial charges are not modified depending on conformation and environment. This leads to a good performance when it comes to the prediction of the IL structural information, but calculated transport properties are generally lower than experimental measurements19. In order to overcome this difficulty, some of the aforementioned FFs have been modified to include polarization effects explicitly at the expense of increasing the computational cost20,21,22. Another noteworthy strategy that has been implemented in recent years to effectively account for polarization is to use non-polarizable FFs with reduced charges23,24, which offers dynamical properties with the right order of magnitude although it might not be able to reproduce the correct relaxation mechanisms and frequently at the cost of loss of structural prediction accuracy. New generation force fields that use machine learning to improve on the existent parametrizations have been developed in the recent years for water systems showing a promising step forward on the modeling of physical properties25,26. The usage of these techniques to refine the current force fields for ionic liquids seems a promising field and may offer a cost-effective alternative to polarizable MD.

One of the main drawbacks of applying classical MD simulations to the study of ILs is that they can not deal with interactions that need an explicit quantum mechanical treatment of electrons, such as charge transfer, hydrogen bonding and chemical reactions. Currently, the best way to model those phenomena is to carry out ab initio molecular dynamics (AIMD) simulations, whose trajectories are generated with forces calculated “on the fly” from electronic structure density functional theory (DFT) calculations. However, the price to pay is the high computational cost of AIMD calculations that decreases the accessible system size and simulation length two orders of magnitude relative to classical MD. This provokes that the calculation of properties that require long simulation times and large system sizes, such as viscosity and conductivity, can not be accomplished by means of AIMD simulations. Despite these restrictions, AIMD methodology has been employed for the characterization of ILs due to the immense value of the knowledge it provides27,28,29, but classical MD is still the choice for most problems involving ILs.

Fortunately, it seems that the existing gap between MD and AIMD can be successfully filled with the development of machine learning (ML) methods. The fast advances in ML and data science have led to the derivation of robust FFs that allow for calculations with DFT accuracy while keeping the high computational efficiency of classical FFs. These models learn the relevant structure-property relation from data obtained by running ab initio simulations, which allows describing the chemical behaviour without depending on any predefined mathematical expression. Thus, the accuracy ML-based FFs is, in principle, only limited by the information provided by their training data30,31. We must note that while these models have the same algorithmic time complexity as classical force fields and are orders of magnitude faster than AIMD simulations, they are still much slower than their classical counterparts (10–100 times slower depending on the system, but more efficient implementations may lower this number).

There are several alternatives for the construction of ML-based FFs, such as Gaussian process regression or fully connected neural networks (NNs)32,33,34,35, and ML models are increasingly used in the field of materials science. In particular, it has been confirmed that they show great potential for large scale simulations of solids but, instead, the use of MD simulations combined with ML interatomic potentials (ML-MD) for exploring molecular liquids can be considered to be still in its infancy. Among the few examples that can be found in the literature, Magdǎu et al.36 presented a ML potential for the ethylene carbonate/ethyl methyl carbonate binary solvent, identifying the key challenges in developing this kind of models. Very recently, Shayestehpour and Zahn37 trained a ML interatomic potential for simulating deep eutectic solvents with thousands of atoms during \(2-3\) nanoseconds. For a mixture of 64 choline chloride and 128 urea molecules they obtained structural and dynamical properties in good agreement with the results from experiments and AIMD simulations. Moreover, He and coworkers38 developed a ML-based FF for 10 different IL pairs and one bulk of 1-ethyl-3-methylimidazolium tetrafluroborate ([EMIM][BF4]) with 32 ionic pairs. After performing 1 ns-long ML-MD calculations, they reported a simulated efficiency comparable to classical FFs but with the precision of DFT methods. Also in the field of densely ionic solvents, Montes-Campos et al.39 pioneeringly reported NeuralIL, a NNFF for ILs that was tested on 15 ionic pairs of ethylammonium nitrate (EAN) for 80 ps-long ML-MD simulations. NeuralIL can be employed to calculate quantities requiring long trajectories or large samples of configurations with ab initio accuracy and with orders-of-magnitude savings in computational cost. In a more recent paper40, they show how a committee of neural networks is able to estimate the error of the force prediction and how re-training the neural network with configurations extracted from MD performed with NeuralIL greatly increases the acccuracy of the predicted forces. Concerning the use of ILs as electrolytes in lithium-ion batteries, a crucial aspect that must be considered is the formation and evolution of the solid-electrolyte interphase (SEI) layer between the anode and electrolyte, as a product of electrochemical decomposition. Nevertheless, very little is known about the SEI due to the complexity of carrying out in-situ characterization, making it an ideal issue for computational modeling. Although remarkable advances have been made to predictively model the fundamentals of the SEI41, it is a very complicated and heterogeneous structure that acts as an electronic insulator but as an ionic conductor, and whose growth spans multiple time- and length-scales, so none of the existing computational methods is completely suitable to the task of simulating such a complex system. Thus, everything points to classical MD simulations powered by a NN potential as the most effective tool to gain a detailed understanding of the interface chemical reactivity leading to SEI formation and evolution. An appropriate knowledge of SEI chemical composition and reaction dynamics is of key importance to design optimal electrolyte/electrode combinations.

In this work, we used the latest version of NeuralIL to perform ML-MD simulations of 4 pure ILs: [EMIM][BF4], 1-ethyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide ([EMIM][TFSI]), EAN and butylammonium nitrate (BAN). The number of ionic pairs included in each mixture ranges from 128 to 512 and it is specified in Table S1. In addition, we pioneeringly characterized a binary mixture composed of 480 ionic pairs of EAN and 60 of lithium nitrate (LiNO3). Up to our knowledge, this is the first time that an IL-inorganic salt mixture is studied by means of ML methods, which is of particular interest for energy storage devices such as batteries or supercapacitors. Using our NNFF we are able to run 100 ps-long ML-MD simulations with thousands of atoms and AIMD precision. Our ML model is trained on energies and forces and provides a prediction of both with a level of accuracy comparable to DFT and improved over classical FFs like OPLS-AA. Bond lengths were evaluated and observed to be close to the OPLS-AA and DFT values. Our results show that NeuralIL can successfully reproduce structural (radial distribution functions and hydrogen bond distributions) and dynamic (mean square displacements, cage autocorrelation functions and vibrational density of states) properties of pure ILs and their binary mixtures with lithium salts, which corrects systematic errors of classical nonpolarized MD in the study of transport properties of ILs and positions ML methods as a game changer for the optimization of electrolytes for energy storage devices. The results of the dynamic properties computed with NeuralIL are also compared to results from polarizable MD simulations using the CL&Pol force field42,43. The results of NeuralIL and CL&Pol are similar, but the results of NeuralIL are clser to those of ab initio and experiments. This work is a first step towards the analysis of the electrolyte reduction reactions that take place near the electrode surface and lead to the SEI complex region, since ML-based force fields can unravel the initial stages of SEI formation and the complex cascade of decomposition reactions that take place without being limited in size as AIMD simulations.

Results

In order to asses the ability of NeuralIL to reproduce the properties of ILs, the analysis was divided in 3 parts: molecular properties, structural properties and dynamical properties.

Molecular properties

In NeuralIL, as in the majority of NN-based force fields, the atoms only have information about their surroundings and their chemical element. Therefore, molecules are not explicitly defined and they appear as a result of the inter-atomic forces. In order to analyze the ability of the force field to replicate the molecular properties, we analyzed the bond length distribution between the atoms that were bonded during the simulation with OPLS-AA. These distributions were computed for the simulations with the NeuralIL FF. In Fig. 1 these values are compared to the equilibrium values of DFT calculations (vertical lines). These values were extracted from one of the training configurations which was minimized using the very same DFT calculation that the one used in the training dataset. Remarkably, the maxima of the bond length values are just slightly above the corresponding DFT value distributions perfectly agree with the value from DFT (see Table 1). Moreover, as can be seen in the left panel, the values for the bond length distribution between atoms of the \(\hbox {EMIM}^{+}\) cation are almost exactly the same regardless of the anion, which is also quite remarkable taking into account that the training process for both systems is completely independent. Similar results are seen for the EAN and BAN systems with only minor differences on the N-O bond of the \(\hbox {NO}_3^{\,\,-}\) anion and some expected differences in the bonds with carbon atoms of the two cations.

Besides, the bond-length distribution between carbon and nitrogen atoms in the \(\hbox {EMIM}^{+}\) cation was analyzed in detail. Attending to the OPLS parameterization, there are 4 different carbon atoms bonded to the nitrogen atoms of \(\hbox {EMIM}^{+}\) and, therefore, four different C-N bonds (see Fig. S2 top). On the other hand, NeuralIL only has information about the atomic element and, therefore, the four carbon atoms are treated equally in the FF. However, as can be seen in Fig. S2, NeuralIL is able to provide proper bond lengths for the four different types of bonds. This shows that NeuralIL is able to treat differently the same atoms depending on their environment, as expected for an ab initio-like calculation.

Fig. 1
figure 1

Bond length distribution between the different atoms. Left, \(\hbox {EMIM}^{+}\) bond length distribution in the system with \(\hbox {BF}_{4}^{\,\,-}\) (solid) and \(\hbox {TFSI}^{-}\) (dashed). Right, comparison between the bond length distribution in EAN (solid) and BAN (dashed). Atom labels correspond to the name in OPLS-AA17.

Table 1 Average bond distances for the simulations with NeuralIL and OPLS-AA compared to their DFT values. The labels match those of Fig. 1.

Structural properties

Once the NeuralIL predictions of intramolecular properties systems have been tested against their DFT counterparts, we must analyze its predictions of structural inter-molecular properties of the systems. Due to their scale, most of the comparisons were made with results of MD simulations with the OPLS force field.

The first property we can compare is the pair radial distribution function (RDF), whose usual expression is

$$\begin{aligned} \textrm{RDF}_{ab}(r) = \frac{V}{N_a N_b}\sum _{i=1}^{N_a}\sum _{j=1}^{N_b} \frac{\left< \delta (r_{ij}-r)\right>}{4\pi r^2}\,. \end{aligned}$$
(1)

This function was calculated between the anions and cations of the different systems. In order to compute the distances between the different species we used the following references: \(\hbox {EMIM}^{+}\), center of mass of the imidazolium ring; \(\hbox {EA}^{+}\), nitrogen atom, \(\hbox {BA}^{+}\), nitrogen atom; \(\hbox {NO}_{3}^{\,\,-}\), nitrogen atom; \(\hbox {TFSI}^{-}\), nitrogen atom; and \(\hbox {BF}_4^{\,\,-}\), boron atom. The results are shown in Fig. 2 for both the NeuralIL and OPLS-AA force fields. As can be seen there, the results for the ammonium-based ILs are very similar, showing only minor differences in the height of the first peak and, in the case of EAN, a broadening and smoothing of the second and third peaks.

Fig. 2
figure 2

Radial distribution functions between anions and cations of the different systems compared to their equivalent with the OPLS-AA and CL&Pol force fields.

However, the results for the imidazolium-based ILs do show some differences in the structure of the first peak (or group of peaks) of the RDFs. In particular, for the [EMIM][BF4] system, a secondary peak appears next to the first one. The origin of this secondary peak can already be guessed in the OPLS-AA results, which show that it could be the same peak but combined with the first one. This suggests that both FFs predict essentially the same structures, but those of NeuralIL are better defined due to the more localized interactions of this potential. In addition, this slight disagreement is also related with the poor OPLS-AA predictions of the hydrogen bond formation in [EMIM][BF4], as will be shown later.

On the other hand, the system that shows more remarkable differences in the RDF between the classical and the NNFF is [EMIM][TFSI]. In this case, NeuralIL simulations show three well defined peaks in the first solvation layer. The OPLS-AA simulation results do also show three peaks in the first solvation layer, but they are much more fuzzy (specially the first two), and show slightly altered positions with respect to the NeuralIL ones. Comparing the NeuralIl results with that of CL&Pol shows a very good agreement specially for the [EMIM][BF4] system, with the [EMIM][TFSI] showing a distribution in between the OPLS-AA and NeuralIL.

Hydrogen bonding is a critical interaction for calculating the structural and dynamic properties of ILs, even those of aprotic ones44. These hydrogen bonds are very well characterized and reproduced by classical MD simulation for protic ILs (like EAN and BAN)45. On the other hand, some aprotic ILs show weak hydrogen bonds that do not fit into the more classic view of hydrogen bonding44,46. These hydrogen bonds are much more difficult to reproduce by classical MD simulations. In order to test the ability of the NeuralIL force field to predict the formation of hydrogen bonds, we calculated the distribution of Donor-Acceptor distances and Hydrogen-Acceptor-Donor angles. These distributions, which are normalized to the random distribution to take into account the non-uniformity of angle distribution in 3D space, are shown in Fig. 3. This figure shows, as expected, almost identical results for the EAN and BAN systems. However, for [EMIM][TFSI] and [EMIM][BF4], the OPLS-AA simulations do not show results compatible with the formation of hydrogen bonds, contrarily to NeuralIL which provides results that are consistent with previous DFT calculations44,46. Moreover, we performed DFT calculations of the different ionic pairs, with the same parameters as those of the training configurations, in order to compute the equilibrium configuration of the different hydrogen bonds. These values are also included in Fig. 3 and, as can be seen, they are very close to the maxima of the distributions but show a shorter distance than that of MD simulations. This effect has already been reported for other studies using DFT and it is due to the condensed phase elongating the hydrogen bonding distances47. There are, however, clear differences between the distributions of the protic and aprotic ILs. Due to their weaker interactions, the distributions for the two aprotic ILs ([EMIM][TFSI] and [EMIM][BF4]) shown in Fig. 3 are less localized, specially in the angular distribution.

Fig. 3
figure 3

Hydrogen bond length and angle distribution normalized to the random distribution for the different systems and force fields. Orange dots show the equilibrium value for gas phase calculations with the same DFT parameters as the training configurations.

Dynamic properties

One of the main limitations of the usage of classical FFs like OPLS-AA for the study of ILs is the absence of electron density polarization. This is known to produce much slower diffusive properties than their experimental counterparts48 and, while there are alternatives for adding the polarization term42,49, they require the reparameterization of the FF and additional terms to prevent the so-called polarization catastrophe43,50.

We computed the mean squared deviation (MSD) of the center of mass of the different ions for the systems with \(\hbox {EMIM}^{+}\) cations, and the results are shown in Fig. 4. Firstly, we can see that for both FFs the diffusion of \(\hbox {EMIM}^{+}\) in mixtures with \(\hbox {BF}_{4}^{\,\,-}\) is faster than those with \(\hbox {TFSI}^{-}\). Moreover, both FFs predict similar ballistic regimes with durations ca. 0.5 ps. Once the ballistic regime ends, all systems enter a subdiffusive regime where \(\text {MSD}\propto t^\gamma\) with \(\gamma < 1\). However, the diffusion during this regime is very different depending on the employed force field. With non polarizable force fields, it usually takes around a ns to reach the diffusive regime48, but with NeuralIL we see that the diffusive regime is reached much earlier, which allows for an estimation of the diffusion cofficient with shorter simulations (which has already been observed in AIMD simulations51.) This faster diffusion is expected as NeuralIL naturally includes the polarization of the electronic density due to being trained with DFT calculations. The results using the CL&Pol force field, which models this polarization effect, are indeed in the same order of magnitude than those of NeuralIL. Moreover, we can compare the value for the [EMIM][TFSI] system with AIMD results from the literature51, extracted from Fig. 5 using the PlotDigitizer software52 (see Fig. 4). These ab initio results are also in the same order of magnitude than those of NeuralIL and CL&Pol, but are much closer to those of NeuralIL. The values for the diffusion coefficient can be seen in Table 2.

Fig. 4
figure 4

MSDs for the \(\hbox {EMIM}^{+}\) cation in mixtures with \(\hbox {TFSI}^{-}\) (brown lines) and \(\hbox {BF}_{4}^{\,\,-}\) (red lines) in the NeuralIL (solid lines) and OPLS-AA (dashed lines) simulations.

Table 2 Diffusion coefficients for \(\hbox {EMIM}^{+}\) with the different force fields. The coefficients were computed by fitting the MSDs from Fig. 4 between 10 and 90 ps.

Li-salt mixtures

To complete the systematic testing of NeuralIL, we also performed a simulation of a binary mixture of EAN with a 1/9 molar fraction of LiNO3. The training and simulation of this system were the same as for the pure ILs. The training was performed with 16 EAN and 2 LiNO3 ionic pairs and the simulation with 256 EAN and 32 LiNO3 ionic pairs.

We computed the RDFs between the anion and cation of the IL, and also that between \(\hbox {Li}^{+}\) and the anion of the IL. The results are shown in Fig. S4 and, as can be seen there, while the general structure is similar for both FFs, there are some remarkable differences between them. The main discrepancy is that, while both FFs show the same peaks for the Li-NO3 RDF, the relative intensity between the first two changes. In particular, OPLS-AA favours a monodentate configuration, with a larger coordination distance, while NeuralIL favours the bidentate configuration. It must be noted that this prominence of the bidentate configuration has been reported previously in the literature in neutron scattering experiments53. Therefore, once again the prediction of NeuralIL is more accurate than the OPLS-AA one. This change in the most favourable coordination distance is reflected in the anion-cation RDF, which shows similar structures but with a larger coordination distance in the OPLS-AA simulations. This could be an effect of \(\hbox {Li}^{+}\) ions being coordinated in the monodentate configuration. Moreover, the simulation with OPLS-AA predicts a secondary peak attached to first one for the anion-cation RDF, while NeuralIL predicts a single peak. The correctness of the NeuralIL prediction is also backed up by the same experimental findings53.

From the dynamic point of view, we analyzed the MSD of \(\hbox {Li}^{+}\) ions with both FFs (see Fig. 5). As expected, the diffusion with the NeuralIL force field is orders of magnitude faster than with OPLS-AA, with the results of CL&Pol being also faster than OPLS but less than the results from NeuralIL. On the other hand, the shorter time diffusion is very similar between both FFs, including the valley observed in both MSDs ca. 0.05 ps. The estimation of the diffusion coefficient from the MSD provides a value of \((4.5\pm 1.3)\cdot 10^{-11}\;\mathrm {m^2/s}\) for the NeuralIL FF, \((2.34\pm 0.84)\cdot 10^{-12}\;\mathrm {m^2/s}\) for OPLS-AA and \((7.6\pm 3.3)\cdot 10^{-12}\;\mathrm {m^2/s}\). The value reported in the literature is \(3.0\cdot 10^{-11}\;\mathrm {m^2/s}\)54, which is totally compatible with the predictions of NeuralIL, but one order of magnitude higher than the prediction of OPLS-AA (as expected48). The results for the polarizable force field are an improvement over the non-polarizable ones but they are still too low. This may have to do with the transferability of the force field which may be in need of a re-parameterization depending on the concentration of lithium salt. This is also true for the NeuralIL force field, but the use of machine learning allows for a mostly automated way of parameterising the force field.

Fig. 5
figure 5

MSD for \(\hbox {Li}^{+}\) cations in the NeuralIL, OPLS-AA and CL&Pol simulations. Dashed lines correspond to the fittings for the diffusion coefficient.

Finally, we also computed the correlation between the nitrate anion and the lithium ions, which is defined as

$$\begin{aligned} \textrm{CACF}(\tau ) = \frac{\langle \theta _{ij}(t)\theta _{ij}(t+\tau )\rangle }{\langle \theta _{ij}(t)\theta _{ij}(t)\rangle }\,, \end{aligned}$$
(2)

where the function \(\theta _{ij}(t)\) is defined for every pair of i-lithium, j-nitrate as

$$\begin{aligned} \theta _{ij}(t) = {\left\{ \begin{array}{ll} 0& \;\;r_{ij}(t) > r_{\textrm{cut}}\\ 1& \;\;r_{ij}(t) \le r_{\textrm{cut}}\\ \end{array}\right. } \end{aligned}$$
(3)

with \(r_{\textrm{cut}}\) obtained from the first minimum after the first solvation shell in the RDF. The \(\text {CACF}(t)\) represent the fraction of lithium ions that are still coordinated with the same anion after a time t. This function is represented in Fig. S5 and, as can be seen, the decorrelation is much faster with the NeuralIL force field than with OPLS-AA. This change is, once again, due to the inclusion of the polarization terms and has been reported previously comparing non-polarizable and polarizable force fields48.

Proton transfer reactions

Finally, an additional weakness of traditional force fields is that they lack the ability to describe chemical reactions in the system. In particular, proton transfer reactions are of significant interest for ILs due to protic ILs being synthesized by acid-base reactions. In fact, currently a new kind of ILs, called pseudo-ILs, is gaining track due to their enhanced conductivity provided by proton transfer inside the liquid. In this pseudo-ILs, there is an equilibrium between an acid-base pair and their ionized form. One of the most studied is the one composed by methylimidazole (MIM) and acetic acid55:

$$\begin{aligned} \hbox {CH}_{3}\hbox {COOH} + \hbox {MIM} \rightleftharpoons \hbox {CH}_{3}\hbox {COO}^{-} + \hbox {HMIM}^{+} \end{aligned}$$

This system is of particular concern as its equilibrium has been theoretically characterized using quantum DFT calculations56, which allows for a direct comparison with the results of the NNFF. In order to avoid a bias in the composition of the system towards the expected equilibrium value, the NNFF was trained starting in configurations far from equilibrium. The initial system was an equal mixture of [CH3COO]-\([\hbox {HMIM}]^{+}\) and MIM (no molecules of CH3COOH where present in the initial training set). The training system was composed of 8 pairs of [CH3COO]-\([\hbox {HMIM}]^{+}\) and 8 MIM molecules, while the production simulation contained (initially) 64 pairs of [CH3COO]-\([\hbox {HMIM}]^{+}\) and 64 MIM molecules. We must note that while no molecules of CH3COOH were present in the initial training set they were present in the extended set as they were predicted by the first generation force field (see Figs. S1, S6). During the production run, proton transfers were observed between MIM and \(\hbox {HMIM}^{+}\) species (which do not alter the composition of the system) and between \(\hbox {HMIM}^{+}\) and \(\hbox {CH}_{3}\hbox {COO}^{-}\), generating CH3COOH molecules, which were not explicitly present in the training set. Therefore, it is possible to follow the evolution of the equilibrium coefficient

$$\begin{aligned} Q = \frac{[{\hbox {HMIM}^{+}}][{\hbox {CH}_{3}\hbox {COOH}}]}{[{MIM}][{\hbox {CH}_{3}\hbox {COOH}}]}\,. \end{aligned}$$
(4)

The evolution of this coefficient during the simulation is represented in Fig. 6 together with the prediction from Ref.56, which establishes that in the equilibrium ionic species represent between 20 to \(30\%\) of the system. As can be seen, the system quickly changes its composition to reach a value just inside the theoretically predicted window.

Fig. 6
figure 6

Evolution of the equilibrium coefficient defined by Eq. (4). In orange, the theoretical value range for the equilibrium constant given by Ref.56.

Having access to the full composition of the system also allows us to evaluate the kinetics of the reaction. A possibility would be to use the method proposed by Yamamoto et al.57, but that requires calculating the correlations of the time derivative of the system composition, which would need very large systems to produce stable values. Another, more manageable, way to analyze the reaction is through the potential of mean force (PMF). This can be computed for hydrogen atoms that are forming a hydrogen bond between \(\hbox {CH}_{3}\hbox {COO}^{-}\) and MIM molecules. Then we can compute the probability of the presence of the hydrogen atom at a distance from the middle of the two adsorption sites (negative values corresponding to the hydrogen atom being closer to the MIM molecule). The PMF can then be calculated as

$$\begin{aligned} \mathrm {PMF(r)} = -k_BT\log [P(r)], \end{aligned}$$
(5)

where P(r) is the probability density of finding a hydrogen atom at a distance r of the middle point. The result is represented in Fig. 7 and, as it can be seen, the barrier for going from the ionic state to the neutral is much lower than the reverse process (\(0.19\,\mathrm {k cal mol^{-1}}\) vs \(1.21\,\mathrm {k cal mol^{-1}}\)). This is compatible with the equilibrium favoring the neutral state and in line with values previously reported in the literature for gas phase calculations58.

Fig. 7
figure 7

PMF as defined by Eq. (5). Energies are referenced to their minima. The dotted lines correspond to the local minima and the dashed line to the transition state.

Conclusions

We have shown that NNFFs, and in particular NeuralIL, are able to correctly reproduce the properties of densely charged fluids. Moreover, they are able to solve endemic problems of the usual physics-inspired force fields when applied to these charged systems. The most prominent one of these deficiencies is the sluggish dynamics predicted by force fields like OPLS-AA, with a diffusion coefficient that is around one order of magnitude smaller than the experimental value. However, NeuralIL is able to correctly reproduce the correct value, even with shorter simulations, due to the built-in polarizability of the electronic density.

Additionally, the prediction of the structural properties is also improved when using NeuralIL, specifically, the ability to reproduce the weak hydrogen bonds present in some molecules, like the imidazolium cations. These hydrogen bonds cannot be easily observed with the simplistic approach of physics-inspired force fields that rely only on constant partial charges to be able to reproduce these complex interactions. Although NNFFs come with an extra computational cost than cannot be understimated, implementations such as NeuralIL are able to efficiently harvest the computational resources available in GPUs, thus allowing us to simulate systems with a large enough size to replicate the bulk properties in a matter of days.

With this work, we have shown that MD simulations with NNFFs offer unprecedented predictive power, accuracy and efficiency. In the near future, this computational tool is expected to play a key role in facilitating the optimization of energy storage devices.

Methods

The simulations were carried out using the most recent version of the NeuralIL FF40. The training of the committee of neural networks for each system was performed using the following algorithm (summarized in Fig. S1).

Firstly, an MD simulation with a conventional force field (OPLS-AA17) was performed with a limited number of atoms (see Table S1) with a production run of 20 ns in the NVT emsemble at 373.15 K. From this production run, 500 configurations were extracted at equally spaced time intervals. In order to ensure that we are not taking configurations too close to the minimum energy ones, a random displacement taken from a uniform distribution with a 0.05-Å width was applied to each atom. After that, DFT calculations using GPAW59 were performed with the same parameters as in Ref.39. After that, the committee of neural networks was trained employing the 500 configurations using the VELO optimizer60 with 51 epochs.

With this first version of the trained FF, a 250-fs MD simulation at 298 K was performed starting from each of the 500 training configurations using the NNFF. Then a DFT calculation with the same parameters as before was performed for each of the final configurations of the MD simulation. These new 500 configurations were added to the training pool and the training was repeated with the same optimizer and number of epochs. The result of this second training was used as the production FF for the MD simulations. The parity plots for all the trained force fields can be seen in Fig. S3.

In the production simulations with NeuralIL, the MD integration was performed using the python module jaxmd61. These simulations were carried out for much larger systems (see Table S1). The initial configuration for each system was provided by an MD simulation with OPLS in Gromacs62. The Gromacs simulations consisted of an energy minimization followed by a 1 ns stabilization in the NPT ensemble at 298 K and 1 atm. The final configuration of the stabilization was used as the input of the production run with the NeuralIL FF. These production runs were performed in the NVT ensemble at 298 K using a Nosé-Hoover thermostat63 with a total duration of 100 ps. All simulations were performed using a time step of 1 fs.

Finally, in order to compare the results obtained with NeuralIL to a more robust forcefield, MD simulations were carried out in OpenMM, version 7.6. The polarizalbe CL&Pol forcefield was employed to explicitly include the electronic degrees of freedom. The parametrizations of all species were the ones reported by the original authors42,43. Simulation boxes with the same dimensions and number of particles as the ones used in the NNFF simulations were created. These configurations were stabilized in the NVT ensemble for 5 ns using a temperature-grouped Nosé-Hoover thermostat64,65, where the translational degrees of freedom were thermostatted at 298 K while the relative motion of Drude particles and Drude cores was kept at 1 K. Then, 100 ps production runs were carried out to compare with the results of NeuralIL. The same 1 fs timestep as in the other simulations was used.