Learning non-local molecular interactions via equivariant local representations and charge equilibration

Fuchs, Paul; Sanocki, Michał; Zavadlav, Julija

doi:10.1038/s41524-025-01790-4

Download PDF

Article
Open access
Published: 16 September 2025

Learning non-local molecular interactions via equivariant local representations and charge equilibration

Paul Fuchs¹,
Michał Sanocki^1,2 &
Julija Zavadlav^1,2

npj Computational Materials volume 11, Article number: 287 (2025) Cite this article

3964 Accesses
3 Citations
Metrics details

Subjects

Abstract

Graph Neural Network (GNN) potentials relying on chemical locality offer near-quantum mechanical accuracy at significantly reduced computational costs. Message-passing GNNs model interactions beyond their immediate neighborhood by propagating local information between neighboring particles while remaining effectively local. However, locality precludes modeling long-range effects critical to many real-world systems, such as charge transfer, electrostatic interactions, and dispersion effects. In this work, we propose the Charge Equilibration Layer for Long-range Interactions (CELLI) to address the challenge of efficiently modeling non-local interactions. This novel architecture generalizes the classical charge equilibration (Qeq) method to a model-agnostic building block for modern equivariant GNN potentials. Therefore, CELLI extends the capability of GNNs to model long-range interactions while providing high interpretability through explicitly modeled charges. On benchmark systems, CELLI achieves state-of-the-art results for strictly local models. CELLI generalizes to diverse datasets and large structures while providing high computational efficiency and robust predictions.

Incorporating long-range electrostatics in neural network potentials via variational charge equilibration from shortsighted ingredients

Article Open access 04 March 2024

Pushing charge equilibration-based machine learning potentials to their limits

Article Open access 16 September 2025

A foundation machine learning potential with polarizable long-range interactions for materials modelling

Article Open access 25 November 2025

Introduction

Machine Learning Interatomic Potentials (MLIPs) are powerful tools for modeling interatomic interactions. They can achieve near-quantum mechanical accuracy at a fraction of the computational cost¹ and linear scaling with the number of particles. Thus, highly scalable and accurate MLIPs can enable precise simulations of larger systems and allow computational studies of complex phenomena that would otherwise be computationally inaccessible^2,3,4,5. Particularly, equivariant Graph Neural Network (GNN) MLIPs such as Allegro⁶ and MACE⁷ are highly expressive models and can learn potential energy surfaces end-to-end from data⁸. Therefore, these models generalize well even for chemically highly diverse datasets^9,10. However, strictly local MLIPs, which assume that atomic interactions are dominated by their immediate environment¹, and message-passing MLIPs, which propagate information beyond the immediate environment¹¹, cannot model interactions beyond a strict effective cutoff radius¹². Thus, these effectively local MLIPs can accurately capture short-range interactions⁶, but cannot capture long-range electrostatic interactions, charge transfer, and dispersion effects^3,13.

Without additional mechanisms to address long-range interactions, the effective locality of most MLIPs greatly limits their application to many real-world scenarios^1,14. Long-range interactions are crucial in several key physical phenomena, including molecular aggregation, protein folding, or the behavior of ionic liquids^15,16. For instance, in proteins, long-range interactions have been shown to play a significant role in their structure and function, with most residues participating in such interactions^15,17, some of which can theoretically span up to 15 Å¹⁸. As a result, even MLIPs with near quantum-level accuracy for short-range interactions would be unable to fully capture the behavior of such proteins without additional schemes considering these long-range effects.

The challenge of modeling long-range effects is a recognized obstacle in developing MLIPs^1,19. Thus, several approaches to incorporate long-range effects into GNN MLIPs have been proposed. Reciprocal space methods model long-range interactions by processing structural²⁰ or learned features^12,21 in Fourier space. However, these methods based on lattice vectors have limited generalizability, as they struggle with differently oriented structures or other supercells and cannot easily be applied to simulations in realistic conditions². The Euclidean Fast Attention (EFA) scheme² overcomes this limitation and respects relevant physical symmetries but requires integrating possible lattice orientations over the unit sphere, increasing computational costs. Methods such as Long-Short-Range Message-Passing, RANGE, and Erwin aim to model long-range interactions by improving the efficiency of message passing by utilizing coarse-grained or hierarchical representations^22,23,24. However, these methods are not directly generalizable to bulk systems.

On the other hand, physics-driven approaches have been proposed. The simplest approaches treat short-range interactions with MLIPs separately from long-range interactions. Therefore, long-range contributions, such as van der Waals²⁵ or electrostatic interactions^26,27, are subtracted from MLIP training data and added to MLIP predictions. However, long-range interactions often correlate with the immediate environments of atoms. For example, capturing long-range interactions through electrostatic effects requires atomic charges that depend on the dynamic chemical environment. Therefore, methods have since emerged that predict charges^28,29 or directly model long-range interactions³⁰, using features of the local environment. Still, these point charge-based methods generally assume chemical locality, which becomes problematic in systems where non-local effects dominate³¹. Moreover, direct charge prediction often requires additional correction schemes to ensure charge conservation and prevent unphysical behavior^28,29. Thus, traditional methods fail to account for, e.g., local bonding environments and the global electrostatic landscape³². The Charge Equilibration Neural Network (CENT)³² was introduced by Ghasemi et al. and later adapted by Ko et al. and Shaidu et al. to address these challenges in coupling long-range and short-range effects. The CENT method globally distributes charges via the Charge Equilibration method (Qeq) based on electrostatic features of the local environment extracted via a Behler-type neural network²⁶. Therefore, the CENT method correlates short- and long-range effects. Moreover, explicitly predicting charge distributions can be advantageous, as charge transfer can be observed, and systems in external electric fields can be simulated using predicted charges. Still, the CENT method does not explicitly account for short-range non-electrostatic interactions. To overcome this issue, the fourth-generation high-dimensional neural network potentials (4GHDNNPs) comprise a second neural network, modeling short-ranged interactions dependent on the charge state³. This method accurately captures global charge distributions in simple systems with non-local effects but requires training an additional neural network on ambiguously defined reference point charges^3,13. Thus, alternative methods predict electrostatic features alongside the short-range corrections using a single Behler-type neural network and a more fidelity Qeq scheme without reference charges¹³ or replace the Qeq method with a self-consistent method to represent electrostatic interactions using well-defined Maximally Localized Wannier Function Centers³³. Still, these methods require multi-step training procedures and employ Behler-type neural networks relying on hand-crafted descriptors. Therefore, they are not simple to generalize to chemically diverse datasets⁸.

Previous machine-learning approaches are often costly, hard to scale, or not simple to generalize to chemically diverse systems. Thus, this work introduces the Charge Equilibration Layer for Long-range Interactions (CELLI), a novel architectural building block for equivariant GNN MLIPs. By generalizing the Qeq method to chemically diverse systems, CELLI enables MLIPs to model long-range interactions and condition short-range interactions on the local charge environment via learned representations. In a series of experiments with crucial charge-transfer and charge-state dependence³, we show that integrating CELLI with Allegro⁶ and MACE⁷ can overcome the inherent locality of state-of-the-art MLIPs. Moreover, on the OE62 dataset³⁴, we demonstrate that CELLI can generalize across a more diverse chemical space while only marginally increasing computational costs. In addition, we employ CELLI on subsets of the SPICE dataset³⁵ to prove that it can produce stable molecular dynamics simulations.

Results

Charge equilibration layer for long-range interactions (CELLI)

With the Charge Equilibration Layer for Long-range Interactions (CELLI), we introduce the classical non-local Qeq method to recent expressive equivariant GNN architectures. Similar to Ko et al., we split the total potential energy U = U_Coul + ΔU into an electrostatic component U_Coul and a correction ΔU. Following the CENT approach³², CELLI leverages the Qeq method, accounting for non-local charge transfer, to compute partial charges Q and electrostatic energy U_Coul (Subsection “Charge equilibration method (Qeq)”) using features of the GNN. Subsequently, the GNN learns the correction ΔU dependent on non-local features provided through the equilibrated partial charges embedded by CELLI. Thus, instead of learning charges and the potential energy through separate NNs³, CELLI enables flexible integration of Qeq-based charge prediction and non-local interaction modeling within a single model.

We describe CELLI using the example of the Allegro architecture⁶, visualized in Fig. 1. Allegro is a strictly local equivariant GNN that learns scalar features ${{\boldsymbol{x}}}_{ij}^{l}$ and tensorial features V_ij for the directed edges ij of a graph through a sequence of L tensor-product layers (outlined in Subsection “Graph neural networks”), which we call Interaction Layers in the following.

**Fig. 1: Charge equilibration layer for long-range interactions (CELLI).**

To extend Allegro, we insert one instance of CELLI at a location between the Interaction Layers. CELLI embeds and applies the Qeq method to learn long-range electrostatic interactions and partial charges using the latent scalar features. The latent scalar features encode many-body information of the environment to an order increasing with the number of previous Interaction Layers⁶. As global charge transfer might affect the local electronic structure and thus the local many-body interactions³, CELLI embeds the charge environment into the latent features ${{\boldsymbol{x}}}_{ij}^{l+1}$ passed to the following Interaction Layers. The following Interaction Layers can correlate charge information from neighbor atoms up to an order determined by the number of subsequent Interaction Layers. Thus, the placement of CELLI within the Interactions Layers determines the body-order (additionally, the receptive field for not strictly local models such as MACE³⁶) of the local environment description passed to the Qeq method and the local charge environment description learned by the following Interaction Layers. Finally, the readout layer uses the latent features to predict per-edge energies, summing up to the correction potential ΔU.

Environment embedding

The latent features from the previous l Interaction Layers encode scalar descriptions of the local edge environments. We use a multi-layer perceptron ${{\rm{MLP}}}_{{\mathcal{R}}}$ to predict a per-edge contribution to the electronegativity, and optionally the hardness $\left({\tilde{\chi }}_{ij},{\tilde{J}}_{ij}^{{\mathcal{R}}}\right)={{\rm{MLP}}}_{{\mathcal{R}}}\left({{\boldsymbol{x}}}_{ik}^{l}\right)$. Summing up the contributions of all directed edges from i to j, multiplied by a species-invariant learnable factor f, yields the electronegativity ${\chi }_{i}=f{\sum }_{k\in {\mathcal{N}}(i)}{\tilde{\chi }}_{ij}$ of particle i.

Species embedding

We encode the particle species to an environmentally independent contribution to the hardnesses ${\tilde{J}}_{i}^{Z}$. To ensure positive hardnesses, we use the generalized soft plus activation function denoted as ${\sigma }_{+}({x}_{1},\ldots ,{x}_{k})=\log (1+\exp ({x}_{1})+\ldots +\exp ({x}_{k}))$ to combine the species-dependent with the environment-dependent hardness contributions. Therefore, we obtain the particle hardness ${J}_{i}={\sigma }_{+}\left({\tilde{J}}_{i}^{Z},{\sum }_{j\in {\mathcal{N}}(i)}{\tilde{J}}_{ij}^{{\mathcal{R}}}\right)$. Appropriate radii can crucially determine whether the optimization converges. Therefore, we base the charge radii on single-bond covalent radii γ_i³⁷ and learn a positive species-dependent scaling factor ${\tilde{s}}_{i}$ to obtain ${\gamma }_{i}=\frac{{\sigma }_{+}({\tilde{s}}_{i})}{\log (2)}{\gamma }_{i}^{\exp }$. Additionally, we embed species as features c_i(Z_i) for later use in the charge embedding.

Charge equilibration method (Qeq)

The charge equilibration method takes the environment-dependent electronegativities, species or environment-dependent hardnesses, and species-dependent charge radii to predict partial charges and the coulombic potential (Subsection “Charge equilibration method (Qeq)”). As different systems require different treatments of the long-range electrostatic interactions, the method optionally employs, e.g., the Smooth Particle Mesh method³⁸ for periodic systems.

Charges embedding and latent feature update

We embed the equilibrated charge environment into the scalar features to provide non-local information to the network. Therefore, a first multi-layer perceptron MLP_Q generates charge-dependent features y_ij = MLP_Q(Q_i, Q_j, c_i, c_j) using the equilibrated charges and species embedding c_i of the central and neighbor atoms. These charge-dependent features y_ij are used by a second multi-layer-perceptron MLP_x to update the scalar features from the previous layer ${{\boldsymbol{x}}}_{ij}^{l}$ to ${{\boldsymbol{x}}}_{ij}^{l+1}={{\rm{MLP}}}_{{\boldsymbol{x}}}\left({{\boldsymbol{y}}}_{ij},{{\boldsymbol{x}}}_{ij}^{l}\right){p}_{{\rm{env}}}\left(\left\Vert {{\boldsymbol{R}}}_{i}-{{\boldsymbol{R}}}_{j}\right\Vert \right)$, where p_env is a polynomial envelope function.

Benchmark systems with strictly local models

First, we test our approach on four benchmark systems introduced by Ko et al. These systems were constructed to be unsolvable by strictly local and charge-independent methods, and allow for visual inspection to verify that the model does not exhibit unphysical behavior, and include: Carbon Chains, Silver Clusters, Sodium Chloride Clusters and Gold Dimers on MgO(001) surface (Methods, Section “Systems and datasets”). They have also been used in related works, enabling a direct comparison with other approaches based on strictly local models^3,13,30.

In almost all cases, we observed significant improvements over the models presented by Ko et al., Shaidu et al., Kim et al. (Table 1). This improvement might arise due to the use of equivariant GNNs instead of Behler-Parrinello Neural Networks utilized in 4GHDNN-based models and a learned embedding of the charge environment. In the Supplementary Information, we performed additional studies indicating that environment charge embedding compared to local charge embedding (see Supplementary Table 3) and inserting CELLI in a central position in the model (see Supplementary Table 4) in most cases yields the model with the highest accuracy. Moreover, in most cases—except for the AuMgO system, where CACE-LR employed an increased cutoff—we achieved substantial accuracy gains not only over the baseline Allegro model, where our RMSE is in some case several orders of magnitude better, but also over the non-4GHDNN-based CACE-LR model, which captures long-range effects solely through local feature augmentation. These results suggest that the improved performance stems from a novel integration of environment-dependent charges via the Qeq mechanism into the equivariant GNN framework, enabling accurate modeling of long-range interactions. Notably, the largest error reductions were achieved with CELLI when using environment-dependent hardnesses, suggesting that this variant should be preferred in future applications. However, in some cases, the improved charge predictions from the environment-dependent hardness version did not lead to substantial changes in force or energy accuracy, indicating that prioritizing charge accuracy may not always result in a better overall model.

Table 1 Root mean square errors (RMSE) in units of meV/atom, meV/Å, and me, for CELLI applied to strictly local Allegro model in comparison to the baseline Allegro model and the previous local descriptor methods 4G³, LRSR¹³, and CACE-LR³⁰ modeling long-range interactions

Full size table

In addition, we conducted experiments on these systems to verify whether the model could accurately represent long-range effects and the resulting changes in the PES. Allegro consistently fails these tests, showing significantly higher errors and producing unphysical results (see Fig. 2). In contrast, CELLI yields predictions that closely match DFT calculations and align with theoretical expectations. The results confirm CELLI’s ability to model can effectively capture long-range charge transfer and electrostatics (carbon chains, NaCl clusters), handle differences between charged states (silver clusters), and accurately model energies and forces in charge-sensitive environments (gold dimers on MgO(001) surfaces). These findings underscore CELLI’s strength in representing critical phenomena in diverse systems.

**Fig. 2: Long-range and charge-dependent interactions benchmarks.**

Long-range interactions for message-passing models

To demonstrate that CELLI is applicable beyond strictly local architectures, we integrate it into the message-passing network MACE⁷. Using the scalar node features ${h}_{i}^{(l)}$ and bessel radial basis edge embeddings e_rbf,ij (see Methods, Section “Graph neural networks”), we construct edge features ${x}_{ij}^{(0)}=({p}_{{\rm{env}}}({r}_{ij}){h}_{i}| | {e}_{{\rm{rbf}},ij})$, where p_env is the envelope function of the model and ∣∣ is a concatenation. The output ${x}_{ij}^{(1)}$ from CELLI is aggregated to a weighted residual update of the scalar node features ${h}_{i}^{(l+1)}={h}_{i}^{(l)}+\varepsilon {\sum }_{j\in {\mathcal{N}}(i)}{x}_{ij}^{(1)}$, where ε is a learnable weight for message aggregation.

We assess the effect of CELLI using the benchmarks from the previous section, excluding the silver clusters due to already being fully contained in the receptive field of the strictly local Allegro model. Additionally, we compare our results to SpookyNet³¹ (Table 2).

Table 2 Root mean square errors (RMSE) in units of meV/atom, meV/Å, and me, for CELLI applied to the message-passing model MACE vs. baseline MACE and SpookyNet³¹

Full size table

CELLI significantly enhances the performance of the baseline MACE model, in fact, in some cases it achieves errors almost ten times lower. Both SpookyNet and CELLI(6) achieve particularly low errors compared to models with two message-passing steps, likely due to their deeper architectures with six message-passing layers, which allow them to capture complex interactions even in smaller systems. However, this depth may limit their applicability to larger systems, as using many message-passing layers can become computationally impractical⁶.

Interestingly, while CELLI tends to produce significantly larger errors in partial charge predictions compared to SpookyNet, these discrepancies do not consistently correlate with errors in energy or force predictions. This may reflect differences in how each model utilizes charge information and is consistent with previous findings that highlight limitations of models based on charge partitioning schemes¹³. This observation highlights the decoupling between partial charge prediction accuracy and overall energy/force performance, reinforcing the idea that incorporation of charge partitioning schemes may constrain model expressiveness¹³. Such a behavior could be due to several factors: first, the predicted charges are solutions to a constrained quadratic minimization problem; second, partial charges do not fully capture the underlying electronic structure relevant to molecular energetics; and third, the incorporation of predefined physical equations, while interpretable, may reduce model flexibility.

Overall, this comparison underscores CELLI’s effectiveness in modeling systems with non-local interactions, even within message-passing neural networks.

Generalization to chemically diverse systems

The previous benchmarks consist of relatively small and simple systems. Therefore, they cannot demonstrate the generalizability across a wide chemical space and the advantageous scalability of our scheme with the size of the system. To this end, we included the OE62 dataset³⁴, which allows us to evaluate the performance of our model on larger and more diverse systems of varying sizes, providing a complementary assessment to the simpler benchmarks focused on single structures with different charge states. In addition, we evaluate the scalability of CELLI by measuring forward-pass times across systems with varying atom counts. We compare three versions of CELLI (two using Allegro and one using MACE) against four Allegro baselines, which differ in model size and the inclusion of an additional tensor product layer. We also include various MPNN architectures, including models that incorporate other long-range correction schemes: Ewald and Neural P³M^12,21 as well as baseline DimeNet++ with hyperparameters as in Kosmala et al. (Detailed Results in Supplementary Table 1).

Our results show that CELLI outperformed the baseline Allegro models (Table 3) with only a marginal increase in computational cost (Fig. 3). Notably, even the small CELLI variant outperformed the largest Allegro baseline model, demonstrating that applying the Qeq scheme is more effective than merely increasing model size, as certain effects cannot be captured without an appropriate long-range correction method. In fact introduction of CELLI to strictly local Allegro model makes its performance comparable to MPNN architectures with state-of-the-art long-range correction schemes. Moreover, CELLI combined with Allegro not only improves upon baseline models but also achieves results comparable with state-of-the-art Dimnet++ with Neural P³M and significantly outperforms PaiNN models with Ewald and Neural P³M corrections, which have substantially more parameters and message-passing steps. Reducing the number of parameters and memory requirements of the models helps avoid memory-related issues³⁹ in large-scale simulations. Moreover, CELLI’s compatibility with a strictly local baseline model could increase its potential to scale efficiently across multiple GPUs³⁹. In the case of MACE, the improvement over the baseline is noticeably smaller, which is likely due to good performance of baseline and presence of message passing. While CELLI exhibits a slightly higher MAE than DimeNet++, it achieves a lower RMSE and substantially improved computational efficiency, reducing runtime by approximately a factor of two (Fig. 3). The discrepancy between the RMSE and MAE results could be due to more outliers in DimeNet++. Additionally, the MACE model has a higher potential to achieve high scalability on multi-GPU simulations than DimeNet++ and PaiNN, due to fewer message passing steps. Thus, CELLI offers significant improvements for highly local baseline models with only a marginal increase in computational cost, which is mainly determined by the underlying architecture. Therefore, CELLI promises efficient and accurate MD simulations of large, complex structures.

Table 3 Summary of accuracy for all trained models (cutoff 0.6 nm) on the OE62 dataset compared to other long-range modeling approaches

Full size table

**Fig. 3: Computational cost of CELLI.**

It is worth noting that in both the carbon chain benchmark and the OE62 dataset, CELLI combined with Allegro performs worse than the baseline MACE model without long-range corrections. These datasets feature small organic molecules where electrostatic interactions and long-range effects are not as pronounced as in other benchmark cases. Additionally, the OE62 dataset consists of ground-state geometries, further reducing the relevance of dynamic charge redistribution. Since these systems are largely dominated by local interactions, introducing message passing steps can improve the model more than CELLI, particularly when electrostatics play a limited role. Nevertheless, this case shows that CELLI can generalize to diverse chemical spaces and allows for a comparison to different models and long-range correction schemes.

Verifying simulation stability

Performing MD simulations requires models to be stable for many timesteps. To validate the robustness of CELLI, we perform a series of MD simulations at ambient conditions (see Method section). Therefore, we train a baseline and a CELLI-enhanced Allegro on the SPICE dataset. We selected this dataset because it provides forces for non-equilibrium low and high-energy structures. Therefore, the dataset promotes model stability by providing much information about conformations encountered in MD simulations, compared to the OE62 dataset, which contains only minimum energy structures.

Replacing one interaction layer by CELLI reduced the energy and force mean absolute errors from 15.5 meV/atom to 9.4 meV/atom and from 81.4 meV/Å to 72.5 meV/Å. In the MD simulations, none of the 16 selected structures suffered from instabilities such as broken bonds or overlapping particles for both Allegro variants. Additionally, we analyzed the physical interpretability of the model by comparing the predicted electronegativities and hardnesses against computed references⁴⁰ and the predicted atomic radii against experimental references³⁷ (see Supplementary Fig. 2). The predictions correlate well with literature values within a physically reasonable margin (see Supplementary Note 1). Therefore, CELLI efficiently increases simulation accuracy at high efficiency for chemically diverse systems, capturing physically meaningful interactions without introducing artifacts for samples unseen in training.

Discussion

This paper presents CELLI, a model-agnostic building block introducing the established Qeq method into highly descriptive equivariant GNN MLIPs. Using equivariant GNNs, CELLI can propose accurate parameters for chemically highly diverse environments. Through the Qeq method, CELLI integrates information about long-range electrostatic interactions and charge transfer into effectively local MLIPs. Therefore, CELLI offers a solution to the long-standing challenge of accurately modeling long-range interactions with MLIPs for chemically diverse systems and applications.

In a series of benchmark cases, we showed that strictly and effectively local MLIPs struggle with modeling long-range electrostatic effects and charge-state dependence. These models can effectively learn complex electrostatic environments through CELLI, significantly enhancing their predictive accuracy and physical validity. Moreover, we showed that CELLI can generalize to chemically diverse datasets and large molecules, marginally increasing the computational costs of the baseline model. Furthermore, in a series of molecular dynamics simulations, we demonstrated that CELLI provides robust predictions for samples unseen in training, which is crucial to running long and stable simulations.

Our method addresses crucial limitations of existing methods to model long-range interactions. On the one hand, by leveraging highly expressive equivariant GNNs, CELLI does not rely on hand-crafted descriptors as used in Behler-Parrinello type Neural Networks²⁶. Thus, CELLI-enhanced models can be trained end-to-end, making the Qeq approach applicable for modeling large and complex chemical systems. Moreover, end-to-end trained CELLI-enhanced models can learn representations for the charge environment, which is crucial to achieve state-of-the-art accuracies for strictly local GNNs. On the other hand, it is also significantly more cost-effective and generally applicable than other proposed machine learning methods. For example, CELLI does not require artificially defined periodicity and anisotropy as seen in lattice-based methods^2,12,21, but can be applied to systems with an arbitrary number of periodic dimensions. Additionally, CELLI can be applied to strictly local GNNs, which are highly parallelizable across multiple GPUs³⁹. Therefore, CELLI’s flexibility, in combination with high accuracy, generalizability, and efficiency, makes it ideal to run large-scale and accurate molecular dynamics simulations of complex systems under strict computational cost constraints.

In the feature, we plan to interface CELLI with the large-scale molecular dynamics simulation framework LAMMPS⁴¹. LAMMPS provides efficient algorithms for charge equilibration and enables running molecular dynamics simulations in parallel on multiple GPUs. Therefore, this integration would simplify deploying CELLI to large-scale simulations. Moreover, we plan to extend CELLI with other physics-based priors^1,42,43, which might further reduce its costs while increasing accuracy and robustness. Finally, we plan to assess CELLI’s capabilities in predicting simulation observables, such as IR spectra in vacuum⁴⁴ and under electric fields⁴⁵, which require accurate modeling of dynamics and electrostatics.

Methods

Graph neural networks

Molecular systems can be represented as graphs by describing atoms as nodes and defining edges between neighboring atoms within a fixed cutoff radius, allowing GNNs to learn atom-centered representations. In the first step, GNNs embed this graph, assigning initial features ${{\boldsymbol{h}}}_{i}^{0}$ to the nodes and features x_ij to the edges from atom species Z_i and atom displacements. Subsequently, GNNs encode the graph by iteratively updating edge and node features that are finally read out to obtain node, edge, and graph property predictions.

The popular class of Message-passing neural networks (MPNNs) class, first formalized by Gilmer et al., encodes the graph by iteratively performing message-passing

$${{\boldsymbol{m}}}_{i}^{l+1}=\sum _{j\in {\mathcal{N}}(i)}{{\mathcal{M}}}^{l}({{\boldsymbol{h}}}_{i}^{l},{{\boldsymbol{h}}}_{j}^{l},{{\boldsymbol{x}}}_{ij}),$$

(1)

$${{\boldsymbol{h}}}_{i}^{l+1}={{\mathcal{U}}}^{l}({{\boldsymbol{m}}}_{i}^{l+1},{{\boldsymbol{h}}}_{i}^{l}),$$

(2)

where ${{\mathcal{M}}}^{l}$ and ${{\mathcal{U}}}^{l}$ are learnable functions of the layer l. As messages ${{\boldsymbol{m}}}_{i}^{l}$ contain information from all graph neighbors $j\in {\mathcal{N}}(i)$ of a particle i, MPNNs pass information of each atom’s neighborhood along the graph. Therefore, message-passing gradually expands the atom’s receptive field and enables the capture of many-body correlations^6,46.

Unfortunately, this information propagation complicates parallelized implementations of GNNs, e.g., in large-scale atomistic MD frameworks such as LAMMPS⁴¹. Therefore, strictly local architectures such as Allegro⁶ have been proposed. Conceptually, Allegro updates the directed edge features through the steps

$${{\boldsymbol{w}}}_{ij}^{l+1}=\sum _{k\in {\mathcal{N}}(i)}{{\mathcal{W}}}^{l}\left({{\boldsymbol{x}}}_{ij}^{l},{{\boldsymbol{x}}}_{ik}^{l}\right),$$

(3)

$${{\boldsymbol{x}}}_{ij}^{l+1}={{\mathcal{U}}}^{l}\left({{\boldsymbol{w}}}_{ij}^{l+1},{{\boldsymbol{x}}}_{ij}^{l+1}\right),$$

(4)

where ${{\boldsymbol{w}}}_{ij}^{l+1}$ contains information from all edges that originate from the same node. Corresponding to the message-passing framework, the function ${{\mathcal{W}}}^{l}$ encodes information about the environment of an edge into the update function ${{\mathcal{U}}}^{l}$. However, as two directed edges between nodes can contain different information (${{\boldsymbol{x}}}_{ij}^{l}\ne {{\boldsymbol{x}}}_{ji}^{l}$), no information is passed along the graph.

Efficient computation of electrostatic interactions

Electrostatic effects are commonly approximated by coulombic interactions. For a system of N charges Q with Gaussian density, located at the centers of the particles R, the coulombic interaction potential is

$${U}_{{\rm{Coul}}}({\boldsymbol{R}},{\boldsymbol{Q}})=\mathop{\sum }\limits_{i}^{N}\mathop{\sum }\limits_{j > i}^{N}\frac{{\rm{erf}}({\alpha }_{ij}{r}_{ij})}{{r}_{ij}}{Q}_{i}{Q}_{j}+\mathop{\sum }\limits_{i=1}^{N}\frac{2{\alpha }_{ii}}{\sqrt{\pi }}{Q}_{i}^{2},$$

(5)

where ${\alpha }_{ij}=\frac{1}{\sqrt{2}}{({\gamma }_{i}^{2}+{\gamma }_{j}^{2})}^{-1/2}$ depends on the radii γ_i of the charges separated by a distance ${r}_{ij}=\left\Vert | {{\boldsymbol{R}}}_{i}-{{\boldsymbol{R}}}_{j}\right\Vert$³². These interactions can extend over larger distances as the interaction decays approximately with the factor 1/r. Moreover, the contributions from distant charges must be accurately captured without truncation or oversimplification¹. Therefore, coulombic interactions are more challenging to model efficiently than, e.g., short-ranged van-der-Waals interactions.

Nevertheless, classical approaches have been proposed to model long-range interactions efficiently without computing direct pairwise interactions beyond a small cutoff. Essentially, these methods decompose the interaction potential into a rapidly decaying short-range part and a smooth but slowly decaying long-range part. The methods then treat the short-range part directly like other short-range interactions. However, as the long-range part still accounts for contributions from distant charges, a more efficient computation requires a different treatment. For example, the Fast Multipole Method⁴⁷ hierarchically groups particles and computes distant interactions between these clusters collectively to achieve a O(N) scaling with respect to the number of particles. Especially for periodic systems, the Smooth Particle Mesh Ewald (SPME) method³⁸ computes long-ranged interactions more efficiently in the reciprocal space. Similar to the short-ranged part in real space, the long-ranged part decays quickly in the reciprocal space and can be truncated without losing accuracy. Additionally, by mapping charges to a grid leveraging B-spline interpolation for smooth gradients and employing fast Fourier transforms, it achieves a computational complexity of $O(N\log N)$. Notably, SPME is not limited to periodic systems but can be generalized to systems with partial or fully non-periodic boundary conditions, e.g., to treat isolated clusters⁴⁸.

Charge equilibration method (Qeq)

Several approaches can compute long-ranged electrostatic interactions accurately and efficiently in many systems. Nevertheless, these interactions must be adequately parametrized for the respective systems by assigning partial charges to the atoms. Assigning fixed partial charges can introduce significant errors due to charge transfer induced by changes in the chemical environment³². Therefore, methods with dynamic partial charge assignment are necessary to accurately model molecular systems with significant electrostatic interactions.

To model environment-dependent partial charges, the Charge Equilibration (Qeq) method⁴⁰ proposes to redistribute charges in the system to minimize the total energy while maintaining charge conservation $\mathop{\sum }\nolimits_{i = 1}^{N}{Q}_{i}={Q}_{{\rm{tot}}}$. In the Qeq method, the contribution of charges to the total energy

$${U}_{{\rm{Qeq}}}({\boldsymbol{R}},{\boldsymbol{Q}})={U}_{{\rm{Coul}}}({\boldsymbol{R}},{\boldsymbol{Q}})+\mathop{\sum }\limits_{i=1}^{N}\left[{\chi }_{i}{Q}_{i}+\frac{{J}_{ii}}{2}{Q}_{i}^{2}\right]$$

(6)

consists of the coulombic interaction between charges U_Coul given in Eq. (5) and a second-order approximation of the charge-core interaction determined by the electronegativities χ_i and chemical hardnesses J_i. Due to the form of the coulombic interaction, the charge energy is quadratic in Q. Consequently, the minimum of U_Qeq is the solution of the linear system

$$\left[{\left.\frac{{\partial }^{2}{U}_{{\rm{Coul}}}}{\partial {Q}_{i}\partial {Q}_{j}}\right| }_{{\boldsymbol{R}}}+{J}_{ii}\right]{Q}_{j}=-{\chi }_{i},$$

(7)

subject to the charge conserving equality constraint 1^TQ = Q_tot. For smaller systems, direct linear solvers can determine the optimal charges within a short runtime. However, due to the cubic scaling with the number of particles O(N³), several other approaches have been proposed to solve the system in quadratic⁴⁹ or quasi-linear time⁵⁰, leveraging efficient treatments of long-range interactions outlined in Section “Efficient computation of electrostatic interactions”.

Systems and datasets

The benchmark datasets for long-range and electrostatic interactions comprise four organic and inorganic systems with up to four different species in free and periodic boundary conditions³. For each system, DFT computations of energies and forces were obtained with the PBE functional, while charges were generated with Hirshfeld population analysis. The datasets are available at https://doi.org/10.24435/materialscloud:f3-yh.

The first benchmark system consists of neutral and charged carbon chains. C₁₀H₂ is a neutral linear chain of carbons terminated with hydrogen atoms, while ${{\rm{C}}}_{10}{{\rm{H}}}_{3}^{+}$ is obtained by protonating one end of the chain, leading to global charge redistribution. This system highlights how a given model accounts for long-range charge transfer caused by local perturbations.

The second benchmark involves triangular and linear silver trimers (Ag₃) with total charges of +1 and −1, respectively. These systems test the model’s ability to handle differences in charge states, geometries, and identification of energetically favorable conformations.

We also evaluated the sodium chloride clusters benchmark, consisting of ${{\rm{Na}}}_{8}{{\rm{Cl}}}_{8}^{+}$ and ${{\rm{Na}}}_{9}{{\rm{Cl}}}_{8}^{+}$. In these systems, moving a sodium atom along a predefined path reveals two distinct energy minima, which are sensitive to long-range electrostatics and charge redistribution and can demonstrate the model’s ability to accurately predict changes in the potential energy surface.

The final benchmark is a periodic system consisting of a gold dimer (Au₂) adsorbed on MgO(001) surfaces, both undoped and Al-doped. Two configurations were considered: “wetting,” where both Au atoms lie near Mg atoms, and “non-wetting,” where one Au atom binds to an O atom while the other remains farther away. These configurations assess the model’s ability to capture adsorption energies and forces in charge-sensitive periodic settings.

The OE62 dataset provides a diverse benchmark for the evaluation of our model, as it consists of 62,000 organic molecules extracted from the Cambridge Structural Database, with DFT-optimized geometries at the PBE level, including van der Waals corrections³⁴. OE62 spans a broad chemical space, with up to 174 atoms and 16 elements, offering a comprehensive test case for assessing the scalability and generalizability of models on chemically complex and diverse systems. The dataset is available at https://doi.org/10.14459/2019mp1507656.

The SPICE dataset⁵¹ (v2.0.1) spans a large chemical space of peptides and drug-like molecules consisting of 17 different chemical species, including low and high-energy conformations and systems with non-zero net charge. Each sample provides DFT computed energies and forces using the ωB97M − D3(BJ) functional with dispersion correction and the def2-TZVPPD basis set as well as MBIS charges. For this paper, we selected the Amino Acid Ligand, PubChem Sets, DES370K, DES Monomers, and Dipeptides subsets. The full dataset is available at https://doi.org/10.5281/zenodo.10975225.

Model optimization

We performed all experiments in the deep-learning framework JAX using chemtrain⁵² to train the models. Therefore, we adapted JAX-MD⁵³, and JAX compatible implementations of Allegro⁵⁴ and DimeNet++^55,56, and MACE^7,57.

The reference energies in the datasets contain large negative shifts. Therefore, we shift the reference energies U by species-dependent constant shifts U_s to obtain the target energies

$${\hat{U}}_{i}={U}_{i}-\mathop{\sum }\limits_{s=1}^{S}{U}_{s}{N}_{s,i}$$

(8)

where N_s,i counts the occurrences of species s in sample i. We determined the shifts U_s through a ridge-regression fit to the dataset.

We train the models via the Force Matching method^52,58. Therefore, we optimize the parameters θ to minimize the loss function

$${\mathcal{L}}(\theta )=\frac{1}{D}\mathop{\sum }\limits_{i=1}^{D}\left[{\gamma }_{U}{\left\Vert {U}_{\theta }({{\boldsymbol{R}}}_{i})-{\hat{U}}_{i}\right\Vert }^{2}+\frac{{\gamma }_{F}}{3{N}_{i}}{\left\Vert {{\boldsymbol{F}}}_{\theta }({{\boldsymbol{R}}}_{i})-\hat{{\boldsymbol{F}}}\right\Vert }^{2}+\frac{{\gamma }_{Q}}{{N}_{i}}{\left\Vert {{\boldsymbol{Q}}}_{\theta }({{\boldsymbol{R}}}_{i})-\hat{{\boldsymbol{Q}}}\right\Vert }^{2}\right]$$

(9)

between the reference values $\hat{U},\hat{{\boldsymbol{F}}},\hat{{\boldsymbol{Q}}}$ and the model predictions U, F, Q for D samples R of the training dataset via stochastic optimization using the ADAM optimizer⁵⁹ and a polynomial step-size schedule with weight decay. The parameters γ_U, γ_F, γ_Q balance the contributions of the targets to the loss and are set problem-specific. We monitor the convergence by empirically estimating the loss on a disjoint validation split and select the parametrization θ that yielded the lowest error on the validation split.

Hyperparameters

Model cutoffs were chosen similar to Ko et al. for the four benchmark systems and to Kosmala et al. for the OE62 dataset (Supplementary Table 2). In the four benchmark systems and for the SPICE dataset, CELLI is replaced by an additional interaction layer to obtain the baseline Allegro model. For the OE62 dataset, CELLI is excluded without replacement (S, L) or replaced by an additional Interaction Layer (S+, L+) to obtain the Allegro baseline variants. For the MACE model, CELLI is always excluded without replacement. DimeNet++ hyperparameters are similar to Kosmala et al., except for the loss function, which is chosen to comply with Eq. (9).

Timing OE62 forward passes

We evaluate the forward pass run times for all models on a single NVIDIA A100. For evaluating the computational performance, we partition the training split at [25, 50, 75, 115, 174] atoms per molecule and choose the 5000 largest structures. For each subset, we choose the maximum number of edges and triplets to account for the maximum required by any sample in the subset. We then time the forward pass for batch sizes of [1, 10, 25, 50, 100] for the ahead-of-time compiled model.

Simulating SPICE systems

We run MD simulations for 16 different systems drawn equally from the four different subsets PubChem Sets, Amino Acid Ligands, Dipeptides, and DES370K. Starting from a randomly selected conformation from the testing set, we run simulations for 1 ns with a step size of 0.5 fs at 300 K using a stochastic thermostat⁶⁰ with a friction coefficient of 100 ps⁻¹. Therefore, we perform 2 million update steps per model.

Data availability

The datasets used in this study are publicly available to download (see Methods). Adapted models, training, and evaluation scripts are not publicly available but may be made available to qualified researchers on reasonable request from the corresponding author.

Code availability

The software chemtrain used to train the models and perform MD simulations is publicly available at https://github.com/tummfm/chemtrain.

References

Anstine, D. M. & Isayev, O. Machine learning interatomic potentials and long-range physics. J. Phys. Chem. A 127, 2417–2431 (2023).
Article PubMed PubMed Central CAS Google Scholar
Frank, J. T., Chmiela, S., Müller, K.-R. & Unke, O. T. Euclidean fast attention: Machine learning global atomic representations at linear cost. Preprint at https://arxiv.org/abs/2412.08541 (2024).
Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12, 398 (2021).
Article PubMed PubMed Central CAS Google Scholar
Röcken, S., Burnet, A. F. & Zavadlav, J. Predicting solvation free energies with an implicit solvent machine learning potential. J. Chem. Phys. 161, 234101 (2024).
Article PubMed Google Scholar
Röcken, S. & Zavadlav, J. Accurate machine learning force fields via experimental and simulation data fusion.npj Comput Mater 10, 69 (2024).
Article Google Scholar
Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).
Article PubMed PubMed Central CAS Google Scholar
Batatia, I., Kovács, D. P., Simm, G. N. C., Ortner, C. & Csányi, G. Mace: higher order equivariant message passing neural networks for fast and accurate force fields. Advances in Neural Information Processing Systems (NeurIPS), 2022.
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Article PubMed PubMed Central CAS Google Scholar
Kovács, D. P. et al. MACE-OFF: short-range transferable machine learning force fields for organic molecules. J. Am. Chem. Soc. 147, 17598–17611 (2025).
Article PubMed PubMed Central Google Scholar
Yang, Z. et al. Efficient equivariant model for machine learning interatomic potentials. npj Comput. Mater. 11, 49 (2025).
Article CAS Google Scholar
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning, vol. 70 of Proceedings of Machine Learning Research, (eds, Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Kosmala, A., Gasteiger, J., Gao, N. & Günnemann, S. Ewald-based long-range message passing for molecular graphs. In Proceedings of the 40th International Conference on Machine Learning, vol. 202 of Proceedings of Machine Learning Research (eds, Krause, A. et al.) 17544–17563 (PMLR, 2023).
Shaidu, Y., Pellegrini, F., Küçükbenli, E., Lot, R. & de Gironcoli, S. Incorporating long-range electrostatics in neural network potentials via variational charge equilibration from shortsighted ingredients. npj Comput. Mater. 10, 47 (2024).
Article CAS Google Scholar
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
Article PubMed PubMed Central CAS Google Scholar
Zhou, H.-X. & Pang, X. Electrostatic interactions in protein structure, folding, binding, and condensation. Chem. Rev. 118, 1691–1741 (2018).
Article PubMed PubMed Central CAS Google Scholar
Baskin, I., Epshtein, A. & Ein-Eli, Y. Benchmarking machine learning methods for modeling physical properties of ionic liquids. J. Mol. Liq. 351, 118616 (2022).
Article CAS Google Scholar
Gromiha, M. & Selvaraj, S. Importance of long-range interactions in protein folding. Biophys. Chem. 77, 49–68 (1999).
Article PubMed CAS Google Scholar
Galante, M. & Tkatchenko, A. Anisotropic van der Waals dispersion forces in polymers: Structural symmetry breaking leads to enhanced conformational search. Phys. Rev. Res. 5, L012028 (2023).
Article CAS Google Scholar
Frank, J. T., Unke, O. T. & Müller, K.-R. So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems. Advances in Neural Information Processing Systems (NeurIPS), 2022.
Yu, H., Hong, L., Chen, S., Gong, X. & Xiang, H. Capturing long-range interaction with reciprocal space neural network. Preprint at https://arxiv.org/abs/2211.16684 (2022).
Wang, Y. et al. Neural P³M: a long-range interaction modeling enhancer for geometric GNNs. In Advances in Neural Information Processing Systems (NeurIPS) 2024.
Li, Y. et al. Long-short-range message-passing: a physics-informed framework to capture non-local interaction for scalable molecular dynamics simulation. The Twelfth International Conference on Learning Representations (ICLR), 2024.
Caruso, A. et al. Extending the range of graph neural networks: relaying attention nodes for global encoding. Preprint at https://arxiv.org/abs/2502.13797 (2025).
Zhdanov, M., Welling, M. & van de Meent, J.-W. Erwin: a tree-based hierarchical transformer for large-scale physical systems. International Conference on Machine Learning (ICML), 2025.
Rowe, P., Deringer, V. L., Gasparotto, P., Csányi, G. & Michaelides, A. An accurate and transferable machine learning potential for carbon. J. Chem. Phys. 153, 034702 (2020).
Article PubMed CAS Google Scholar
Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
Article Google Scholar
Coste, A., Slejko, E., Zavadlav, J. & Praprotnik, M. Developing an implicit solvation machine learning model for molecular simulations of ionic media. J. Chem. Theory Comput. 20, 411–420 (2023).
Article PubMed PubMed Central Google Scholar
Song, Z., Han, J., Henkelman, G. & Li, L. Charge-optimized electrostatic interaction atom-centered neural network algorithm. J. Chem. Theory Comput. 20, 2088–2097 (2024).
Article PubMed Google Scholar
Plé, T., Lagardère, L. & Piquemal, J.-P. Force-field-enhanced neural network interactions: from local equivariant embedding to atom-in-molecule properties and long-range effects. Chem. Sci. 14, 12554–12569 (2023).
Article PubMed PubMed Central Google Scholar
Kim, D., King, D. S., Zhong, P. & Cheng, B. Learning charges and long-range interactions from energies and forces. Preprint at https://arxiv.org/abs/2412.15455 (2024).
Unke, O. T. et al. Spookynet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).
Article PubMed PubMed Central CAS Google Scholar
Ghasemi, S. A., Hofstetter, A., Saha, S. & Goedecker, S. Interatomic potentials for ionic systems with density functional accuracy based on charge densities obtained by a neural network. Phys. Rev. B 92, 045131 (2015).
Article Google Scholar
Gao, A. & Remsing, R. C. Self-consistent determination of long-range electrostatics in neural network potentials. Nat. Commun. 13, 1572 (2022).
Article PubMed PubMed Central CAS Google Scholar
Stuke, A. et al. Atomic structures and orbital energies of 61,489 crystal-forming organic molecules. Sci. Data 7, 58 (2020).
Article PubMed PubMed Central CAS Google Scholar
Eastman, P. et al. SPICE, a dataset of drug-like molecules and peptides for training machine learning potentials. Sci. Data 10, 11 (2023).
Article PubMed PubMed Central CAS Google Scholar
Batatia, I. et al. The design space of E(3)-equivariant atom-centred interatomic potentials. Nat. Mach. Intell. 7, 56–67 (2025).
Article PubMed PubMed Central Google Scholar
Pyykkö, P. & Atsumi, M. Molecular single-bond covalent radii for elements 1–118. Chem. A Eur. J. 15, 186–197 (2009).
Article Google Scholar
Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593 (1995).
Article CAS Google Scholar
Fuchs, P., Chen, W., Thaler, S. & Zavadlav, J. chemtrain-deploy: a parallel and scalable framework for machine learning potentials in million-atom MD simulations. J. Chem. Theory Comput. 15, 7550–7560 (2025).
Article Google Scholar
Rappe, A. K. & Goddard III, W. A. Charge equilibration for molecular dynamics simulations. J. Phys. Chem. 95, 3358–3363 (1991).
Article CAS Google Scholar
Thompson, A. P. et al. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Article CAS Google Scholar
Thürlemann, M., Böselt, L. & Riniker, S. Regularized by physics: graph neural network parametrized potentials for the description of intermolecular interactions. J. Chem. Theory Comput. 19, 562–579 (2023).
Article PubMed PubMed Central Google Scholar
Kabylda, A. et al. Molecular simulations with a pretrained neural network and universal pairwise force fields. J. Am. Chem. Soc. ASAP Article (2025).
Gastegger, M., Behler, J. & Marquetand, P. Machine learning molecular dynamics for the simulation of infrared spectra. Chem. Sci. 8, 6924–6935 (2017).
Article PubMed PubMed Central CAS Google Scholar
Gastegger, M., Schütt, K. T. & Müller, K.-R. Machine learning of solvent effects on molecular spectra and reactions. Chem. Sci. 12, 11473–11483 (2021).
Article PubMed PubMed Central CAS Google Scholar
Schütt, K. T., Unke, O. T. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. Proceedings of the 38th International Conference on Machine Learning. 9377–9388 (2021).
Greengard, L. & Rokhlin, V. A fast algorithm for particle simulations. J. Comput. Phys. 73, 325–348 (1987).
Article Google Scholar
Martyna, G. J. & Tuckerman, M. E. A reciprocal space based method for treating long range interactions in ab initio and force-field-based calculations in clusters. J. Chem. Phys. 110, 2810–2821 (1999).
Article CAS Google Scholar
Nakano, A. Parallel multilevel preconditioned conjugate-gradient approach to variable-charge molecular dynamics. Comput. Phys. Commun. 104, 59–69 (1997).
Article CAS Google Scholar
Gubler, M., Finkler, J. A., Schäfer, M. R., Behler, J. & Goedecker, S. Accelerating fourth-generation machine learning potentials using quasi-linear scaling particle mesh charge equilibration. J. Chem. Theory Comput. 20, 7264–7271 (2024).
PubMed PubMed Central CAS Google Scholar
Eastman, P. et al. Spice 2.0.1 https://doi.org/10.5281/zenodo.10975225 (2024).
Fuchs, P., Thaler, S., Röcken, S. & Zavadlav, J. Chemtrain: Learning deep potential models via automatic differentiation and statistical physics. Comput. Phys. Commun. 310, 109512 (2025).
Article CAS Google Scholar
Schoenholz, S. S., Cubuk, E. D. & JAX, M.D. A framework for differentiable physics. J. Stat. Mech. Theory Exp. 2021, 124016 (2021).
Mario, G. & Daigavane, A. Allegro-Jax. https://github.com/mariogeiger/allegro-jax.
Thaler, S. & Zavadlav, J. Learning neural network potentials from experimental data via Differentiable Trajectory Reweighting. Nat. Commun. 12, 6884 (2021).
Article PubMed PubMed Central CAS Google Scholar
Gasteiger, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. Machine Learning for Molecules Workshop at NeurIPS 2020.
Mario, G. & Daigavane, A. Mace-Jax. https://github.com/ACEsuit/mace-jax.
Ercolessi, F. & Adams, J. B. Interatomic potentials from first-principles calculations: the force-matching method. Europhys. Lett. 26, 583–588 (1994).
Article CAS Google Scholar
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. The 3rd International Conference for Learning Representations, San Diego, 2015.
Goga, N., Rzepiela, A. J., de Vries, A. H., Marrink, S. J. & Berendsen, H. J. C. Efficient Algorithms for Langevin and DPD Dynamics. J. Chem. Theory Comput. 8, 3637–3649 (2012).
Article PubMed CAS Google Scholar

Download references

Acknowledgements

Funded by the European Union. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. This work was funded by the ERC (StG SupraModel) - 101077842 and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 534045056 and 561190767.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Multiscale Modeling of Fluid Materials, Department of Engineering Physics and Computation, TUM School of Engineering and Design, Technical University of Munich, Munich, Germany
Paul Fuchs, Michał Sanocki & Julija Zavadlav
Atomistic Modeling Center, Munich Data Science Institute, Technical University of Munich, Munich, Germany
Michał Sanocki & Julija Zavadlav

Authors

Paul Fuchs
View author publications
Search author on:PubMed Google Scholar
Michał Sanocki
View author publications
Search author on:PubMed Google Scholar
Julija Zavadlav
View author publications
Search author on:PubMed Google Scholar

Contributions

P.F. and J.Z. conceptualized the study. P.F. developed the model and implemented the software. P.F. and M.S. performed the experiments and wrote the manuscript. M.S. analyzed the results. J.Z. supervised the project, reviewed the manuscript, provided resources, and acquired funding.

Corresponding author

Correspondence to Julija Zavadlav.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fuchs, P., Sanocki, M. & Zavadlav, J. Learning non-local molecular interactions via equivariant local representations and charge equilibration. npj Comput Mater 11, 287 (2025). https://doi.org/10.1038/s41524-025-01790-4

Download citation

Received: 18 June 2025
Accepted: 03 September 2025
Published: 16 September 2025
Version of record: 16 September 2025
DOI: https://doi.org/10.1038/s41524-025-01790-4

This article is cited by

A foundation machine learning potential with polarizable long-range interactions for materials modelling
- Rongzhi Gao
- ChiYung Yam
- Ziyang Hu
Nature Communications (2025)

Subjects

Abstract

Similar content being viewed by others

Incorporating long-range electrostatics in neural network potentials via variational charge equilibration from shortsighted ingredients

Pushing charge equilibration-based machine learning potentials to their limits

A foundation machine learning potential with polarizable long-range interactions for materials modelling

Introduction

Results

Charge equilibration layer for long-range interactions (CELLI)

Environment embedding

Species embedding

Charge equilibration method (Qeq)

Charges embedding and latent feature update

Benchmark systems with strictly local models

Long-range interactions for message-passing models

Generalization to chemically diverse systems

Verifying simulation stability

Discussion

Methods

Graph neural networks

Efficient computation of electrostatic interactions

Charge equilibration method (Qeq)

Systems and datasets

Model optimization

Hyperparameters

Timing OE62 forward passes

Simulating SPICE systems

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

A foundation machine learning potential with polarizable long-range interactions for materials modelling

Search

Quick links