Machine learning global atomic representations with Euclidean fast attention

Frank, J. Thorben; Chmiela, Stefan; Müller, Klaus-Robert; Unke, Oliver T.

doi:10.1038/s42256-026-01195-y

Article
Published: 25 March 2026

Machine learning global atomic representations with Euclidean fast attention

Nature Machine Intelligence volume 8, pages 388–402 (2026) Cite this article

7386 Accesses
2 Citations
27 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Long-range correlations are essential across numerous machine learning tasks, especially for data embedded in Euclidean space, where the relative positions and orientations of distant components are often critical for accurate predictions. Self-attention offers a compelling mechanism for capturing these global effects, but its quadratic complexity presents a significant practical limitation. This problem is particularly pronounced in computational chemistry, where the stringent efficiency requirements of machine learning force fields (MLFFs) often preclude accurately modelling long-range interactions. Here, to address this, we introduce Euclidean fast attention (EFA), a linear-scaling attention-like mechanism designed for Euclidean data, which can be easily incorporated into existing model architectures. A core component of EFA is our proposed Euclidean rotary positional encoding, which enables efficient representation of spatial information while preserving essential physical symmetries. We empirically demonstrate that EFA effectively captures diverse long-range effects, enabling EFA-equipped MLFFs to describe challenging chemical interactions for which conventional MLFFs yield incorrect results.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of the central concepts of this work.**

**Fig. 2: Geometric expressiveness of EFA.**

**Fig. 3: Shortcomings of MP and scaling analysis.**

**Fig. 4: EFA for reactions and dimers.**

**Fig. 5: EFA for electronically delocalized effects.**

Real-space machine learning of correlation density functionals

Article Open access 01 December 2025

A Euclidean transformer for fast and stable machine learned force fields

Article Open access 06 August 2024

Latent Ewald summation for machine learning of long-range interactions

Article Open access 26 March 2025

Data availability

The S_N2 data are taken from ref. ⁴⁷, the cumulene data are taken from ref. ³, the dimer data are taken from refs. ^48,78, data for the non-local charge transfer benchmark are taken from ref. ⁴⁶ and BIGDML data are taken from ref. ⁵⁰. The k-chains data are available via GitHub at https://github.com/chaitjo/geometric-gnn-dojo. The data for distinguishability of local neighbourhoods are available via GitHub at https://github.com/google-research/e3x/blob/main/tests/nn/modules_test.py (starting from line 315) and were originally proposed in ref. ⁴³. Preprocessed datasets for use with EFA as well as newly generated datasets and source data for plots and tables are available via Zenodo at https://doi.org/10.5281/zenodo.14750285 (ref. ⁷⁹).

Code availability

An implementation of the EFA algorithm is publicly available via GitHub at https://github.com/thorben-frank/euclidean_fast_attention. The repository also includes code for training and evaluation. The code for the article is also available via Zenodo at https://doi.org/10.5281/zenodo.18171624 (ref. ⁸⁰).

References

Rupp, M., Tkatchenko, A., Müller, K.-R. & Von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
Article Google Scholar
von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).
Article Google Scholar
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
Article Google Scholar
Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).
Article Google Scholar
Fu, X. et al. Forces are not enough: benchmark and critical evaluation for machine learning force fields with molecular simulations. TMLR 2835–8856 (2023).
Karplus, M. & McCammon, J. A. Molecular dynamics simulations of biomolecules. Nat. Struct. Mol. Biol. 9, 646–652 (2002).
Article Google Scholar
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article Google Scholar
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian Approximation Potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Article Google Scholar
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Article Google Scholar
Schütt, K. T. et al. Machine Learning Meets Quantum Physics (Springer, 2020).
Unke, O. T. et al. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. Sci. Adv. 10, eadn4397 (2024).
Article Google Scholar
Kabylda, A. et al. Molecular simulations with a pretrained neural network and universal pairwise force fields. J. Am. Chem. Soc. 147, 33723–33734 (2025).
Article Google Scholar
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Article Google Scholar
Woods, L. et al. Materials perspective on Casimir and van der Waals interactions. Rev. Mod. Phys. 88, 045003 (2016).
Article MathSciNet Google Scholar
Hermann, J., DiStasio, R. A. Jr & Tkatchenko, A. First-principles models for van der Waals interactions in molecules and materials: concepts, theory, and applications. Chem. Rev. 117, 4714–4758 (2017).
Article Google Scholar
Stöhr, M., Van Voorhis, T. & Tkatchenko, A. Theory and practice of modeling van der Waals interactions in electronic-structure calculations. Chem. Soc. Rev. 48, 4118–4154 (2019).
Article Google Scholar
Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2017); https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Dosovitskiy, A. et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR, 2021).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (ICLR, 2018).
von Glehn, I., Spencer, J. S. & Pfau, D. A self-attention ansatz for ab-initio quantum chemistry. In International Conference on Learning Representations (ICLR, 2023).
Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FlashAttention: fast and memory-efficient exact attention with IO-awareness. Adv. Neural Inf. Process. Syst. 35, 16344–16359 (2022).
Article Google Scholar
Dao, T. Flashattention-2: faster attention with better parallelism and work partitioning. In International Conference on Learning Representations (ICLR, 2024).
Tay, Y., Dehghani, M., Bahri, D. & Metzler, D. Efficient transformers: a survey. ACM Comput. Surveys 55, 1–28 (2022).
Article Google Scholar
Yuan, J. et al. Native sparse attention: Hardware-aligned and natively trainable sparse attention. In Proc. 63rd Annual Meeting of the Association for Computational Linguistics (eds Che, W. et al.) 23078–23097 (Association for Computational Linguistics, 2025).
Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning (eds Daumé, H. III & Singh, Aarti) 5156–5165 (PMLR, 2020).
Choromanski, K. M. et al. Rethinking attention with performers. In International Conference on Learning Representations (ICLR, 2021).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/e2ad76f2326fbc6b56a45a56c59fafdb-Paper.pdf
Unke, O. T. et al. SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).
Article Google Scholar
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Article Google Scholar
Frank, J. T., Unke, O. T., Müller, K.-R. & Chmiela, S. A Euclidean transformer for fast and stable machine learned force fields. Nat. Commun. 15, 6539 (2024).
Article Google Scholar
Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: model analysis in the tea challenge 2023. Chem. Sci. 16, 3720–3737 (2025).
Article Google Scholar
Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: molecular dynamics in the tea challenge 2023. Chem. Sci. 16, 3738–3754 (2025).
Article Google Scholar
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
Article Google Scholar
Christensen, A. S., Bratholm, L. A., Faber, F. A. & Anatole von Lilienfeld, O. FCHL revisited: faster and more accurate quantum machine learning. J. Chem. Phys. 152, 044107 (2020).
Article Google Scholar
Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).
Article Google Scholar
Su, J. et al. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
Article Google Scholar
Ewald, P. P. Die Berechnung optischer und elektrostatischer Gitterpotentiale. Ann. Phys. 369, 253–287 (1921).
Article Google Scholar
Kosmala, A., Gasteiger, J., Gao, N. & Günnemann, S. Ewald-based long-range message passing for molecular graphs. In Proc. 40th International Conference on Machine Learning 17544–17563 (JMLR.org, 2023).
Frank, T., Unke, O. & Müller, K.-R. So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems. Adv. Neural Inf. Process. Syst. 35, 29400–29413 (2022).
Article Google Scholar
Unke, O. T. & Maennel, H. E3x: E(3)-equivariant deep learning made easy. Preprint at https://arxiv.org/abs/2401.07595 (2024).
Liao, Y.-L. & Smidt, T. Equiformer: equivariant graph attention transformer for 3D atomistic graphs. Preprint at https://arxiv.org/abs/2206.11990 (2022).
Pozdnyakov, S. N. et al. Incompleteness of atomic structure representations. Phys. Rev. Lett. 125, 166001 (2020).
Article MathSciNet Google Scholar
Leman, A. & Weisfeiler, B. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2, 12–16 (1968).
Google Scholar
Joshi, C. K., Bodnar, C., Mathis, S. V., Cohen, T. & Lio, P. On the expressive power of geometric graph neural networks. In Proc. 40th International Conference on Machine Learning 15330–15355 (PMLR, 2023).
Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12, 398 (2021).
Article Google Scholar
Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
Article Google Scholar
Donchev, A. G. et al. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Sci. Data 8, 55 (2021).
Article Google Scholar
Alon, U. & Yahav, E. On the bottleneck of graph neural networks and its practical implications. In International Conference on Learning Representations (ICLR, 2021).
Sauceda, H. E. et al. BIGDML—Towards accurate quantum machine learning force fields for materials. Nat. Commun. 13, 3733 (2022).
Article Google Scholar
Muhli, H. et al. Machine learning force fields based on local parametrization of dispersion interactions: Application to the phase diagram of C₆₀. Phys. Rev. B 104, 054106 (2021).
Article Google Scholar
Westermayr, J., Chaudhuri, S., Jeindl, A., Hofmann, O. T. & Maurer, R. J. Long-range dispersion-inclusive machine learning potentials for structure search and optimization of hybrid organic–inorganic interfaces. Digit. Discov. 1, 463–475 (2022).
Article Google Scholar
Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
Article Google Scholar
Morawietz, T., Sharma, V. & Behler, J. A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges. J. Chem. Phys. 136, 064103 (2012).
Article Google Scholar
Grisafi, A. & Ceriotti, M. Incorporating long-range physics in atomic-scale machine learning. J. Chem. Phys. 151, 204105 (2019).
Article Google Scholar
Pagotto, J., Zhang, J. & Duignan, T. Predicting the properties of salt water using neural network potentials and continuum solvent theory. Preprint at chemRxiv https://doi.org/10.26434/chemrxiv-2022-jndlx (2022).
Li, Y. et al. Long-short-range message-passing: a physics-informed framework to capture non-local interaction for scalable molecular dynamics simulation. In International Conference on Learning Representations (ICLR, 2024).
Loche, P. et al. Fast and flexible long-range models for atomistic machine learning. J. Chem. Phys. 162, 142501 (2025).
Article Google Scholar
Rokhlin, V. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 60, 187–207 (1985).
Article MathSciNet Google Scholar
King, D. S., Kim, D., Zhong, P. & Cheng, B. Machine learning of charges and long-range interactions from energies and forces. Nat. Commun. 16, 8763 (2025).
Article Google Scholar
Cheng, B. Latent Ewald summation for machine learning of long-range interactions. npj Comput. Mater. 11, 80 (2025).
Article Google Scholar
Gori, M., Kurian, P. & Tkatchenko, A. Second quantization of many-body dispersion interactions for chemical and biological systems. Nat. Commun. 14, 8218 (2023).
Article Google Scholar
Batatia, I., Schaaf, L. L., Csanyi, G., Ortner, C. & Faber, F. A. Equivariant matrix function neural networks. In International Conference on Learning Representations (ICLR, 2024).
Wang, Y. et al. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing. Nat. Commun. 15, 313 (2024).
Article Google Scholar
Lebedev, V. I. Quadratures on a sphere. USSR Comput. Math. Math. Phys. 16, 10–24 (1976).
Article MathSciNet Google Scholar
Unke, O. et al. Se(3)-equivariant prediction of molecular wavefunctions and electronic densities. In Proc. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) (Curran Associates, Inc., 2021); https://proceedings.neurips.cc/paper_files/paper/2021/file/78f1893678afbeaa90b1fa01b9cfb860-Paper.pdf
Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D steerable CNNs: learning rotationally equivariant features in volumetric data. In Proc. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018) (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/488e4104520c6aab692863cc1dba45af-Paper.pdf
Hendrycks, D. & Gimpel, K. Gaussian error linear units (GELUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).
Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).
Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. GitHub http://github.com/google/jax (2018).
Heek, J. et al. Flax: a neural network library and ecosystem for JAX. GitHub http://github.com/google/flax (2020).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article Google Scholar
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Article Google Scholar
Elfwing, S., Uchibe, E. & Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural networks 107, 3–11 (2018).
Article Google Scholar
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1802.08219 (2014).
Hessel, M. et al. Optax: composable gradient transformation and optimisation, in jax!. GitHub http://github.com/deepmind/optax (2020).
Tancik, M. et al. Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural Inf. Process. Syst. 33, 7537–7547 (2020).
Google Scholar
Eastman, P. et al. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Sci. Data 10, 11 (2023).
Article Google Scholar
Frank, J. T., Chmiela, S., Müller, K.-R. & Unke, O. T. Machine learning global atomic representations with euclidean fast attention. Zenodo https://doi.org/10.5281/zenodo.14750285 (2026).
Frank, T. & Engelberger, F. thorben-frank/euclidean_fast_attention: publication. Zenodo https://doi.org/10.5281/zenodo.18171624 (2026).

Download references

Acknowledgements

This work was in part supported by the German Ministry for Education and Research (BMBF) under grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D and 01IS18037A. K.-R.M. was partly supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MSIT) (no. 2019-0-00079, Artificial Intelligence Graduate School Program, Korea University and no. 2022-0-00984, Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation). We thank S. Blücher and H. Maennel for helpful comments on the paper.

Author information

Authors and Affiliations

Google DeepMind, Berlin, Germany
J. Thorben Frank, Klaus-Robert Müller & Oliver T. Unke
Machine Learning Group, Technische Universität Berlin, Berlin, Germany
J. Thorben Frank, Stefan Chmiela & Klaus-Robert Müller
Berlin Institute for the Foundations of Learning and Data – BIFOLD, Berlin, Germany
J. Thorben Frank, Stefan Chmiela & Klaus-Robert Müller
Max Planck Institute for Informatics, Saarbrücken, Germany
Klaus-Robert Müller
Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea
Klaus-Robert Müller

Authors

J. Thorben Frank
View author publications
Search author on:PubMed Google Scholar
Stefan Chmiela
View author publications
Search author on:PubMed Google Scholar
Klaus-Robert Müller
View author publications
Search author on:PubMed Google Scholar
Oliver T. Unke
View author publications
Search author on:PubMed Google Scholar

Contributions

J.T.F. and O.T.U. developed theory and code for the EFA algorithm. O.T.U. supervised and guided the project. J.T.F., S.C., K.-R.M. and O.T.U. conceived and designed the experiments. J.T.F. and O.T.U. performed the experiments. J.T.F. analysed the data. J.T.F. and O.T.U. prepared the first version of the paper. All authors contributed to writing the final version of the paper.

Corresponding author

Correspondence to Oliver T. Unke.

Ethics declarations

Competing interests

The authors are inventors on a US patent application (Serial No. 18/897,283) and a PCT international patent application (No. PCT/US2025/046393) related to this work.

Peer review

Peer review information

Nature Machine Intelligence thanks Sheng Gong, Sergey Pozdnyakov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary figures and tables as well as extended experiments on linear scaling of EFA kernel function (Supplementary Figs. 1 and 2) and comparison with other models (Supplementary Fig. 10).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Frank, J.T., Chmiela, S., Müller, KR. et al. Machine learning global atomic representations with Euclidean fast attention. Nat Mach Intell 8, 388–402 (2026). https://doi.org/10.1038/s42256-026-01195-y

Download citation

Received: 12 December 2024
Accepted: 27 January 2026
Published: 25 March 2026
Version of record: 25 March 2026
Issue date: March 2026
DOI: https://doi.org/10.1038/s42256-026-01195-y