Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Machine learning global atomic representations with Euclidean fast attention

A preprint version of the article is available at arXiv.

Abstract

Long-range correlations are essential across numerous machine learning tasks, especially for data embedded in Euclidean space, where the relative positions and orientations of distant components are often critical for accurate predictions. Self-attention offers a compelling mechanism for capturing these global effects, but its quadratic complexity presents a significant practical limitation. This problem is particularly pronounced in computational chemistry, where the stringent efficiency requirements of machine learning force fields (MLFFs) often preclude accurately modelling long-range interactions. Here, to address this, we introduce Euclidean fast attention (EFA), a linear-scaling attention-like mechanism designed for Euclidean data, which can be easily incorporated into existing model architectures. A core component of EFA is our proposed Euclidean rotary positional encoding, which enables efficient representation of spatial information while preserving essential physical symmetries. We empirically demonstrate that EFA effectively captures diverse long-range effects, enabling EFA-equipped MLFFs to describe challenging chemical interactions for which conventional MLFFs yield incorrect results.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the central concepts of this work.
The alternative text for this image may have been generated using AI.
Fig. 2: Geometric expressiveness of EFA.
The alternative text for this image may have been generated using AI.
Fig. 3: Shortcomings of MP and scaling analysis.
The alternative text for this image may have been generated using AI.
Fig. 4: EFA for reactions and dimers.
The alternative text for this image may have been generated using AI.
Fig. 5: EFA for electronically delocalized effects.
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

The SN2 data are taken from ref. 47, the cumulene data are taken from ref. 3, the dimer data are taken from refs. 48,78, data for the non-local charge transfer benchmark are taken from ref. 46 and BIGDML data are taken from ref. 50. The k-chains data are available via GitHub at https://github.com/chaitjo/geometric-gnn-dojo. The data for distinguishability of local neighbourhoods are available via GitHub at https://github.com/google-research/e3x/blob/main/tests/nn/modules_test.py (starting from line 315) and were originally proposed in ref. 43. Preprocessed datasets for use with EFA as well as newly generated datasets and source data for plots and tables are available via Zenodo at https://doi.org/10.5281/zenodo.14750285 (ref. 79).

Code availability

An implementation of the EFA algorithm is publicly available via GitHub at https://github.com/thorben-frank/euclidean_fast_attention. The repository also includes code for training and evaluation. The code for the article is also available via Zenodo at https://doi.org/10.5281/zenodo.18171624 (ref. 80).

References

  1. Rupp, M., Tkatchenko, A., Müller, K.-R. & Von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).

    Article  Google Scholar 

  2. von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).

    Article  Google Scholar 

  3. Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).

    Article  Google Scholar 

  4. Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).

    Article  Google Scholar 

  5. Fu, X. et al. Forces are not enough: benchmark and critical evaluation for machine learning force fields with molecular simulations. TMLR 2835–8856 (2023).

  6. Karplus, M. & McCammon, J. A. Molecular dynamics simulations of biomolecules. Nat. Struct. Mol. Biol. 9, 646–652 (2002).

    Article  Google Scholar 

  7. Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).

    Article  Google Scholar 

  8. Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian Approximation Potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).

    Article  Google Scholar 

  9. Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).

    Article  Google Scholar 

  10. Schütt, K. T. et al. Machine Learning Meets Quantum Physics (Springer, 2020).

  11. Unke, O. T. et al. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. Sci. Adv. 10, eadn4397 (2024).

    Article  Google Scholar 

  12. Kabylda, A. et al. Molecular simulations with a pretrained neural network and universal pairwise force fields. J. Am. Chem. Soc. 147, 33723–33734 (2025).

    Article  Google Scholar 

  13. Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

    Article  Google Scholar 

  14. Woods, L. et al. Materials perspective on Casimir and van der Waals interactions. Rev. Mod. Phys. 88, 045003 (2016).

    Article  MathSciNet  Google Scholar 

  15. Hermann, J., DiStasio, R. A. Jr & Tkatchenko, A. First-principles models for van der Waals interactions in molecules and materials: concepts, theory, and applications. Chem. Rev. 117, 4714–4758 (2017).

    Article  Google Scholar 

  16. Stöhr, M., Van Voorhis, T. & Tkatchenko, A. Theory and practice of modeling van der Waals interactions in electronic-structure calculations. Chem. Soc. Rev. 48, 4118–4154 (2019).

    Article  Google Scholar 

  17. Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2017); https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

  18. Dosovitskiy, A. et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR, 2021).

  19. Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (ICLR, 2018).

  20. von Glehn, I., Spencer, J. S. & Pfau, D. A self-attention ansatz for ab-initio quantum chemistry. In International Conference on Learning Representations (ICLR, 2023).

  21. Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FlashAttention: fast and memory-efficient exact attention with IO-awareness. Adv. Neural Inf. Process. Syst. 35, 16344–16359 (2022).

    Article  Google Scholar 

  22. Dao, T. Flashattention-2: faster attention with better parallelism and work partitioning. In International Conference on Learning Representations (ICLR, 2024).

  23. Tay, Y., Dehghani, M., Bahri, D. & Metzler, D. Efficient transformers: a survey. ACM Comput. Surveys 55, 1–28 (2022).

    Article  Google Scholar 

  24. Yuan, J. et al. Native sparse attention: Hardware-aligned and natively trainable sparse attention. In Proc. 63rd Annual Meeting of the Association for Computational Linguistics (eds Che, W. et al.) 23078–23097 (Association for Computational Linguistics, 2025).

  25. Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning (eds Daumé, H. III & Singh, Aarti) 5156–5165 (PMLR, 2020).

  26. Choromanski, K. M. et al. Rethinking attention with performers. In International Conference on Learning Representations (ICLR, 2021).

  27. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).

  28. Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/e2ad76f2326fbc6b56a45a56c59fafdb-Paper.pdf

  29. Unke, O. T. et al. SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).

    Article  Google Scholar 

  30. Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).

    Article  Google Scholar 

  31. Frank, J. T., Unke, O. T., Müller, K.-R. & Chmiela, S. A Euclidean transformer for fast and stable machine learned force fields. Nat. Commun. 15, 6539 (2024).

    Article  Google Scholar 

  32. Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: model analysis in the tea challenge 2023. Chem. Sci. 16, 3720–3737 (2025).

    Article  Google Scholar 

  33. Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: molecular dynamics in the tea challenge 2023. Chem. Sci. 16, 3738–3754 (2025).

    Article  Google Scholar 

  34. Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).

    Article  Google Scholar 

  35. Christensen, A. S., Bratholm, L. A., Faber, F. A. & Anatole von Lilienfeld, O. FCHL revisited: faster and more accurate quantum machine learning. J. Chem. Phys. 152, 044107 (2020).

    Article  Google Scholar 

  36. Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).

    Article  Google Scholar 

  37. Su, J. et al. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).

    Article  Google Scholar 

  38. Ewald, P. P. Die Berechnung optischer und elektrostatischer Gitterpotentiale. Ann. Phys. 369, 253–287 (1921).

    Article  Google Scholar 

  39. Kosmala, A., Gasteiger, J., Gao, N. & Günnemann, S. Ewald-based long-range message passing for molecular graphs. In Proc. 40th International Conference on Machine Learning 17544–17563 (JMLR.org, 2023).

  40. Frank, T., Unke, O. & Müller, K.-R. So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems. Adv. Neural Inf. Process. Syst. 35, 29400–29413 (2022).

    Article  Google Scholar 

  41. Unke, O. T. & Maennel, H. E3x: E(3)-equivariant deep learning made easy. Preprint at https://arxiv.org/abs/2401.07595 (2024).

  42. Liao, Y.-L. & Smidt, T. Equiformer: equivariant graph attention transformer for 3D atomistic graphs. Preprint at https://arxiv.org/abs/2206.11990 (2022).

  43. Pozdnyakov, S. N. et al. Incompleteness of atomic structure representations. Phys. Rev. Lett. 125, 166001 (2020).

    Article  MathSciNet  Google Scholar 

  44. Leman, A. & Weisfeiler, B. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2, 12–16 (1968).

    Google Scholar 

  45. Joshi, C. K., Bodnar, C., Mathis, S. V., Cohen, T. & Lio, P. On the expressive power of geometric graph neural networks. In Proc. 40th International Conference on Machine Learning 15330–15355 (PMLR, 2023).

  46. Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12, 398 (2021).

    Article  Google Scholar 

  47. Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).

    Article  Google Scholar 

  48. Donchev, A. G. et al. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Sci. Data 8, 55 (2021).

    Article  Google Scholar 

  49. Alon, U. & Yahav, E. On the bottleneck of graph neural networks and its practical implications. In International Conference on Learning Representations (ICLR, 2021).

  50. Sauceda, H. E. et al. BIGDML—Towards accurate quantum machine learning force fields for materials. Nat. Commun. 13, 3733 (2022).

    Article  Google Scholar 

  51. Muhli, H. et al. Machine learning force fields based on local parametrization of dispersion interactions: Application to the phase diagram of C60. Phys. Rev. B 104, 054106 (2021).

    Article  Google Scholar 

  52. Westermayr, J., Chaudhuri, S., Jeindl, A., Hofmann, O. T. & Maurer, R. J. Long-range dispersion-inclusive machine learning potentials for structure search and optimization of hybrid organic–inorganic interfaces. Digit. Discov. 1, 463–475 (2022).

    Article  Google Scholar 

  53. Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).

    Article  Google Scholar 

  54. Morawietz, T., Sharma, V. & Behler, J. A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges. J. Chem. Phys. 136, 064103 (2012).

    Article  Google Scholar 

  55. Grisafi, A. & Ceriotti, M. Incorporating long-range physics in atomic-scale machine learning. J. Chem. Phys. 151, 204105 (2019).

    Article  Google Scholar 

  56. Pagotto, J., Zhang, J. & Duignan, T. Predicting the properties of salt water using neural network potentials and continuum solvent theory. Preprint at chemRxiv https://doi.org/10.26434/chemrxiv-2022-jndlx (2022).

  57. Li, Y. et al. Long-short-range message-passing: a physics-informed framework to capture non-local interaction for scalable molecular dynamics simulation. In International Conference on Learning Representations (ICLR, 2024).

  58. Loche, P. et al. Fast and flexible long-range models for atomistic machine learning. J. Chem. Phys. 162, 142501 (2025).

    Article  Google Scholar 

  59. Rokhlin, V. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 60, 187–207 (1985).

    Article  MathSciNet  Google Scholar 

  60. King, D. S., Kim, D., Zhong, P. & Cheng, B. Machine learning of charges and long-range interactions from energies and forces. Nat. Commun. 16, 8763 (2025).

    Article  Google Scholar 

  61. Cheng, B. Latent Ewald summation for machine learning of long-range interactions. npj Comput. Mater. 11, 80 (2025).

    Article  Google Scholar 

  62. Gori, M., Kurian, P. & Tkatchenko, A. Second quantization of many-body dispersion interactions for chemical and biological systems. Nat. Commun. 14, 8218 (2023).

    Article  Google Scholar 

  63. Batatia, I., Schaaf, L. L., Csanyi, G., Ortner, C. & Faber, F. A. Equivariant matrix function neural networks. In International Conference on Learning Representations (ICLR, 2024).

  64. Wang, Y. et al. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing. Nat. Commun. 15, 313 (2024).

    Article  Google Scholar 

  65. Lebedev, V. I. Quadratures on a sphere. USSR Comput. Math. Math. Phys. 16, 10–24 (1976).

    Article  MathSciNet  Google Scholar 

  66. Unke, O. et al. Se(3)-equivariant prediction of molecular wavefunctions and electronic densities. In Proc. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) (Curran Associates, Inc., 2021); https://proceedings.neurips.cc/paper_files/paper/2021/file/78f1893678afbeaa90b1fa01b9cfb860-Paper.pdf

  67. Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D steerable CNNs: learning rotationally equivariant features in volumetric data. In Proc. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018) (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/488e4104520c6aab692863cc1dba45af-Paper.pdf

  68. Hendrycks, D. & Gimpel, K. Gaussian error linear units (GELUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).

  69. Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).

  70. Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. GitHub http://github.com/google/jax (2018).

  71. Heek, J. et al. Flax: a neural network library and ecosystem for JAX. GitHub http://github.com/google/flax (2020).

  72. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  Google Scholar 

  73. Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).

    Article  Google Scholar 

  74. Elfwing, S., Uchibe, E. & Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural networks 107, 3–11 (2018).

    Article  Google Scholar 

  75. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1802.08219 (2014).

  76. Hessel, M. et al. Optax: composable gradient transformation and optimisation, in jax!. GitHub http://github.com/deepmind/optax (2020).

  77. Tancik, M. et al. Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural Inf. Process. Syst. 33, 7537–7547 (2020).

    Google Scholar 

  78. Eastman, P. et al. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Sci. Data 10, 11 (2023).

    Article  Google Scholar 

  79. Frank, J. T., Chmiela, S., Müller, K.-R. & Unke, O. T. Machine learning global atomic representations with euclidean fast attention. Zenodo https://doi.org/10.5281/zenodo.14750285 (2026).

  80. Frank, T. & Engelberger, F. thorben-frank/euclidean_fast_attention: publication. Zenodo https://doi.org/10.5281/zenodo.18171624 (2026).

Download references

Acknowledgements

This work was in part supported by the German Ministry for Education and Research (BMBF) under grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D and 01IS18037A. K.-R.M. was partly supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MSIT) (no. 2019-0-00079, Artificial Intelligence Graduate School Program, Korea University and no. 2022-0-00984, Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation). We thank S. Blücher and H. Maennel for helpful comments on the paper.

Author information

Authors and Affiliations

Authors

Contributions

J.T.F. and O.T.U. developed theory and code for the EFA algorithm. O.T.U. supervised and guided the project. J.T.F., S.C., K.-R.M. and O.T.U. conceived and designed the experiments. J.T.F. and O.T.U. performed the experiments. J.T.F. analysed the data. J.T.F. and O.T.U. prepared the first version of the paper. All authors contributed to writing the final version of the paper.

Corresponding author

Correspondence to Oliver T. Unke.

Ethics declarations

Competing interests

The authors are inventors on a US patent application (Serial No. 18/897,283) and a PCT international patent application (No. PCT/US2025/046393) related to this work.

Peer review

Peer review information

Nature Machine Intelligence thanks Sheng Gong, Sergey Pozdnyakov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary figures and tables as well as extended experiments on linear scaling of EFA kernel function (Supplementary Figs. 1 and 2) and comparison with other models (Supplementary Fig. 10).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Frank, J.T., Chmiela, S., Müller, KR. et al. Machine learning global atomic representations with Euclidean fast attention. Nat Mach Intell 8, 388–402 (2026). https://doi.org/10.1038/s42256-026-01195-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-026-01195-y

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics