Abstract
Long-range correlations are essential across numerous machine learning tasks, especially for data embedded in Euclidean space, where the relative positions and orientations of distant components are often critical for accurate predictions. Self-attention offers a compelling mechanism for capturing these global effects, but its quadratic complexity presents a significant practical limitation. This problem is particularly pronounced in computational chemistry, where the stringent efficiency requirements of machine learning force fields (MLFFs) often preclude accurately modelling long-range interactions. Here, to address this, we introduce Euclidean fast attention (EFA), a linear-scaling attention-like mechanism designed for Euclidean data, which can be easily incorporated into existing model architectures. A core component of EFA is our proposed Euclidean rotary positional encoding, which enables efficient representation of spatial information while preserving essential physical symmetries. We empirically demonstrate that EFA effectively captures diverse long-range effects, enabling EFA-equipped MLFFs to describe challenging chemical interactions for which conventional MLFFs yield incorrect results.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The SN2 data are taken from ref. 47, the cumulene data are taken from ref. 3, the dimer data are taken from refs. 48,78, data for the non-local charge transfer benchmark are taken from ref. 46 and BIGDML data are taken from ref. 50. The k-chains data are available via GitHub at https://github.com/chaitjo/geometric-gnn-dojo. The data for distinguishability of local neighbourhoods are available via GitHub at https://github.com/google-research/e3x/blob/main/tests/nn/modules_test.py (starting from line 315) and were originally proposed in ref. 43. Preprocessed datasets for use with EFA as well as newly generated datasets and source data for plots and tables are available via Zenodo at https://doi.org/10.5281/zenodo.14750285 (ref. 79).
Code availability
An implementation of the EFA algorithm is publicly available via GitHub at https://github.com/thorben-frank/euclidean_fast_attention. The repository also includes code for training and evaluation. The code for the article is also available via Zenodo at https://doi.org/10.5281/zenodo.18171624 (ref. 80).
References
Rupp, M., Tkatchenko, A., Müller, K.-R. & Von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).
Fu, X. et al. Forces are not enough: benchmark and critical evaluation for machine learning force fields with molecular simulations. TMLR 2835–8856 (2023).
Karplus, M. & McCammon, J. A. Molecular dynamics simulations of biomolecules. Nat. Struct. Mol. Biol. 9, 646–652 (2002).
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian Approximation Potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Schütt, K. T. et al. Machine Learning Meets Quantum Physics (Springer, 2020).
Unke, O. T. et al. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. Sci. Adv. 10, eadn4397 (2024).
Kabylda, A. et al. Molecular simulations with a pretrained neural network and universal pairwise force fields. J. Am. Chem. Soc. 147, 33723–33734 (2025).
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Woods, L. et al. Materials perspective on Casimir and van der Waals interactions. Rev. Mod. Phys. 88, 045003 (2016).
Hermann, J., DiStasio, R. A. Jr & Tkatchenko, A. First-principles models for van der Waals interactions in molecules and materials: concepts, theory, and applications. Chem. Rev. 117, 4714–4758 (2017).
Stöhr, M., Van Voorhis, T. & Tkatchenko, A. Theory and practice of modeling van der Waals interactions in electronic-structure calculations. Chem. Soc. Rev. 48, 4118–4154 (2019).
Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2017); https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Dosovitskiy, A. et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR, 2021).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (ICLR, 2018).
von Glehn, I., Spencer, J. S. & Pfau, D. A self-attention ansatz for ab-initio quantum chemistry. In International Conference on Learning Representations (ICLR, 2023).
Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FlashAttention: fast and memory-efficient exact attention with IO-awareness. Adv. Neural Inf. Process. Syst. 35, 16344–16359 (2022).
Dao, T. Flashattention-2: faster attention with better parallelism and work partitioning. In International Conference on Learning Representations (ICLR, 2024).
Tay, Y., Dehghani, M., Bahri, D. & Metzler, D. Efficient transformers: a survey. ACM Comput. Surveys 55, 1–28 (2022).
Yuan, J. et al. Native sparse attention: Hardware-aligned and natively trainable sparse attention. In Proc. 63rd Annual Meeting of the Association for Computational Linguistics (eds Che, W. et al.) 23078–23097 (Association for Computational Linguistics, 2025).
Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning (eds Daumé, H. III & Singh, Aarti) 5156–5165 (PMLR, 2020).
Choromanski, K. M. et al. Rethinking attention with performers. In International Conference on Learning Representations (ICLR, 2021).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Proc. Advances in Neural Information Processing Systems (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/e2ad76f2326fbc6b56a45a56c59fafdb-Paper.pdf
Unke, O. T. et al. SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Frank, J. T., Unke, O. T., Müller, K.-R. & Chmiela, S. A Euclidean transformer for fast and stable machine learned force fields. Nat. Commun. 15, 6539 (2024).
Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: model analysis in the tea challenge 2023. Chem. Sci. 16, 3720–3737 (2025).
Poltavsky, I. et al. Crash testing machine learning force fields for molecules, materials, and interfaces: molecular dynamics in the tea challenge 2023. Chem. Sci. 16, 3738–3754 (2025).
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
Christensen, A. S., Bratholm, L. A., Faber, F. A. & Anatole von Lilienfeld, O. FCHL revisited: faster and more accurate quantum machine learning. J. Chem. Phys. 152, 044107 (2020).
Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).
Su, J. et al. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
Ewald, P. P. Die Berechnung optischer und elektrostatischer Gitterpotentiale. Ann. Phys. 369, 253–287 (1921).
Kosmala, A., Gasteiger, J., Gao, N. & Günnemann, S. Ewald-based long-range message passing for molecular graphs. In Proc. 40th International Conference on Machine Learning 17544–17563 (JMLR.org, 2023).
Frank, T., Unke, O. & Müller, K.-R. So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems. Adv. Neural Inf. Process. Syst. 35, 29400–29413 (2022).
Unke, O. T. & Maennel, H. E3x: E(3)-equivariant deep learning made easy. Preprint at https://arxiv.org/abs/2401.07595 (2024).
Liao, Y.-L. & Smidt, T. Equiformer: equivariant graph attention transformer for 3D atomistic graphs. Preprint at https://arxiv.org/abs/2206.11990 (2022).
Pozdnyakov, S. N. et al. Incompleteness of atomic structure representations. Phys. Rev. Lett. 125, 166001 (2020).
Leman, A. & Weisfeiler, B. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2, 12–16 (1968).
Joshi, C. K., Bodnar, C., Mathis, S. V., Cohen, T. & Lio, P. On the expressive power of geometric graph neural networks. In Proc. 40th International Conference on Machine Learning 15330–15355 (PMLR, 2023).
Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12, 398 (2021).
Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
Donchev, A. G. et al. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Sci. Data 8, 55 (2021).
Alon, U. & Yahav, E. On the bottleneck of graph neural networks and its practical implications. In International Conference on Learning Representations (ICLR, 2021).
Sauceda, H. E. et al. BIGDML—Towards accurate quantum machine learning force fields for materials. Nat. Commun. 13, 3733 (2022).
Muhli, H. et al. Machine learning force fields based on local parametrization of dispersion interactions: Application to the phase diagram of C60. Phys. Rev. B 104, 054106 (2021).
Westermayr, J., Chaudhuri, S., Jeindl, A., Hofmann, O. T. & Maurer, R. J. Long-range dispersion-inclusive machine learning potentials for structure search and optimization of hybrid organic–inorganic interfaces. Digit. Discov. 1, 463–475 (2022).
Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
Morawietz, T., Sharma, V. & Behler, J. A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges. J. Chem. Phys. 136, 064103 (2012).
Grisafi, A. & Ceriotti, M. Incorporating long-range physics in atomic-scale machine learning. J. Chem. Phys. 151, 204105 (2019).
Pagotto, J., Zhang, J. & Duignan, T. Predicting the properties of salt water using neural network potentials and continuum solvent theory. Preprint at chemRxiv https://doi.org/10.26434/chemrxiv-2022-jndlx (2022).
Li, Y. et al. Long-short-range message-passing: a physics-informed framework to capture non-local interaction for scalable molecular dynamics simulation. In International Conference on Learning Representations (ICLR, 2024).
Loche, P. et al. Fast and flexible long-range models for atomistic machine learning. J. Chem. Phys. 162, 142501 (2025).
Rokhlin, V. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 60, 187–207 (1985).
King, D. S., Kim, D., Zhong, P. & Cheng, B. Machine learning of charges and long-range interactions from energies and forces. Nat. Commun. 16, 8763 (2025).
Cheng, B. Latent Ewald summation for machine learning of long-range interactions. npj Comput. Mater. 11, 80 (2025).
Gori, M., Kurian, P. & Tkatchenko, A. Second quantization of many-body dispersion interactions for chemical and biological systems. Nat. Commun. 14, 8218 (2023).
Batatia, I., Schaaf, L. L., Csanyi, G., Ortner, C. & Faber, F. A. Equivariant matrix function neural networks. In International Conference on Learning Representations (ICLR, 2024).
Wang, Y. et al. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing. Nat. Commun. 15, 313 (2024).
Lebedev, V. I. Quadratures on a sphere. USSR Comput. Math. Math. Phys. 16, 10–24 (1976).
Unke, O. et al. Se(3)-equivariant prediction of molecular wavefunctions and electronic densities. In Proc. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) (Curran Associates, Inc., 2021); https://proceedings.neurips.cc/paper_files/paper/2021/file/78f1893678afbeaa90b1fa01b9cfb860-Paper.pdf
Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D steerable CNNs: learning rotationally equivariant features in volumetric data. In Proc. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018) (Curran Associates, Inc., 2018); https://proceedings.neurips.cc/paper_files/paper/2018/file/488e4104520c6aab692863cc1dba45af-Paper.pdf
Hendrycks, D. & Gimpel, K. Gaussian error linear units (GELUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).
Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).
Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. GitHub http://github.com/google/jax (2018).
Heek, J. et al. Flax: a neural network library and ecosystem for JAX. GitHub http://github.com/google/flax (2020).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Elfwing, S., Uchibe, E. & Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural networks 107, 3–11 (2018).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1802.08219 (2014).
Hessel, M. et al. Optax: composable gradient transformation and optimisation, in jax!. GitHub http://github.com/deepmind/optax (2020).
Tancik, M. et al. Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural Inf. Process. Syst. 33, 7537–7547 (2020).
Eastman, P. et al. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Sci. Data 10, 11 (2023).
Frank, J. T., Chmiela, S., Müller, K.-R. & Unke, O. T. Machine learning global atomic representations with euclidean fast attention. Zenodo https://doi.org/10.5281/zenodo.14750285 (2026).
Frank, T. & Engelberger, F. thorben-frank/euclidean_fast_attention: publication. Zenodo https://doi.org/10.5281/zenodo.18171624 (2026).
Acknowledgements
This work was in part supported by the German Ministry for Education and Research (BMBF) under grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D and 01IS18037A. K.-R.M. was partly supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MSIT) (no. 2019-0-00079, Artificial Intelligence Graduate School Program, Korea University and no. 2022-0-00984, Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation). We thank S. Blücher and H. Maennel for helpful comments on the paper.
Author information
Authors and Affiliations
Contributions
J.T.F. and O.T.U. developed theory and code for the EFA algorithm. O.T.U. supervised and guided the project. J.T.F., S.C., K.-R.M. and O.T.U. conceived and designed the experiments. J.T.F. and O.T.U. performed the experiments. J.T.F. analysed the data. J.T.F. and O.T.U. prepared the first version of the paper. All authors contributed to writing the final version of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors are inventors on a US patent application (Serial No. 18/897,283) and a PCT international patent application (No. PCT/US2025/046393) related to this work.
Peer review
Peer review information
Nature Machine Intelligence thanks Sheng Gong, Sergey Pozdnyakov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information (download PDF )
Supplementary figures and tables as well as extended experiments on linear scaling of EFA kernel function (Supplementary Figs. 1 and 2) and comparison with other models (Supplementary Fig. 10).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Frank, J.T., Chmiela, S., Müller, KR. et al. Machine learning global atomic representations with Euclidean fast attention. Nat Mach Intell 8, 388–402 (2026). https://doi.org/10.1038/s42256-026-01195-y
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-026-01195-y


