Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Shifting sands of hardware and software in exascale quantum mechanical simulations

Abstract

The era of exascale computing presents both exciting opportunities and unique challenges for quantum mechanical simulations. Although the transition from petaflops to exascale computing has been marked by a steady increase in computational power, it is accompanied by a shift towards heterogeneous architectures, with graphical processing units (GPUs) in particular gaining a dominant role. The exascale era therefore demands a fundamental shift in software development strategies. This Perspective examines the changing landscape of hardware and software for exascale computing, highlighting the limitations of traditional algorithms and software implementations in light of the increasing use of heterogeneous architectures in high-end systems. We discuss the challenges of adapting quantum chemistry software to these new architectures, including the fragmentation of the software stack, the need for more efficient algorithms (including reduced precision versions) tailored for GPUs, and the importance of developing standardized libraries and programming models.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Similar content being viewed by others

References

  1. Bader, D. A. Petascale Computing: Algorithms and Applications (Computational Science) (Chapman and Hall/CRC, 2007).

  2. Geist, A. & Lucas, R. Major computer science challenges at exascale. Int. J. High Perform. Comput. Appl. 23, 427–436 (2009).

    Article  Google Scholar 

  3. Vetter, J. S. Contemporary High Performance Computing (Chapman and Hall/CRC, 2017).

  4. Dongarra, J. J. & Walker, D. W. The quest for petascale computing. Comput. Sci. Eng. 3, 32–39 (2001).

    Article  Google Scholar 

  5. Pedretti, K. et al. Chronicles of Astra: challenges and lessons from the first petascale arm supercomputer. https://www.osti.gov/biblio/1822114 (2020).

  6. Kogge, P. M. & Dally, W. J. Frontier vs the Exascale Report: why so long? and are we really there yet? In 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 26–35 (IEEE, 2022).

  7. Sinha, P. et al. Not all GPUs are created equal: characterizing variability in large-scale, accelerator-rich systems. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, 01–15 (IEEE, 2022).

  8. Patrizio, A. ISC ’22: The AMD-Intel-Nvidia HPC race heats up. Network World https://www.networkworld.com/article/3662114/isc-22-the-amd-intel-nvidia-hpc-race-heats-up.html (2022).

  9. Loh, G. H. et al. A research retrospective on AMD’s Exascale Computing Journey. In ISCA ’23: Proceedings of the 50th Annual International Symposium on Computer Architecture, 1–14 (ACM, 2023).

  10. Lu, Y., Qian, D., Fu, H. & Chen, W. Will supercomputers be super-data and super-AI machines? Commun. ACM 61, 82–87 (2018).

    Article  ADS  Google Scholar 

  11. Huerta, E. A. et al. Convergence of artificial intelligence and high performance computing on NSF-supported cyberinfrastructure. J. Big Data 7, 1–12 (2020).

    Article  Google Scholar 

  12. Gepner, P. Machine learning and high-performance computing hybrid systems, a new way of performance acceleration in engineering and scientific applications. In 2021 16th Conference on Computer Science and Intelligence Systems (FedCSIS), 27–36 (Polskie Towarzystwo Informatyczne, 2021).

  13. Liang, B.-S. AI computing in large-scale era: pre-trillion-scale neural network models and exa-scale supercomputing. In 2023 International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT) 1–3 (IEEE, 2023).

  14. Dongarra, J. et al. The International Exascale Software Project roadmap. Int. J. High Perform. Comput. Appl. 25, 3–60 (2011).

    Article  Google Scholar 

  15. Fiore, S., Bakhouya, M. & Smari, W. W. On the road to exascale: advances in high performance computing and simulations — an overview and editorial. Future Gener. Comput. Syst. 82, 450–458 (2018).

    Article  Google Scholar 

  16. Richard, R. M. et al. Developing a computational chemistry framework for the exascale era. Comput. Sci. Eng. 21, 48–58 (2018).

    Article  Google Scholar 

  17. Gordon, M. S. et al. Novel computer architectures and quantum chemistry. J. Phys. Chem. A 124, 4557–4582 (2020).

    Article  Google Scholar 

  18. Luo, L. et al. Pre-exascale accelerated application development: the ORNL Summit experience. IBM J. Res. Dev. 64, 11:1–11:21 (2020).

    Article  Google Scholar 

  19. McInnes, L. C. et al. How community software ecosystems can unlock the potential of exascale computing. Nat. Comput. Sci. 1, 92–94 (2021).

    Article  Google Scholar 

  20. Matsuoka, S., Domke, J., Wahib, M., Drozd, A. & Hoefler, T. Myths and legends in high-performance computing. Int. J. High Perform. Comput. Appl. 37, 245–259 (2023).

    Article  Google Scholar 

  21. Geist, A. & Reed, D. A. A survey of high-performance computing scaling challenges. Int. J. High Perform. Comput. Appl. 31, 104–113 (2015).

    Article  Google Scholar 

  22. Evans, T. M. et al. A survey of software implementations used by application codes in the Exascale Computing Project. Int. J. High Perform. Comput. Appl. 36, 5–12 (2021).

    Article  Google Scholar 

  23. Scemama, A. QMCkl: A unified approach to accelerating quantum Monte Carlo Codes. Zenodo https://doi.org/10.5281/zenodo.10622933 (2024).

  24. Lehtola, S. A call to arms: making the case for more reusable libraries. J. Chem. Phys. https://doi.org/10.1063/5.0175165 (2023).

  25. Kowalski, K. et al. From NWChem to NWChemEx: evolving with the computational chemistry landscape. Chem. Rev. 121, 4962–4998 (2021).

    Article  Google Scholar 

  26. Pototschnig, J. V. et al. Implementation of relativistic coupled cluster theory for massively parallel GPU-accelerated computing architectures. J. Chem. Theory Comput. 17, 5509–5529 (2021).

    Article  Google Scholar 

  27. Yokelson, D., Tkachenko, N. V., Robey, R., Li, Y. W. & Dub, P. A. Performance analysis of CP2K code for ab initio molecular dynamics on CPUs and GPUs. J. Chem. Inf. Model. 62, 2378–2386 (2022).

    Article  Google Scholar 

  28. Gavini, V. et al. Roadmap on electronic structure codes in the exascale era. Model. Simul. Mater. Sci. Eng. 31, 063301 (2023).

    Article  ADS  Google Scholar 

  29. Kim, I. et al. Kohn–Sham time-dependent density functional theory with Tamm–Dancoff approximation on massively parallel GPUs. npj Comput. Mater. 9, 1–12 (2023).

    Article  ADS  Google Scholar 

  30. Galvez Vallejo, J. L. et al. Toward an extreme-scale electronic structure system. J. Chem. Phys. 159, 044112 (2023).

    Article  ADS  Google Scholar 

  31. Corzo, H. H. et al. Coupled cluster theory on modern heterogeneous supercomputers. Front. Chem. 11, 1154526 (2023).

    Article  ADS  Google Scholar 

  32. Schade, R. et al. Breaking the exascale barrier for the electronic structure problem in ab-initio molecular dynamics. Int. J. High Perform. Comput. Appl. 37, 530–538 (2023).

    Article  Google Scholar 

  33. Yu, H. S., Li, S. L. & Truhlar, D. G. Perspective: Kohn–Sham density functional theory descending a staircase. J. Chem. Phys. 145, 130901 (2016).

    Article  ADS  Google Scholar 

  34. Spiegelman, F. et al. Density-functional tight-binding: basic concepts and applications to molecules and clusters. Adv. Phys. X 5, 1710252 (2020).

    Google Scholar 

  35. David Sherrill, C. & Schaefer, H. F. The configuration interaction method: advances in highly correlated approaches. Adv. Quantum Chem. 34, 143–269 (1999).

    Article  ADS  Google Scholar 

  36. Lyakh, D. I., Musiał, M., Lotrich, V. F. & Bartlett, R. J. Multireference nature of chemistry: the coupled-cluster view. Chem. Rev. 112, 182–243 (2012).

    Article  Google Scholar 

  37. Jung, Y., Sodt, A., Gill, P. M. W. & Head-Gordon, M. Auxiliary basis expansions for large-scale electronic structure calculations. Proc. Natl Acad. Sci. USA 102, 6692–6697 (2005).

    Article  ADS  Google Scholar 

  38. Pedersen, T. B., Lehtola, S., Galván, I. Fdez & Lindh, R. The versatility of the Cholesky decomposition in electronic structure theory. WIREs Comput. Mol. Sci. 14, e1692 (2024).

    Article  Google Scholar 

  39. Davidson, E. R. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices. J. Comput. Phys. 17, 87–94 (1975).

    Article  ADS  MathSciNet  Google Scholar 

  40. Bartlett, R. J. & Musiał, M. Coupled-cluster theory in quantum chemistry. Rev. Mod. Phys. 79, 291–352 (2007).

    Article  ADS  Google Scholar 

  41. Barca, G. M. J. et al. Enabling large-scale correlated electronic structure calculations: scaling the RI-MP2 method on summit. In SC ’21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 1–15 (ACM, 2021).

  42. Burke, K., Werschnik, J. & Gross, E. K. U. Time-dependent density functional theory: past, present, and future. J. Chem. Phys. 123, 062206 (2005).

    Article  ADS  Google Scholar 

  43. Casida, M. E. & Huix-Rotllant, M. Progress in time-dependent density-functional theory. Annu. Rev. Phys. Chem. 63, 287–323 (2012).

    Article  ADS  Google Scholar 

  44. Barca, G. M. J. et al. Recent developments in the general atomic and molecular electronic structure system. J. Chem. Phys. 152, 154102 (2020).

    Article  ADS  Google Scholar 

  45. Aidas, K. et al. The Dalton quantum chemistry program system. WIREs Comput. Mol. Sci. 4, 269–284 (2014).

    Article  Google Scholar 

  46. Reine, S., Helgaker, T. & Lindh, R. Multi-electron integrals. WIREs Comput. Mol. Sci. 2, 290–303 (2012).

    Article  Google Scholar 

  47. Sun, Q. et al. PySCF: the Python-based simulations of chemistry framework. WIREs Comput. Mol. Sci. 8, e1340 (2018).

    Article  Google Scholar 

  48. Sun, Q. et al. Recent developments in the PySCF program package. J. Chem. Phys. 153, 024109 (2020).

    Article  Google Scholar 

  49. Erba, A. et al. Crystal23: a program for computational solid state physics and chemistry. J. Chem. Theory Comput. 19, 6891–6932 (2023).

    Article  Google Scholar 

  50. Pisani, C. et al. Cryscor: a program for the post-Hartree–Fock treatment of periodic systems. Phys. Chem. Chem. Phys. 14, 7615–7628 (2012).

    Article  Google Scholar 

  51. Kühne, T. D. et al. CP2K: an electronic structure and molecular dynamics software package — Quickstep: efficient and accurate electronic structure calculations. J. Chem. Phys. 152, 194103 (2020).

    Article  ADS  Google Scholar 

  52. Blum, V., Rossi, M., Kokott, S. & Scheffler, M. The FHI-aims code: all-electron, ab initio materials simulations towards the exascale. Preprint at https://arxiv.org/abs/2208.12335 (2022).

  53. Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993).

    Article  ADS  Google Scholar 

  54. Carnimeo, I. et al. Quantum ESPRESSO: one further step toward the exascale. J. Chem. Theory Comput. 19, 6992–7006 (2023).

    Article  Google Scholar 

  55. Gonze, X. et al. The Abinit project: impact, environment and recent developments. Comput. Phys. Commun. 248, 107042 (2020).

    Article  Google Scholar 

  56. Clark, S. J. et al. First principles methods using CASTEP. Z. Kristallogr. Cryst. Mater. 220, 567–570 (2005).

    Article  Google Scholar 

  57. Tancogne-Dejean, N. et al. Octopus, a computational framework for exploring light-driven phenomena and quantum dynamics in extended and finite systems. J. Chem. Phys. 152, 124119 (2020).

    Article  ADS  Google Scholar 

  58. Mortensen, J. J. et al. GPAW: an open Python package for electronic structure calculations. J. Chem. Phys. 160, 092503 (2024).

    Article  ADS  Google Scholar 

  59. Xu, Q. et al. Sparc: simulation package for ab-initio real-space calculations. SoftwareX 15, 100709 (2021).

    Article  Google Scholar 

  60. Motamarri, P. et al. DFT-FE — a massively parallel adaptive finite-element code for large-scale density functional theory calculations. Comput. Phys. Commun. 246, 106853 (2020).

    Article  Google Scholar 

  61. Das, S., Motamarri, P., Subramanian, V., Rogers, D. M. & Gavini, V. DFT-FE 1.0: a massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization. Comput. Phys. Commun. 280, 108473 (2022).

    Article  MathSciNet  Google Scholar 

  62. Kronik, L. et al. Parsec — the pseudopotential algorithm for real-space electronic structure calculations: recent advances and novel applications to nano-structures. Phys. Status Solidi B 243, 1063 (2006).

    Article  ADS  Google Scholar 

  63. Cc4s User Documentation. https://cc4s.github.io/user-manual (2022).

  64. Leng, X., Jin, F., Wei, M. & Ma, Y. Gw method and Bethe-Salpeter equation for calculating electronic excitations. WIREs Comput. Mol. Sci. 6, 532–550 (2016).

    Article  Google Scholar 

  65. Blase, X., Duchemin, I., Jacquemin, D. & Loos, P.-F. The Bethe-Salpeter equation formalism: from physics to chemistry. J. Phys. Chem. Lett. 11, 7371–7382 (2020).

    Article  Google Scholar 

  66. Deslippe, J. et al. BerkeleyGW: a massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures. Comput. Phys. Commun. 183, 1269–1289 (2012).

    Article  ADS  Google Scholar 

  67. Gulans, A. et al. exciting: a full-potential all-electron package implementing density-functional theory and many-body perturbation theory. J. Phys. Condens. Matter 26, 363202 (2014).

    Article  Google Scholar 

  68. Sangalli, D. et al. Many-body perturbation theory calculations using the yambo code. J. Phys. Condens. Matter 31, 325902 (2019).

    Article  Google Scholar 

  69. Foulkes, W. M. C., Mitas, L., Needs, R. J. & Rajagopal, G. Quantum Monte Carlo simulations of solids. Rev. Mod. Phys. 73, 33–83 (2001).

    Article  ADS  Google Scholar 

  70. Lester, W. A., Mitas, L. & Hammond, B. Quantum Monte Carlo for atoms, molecules and solids. Chem. Phys. Lett. 478, 1–10 (2009).

    Article  ADS  Google Scholar 

  71. Austin, B. M., Zubarev, D. Y. & Lester, W. A. J. Quantum Monte Carlo and related approaches. Chem. Rev. 112, 263–288 (2012).

    Article  Google Scholar 

  72. Kolorenč, J. & Mitas, L. Applications of quantum Monte Carlo methods in condensed systems. Rep. Prog. Phys. 74, 026502 (2011).

    Article  ADS  Google Scholar 

  73. Kim, J. et al. Qmcpack: an open source ab initio quantum Monte Carlo package for the electronic structure of atoms, molecules and solids. J. Phys. Condens. Matter 30, 195901 (2018).

    Article  ADS  Google Scholar 

  74. Kent, P. R. C. et al. Qmcpack: advances in the development, efficiency, and application of auxiliary field and real-space variational and diffusion quantum Monte Carlo. J. Chem. Phys. 152, 174105 (2020).

    Article  ADS  Google Scholar 

  75. Nakano, K. et al. Turborvb: a many-body toolkit for ab initio electronic simulations by quantum Monte Carlo. J. Chem. Phys. 152, 204121 (2020).

    Article  ADS  Google Scholar 

  76. Shinde, R. et al. Cornell-Holland ab-initio materials package (champ-eu). https://github.com/filippi-claudia/champ (accessed 9 January 2025).

  77. Scemama, A., Caffarel, M., Oseret, E. & Jalby, W. QMC=Chem: a quantum Monte Carlo program for large-scale simulations in chemistry at the petascale level and beyond. In High Performance Computing for Computational Science — VECPAR 2012, 118–127 (Springer, 2013).

  78. Wagner, L. K., Bajdich, M. & Mitas, L. QWalk: a quantum Monte Carlo program for electronic structure. J. Comput. Phys. 228, 3390–3404 (2009).

    Article  ADS  Google Scholar 

  79. Needs, R. J., Towler, M. D., Drummond, N. D., López Ríos, P. & Trail, J. R. Variational and diffusion quantum Monte Carlo calculations with the CASINO code. J. Chem. Phys. 152, 154106 (2020).

    Article  ADS  Google Scholar 

  80. Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).

    Article  ADS  MathSciNet  Google Scholar 

  81. Pfau, D., Spencer, J. S., Matthews, A. G. D. G. & Foulkes, W. M. C. Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Phys. Rev. Res. 2, 033429 (2020).

    Article  Google Scholar 

  82. Hermann, J., Schätzle, Z. & Noé, F. Deep-neural-network solution of the electronic Schrödinger equation. Nat. Chem. 12, 891–897 (2020).

    Article  Google Scholar 

  83. Wilson, M. et al. Neural network ansatz for periodic wave functions and the homogeneous electron gas. Phys. Rev. B 107, 235139 (2023).

    Article  ADS  Google Scholar 

  84. Han, J., Zhang, L. & E, W. Solving many-electron Schrödinger equation using deep neural networks. J. Comput. Phys. 399, 108929 (2019).

    Article  MathSciNet  Google Scholar 

  85. Lu, D. et al. 86 PFLOPS deep potential molecular dynamics simulation of 100 million atoms with ab initio accuracy. Comput. Phys. Commun. 259, 107624 (2021).

    Article  MathSciNet  Google Scholar 

  86. Muniz, M. C., Car, R. & Panagiotopoulos, A. Z. Neural network water model based on the MB-pol many-body potential. J. Phys. Chem. B 127, 9165–9171 (2023).

    Article  Google Scholar 

  87. Batzner, S., Musaelian, A. & Kozinsky, B. Advancing molecular simulation with equivariant interatomic potentials. Nat. Rev. Phys. 5, 437–438 (2023).

    Article  Google Scholar 

  88. Bystrom, K. & Kozinsky, B. Nonlocal machine-learned exchange functional for molecules and solids. Phys. Rev. B 110, 075130 (2024).

    Article  Google Scholar 

  89. Owen, C. J. et al. Unbiased atomistic predictions of crystal dislocation dynamics using Bayesian force fields. Preprint at https://arxiv.org/abs/2401.04359 (2024).

  90. van der Oord, C., Sachs, M., Kovács, D. P., Ortner, C. & Csányi, G. Hyperactive learning for data-driven interatomic potentials. npj Comput. Mater. 9, 168 (2023).

    Article  ADS  Google Scholar 

  91. Klawohn, S. et al. Gaussian approximation potentials: theory, software implementation and application examples. J. Chem. Phys. 159, 174108 (2023).

    Article  ADS  Google Scholar 

  92. Bigi, F., Langer, M. & Ceriotti, M. The dark side of the forces: assessing non-conservative force models for atomistic machine learning. Preprint at https://arxiv.org/abs/2412.11569 (2024).

  93. Chong, S. et al. Robustness of local predictions in atomistic machine learning models. J. Chem. Theory Comput. 19, 8020–8031 (2023).

    Article  Google Scholar 

  94. Chandrasekaran, A. et al. Solving the electronic structure problem with machine learning. npj Comput. Mater. 5, 22 (2019).

    Article  ADS  Google Scholar 

  95. Jain, A. et al. The Materials Project: Accelerating Materials Design Through Theory-Driven Data and Tools, 1–34 (Springer, 2018).

  96. Fare, C., Fenner, P., Benatan, M., Varsi, A. & Pyzer-Knapp, E. O. A multi-fidelity machine learning approach to high throughput materials screening. npj Comput. Mater. 8, 257 (2022).

    Article  ADS  Google Scholar 

  97. Batatia, I. et al. A foundation model for atomistic materials chemistry. Preprint at https://arxiv.org/abs/2401.00096 (2023).

  98. Dongarra, J. J., Luszczek, P. & Petitet, A. The LINPACK benchmark: past, present and future. Concurrency Computat. Pract. Exper. 15, 803–820 (2003).

    Article  Google Scholar 

  99. Dongarra, J., Heroux, M. A. & Luszczek, P. High-performance conjugate-gradient benchmark: a new metric for ranking high-performance computing systems. Int. J. High Perform. Comput. Appl. 30, 3–10 (2015).

    Article  Google Scholar 

  100. TOP500.org. HPCG — November 2024. https://www.top500.org/lists/hpcg/2024/11 (accessed 1 December 2024).

  101. ACM Gordon Bell Prize. https://awards.acm.org/bell (accessed 1 August 2024).

  102. Liu, Y. A. et al. Closing the ‘quantum supremacy’ gap: achieving real-time simulation of a random quantum circuit using a new sunway supercomputer. In SC ’21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, https://doi.org/10.1145/3458817.348739 (ACM, 2021).

  103. Das, S. et al. Fast, scalable and accurate finite-element based ab initio calculations using mixed precision computing: 46 PFLOPS simulation of a metallic dislocation system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, https://doi.org/10.1145/3295500.3357157 (ACM, 2019).

  104. Ben, M. D. et al. Accelerating large-scale excited-state GW calculations on leadership HPC systems. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 1–11 (IEEE, 2020).

  105. Das, S. et al. Large-scale materials modeling at quantum accuracy: ab initio simulations of quasicrystals and interacting extended defects in metallic alloys. In SC23: International Conference for High Performance Computing, Networking, Storage and Analysis, 1–12 (IEEE, 2023).

  106. Bloomberg and Digitimes. TSMC top 10 customers revealed: Apple accounts for quarter of revenue. https://www.gizmochina.com/2021/12/15/tsmc-top-10-customers-revealed-apple-accounts-for-quarter-of-revenue/ (2021).

  107. Sorokin, A., Malkovsky, S. & Tsoy, G. Comparing the performance of general matrix multiplication routine on heterogeneous computing systems. J. Parallel Distrib. Comput. 160, 39–48 (2022).

    Article  Google Scholar 

  108. 4th Gen AMD EPYC Review (AMD Genoa). https://www.storagereview.com/review/4th-gen-amd-epyc-review-amd-genoa (2022).

  109. Bertoni, C. et al. Performance portability evaluation of OpenCL benchmarks across Intel and nvidia platforms. In 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 330–339 (IEEE, 2020).

  110. Hammond, J. Shifting through the gears of GPU programming: understanding performance and portability trade-offs. https://www.nvidia.com/en-us/on-demand/session/gtcspring22-s41620/ (2022).

  111. Herten, A. Many cores, many models: GPU programming model vs. vendor compatibility overview. In Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, https://doi.org/10.1145/3624062.3624178 (ACM, 2023).

  112. Carter Edwards, H., Trott, C. R. & Sunderland, D. Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74, 3202–3216 (2014).

    Article  Google Scholar 

  113. Beckingsale, D. A. et al. Raja: Portable performance for large-scale scientific applications. In 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), 71–81 (IEEE, 2019).

  114. Matthes, A. et al. Tuning and Optimization for a Variety of Many-Core Architectures Without Changing a Single Line of Implementation Code Using the Alpaka Library, 496–514 (Springer, 2017).

  115. Alpay, A., Soproni, B., Wünsche, H. & Heuveline, V. Exploring the possibility of a hipSYCL-based implementation of oneAPI. In Proceedings of the 10th International Workshop on OpenCL, https://doi.org/10.1145/3529538.3530005 (ACM, 2022).

  116. Markomanolis, G. S. et al. Evaluating GPU Programming Models for the LUMI Supercomputer, 79–101 (Springer, 2022).

  117. Genovese, L. et al. Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. J. Chem. Phys. 131, 034103 (2009).

    Article  ADS  Google Scholar 

  118. Asadchev, A. & Gordon, M. S. New multithreaded hybrid CPU/GPU approach to Hartree–Fock. J. Chem. Theory Comput. 8, 4166–4176 (2012).

    Article  Google Scholar 

  119. Manathunga, M., Miao, Y., Mu, D., Götz, A. W. & Merz Jr, K. M. Parallel implementation of density functional theory methods in the Quantum Interaction Computational Kernel program. J. Chem. Theory Comput. 16, 4315–4326 (2020).

    Article  Google Scholar 

  120. Barca, G. M. J., Galvez-Vallejo, J. L., Poole, D. L., Rendell, A. P. & Gordon, M. S. High-performance, graphics processing unit-accelerated Fock build algorithm. J. Chem. Theory Comput. 16, 7232–7238 (2020).

    Article  Google Scholar 

  121. Stone, J. E. et al. Accelerating molecular modeling applications with graphics processors. J. Comput. Chem. 28, 2618–2640 (2007).

    Article  Google Scholar 

  122. Seritan, S. et al. TeraChem: accelerating electronic structure and ab initio molecular dynamics with graphical processing units. J. Chem. Phys. 152, 224110 (2020).

    Article  ADS  Google Scholar 

  123. Seritan, S. et al. TeraChem: a graphical processing unit-accelerated electronic structure package for large-scale ab initio molecular dynamics. WIREs Comput. Mol. Sci. 11, e1494 (2021).

    Article  Google Scholar 

  124. Ma, W., Krishnamoorthy, S., Villa, O. & Kowalski, K. GPU-based implementations of the noniterative regularized-CCSD(T) corrections: applications to strongly correlated systems. J. Chem. Theory Comput. 7, 1316–1327 (2011).

    Article  Google Scholar 

  125. Deprince, A. E. I. & Hammond, J. R. Coupled cluster theory on graphics processing units I. The coupled cluster doubles method. J. Chem. Theory Comput. 7, 1287–1295 (2011).

    Article  Google Scholar 

  126. Deprince, A. E. III, Kennedy, M. R., Sumpter, B. G. & Sherrill, C. D. Density-fitted singles and doubles coupled cluster on graphics processing units. Mol. Phys. https://doi.org/10.1080/00268976.2013.874599 (2014).

  127. Bhaskaran-Nair, K. et al. Noniterative multireference coupled cluster methods on heterogeneous CPU–GPU systems. J. Chem. Theory Comput. 9, 1949–1957 (2013).

    Article  Google Scholar 

  128. Shen, T., Zhu, Z., Zhang, I. Y. & Scheffler, M. Massive-parallel implementation of the resolution-of-identity coupled-cluster approaches in the numeric atom-centered orbital framework for molecular systems. J. Chem. Theory Comput. 15, 4721–4734 (2019).

    Article  Google Scholar 

  129. Wang, L.-W. Divide-and-conquer quantum mechanical material simulations with exascale supercomputers. Natl Sci. Rev. 1, 604–617 (2014).

    Article  Google Scholar 

  130. Høyvik, I.-M., Kristensen, K., Jansik, B. & Jørgensen, P. The divide–expand–consolidate family of coupled cluster methods: numerical illustrations using second order Møller-Plesset perturbation theory. J. Chem. Phys. 136, 014105 (2012).

    Article  ADS  Google Scholar 

  131. Kjærgaard, T., Baudin, P., Bykov, D., Kristensen, K. & Jørgensen, P. The divide–expand–consolidate coupled cluster scheme. WIREs Comput. Mol. Sci. 7, e1319 (2017).

    Article  Google Scholar 

  132. Shee, J., Arthur, E. J., Zhang, S., Reichman, D. R. & Friesner, R. A. Phaseless auxiliary-field quantum Monte Carlo on graphical processing units. J. Chem. Theory Comput. 14, 4109–4121 (2018).

    Article  Google Scholar 

  133. Fales, B. S. & Martìnez, T. J. Efficient treatment of large active spaces through multi-GPU parallel implementation of direct configuration interaction. J. Chem. Theory Comput. 16, 1586–1596 (2020).

    Article  Google Scholar 

  134. Hacene, M. et al. Accelerating VASP electronic structure calculations using graphic processing units. J. Comput. Chem. 33, 2581–2589 (2012).

    Article  Google Scholar 

  135. Del Ben, M. et al. Accelerating Large-Scale Excited-State GW Calculations on Leadership HPC Systems (IEEE Computer Society, 2020).

  136. Yasuda, K. Two-electron integral evaluation on the graphics processor unit. J. Comput. Chem. 29, 334–342 (2008).

    Article  Google Scholar 

  137. Ufimtsev, I. S. & Martínez, T. J. Quantum chemistry on graphical processing units. 1. Strategies for two-electron integral evaluation. J. Chem. Theory Comput. 4, 222–231 (2008).

    Article  Google Scholar 

  138. Asadchev, A. et al. Uncontracted Rys quadrature implementation of up to g functions on graphical processing units. J. Chem. Theory Comput. 6, 696–704 (2010).

    Article  Google Scholar 

  139. Asadchev, A. & Gordon, M. S. Mixed-precision evaluation of two-electron integrals by Rys quadrature. Comput. Phys. Commun. 183, 1563–1567 (2012).

    Article  ADS  Google Scholar 

  140. Luehr, N., Ufimtsev, I. S. & Martínez, T. J. Dynamic precision for electron repulsion integral evaluation on graphical processing units (GPUs). J. Chem. Theory Comput. 7, 949–954 (2011).

    Article  Google Scholar 

  141. Miao, Y. & Merz Jr, K. M. Acceleration of electron repulsion integral evaluation on graphics processing units via use of recurrence relations. J. Chem. Theory Comput. 9, 965–976 (2013).

    Article  Google Scholar 

  142. Miao, Y. & Merz Jr, K. M. Acceleration of high angular momentum electron repulsion integrals and integral derivatives on graphics processing units. J. Chem. Theory Comput. 11, 1449–1462 (2015).

    Article  Google Scholar 

  143. Asadchev, A. & Valeev, E. F. High-performance evaluation of high angular momentum 4-center Gaussian integrals on modern accelerated processors. J. Phys. Chem. A 127, 10889–10895 (2023).

    Article  Google Scholar 

  144. Rák, Á. & Cserey, G. The BRUSH algorithm for two-electron integrals on GPU. Chem. Phys. Lett. 622, 92–98 (2015).

    Article  ADS  Google Scholar 

  145. Song, C., Wang, L.-P. & Martínez, T. J. Automated code engine for graphical processing units: application to the effective core potential integrals and gradients. J. Chem. Theory Comput. 12, 92–106 (2016).

    Article  Google Scholar 

  146. Kussmann, J. & Ochsenfeld, C. Hybrid CPU/GPU integral engine for strong-scaling ab initio methods. J. Chem. Theory Comput. 13, 3153–3159 (2017).

    Article  Google Scholar 

  147. Tornai, G., Ladjánszki, I., Rák, Á., Kis, G. & Cserey, G. Calculation of quantum chemical two-electron integrals by applying compiler technology on GPU. J. Chem. Theory Comput. 15, 5319–5331 (2019).

    Article  Google Scholar 

  148. Johnson, K. G. et al. Multinode multi-GPU two-electron integrals: code generation using the Regent language. J. Chem. Theory Comput. 18, 6522–6536 (2022).

    Article  Google Scholar 

  149. Galvez Vallejo, J. L., Barca, G. M. J. & Gordon, M. S. High-performance GPU-accelerated evaluation of electron repulsion integrals. Mol. Phys. 121, e2112987 (2023).

    Article  ADS  Google Scholar 

  150. Buttari, A. et al. in High Performance Computing and Grids in Action, 19–36 (IOS, 2008).

  151. Vysotskiy, V. P. & Cederbaum, L. S. Accurate quantum chemistry in single precision arithmetic: correlation energy. J. Chem. Theory Comput. 7, 320–326 (2011).

    Article  Google Scholar 

  152. Olivares-Amaya, R. et al. Accelerating correlated quantum chemistry calculations using graphical processing units and a mixed precision matrix multiplication library. J. Chem. Theory Comput. 6, 135–144 (2010).

    Article  Google Scholar 

  153. Tsuchida, E. & Choe, Y.-K. Iterative diagonalization of symmetric matrices in mixed precision and its application to electronic structure calculations. Comput. Phys. Commun. 183, 980–985 (2012).

    Article  ADS  Google Scholar 

  154. Scemama, A., Caffarel, M., Oseret, E. & Jalby, W. Quantum Monte Carlo for large chemical systems: implementing efficient strategies for petascale platforms and beyond. J. Comput. Chem. 34, 938–951 (2013).

    Article  Google Scholar 

  155. Pokhilko, P., Epifanovsky, E. & Krylov, A. I. Double precision is not needed for many-body calculations: emergent conventional wisdom. J. Chem. Theory Comput. 14, 4088–4096 (2018).

    Article  Google Scholar 

  156. Negre, C. F. A. et al. Recursive factorization of the inverse overlap matrix in linear-scaling quantum molecular dynamics simulations. J. Chem. Theory Comput. 12, 3063–3073 (2016).

    Article  Google Scholar 

  157. Alvermann, A. et al. Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects. Jpn. J. Ind. Appl. Math. 36, 699–717 (2019).

    Article  MathSciNet  Google Scholar 

  158. Abdelfattah, A., Tomov, S. & Dongarra, J. Towards half-precision computation for complex matrices: a case study for mixed precision solvers on GPUs. In 2019 IEEE/ACM 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), 17–24 (IEEE, 2019).

  159. Poulos, A., McKee, S. A. & Calhoun, J. C. Posits and the state of numerical representations in the age of exascale and edge computing. Softw. Pract. Exper. 52, 619–635 (2022).

    Article  Google Scholar 

  160. Dawson, W., Ozaki, K., Domke, J. & Nakajima, T. Reducing numerical precision requirements in quantum chemistry calculations. J. Chem. Theory Comput. 20, 10826–10837 (2024).

    Article  Google Scholar 

  161. Ozaki, K., Ogita, T., Oishi, S. & Rump, S. M. Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numer. Algorithms 59, 95–118 (2011).

    Article  MathSciNet  Google Scholar 

  162. Yu, V. W.-z & Govoni, M. GPU acceleration of large-scale full-frequency GW calculations. J. Chem. Theory Comput. 18, 4690–4707 (2022).

    Article  Google Scholar 

  163. Vinson, J. Faster exact exchange in periodic systems using single-precision arithmetic. J. Chem. Phys. 153, 204106 (2020).

    Article  ADS  Google Scholar 

  164. Borštnik, U., VandeVondele, J., Weber, V. & Hutter, J. Sparse matrix multiplication: the distributed block-compressed sparse row library. Parallel Comput. 40, 47–58 (2014).

    Article  MathSciNet  Google Scholar 

  165. Asadchev, A. & Valeev, E. F. 3-Center and 4-center 2-particle Gaussian AO integrals on modern accelerated processors. J. Chem. Phys. 160, 244109 (2024).

    Article  Google Scholar 

  166. Zhe Yu, V. W. et al. ELSI — an open infrastructure for electronic structure solvers. Comput. Phys. Commun. 256, 107459 (2020).

    Article  MathSciNet  Google Scholar 

  167. Marek, A. et al. The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science. J. Phys.: Condens. Matter 26, 213201 (2014).

    Google Scholar 

  168. Hasik, J., Poilblanc, D. & Becca, F. Investigation of the Néel phase of the frustrated Heisenberg antiferromagnet by differentiable symmetric tensor networks. https://scipost.org/submissions/scipost_202011_00009v2 (2020).

  169. Fishman, M., White, S. & Stoudenmire, E. M. The ITensor software library for tensor network calculations. SciPost Phys. Codebases 004 (2022).

  170. Menczer, A. et al. Parallel implementation of the density matrix renormalization group method achieving a quarter petaFLOPS performance on a single DGX-H100 GPU node. J. Chem. Theory Comput 20, 8397–8404 (2024).

    Article  Google Scholar 

  171. Calvin, J. A., Lewis, C. A. & Valeev, E. F. Scalable task-based algorithm for multiplication of block-rank-sparse matrices. In Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms, https://doi.org/10.1145/2833179.2833186 (2015).

  172. Solomonik, E., Matthews, D., Hammond, J. R., Stanton, J. F. & Demmel, J. A massively parallel tensor contraction framework for coupled-cluster computations. J. Parallel Distrib. Comput. 74, 3176–3190 (2014).

    Article  Google Scholar 

  173. Matthews, D. A. High-performance tensor contraction without transposition. SIAM J. Sci. Comput. https://doi.org/10.1137/16M108968X (2018).

  174. Solcà, R. et al. Efficient implementation of quantum materials simulations on distributed CPU-GPU systems. In SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, https://doi.org/10.1145/2807591.2807654 (IEEE, 2015).

  175. Kulik, H. J. et al. Roadmap on machine learning in electronic structure. Electron. Struct. 4, 023004 (2022).

    Article  ADS  Google Scholar 

  176. GPU NVIDIA H100 Tensor Core. https://www.nvidia.com/fr-fr/data-center/h100 (2024).

  177. AMD EPYC 9754. https://www.amd.com/en/products/processors/server/epyc/4th-generation-9004-and-8004-series/amd-epyc-9754.html (2024).

Download references

Acknowledgements

The authors acknowledge partial support from the European Centre of Excellence in Exascale Computing TREX — Targeting Real Chemical Accuracy at the Exascale. This project has received funding in part from the European Union’s Horizon 2020 — Research and Innovation Program — under grant agreement no. 952165.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to all aspects of this work.

Corresponding authors

Correspondence to Ravindra Shinde or William Jalby.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Physics thanks the anonymous referees for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shinde, R., Filippi, C., Scemama, A. et al. Shifting sands of hardware and software in exascale quantum mechanical simulations. Nat Rev Phys 7, 378–387 (2025). https://doi.org/10.1038/s42254-025-00823-7

Download citation

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42254-025-00823-7

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics