Generalizing deep learning electronic structure calculation to the plane-wave basis

Gong, Xiaoxun; Louie, Steven G.; Duan, Wenhui; Xu, Yong

doi:10.1038/s43588-024-00701-9

Download PDF

Brief Communication
Open access
Published: 03 October 2024

Generalizing deep learning electronic structure calculation to the plane-wave basis

Nature Computational Science volume 4, pages 752–760 (2024)Cite this article

11k Accesses
18 Citations
14 Altmetric
Metrics details

Subjects

Abstract

Deep neural networks capable of representing the density functional theory (DFT) Hamiltonian as a function of material structure hold great promise for revolutionizing future electronic structure calculations. However, a notable limitation of previous neural networks is their compatibility solely with the atomic-orbital (AO) basis, excluding the widely used plane-wave (PW) basis. Here we overcome this critical limitation by proposing an accurate and efficient real-space reconstruction method for directly computing AO Hamiltonian matrices from PW DFT results. The reconstruction method is orders of magnitude faster than traditional projection-based methods to convert PW results to the AO basis, and the reconstructed Hamiltonian matrices can faithfully reproduce the PW electronic structure, thus bridging the longstanding gap between the AO basis deep learning electronic structure approach and PW DFT. Advantages of the PW methods, such as high accuracy, high flexibility and wide applicability, thus can be all integrated into deep learning electronic structure methods without sacrificing these methods’ inherent benefits. This allows for the construction of large-scale and high-fidelity training datasets with the help of PW DFT results towards the development of precise and broadly applicable deep learning electronic structure models.

Deep-learning density functional theory Hamiltonian for efficient ab initio electronic-structure calculation

Article Open access 23 June 2022

General framework for E(3)-equivariant neural network representation of density functional theory Hamiltonian

Article Open access 18 May 2023

A deep equivariant neural network approach for efficient hybrid density functional calculations

Article Open access 11 October 2024

Main

Recent years have witnessed remarkable progress in the field of ab initio computation in combination with artificial intelligence^1,2,3,4. For instance, neural-network force fields can facilitate ab initio molecular dynamics simulations at large length and time scales^5,6,7,8,9 and have become almost indispensable in molecular dynamics simulations nowadays. There are also numerous deep learning models for studying various material properties^10,11,12,13. Recently, fruitful progress has been achieved in the generalization of deep learning methods from atomic structure calculations to electronic structure calculations. For instance, machine learning offers a pathway for designing accurate density functionals^{14,15,16,17,18} as well as for predicting electronic properties, such as charge density and local density of states^{19,20,21,22,23,24,25,26,27,28,29,30,31}. Deep learning methods have also been proposed to bypass the iterative solution of the Kohn–Sham equation of density functional theory (DFT) by directly predicting the converged DFT Hamiltonian under the atomic-orbital (AO) basis^{32,33,34,35,36,37,38,39,40,41,42,43}. All these methods substantially expand the scope of theoretical and computational materials research towards unprecedented accuracy and efficiency.

When compared with other approaches, the deep learning DFT Hamiltonian method has several benefits^33,34. First, eigenvalues and wavefunctions can be easily obtained from a one-shot diagonalization of the predicted sparse AO Hamiltonian matrix, from which all the DFT-based physical properties of materials can be derived. Furthermore, the method scales linearly with system size, and can be trained by DFT results for small-size structures and generalize to study unseen large-size structures with ab initio accuracy. The reason for these properties is that the AO Hamiltonian is a local and nearsighted physical quantity, which can be determined only by its nearby atomic environment^44,45 (also see Supplementary Sections 1 and 2). Thus, methods of this kind can be designed to break the accuracy–efficiency dilemma in electronic structure simulations and are particularly useful for large-scale material simulations that would otherwise demand formidable computational resources. Similar to the role neural-network force fields are playing in today’s molecular dynamics simulations, it is highly possible that future electronic structure simulations will also be primarily based on deep learning models of the DFT Hamiltonian.

However, the deep learning DFT Hamiltonian method faces a critical issue related to basis functions. The Hamiltonian is a quantum mechanical operator, which can be expressed as a matrix if a particular basis set is chosen. Commonly used basis sets in DFT are plane waves (PWs) and AOs. Up to now, all neural-network methods for DFT Hamiltonians only support the AO basis, because PWs are spread over the entire space (Fig. 1a) and will destroy the aforementioned locality property. Nevertheless, PW-based methods usually offer higher accuracy than those using AOs because the PW basis can usually achieve fuller completeness and is also easier to converge. It is also favorable over the AO basis in terms of its simplicity and flexibility. In fact, the majority of DFT calculations for solids are done using the PW basis. In this context, generalizing deep learning electronic structure calculation to the PW basis would be of critical importance to future development of the field.

**Fig. 1: Idea of the deep learning DFT Hamiltonian under the PW basis and its applications to twisted bilayer graphene.**

In this work, we propose a real-space reconstruction method to reconstruct AO Hamiltonians based on PW DFT results. It is orders of magnitude faster than the traditional method of directly projecting the PW Hamiltonian or wavefunctions. Moreover, we show that the AO Hamiltonians generated using our method not only can faithfully reproduce the PW electronic structure but also can be very easily learned by neural-network models. Thus, the critical problem of the deep learning DFT Hamiltonian under the PW basis is solved (Fig. 1a). The high accuracy and efficiency of our method is beneficial for the construction of more versatile and accurate deep learning electronic structure calculation methods, which not only makes them accessible to a much broader scientific community, but also greatly enhances their suitability for general applications.

Results and discussion

Theory

The PW Hamiltonian and the AO Hamiltonian are actually the same physical quantity expressed under different basis sets. In principle, once we have the PW Hamiltonian ${H}_{{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })=\langle {\mathbf{k}}+{\mathbf{G}}| \hat{H}| {\mathbf{k}}+{{\mathbf{G}}}^{{\prime} }\rangle$, we can always obtain the corresponding AO Hamiltonian ${H}_{i\alpha ,\;j\beta }=\langle {\phi }_{i\alpha }| \hat{H}| {\phi }_{j\beta }\rangle$ by a change of basis, which can then be flexibly learned by current neural networks for AO Hamiltonians. However, there are several different ways of doing this. In Supplementary Section 3, we present a detailed review of the possible methods from literature as well as previous attempts at deep learning electronic structure calculation under the PW basis. Here we will briefly discuss three methods to convert PW Hamiltonians to the AO basis, and more details can be found in Methods.

We would naturally think of using the projection method, which is widely used to bridge the gap between PWs and AOs. The projection method was initially developed to evaluate the quality of the AO basis set^46,47,48 and has been adapted for various purposes, such as to analyze charge distribution and to interpret chemical bonding^{49,50,51,52,53,54,55}. The projection method can also be modified to directly convert a Hamiltonian from the PW basis to the AO basis:

$${H}_{i\alpha ,\;j\beta }=\sum _{{\mathbf{k}}{\mathbf{G}}{{\mathbf{G}}}^{{\prime} }}\langle {\phi }_{i\alpha }| {\mathbf{k}}+{\mathbf{G}}\rangle {H}_{{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })\langle {\mathbf{k}}+{{\mathbf{G}}}^{{\prime} }| {\phi }_{j\beta }\rangle .$$

(1)

Here the PW basis is normalized in the Born–von Kármán (BvK) supercell: $\langle {\mathbf{r}}| {\mathbf{k}}+{\mathbf{G}}\rangle =\exp (i({\mathbf{k}}+{\mathbf{G}})\cdot {\mathbf{r}})/\sqrt{N\varOmega }$, where k is a wavevector in the first Brillouin zone, G is a reciprocal lattice vector, N is the number of primitive unit cells forming the BvK supercell and Ω is the volume of the primitive unit cell. The AO basis function |ϕ_iα〉 is centered at atom i. There could be multiple basis functions (labeled by n) sharing the same angular momentum quantum number l and magnetic quantum number m. The index α is an abbreviation for n, l, m. Equation (1) will be referred to as the H_k(G, G′) projection method in this Brief Communication.

If the eigenvalues ε_nk and wavefunctions |ψ_n_k〉 of the PW Hamiltonian are obtained, equation (1) can be further written as

$${H}_{i\alpha ,\;j\beta }=\sum _{n{\mathbf{k}}}\langle {\phi }_{i\alpha }| {\psi }_{n{\mathbf{k}}}\rangle {\varepsilon }_{n{\mathbf{k}}}\langle {\psi }_{n{\mathbf{k}}}| {\phi }_{j\beta }\rangle ,$$

(2)

where

$$\langle {\phi }_{i\alpha }| {\psi }_{n{\mathbf{k}}}\rangle =\sum _{{\mathbf{G}}}\langle {\phi }_{i\alpha }| {\mathbf{k}}+{\mathbf{G}}\rangle \langle {\mathbf{k}}+{\mathbf{G}}| {\psi }_{n{\mathbf{k}}}\rangle .$$

(3)

In this Brief Communication equation (2) will be referred to as the ψ_nk(G) projection method.

Although equations (1) and (2) offer straightforward ways of converting PW Hamiltonians to the AO basis, they suffer from low computational efficiency, and the reasons are as follows. Equation (1) involves two multiplications and summations over G, and the number of G vectors is usually very large. Equation (2) requires a large number of Bloch wavefunctions to converge. Moreover, they all scale cubically with respect to the number of atoms in the system (Methods), which limits their scope of applications.

In fact, we can leverage locality in real space to considerably speed up the calculation. The Hamiltonian $H({\mathbf{r}},{{\mathbf{r}}}^{{\prime} })=\langle {\mathbf{r}}| \hat{H}| {{\mathbf{r}}}^{{\prime} }\rangle$ in real space under atomic units is^56,57

$$\begin{array}{l}H({\mathbf{r}},{{\mathbf{r}}}^{{\prime} })=-\displaystyle\frac{1}{2}{\nabla }^{2}\delta ({\mathbf{r}}-{{\mathbf{r}}}^{{\prime} })+[{V}_{{\rm{Har}}}({\mathbf{r}})+{V}_{{\rm{xc}}}({\mathbf{r}})+{V}_{{\rm{loc}}}({\mathbf{r}})]\delta ({\mathbf{r}}-{{\mathbf{r}}}^{{\prime} })\\\qquad\qquad\ \;+{V}_{{\rm{nloc}}}({\mathbf{r}},{{\mathbf{r}}}^{{\prime} }),\end{array}$$

(4)

where the various terms correspond to the kinetic energy, the Hartree potential, the exchange–correlation potential and the local and non-local parts of the pseudopotential, respectively. In this Brief Communication we are only considering semilocal functionals for exchange and correlation. The three terms in the square bracket will be referred to as the total effective local potential: V_eff(r) ≡ V_Har(r) + V_xc(r) + V_loc(r), which is periodic over unit cells. Usually, PW DFT programs directly store V_eff(r) or its Fourier transform V_eff(G) in memory, and ${V}_{{\rm{nloc}}}({\mathbf{r}},{{\mathbf{r}}}^{{\prime} })=\langle {\mathbf{r}}| {\hat{V}}_{{\rm{nloc}}}| {{\mathbf{r}}}^{{\prime} }\rangle$ can be read from pseudopotential files. Once we have H(r, r′), we can calculate the AO Hamiltonian directly in real space as follows:

$${H}_{i\alpha ,\;j\beta }=\langle {\phi }_{i\alpha }| {-}\frac{1}{2}{\nabla }^{2}| {\phi }_{j\beta }\rangle +\int{{\rm{d}}}^{3}r\,{\phi }_{i\alpha }^{* }({\mathbf{r}}){V}_{{\rm{eff}}}({\mathbf{r}}){\phi }_{j\beta }({\mathbf{r}})+\langle {\phi }_{i\alpha }| {\hat{V}}_{{\rm{nloc}}}| {\phi }_{j\beta }\rangle ,$$

(5)

which will be referred to as the real-space reconstruction method in this Brief Communication. The first and last terms can be very efficiently calculated using two-center integral techniques⁵⁸ (also see Methods). The most time-consuming part is the evaluation of the second term in the equation, which is integrated directly on an evenly spaced real-space grid. Since the AOs are local in real space, the integration region can be chosen such that both of the orbitals ϕ_iα(r) and ϕ_jβ(r) are non-zero. The number of grid points involved in the integration thus does not depend on the overall system size. Therefore, the time required to evaluate the above formula is proportional to the number of non-zero AO Hamiltonian matrix elements, which scales linearly with the number of atoms in the system. Although they are theoretically equivalent and will yield the same results when converged, the proposed real-space reconstruction method is much more efficient than the first two projection-based methods.

It is worth noticing that none of the three methods described in this Brief Communication depend on the specific form of the AO basis functions. They only need to be separable into radial and angular parts, and the radial function needs to go to zero after a certain cutoff radius. This degree of freedom allows us to systematically improve the quality of the reconstructed AO Hamiltonian by customizing the AO basis using modern techniques such as a numerical AO basis^59,60,61,62. The most important design principle of the AO basis is that it must be compatible with the pseudopotentials used in the PW calculation, otherwise it is difficult for the reconstructed Hamiltonian to give an accurate description of the band structure.

Application to twisted bilayer graphene

The real-space reconstruction method provides a very efficient way to calculate the Hamiltonian under an AO basis set from PW DFT results. The resulting AO Hamiltonian not only can accurately reproduce the PW electronic structure, but also can be learned by neural networks, thus enabling deep learning electronic structure calculations under the PW basis. The effectiveness of this workflow depends on two factors: the quality of the reconstructed AO Hamiltonian, and the compatibility of the reconstructed Hamiltonian with deep learning methods. In our tests, the first is measured by comparison of the band structures of the AO Hamiltonian with those from direct PW DFT calculations. The second can be evaluated through checking the quality of the band structure predicted by the neural networks that learn from reconstructed AO Hamiltonians. In all calculations reported in this Brief Communication, PW DFT calculations are performed with the Quantum ESPRESSO package⁶³ using norm-conserving pseudopotentials⁶⁴. The results of these PW calculations are used to reconstruct AO Hamiltonians, where the AO basis is the numerical AOs generated using the SIESTA code⁵⁹ for the same set of pseudopotentials. Details of the convergence tests we have performed on the effect of the sizes of AO basis sets can be found in Supplementary Section 4.

The most remarkable capability of the deep learning DFT Hamiltonian method is that neural-network models can be trained on small structures and generalized to predict the Hamiltonians of much larger structures. In the study of bilayer graphene, the training set consists of 300 4 × 4 bilayer graphene supercells with different stackings and random perturbation of each atom site. After we train the neural-network model on the reconstructed AO Hamiltonians from PW DFT results (Fig. 1b), we can use the model to systematically study Moiré twisted superstructures with arbitrary twist angle. We first benchmark the reconstructed Hamiltonian on one of the structures in the training set by plotting its band structure alongside those calculated using PWs. As shown in Fig. 1c, the two band structures agree very well. After training the neural-network model, we use it to study the well-known ‘magic-angle’ twisted bilayer graphene at θ = 1.08° with 11,164 atoms in a Moiré supercell. This system is of substantial interest to researchers because of the discovery of a series of correlated phenomena^65,66, but is particularly challenging for electronic structure calculations because of its large system size and large-scale corrugation patterns. However, with the deep learning DFT Hamiltonian method available, the computational cost can be greatly reduced³⁴ (also see Supplementary Section 5). As illustrated in Fig. 1d, the neural network trained on the reconstructed AO Hamiltonian manages to give very accurate predictions when compared with the PW DFT benchmark⁶⁷, with an error of only a few millielectronvolts. Moreover, when the neural network is trained on reconstructed AO Hamiltonians from PW DFT output, the predicted band structure (PW-NN in Fig. 1d) has better agreement with the PW DFT results by Lucignano et al.⁶⁷ compared to the case where the neural network is trained on Hamiltonians calculated by AO DFT³⁴ (AO-NN in Fig. 1d). This shows that the deep learning Hamiltonian interfaced to PW DFT can indeed give results that have higher accuracy. This high accuracy, when combined with the flexibility and wide applicability of the PW method, will greatly enhance the capability of deep learning ab initio calculations and will be highly beneficial for future research.

Application to bilayer MoS2

The three previously discussed methods to obtain the AO Hamiltonian from PW DFT results are equivalent when converged, but the real-space reconstruction method (equation (5)) is the most efficient. Here we compare these three methods in the study of the bilayer MoS₂ system (Fig. 2a). First we tested the reconstruction method on the AB-stacked bilayer unit cell consisting of six atoms, and the band structures obtained from the reconstructed AO Hamiltonian agree well with PW DFT results (Fig. 2b). We then plot the band structures given by the three different methods, and the results are shown in Fig. 2c. They are almost the same, except that the band structure given by the ψ_nk(G) projection method is slightly different from the other two because we are only using a finite number of bands in evaluation of equation (2).

**Fig. 2: Application of the deep learning DFT Hamiltonian under the PW basis to study twisted bilayer MoS₂.**

We further compared the computation times of the three methods. The systems we have studied here are bilayer MoS₂ structures with different numbers of atoms (unit cell with 6 atoms, 3 × 3 supercell with 54 atoms and Moiré twisted bilayer MoS₂ at θ = 13.17° and 9.43° with 114 and 222 atoms). The CPU times are shown in Fig. 2d along with the time for PW self-consistent field calculation. Note that the time of diagonalization for PW wavefunctions for the ψ_nk(G) projection method is also included in the total CPU time for that method. As expected, the two projection-based methods show roughly cubic scaling. They are even more time consuming than the full self-consistent field calculation. Conversely, thanks to the locality of the AO basis, the real-space reconstruction method achieves linear scaling and can be several orders of magnitude faster than the projection methods. This acceleration will become more prominent when we investigate large-size materials. Therefore, our method will be essential when we want to construct large-scale training sets including various kinds of material and structures of different sizes to train accurate and versatile neural-network models, whereas the projection methods would not be affordable for this purpose in terms of computational cost.

Now, we follow the same workflow as illustrated in Fig. 1b and test the performance of the neural network on bilayer MoS₂, which is a more challenging material system than bilayer graphene. PW DFT calculations are performed on 256 non-twisted 3 × 3 supercells of bilayer MoS₂ with different stacking configurations and random perturbations of each atom site, and the neural-network model is trained on the reconstructed AO Hamiltonians. We test the generalizability of the neural network to a fully relaxed θ = 6.01° twisted bilayer MoS₂ with 546 atoms in the Moiré supercell. Since the neural network is trained on reconstructed AO Hamiltonians, the predicted bands are compared with the bands of the AO Hamiltonian, and results are shown in Fig. 2e. The absolute energy differences are as small as 0.30 and 2.22 meV for the highest valence band and the lowest conduction band, respectively, where errors are averaged along the high-symmetry k path Γ–K–M–Γ. This is remarkable considering that only small non-twisted structures are included in the training set.

Discussion

Our approach to reconstruct an AO Hamiltonian from PW DFT results facilitates deep learning electronic structure calculations based on PW DFT results and combines the advantages of the PW method and the deep learning approaches. One direct impact of our work is that it makes the deep learning electronic structure method applicable for those who are already familiar with the PW method but have less experience in AO DFT. Another promising future application of our method is to build universal deep learning models that can handle diverse families of materials and give accurate predictions of their electronic structure. The model can take advantage of the numerous materials databases that have already been set up. In fact, most of the materials databases of solids are built using the PW method, and they are thus made accessible through our reconstruction method. Moreover, the applicable scope of our method is not limited to PW DFT only. The spirit of the change of basis can also be generalized to apply to any kind of implementation of Kohn–Sham DFT and interface it with deep learning approaches. Further, the PW methods are even more widely used to implement advanced methods beyond the DFT level, such as the density functional perturbation theory to study electron–phonon interactions⁶⁸, the many-body perturbation theory (such as GW and GW-BSE methods) and time-dependent DFT for excited-state phenomena, and so on. Now, with our method to interface deep learning with PW methods, important generalizations of the deep learning approach to these advanced methods will become feasible in the foreseeable future.

Methods

Details of different ways to convert the PW Hamiltonian to the AO basis

The DFT Hamiltonian operator we are considering in this Brief Communication is given as equation (4) in the main text. Here, we will explain the non-local part of the pseudopotential ${V}_{{\rm{nloc}}}({\mathbf{r}},{{\mathbf{r}}}^{{\prime} })=\langle {\mathbf{r}}| {\hat{V}}_{{\rm{nloc}}}| {{\mathbf{r}}}^{{\prime} }\rangle$ in detail. It is constructed in a separable form known as the Kleinman and Bylander projectors⁶⁹:

$${\hat{V}}_{{\rm{nloc}}}=\sum _{i\alpha \beta }\left\vert {p}_{i\alpha }\right\rangle {D}_{i\alpha \beta }\left\langle {p}_{i\beta }\right\vert ,$$

(6)

where the summation over atom sites i is carried out over all atoms in the whole BvK supercell, and this will apply to the remainder of this section unless otherwise stated. The projector function |p_iα〉 is centered at atom i and can be separated into radial and angular parts:

$$\langle {\mathbf{r}}| {p}_{i\alpha }\rangle ={R}_{inl}^{{\rm{proj}}}(| {{\mathbf{r}}}_{i}| ){Y}_{lm}({\hat{{\mathbf{r}}}}_{i}),$$

(7)

where r_i ≡ r − R_i and R_i is the position of the ith atom. There could be multiple projector functions (labeled by n) sharing the same l and m. The matrix D_iαβ is non-zero only for α = β and is not system dependent (that is, unchanged in different atomic environments) if we only focus on norm-conserving pseudopotentials^70,71.

Here, we would like to discuss the locality of the Hamiltonian. The Hamiltonian in equation (4) is non-local (that is, it is non-zero when |r − r′| ≠ 0) because of the presence of the non-local projectors of the pseudopotential. However, the pseudopotential projector functions |p_iα〉 are, by construction, highly localized around each nucleus. Therefore, we would expect that the non-local part V_nloc(r, r′) is non-zero only within the core region. The nearsightedness of the Hamiltonian is closely related to its locality but is a different concept, which is discussed in Supplementary Section 2.

In practical DFT calculations, two common choices of basis functions for expanding the Hamiltonians and wavefunctions are PWs and AOs. In the remainder of this section, we will first briefly review the forms of the Hamiltonian matrix under both kinds of basis set. Then, we will discuss in detail the three methods of transforming a PW Hamiltonian to the AO basis mentioned in the main text. Finally, we will discuss an extension of our method to the projector augmented-wave (PAW) formalism^72,73. Details of the numerical techniques used to speed up the calculations are deferred to the next section.

PW Hamiltonian

The PW basis we are using here is normalized in the BvK supercell:

$$\langle {\mathbf{r}}| {\mathbf{k}}+{\mathbf{G}}\rangle =\frac{1}{\sqrt{N\varOmega }}\exp (i({\mathbf{k}}+{\mathbf{G}})\cdot {\mathbf{r}}).$$

(8)

Under the PW representation, the Kohn–Sham equation is written as

$$\sum _{{{\mathbf{G}}}^{{\prime} }}({H}_{{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })-{\varepsilon }_{n{\mathbf{k}}}{\delta }_{{\mathbf{G}}{{\mathbf{G}}}^{{\prime} }}){\psi }_{n{\mathbf{k}}}({{\mathbf{G}}}^{{\prime} })=0,$$

(9)

where ε_nk is the Kohn–Sham eigenvalue, |ψ_n_k〉 is the corresponding eigenstate, ${H}_{{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })=\langle {\mathbf{k}}+{\mathbf{G}}| \hat{H}| {\mathbf{k}}+{{\mathbf{G}}}^{{\prime} }\rangle$ is the PW Hamiltonian matrix and ψ_nk(G) = 〈k + G∣ψ_nk〉 is the wavefunction.

The Hamiltonian can be written under the PW basis⁵⁶ as

$${H}_{{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })=\frac{1}{2}| {\mathbf{k}}+{\mathbf{G}}{| }^{2}{\delta }_{{\mathbf{G}}{{\mathbf{G}}}^{{\prime} }}+{V}_{{\rm{eff}}}({\mathbf{G}}-{{\mathbf{G}}}^{{\prime} })+{V}_{{\rm{nloc}},{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} }),$$

(10)

where

$${V}_{{\rm{eff}}}({\mathbf{G}})=\frac{1}{\varOmega }{\int}_{\varOmega }{{\rm{d}}}^{3}r\,{V}_{{\rm{eff}}}({\mathbf{r}})\exp (-i{\mathbf{G}}\cdot {\mathbf{r}}),$$

(11)

and the integral is carried out within the primitive unit cell. Because V_eff(r) is periodic over unit cells, it is convenient to discretize it on evenly spaced grid points in real space, and the Fourier transform can be efficiently calculated using the fast Fourier transform.

The last term in equation (10) is

$${V}_{{\rm{nloc}},{\mathbf{k}}}({\mathbf{G}},{{\mathbf{G}}}^{{\prime} })=\sum _{i\alpha \beta }\langle {\mathbf{k}}+{\mathbf{G}}| {p}_{i\alpha }\rangle {D}_{i\alpha \beta }\langle\; {p}_{i\beta }| {\mathbf{k}}+{{\mathbf{G}}}^{{\prime} }\rangle ,$$

(12)

where the Fourier transform 〈k + G∣p_iα〉 can be calculated efficiently using an algorithm described in the next subsection.

Finally, we would like to point out that the total effective local potential is the only term in equation (10) that is system dependent and needs to be obtained from self-consistent field iterations. The kinetic energy term is trivial, and the non-local pseudopotential term can be built from data read from the pseudopotential file. In practice, most of the PW DFT codes store the quantity V_eff(G) or V_eff(r) instead of the full Hamiltonian matrix, which substantially saves memory.

AO Hamiltonian

The AO basis functions are centered on atomic sites and are separated into radial and angular parts, similar to the projector function defined in equation (7):

$$\langle {\mathbf{r}}| {\phi }_{i\alpha }\rangle ={\phi }_{i\alpha }({\mathbf{r}})={R}_{inl}(| {{\mathbf{r}}}_{i}| ){Y}_{lm}({\hat{{\mathbf{r}}}}_{i}),$$

(13)

where the radial function R_inl(r) can, in principle, take any arbitrary form and still be compatible with our reconstruction method, which will be described in a later section. It only needs to be local, which means that it goes to zero after a certain cutoff radius. This degree of freedom allows us to systematically improve the quality of the reconstructed AO Hamiltonian by customizing the AO basis using modern techniques such as a numerical AO basis^59,60,61,62. The most important design principle of the AO basis is that it must be compatible with the pseudopotential used in the PW calculation, otherwise it is difficult for the reconstructed Hamiltonian to give an accurate description of the band structure.

The Kohn–Sham equation under the AO basis is written as

$$\sum _{j\beta }({H}_{i\alpha ,\;j\beta }-{\varepsilon }_{n{\mathbf{k}}}{S}_{i\alpha ,\;j\beta }){\psi }_{n{\mathbf{k}},\;j\beta }=0,$$

(14)

where ${H}_{i\alpha ,\;j\beta }=\langle {\phi }_{i\alpha }| \hat{H}| {\phi }_{j\beta }\rangle$ is the Hamiltonian matrix, and S_iα, jβ = 〈ϕ_iα∣ϕ_jβ〉 is the overlap matrix. Notice that we have to include the overlap matrix here because the AO basis functions are typically not orthonormal.

Reconstruction of AO Hamiltonian

In the main text, two projection-based methods are discussed: the H_k(G, G′) projection method (equation (1)) and the ψ_nk(G) projection method (equation (2)). Both methods scale cubically with system size, and here we will discuss the scaling of these two methods in detail. Equation (1) involves summations over two G vectors, and the summations are performed over all AO pairs (iα, jβ). The number of G vectors is usually very large, and is proportional to the system size. The number of orbital pairs (iα, jβ) within a certain cutoff radius is also proportional to the system size. Thus the projection has a scaling of O(N³), where N is the number of atoms in the unit cell. According to our tests, the typical calculation time for equation (1) is sometimes longer than that for a full self-consistent DFT calculation. The second method, using equation (2), involves the evaluation of equation (3), which also scales as N³, because number of AOs, number of G vectors and number of wavefunctions are all proportional to system size. Because equation (3) only involves one summation over G, this method is usually more efficient than the first one using equation (1). However, to converge the calculation of equation (2), we have to choose a relatively large n, which means we have to diagonalize the PW Hamiltonian for a large number of bands, including high-energy unoccupied bands that are typically not calculated by standard PW codes. Therefore, neither of the methods above is satisfactory in terms of efficiency.

In the main text, an efficient method is proposed to calculate the AO Hamiltonian directly in real space as equation (5). It can also be written as

$$\begin{array}{l}{H}_{i\alpha ,\;j\beta }=\langle {\phi }_{i\alpha }| {-}\displaystyle\frac{1}{2}{\nabla }^{2}| {\phi }_{j\beta }\rangle +\int{{\rm{d}}}^{3}r\,{\phi }_{i\alpha }^{* }({\mathbf{r}}){V}_{{\rm{eff}}}({\mathbf{r}}){\phi }_{j\beta }({\mathbf{r}})\\\qquad\qquad+\sum _{a\gamma \delta }\langle {\phi }_{i\alpha }| {p}_{a\gamma }\rangle {D}_{a\gamma \delta }\langle {p}_{a\delta }| {\phi }_{j\beta }\rangle ,\end{array}$$

(15)

where the terms 〈ϕ_iα|−½∇²|ϕ_jβ〉 and 〈ϕ_iα∣p_aγ〉 can be calculated very efficiently using the two-center integral technique⁵⁸, which will be described in the next subsection. The second term in the equation can be calculated directly on an evenly spaced real-space grid, and the integration region can be chosen such that both of the orbitals ϕ_iα(r) and ϕ_jβ(r) are non-zero in the integration region. The number of grid points involved in the integration thus does not depend on the overall system size. Therefore, the time required to evaluate the above formula scales linearly with the system size. Here, we would like to note that sometimes we can not directly obtain V_eff(r) from a PW DFT code, but have to convert V_eff(G) to real space using the inverse of equation (11). This involves the fast Fourier transform, which scales as O(N log N), but we only have to do it once, and the pre-factor is so small that the time is negligible when compared with those of other parts of the real-space reconstruction method, at least up to a few thousand atoms.

As pointed out before, V_eff(r) or V_eff(G) is the only term in the Hamiltonian that requires self-consistent field calculations. The quantities ∣p_aγ〉 and D_aγδ can be directly read from pseudopotential files. Therefore, for any material structure we are interested in, we only need to calculate V_eff(r) with the PW DFT codes to use equation (5) to construct the AO Hamiltonian. Obtaining this quantity is also convenient because most PW DFT codes directly store V_eff(r) or V_eff(G) in memory.

Extension to the PAW method

Under the PAW formalism^72,73, the all-electron wavefunction |Ψ〉 is connected to the smooth pseudo-wavefunction $\left\vert \tilde{\Psi }\right\rangle$ by a linear transformation: $\left\vert \Psi \right\rangle ={\mathcal{T}}\left\vert \tilde{\Psi }\right\rangle$. Since the wavefunctions are changed by a linear transformation, any operator $\hat{A}$ in the PAW formalism is also changed according to the rule $\tilde{A}={{\mathcal{T}}}^{\dagger }\hat{A}{\mathcal{T}}$. The Kohn–Sham equation still takes the same form:

$$(\tilde{H}-{\varepsilon }_{n{\mathbf{k}}}\tilde{S})\left\vert {\tilde{\psi }}_{n{\mathbf{k}}}\right\rangle =0,$$

(16)

where the Hamiltonian operator is

$$\tilde{H}({\mathbf{r}},{{\mathbf{r}}}^{{\prime} })=\left(-\frac{1}{2}{\nabla }^{2}+{\tilde{V}}_{{\rm{eff}}}({\mathbf{r}})\right)\delta ({\mathbf{r}}-{{\mathbf{r}}}^{{\prime} })+{\tilde{V}}_{{\rm{nloc}}}({\mathbf{r}},{{\mathbf{r}}}^{{\prime} }),$$

(17)

with

$${\tilde{V}}_{{\rm{nloc}}}=\sum _{i\alpha \beta }\left\vert {\tilde{p}}_{i\alpha }\right\rangle {\tilde{D}}_{i\alpha \beta }\left\langle {\tilde{p}}_{i\beta }\right\vert ,$$

(18)

and the overlap operator is

$$\tilde{S}=1+\sum _{i\alpha \beta }\left\vert {\tilde{p}}_{i\alpha }\right\rangle {\tilde{Q}}_{i\alpha \beta }\left\langle {\tilde{p}}_{i\beta }\right\vert .$$

(19)

Comparing equations (4) and (6) with equations (17) and (18), we can see that the PAW Hamiltonian takes exactly the same form as in the case where the norm-conserving pseudopotential is used, so equation (15) can still be used. The process of evaluating the terms in equation (17) will be different for the DFT codes, but this is beyond the scope of this Brief Communication. The only thing we have to be careful of here is that the matrix ${\tilde{D}}_{i\alpha \beta }$ is now system dependent and needs to be obtained self-consistently. Thus, apart from obtaining ${\tilde{V}}_{{\rm{eff}}}({\mathbf{r}})$, we also need the matrix ${\tilde{D}}_{i\alpha \beta }$ from the DFT code when we convert the PW Hamiltonian to the AO basis.

Numerical techniques

Fourier transform of orbitals

Here we consider the Fourier transform of orbitals that can be separated into radial and angular parts: ${\phi }_{i\alpha }({\mathbf{r}})={R}_{inl}(| {\mathbf{r}}| ){Y}_{lm}(\hat{{\mathbf{r}}})$. This will be useful when computing the Fourier transform of projector functions 〈k + G∣p_iα〉 or AOs 〈k + G∣ϕ_iα〉. Using the identity

$${\mathrm{e}}^{i{\mathbf{k}}\cdot {\mathbf{r}}}=4\uppi \mathop{\sum }\limits_{l=0}^{\infty }\mathop{\sum }\limits_{m=-l}^{l}{i}^{l}{j}_{l}(kr){Y}_{lm}(\hat{{\mathbf{k}}}){Y}_{lm}^{\,* }(\hat{{\mathbf{r}}}),$$

(20)

where $r\equiv | {\mathbf{r}}| ,k\equiv | {\mathbf{k}}| ,\hat{\mathbf{r}}\equiv {\mathbf{r}}/r,\hat{\mathbf{k}}\equiv {\mathbf{k}}/k$ and j_l is the spherical Bessel function of order l, we can rewrite the Fourier transform as

$${\phi }_{i\alpha }({\mathbf{k}})\equiv \int{{\rm{d}}}^{3}r\,{\mathrm{e}}^{-i{\mathbf{k}}\cdot {\mathbf{r}}}{\phi }_{i\alpha }({\mathbf{r}})={(-i)}^{l}{R}_{inl}(| {\mathbf{k}}| ){Y}_{lm}(\hat{{\mathbf{k}}}),$$

(21)

where the radial part can be obtained using a spherical Bessel transformation

$${R}_{inl}(k)=4\uppi \mathop{\int}\nolimits_{0}^{\infty }{\rm{d}}r\,{r}^{2}{j}_{l}(kr){R}_{inl}(r).$$

(22)

In practice, R_inl(k) can be computed on a radial grid up to certain energy cutoff. Then all three-dimensional Fourier transforms of the orbital can be calculated easily and very efficiently using spline interpolation. If the orbital is not centered at the origin, we only need to add an additional phase factor.

It is also worth mentioning the inverse transformation here:

$${R}_{inl}(r)=\frac{1}{2{\uppi }^{2}}\mathop{\int}\nolimits_{0}^{\infty }{\rm{d}}k\,{k}^{2}{j}_{l}(kr){R}_{inl}(k).$$

(23)

Two-center integrals

Integrals of the product of two orbitals centered at two different positions are used frequently in AO calculations. Here we discuss a very efficient way to calculate this kind of integral, following Sankey and Niklewski⁵⁸. Consider two orbitals ϕ_iα(r) and ϕ_jβ(r) with α = (n₁l₁m₁), β = (n₂l₂m₂); their overlap integral is defined as

$$S({\mathbf{R}})\equiv \int{{\rm{d}}}^{3}r\,{\phi }_{i\alpha }^{* }({\mathbf{r}}){\phi }_{j\beta }({\mathbf{r}}-{\mathbf{R}}).$$

(24)

This integral in real space can be converted to the integral in Fourier space:

$$S({\mathbf{R}})=\int\frac{{{\rm{d}}}^{3}k}{{(2\uppi )}^{3}}\,{\phi }_{i\alpha }^{* }({\mathbf{k}}){\phi }_{j\beta }({\mathbf{k}}){\mathrm{e}}^{-i{\mathbf{k}}\cdot {\mathbf{R}}}.$$

(25)

Plugging in equations (20) and (21), we have

$$S({\mathbf{R}})=\mathop{\sum }\limits_{l=0}^{2{l}_{\max }}\mathop{\sum }\limits_{m=-l}^{l}{G}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2},lm}{S}_{l}(| {\mathbf{R}}| ){Y}_{lm}(\hat{{\mathbf{R}}}),$$

(26)

with l_max = max{l₁, l₂}, Gaunt coefficients ${G}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2},lm}$ defined as

$${G}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2},lm}=\mathop{\int}\nolimits_{0}^{\uppi }\sin \theta \,{\rm{d}}\theta \mathop{\int}\nolimits_{0}^{2\uppi }{\rm{d}}\varphi \,{Y}_{{l}_{1}{m}_{1}}^{* }(\theta ,\varphi ){Y}_{{l}_{2}{m}_{2}}(\theta ,\varphi ){Y}_{lm}^{\,* }(\theta ,\varphi ),$$

(27)

and

$${S}_{l}(R)=\frac{{i}^{{l}_{1}-{l}_{2}-l}}{2{\uppi }^{2}}\mathop{\int}\nolimits_{0}^{\infty }{\rm{d}}k\,{k}^{2}{j}_{l}(kR){R}_{i{n}_{1}{l}_{1}}^{* }(k){R}_{j{n}_{2}{l}_{2}}(k).$$

(28)

In our calculations, S_l(R) is computed on a radial grid, so that all overlap integrals S(R) can be computed very efficiently using spline interpolation.

The above technique can be extended to calculate kinetic matrix elements

$$T({\mathbf{R}})\equiv \int{{\rm{d}}}^{3}r\,{\phi }_{i\alpha }^{* }({\mathbf{r}})(-\frac{1}{2}{\nabla }^{2}){\phi }_{j\beta }({\mathbf{r}}-{\mathbf{R}}),$$

(29)

with slight modifications to equation (28):

$${T}_{l}(R)=\frac{{i}^{{l}_{1}-{l}_{2}-l}}{4{\uppi }^{2}}\mathop{\int}\nolimits_{0}^{\infty }{\rm{d}}k\,{k}^{\,4}{j}_{l}(kR){R}_{i{n}_{1}{l}_{1}}^{* }(k){R}_{j{n}_{2}{l}_{2}}(k).$$

(30)

Preparation of datasets

Bilayer graphene dataset

The structures in the training set are the same as those used in ref. ³³. There are 300 bilayer graphene 4 × 4 supercells with different interlayer stackings and random perturbations to atomic positions. The perturbations are uniformly distributed within ±0.1 Å along three Cartesian directions. The interlayer distance follows a normal distribution with mean 3.41 Å and s.d. 0.05 Å. The thickness of the unit cell along the non-periodic direction is chosen to be 20 Å. The PW DFT calculations are performed using the Perdew–Burke–Ernzerhof functional⁷⁴ with the Quantum ESPRESSO package⁶³ and norm-conserving Vanderbilt pseudopotential⁶⁴. Energy cutoffs are 80 Ry for the wavefunctions and 320 Ry for the charge density. A 3 × 3 grid is used for the k sampling of the supercell. The double-zeta plus polarization (DZP) basis for the carbon atom with nodes is generated using SIESTA⁵⁹, which includes two orbitals in the 2s shell, two orbitals in the 2p shell and one orbital in the 3d shell polarized from the 2p orbital.

Bilayer MoS₂

PW DFT simulations are all performed with the Quantum ESPRESSO package using the Perdew–Burke–Ernzerhof functional with wavefunction cutoff 60 Ry and charge density cutoff 240 Ry. The unit cell is calculated using a 6 × 6 k sampling, and the mesh sizes are reduced for the supercells corresponding to the supercell sizes. The thickness of the unit cell along the non-periodic direction is chosen to be 20 Å. There are 256 3 × 3 supercells in the training set with different interlayer stackings and random perturbations to atomic positions. The perturbations are uniformly distributed within ±0.1 Å along three Cartesian directions. The average distance between the two Mo layers is 6.49 Å and the s.d. is 0.05 Å. The twisted structures are relaxed using the Perdew–Burke–Ernzerhof functional plus van der Waals interaction energy corrected using the DFT-D3 method⁷⁵. The AOs are the standard split-norm DZP basis for Mo and S atoms generated by SIESTA. The orbitals for the Mo atom include one orbital in the 4s shell, one orbital in the 4p shell, two orbitals in the 4d shell, two orbitals in the 5s shell and one orbital in the 5p shell polarized from the 5s shell. The orbitals for the S atom include two orbitals in the 3s shell, two orbitals in the 3p shell and one orbital in the 3d shell polarized from the 3p orbital. One additional diffusion orbital in the 4s shell with cutoff distance 8.0 a.u. is added to the S atom to capture the interlayer hybridization.

Reconstruction from PW Hamiltonian to AO basis

The cutoffs for equation (5) of the two-center integrals are taken to be the same as the wavefunction cutoff used in the PW calculations. The sizes of the real-space grid for the real-space integrals in equation (5) are also taken to be the same as the size of the fast Fourier transform grid in the PW calculations. These are the same for all calculations in this Brief Communication.

Data availability

The data used in the current study are available at Zenodo⁷⁶. PW DFT calculations in this study are performed with the Quantum ESPRESSO package (https://www.quantum-espresso.org/). AO basis functions are generated using the SIESTA package (https://siesta-project.org/siesta/index.html). Source Data for Figures 1 and 2 are available with this manuscript.

Code availability

The source code used in the current study is available at Zenodo⁷⁷, at GitHub (https://github.com/Xiaoxun-Gong/HPRO) and as Supplementary Software along with the manuscript. A demo of how to use it is also provided alongside the code.

References

Li, H., Xu, Y. & Duan, W. Ab initio artificial intelligence: future research of Materials Genome Initiative. Mater. Genome Eng. Adv. 1, e16 (2023).
Article Google Scholar
von Lilienfeld, O. A. & Burke, K. Retrospective on a decade of machine learning for chemical discovery. Nat. Commun. 11, 4895 (2020).
Article Google Scholar
Pederson, R., Kalita, B. & Burke, K. Machine learning and density functional theory. Nat. Rev. Phys. 4, 357–358 (2022).
Article Google Scholar
Fiedler, L., Shah, K., Bussmann, M. & Cangi, A. Deep dive into machine learning density functional theory for materials science and chemistry. Phys. Rev. Mater. 6, 040301 (2022).
Article Google Scholar
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article Google Scholar
Zhang, L., Han, J., Wang, H., Car, R. & E, W. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Article Google Scholar
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Article Google Scholar
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Article Google Scholar
Chen, C. & Ong, S. P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2, 718–728 (2022).
Article Google Scholar
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. Proc. Mach. Learn. Res. 70, 1263–1272 (2017).
Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018).
Article Google Scholar
Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations Vol. 14, 10758 (Curran, 2020).
Eickenberg, M., Exarchakis, G., Hirn, M., Mallat, S. & Thiry, L. Solid harmonic wavelet scattering for predictions of molecule properties. J. Chem. Phys. 148, 241732 (2018).
Article Google Scholar
Lei, X. & Medford, A. J. Design and analysis of machine learning exchange–correlation functionals via rotationally invariant convolutional descriptors. Phys. Rev. Mater. 3, 063801 (2019).
Article Google Scholar
Dick, S. & Fernandez-Serra, M. Machine learning accurate exchange and correlation functionals of the electronic density. Nat. Commun. 11, 3509 (2020).
Article Google Scholar
Nagai, R., Akashi, R. & Sugino, O. Completing density functional theory by machine learning hidden messages from molecules. npj Comput. Mater. 6, 43 (2020).
Article Google Scholar
Fujinami, M., Kageyama, R., Seino, J., Ikabata, Y. & Nakai, H. Orbital-free density functional theory calculation applying semi-local machine-learned kinetic energy density functional and kinetic potential. Chem. Phys. Lett. 748, 137358 (2020).
Article Google Scholar
Bogojeski, M., Vogt-Maranto, L., Tuckerman, M. E., Müller, K.-R. & Burke, K. Quantum chemical accuracy from density functional approximations via machine learning. Nat. Commun. 11, 5223 (2020).
Article Google Scholar
Yeo, B. C., Kim, D., Kim, C. & Han, S. S. Pattern learning electronic density of states. Sci. Rep. 9, 5879 (2019).
Article Google Scholar
Ben Mahmoud, C., Anelli, A., Csányi, G. & Ceriotti, M. Learning the electronic density of states in condensed matter. Phys. Rev. B 102, 235130 (2020).
Article Google Scholar
Brockherde, F. et al. Bypassing the Kohn–Sham equations with machine learning. Nat. Commun. 8, 872 (2017).
Article Google Scholar
Tsubaki, M. & Mizoguchi, T. Quantum deep field: data-driven wave function, electron density generation, and atomization energy prediction and extrapolation with machine learning. Phys. Rev. Lett. 125, 206401 (2020).
Article Google Scholar
Schmidt, E., Fowler, A. T., Elliott, J. A. & Bristowe, P. D. Learning models for electron densities with Bayesian regression. Comput. Mater. Sci. 149, 250–258 (2018).
Article Google Scholar
Alred, J. M., Bets, K. V., Xie, Y. & Yakobson, B. I. Machine learning electron density in sulfur crosslinked carbon nanotubes. Compos. Sci. Technol. 166, 3–9 (2018).
Article Google Scholar
Grisafi, A. et al. Transferable machine-learning model of the electron density. ACS Cent. Sci. 5, 57–64 (2018).
Article Google Scholar
Fabrizio, A., Grisafi, A., Meyer, B., Ceriotti, M. & Corminboeuf, C. Electron density learning of non-covalent systems. Chem. Sci. 10, 9424–9432 (2019).
Article Google Scholar
Lewis, A. M., Grisafi, A., Ceriotti, M. & Rossi, M. Learning electron densities in the condensed phase. J. Chem. Theory Comput. 17, 7203–7214 (2021).
Article Google Scholar
Chandrasekaran, A. et al. Solving the electronic structure problem with machine learning. npj Comput. Mater. 5, 22 (2019).
Article Google Scholar
Ellis, J. A. et al. Accelerating finite-temperature Kohn–Sham density functional theory with deep neural networks. Phys. Rev. B 104, 035120 (2021).
Article Google Scholar
Dick, S. & Fernandez-Serra, M. Learning from the density to correct total energy and forces in first principle simulations. J. Chem. Phys. 151, 144102 (2019).
Article Google Scholar
Fiedler, L. et al. Predicting electronic structures at any length scale with machine learning. npj Comput. Mater. 9, 115 (2023).
Article Google Scholar
Schütt, K. T., Gastegger, M., Tkatchenko, A., Müller, K.-R. & Maurer, R. J. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat. Commun. 10, 5024 (2019).
Article Google Scholar
Li, H. et al. Deep-learning density functional theory Hamiltonian for efficient ab initio electronic-structure calculation. Nat. Comput. Sci. 2, 367–377 (2022).
Article Google Scholar
Gong, X. et al. General framework for E(3)-equivariant neural network representation of density functional theory Hamiltonian. Nat. Commun. 14, 2848 (2023).
Article Google Scholar
Li, H. et al. Deep-learning electronic-structure calculation of magnetic superstructures. Nat. Comput. Sci. 3, 321–327 (2023).
Article Google Scholar
Tang, Z. et al. Efficient hybrid density functional calculation by deep learning. Preprint at https://arxiv.org/abs/2302.08221 (2023).
Su, M., Yang, J.-H., Xiang, H.-J. & Gong, X.-G. Efficient determination of the Hamiltonian and electronic properties using graph neural network with complete local coordinates. Mach. Learn. Sci. Technol. 4, 035010 (2023).
Article Google Scholar
Zhong, Y., Yu, H., Su, M., Gong, X. & Xiang, H. Transferable equivariant graph neural networks for the Hamiltonians of molecules and solids. npj Comput. Mater. 9, 182 (2023).
Article Google Scholar
Yu, H., Xu, Z., Qian, X., Qian, X. & Ji, S. Efficient and equivariant graph networks for predicting quantum Hamiltonian. Proc. Mach. Learn. Res. 202, 40412–40424 (2023).
Wang, Z. et al. Machine learning method for tight-binding Hamiltonian parameterization from ab-initio band structure. npj Comput. Mater. 7, 11 (2021).
Article Google Scholar
Gu, Q. et al. Deep learning tight-binding approach for large-scale electronic simulations at finite temperatures with ab initio accuracy. Nat. Commun. 15, 6772 (2024).
Article Google Scholar
Kim, R. & Son, Y.-W. Transferable empirical pseudopotenials from machine learning. Phys. Rev. B 109, 045153 (2024).
Article Google Scholar
Wang, Y. et al. Universal materials model of deep-learning density functional theory Hamiltonian. Sci. Bull. 69, 2514–2521 (2024).
Kohn, W. Density functional and density matrix method scaling linearly with the number of atoms. Phys. Rev. Lett. 76, 3168 (1996).
Article Google Scholar
Prodan, E. & Kohn, W. Nearsightedness of electronic matter. Proc. Natl Acad. Sci. USA 102, 11635–11638 (2005).
Article Google Scholar
Chadi, D. J. Localized-orbital description of wave functions and energy bands in semiconductors. Phys. Rev. B 16, 3572 (1977).
Article Google Scholar
Sanchez-Portal, D., Artacho, E. & Soler, J. M. Projection of plane-wave calculations into atomic orbitals. Solid State Commun. 95, 685–690 (1995).
Sánchez-Portal, D., Artacho, E. & Soler, J. M. Analysis of atomic orbital basis sets from the projection of plane-wave results. J. Phys. Condens. Matter 8, 3859 (1996).
Article Google Scholar
Segall, M. D., Pickard, C. J., Ahah, R. & Payne, M. C. Population analysis in plane wave electronic structure calculations. Mol. Phys. 89, 571–577 (1996).
Article Google Scholar
Dunnington, B. D. & Schmidt, J. R. Generalization of natural bond orbital analysis to periodic systems: applications to solids and surfaces via plane-wave density functional theory. J. Chem. Theory Comput. 8, 1902–1911 (2012).
Article Google Scholar
Dunnington, B. D. & Schmidt, J. R. A projection-free method for representing plane-wave DFT results in an atom-centered basis. J. Chem. Phys. 143, 104109 (2015).
Article Google Scholar
Maintz, S., Deringer, V. L., Tchougréeff, A. L. & Dronskowski, R. LOBSTER: a tool to extract chemical bonding from plane-wave based DFT. J. Comput. Chem. 37, 1030–1035 (2016).
Article Google Scholar
Nelson, R. et al. LOBSTER: local orbital projections, atomic charges, and chemical-bonding analysis from projector-augmented-wave-based density-functional theory. J. Comput. Chem. 41, 1931–1940 (2020).
Article Google Scholar
Aarons, J., Verga, L. G., Hine, N. D. M. & Skylaris, C.-K. Atom-projected and angular momentum resolved density of states in the ONETEP code. Electron. Struct. 1, 035002 (2019).
Article Google Scholar
Kundu, S., Bhattacharjee, S., Lee, S.-C. & Jain, M. Population analysis with Wannier orbitals. J. Chem. Phys. 154, 104111 (2021).
Article Google Scholar
Martin, R. M. Electronic Structure: Basic Theory and Practical Methods (Cambridge Univ. Press, 2004).
Sholl, D. S. & Steckel, J. A. Density Functional Theory: a Practical Introduction (Wiley, 2009).
Sankey, O. F. & Niklewski, D. J. Ab initio multicenter tight-binding model for molecular-dynamics simulations and other applications in covalent systems. Phys. Rev. B 40, 3979 (1989).
Article Google Scholar
Soler, J. M. et al. The SIESTA method for ab initio order-N materials simulation. J. Phys. Condens. Matter 14, 2745 (2002).
Article Google Scholar
Ozaki, T. Variationally optimized atomic orbitals for large-scale electronic structures. Phys. Rev. B 67, 155108 (2003).
Article Google Scholar
Larsen, A. H., Vanin, M., Mortensen, J. J., Thygesen, K. S. & Jacobsen, K. W. Localized atomic basis set in the projector augmented wave method. Phys. Rev. B 80, 195112 (2009).
Article Google Scholar
Bowler, D. R. et al. Highly accurate local basis sets for large-scale DFT calculations in CONQUEST. Jpn. J. Appl. Phys. 58, 100503 (2019).
Article Google Scholar
Giannozzi, P. et al. Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).
Article Google Scholar
van Setten, M. et al. The PseudoDojo: training and grading a 85 element optimized norm-conserving pseudopotential table. Comput. Phys. Commun. 226, 39–54 (2018).
Article Google Scholar
Cao, Y. et al. Correlated insulator behaviour at half-filling in magic-angle graphene superlattices. Nature 556, 80–84 (2018).
Article Google Scholar
Cao, Y. et al. Unconventional superconductivity in magic-angle graphene superlattices. Nature 556, 43–50 (2018).
Article Google Scholar
Lucignano, P., Alfè, D., Cataudella, V., Ninno, D. & Cantele, G. Crucial role of atomic corrugation on the flat bands and energy gaps of twisted bilayer graphene at the magic angle θ ~ 1.08°. Phys. Rev. B 99, 195419 (2019).
Article Google Scholar
Li, H. et al. Deep-learning density functional perturbation theory. Phys. Rev. Lett. 132, 096401 (2024).
Article Google Scholar
Kleinman, L. & Bylander, D. M. Efficacious form for model pseudopotentials. Phys. Rev. Lett. 48, 1425 (1982).
Article Google Scholar
Hamann, D. R., Schlüter, M. & Chiang, C. Norm-conserving pseudopotentials. Phys. Rev. Lett. 43, 1494 (1979).
Article Google Scholar
Hamann, D. R. Optimized norm-conserving Vanderbilt pseudopotentials. Phys. Rev. B 88, 085117 (2013).
Article Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Article Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758 (1999).
Article Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 78, 1396 (1997).
Article Google Scholar
Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H–Pu. J. Chem. Phys. 132, 154104 (2010).
Article Google Scholar
Gong, X., Louie, S. G., Duan, W. & Xu, Y. Dataset for ‘Generalizing deep-learning electronic structure calculation to plane-wave basis’. Zenodo https://doi.org/10.5281/zenodo.13377497 (2024).
Gong, X., Louie, S. G., Duan, W. & Xu, Y. Code for ‘Generalizing deep-learning electronic structure calculation to plane-wave basis’. Zenodo https://doi.org/10.5281/zenodo.13377785 (2024).

Download references

Acknowledgements

This work was supported by the Basic Science Center Project of NSFC (grant 52388201), the National Science Foundation of the United States (grant DMR-2325410), the Ministry of Science and Technology of China (grant 2023YFA1406400), the National Natural Science Foundation of China (grants 12334003, 12421004 and 12361141826), the National Science Fund for Distinguished Young Scholars (grant 12025405), the Beijing Advanced Innovation Center for Future Chip Technology (ICFC) and the Beijing Advanced Innovation Center for Materials Genome Engineering. The calculations were done at National Supercomputer Center in Tianjin using the Tianhe new generation supercomputer.

We thank W. Kim and Z. Tang for discussions.

Author information

Authors and Affiliations

State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing, China
Xiaoxun Gong, Wenhui Duan & Yong Xu
Department of Physics, University of California at Berkeley, Berkeley, CA, USA
Xiaoxun Gong & Steven G. Louie
Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Xiaoxun Gong & Steven G. Louie
Institute for Advanced Study, Tsinghua University, Beijing, China
Wenhui Duan
Frontier Science Center for Quantum Information, Beijing, China
Wenhui Duan & Yong Xu
RIKEN Center for Emergent Matter Science (CEMS), Wako, Japan
Yong Xu

Authors

Xiaoxun Gong
View author publications
Search author on:PubMed Google Scholar
Steven G. Louie
View author publications
Search author on:PubMed Google Scholar
Wenhui Duan
View author publications
Search author on:PubMed Google Scholar
Yong Xu
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.X., W.D. and S.G.L. proposed the project and supervised X.G. in carrying out the research. All authors discussed the results. Y.X. and X.G. prepared the manuscript with input from the other co-authors.

Corresponding authors

Correspondence to Steven G. Louie, Wenhui Duan or Yong Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Attila Cangi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Jie Pan, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–5, Fig. 1 and Tables 1 and 2.

Peer Review File

Supplementary Software

Software developed and used in the study.

Source data

Source Data Fig. 1

Source data for plotting Fig. 1c,d.

Source Data Fig. 2

Source data for plotting Fig. 2b–e.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Gong, X., Louie, S.G., Duan, W. et al. Generalizing deep learning electronic structure calculation to the plane-wave basis. Nat Comput Sci 4, 752–760 (2024). https://doi.org/10.1038/s43588-024-00701-9

Download citation

Received: 19 March 2024
Accepted: 04 September 2024
Published: 03 October 2024
Issue date: October 2024
DOI: https://doi.org/10.1038/s43588-024-00701-9

This article is cited by

Leveraging generative models with periodicity-aware, invertible and invariant representations for crystalline materials design
- Zhilong Wang
- Fengqi You
Nature Computational Science (2025)
Bridging the gap in electronic structure calculations via machine learning
- Attila Cangi
Nature Computational Science (2024)