Introduction

The band structure is a fundamental concept in condensed matter physics and materials science, essential for predicting and understanding material properties and phenomena. In the framework of Kohn-Sham density functional theory (DFT)1,2, band structure calculations typically involve three steps: (1) performing self-consistent field (SCF) electronic structure calculations on a uniform k-point grid {k}; (2) obtaining the Hamiltonian Hq on a nonuniform k-point grid (or path) {q}; (3) diagonalizing Hq to obtain eigenvalues. Due to the complexity of the density functional, it is often more efficient to interpolate Hq from Hk in the second step using Fourier interpolation:

$${H}_{{\bf{q}}}=\frac{1}{{N}_{k}}\sum _{{\bf{k}},{\bf{R}}}{H}_{{\bf{k}}}{e}^{{\rm{i}}({\bf{q}}-{\bf{k}}){\bf{R}}},$$
(1)

where R is the Bravais lattice vector, and Nk is the number of uniform k-points. In this paper, we focus on improving the accuracy of this interpolation.

The success of interpolation relies on the smoothness of matrix elements in reciprocal space or their localization in real space. To clarify, when we refer to the localization of the Hamiltonian, we mean localization in R space, not in the band indices α, β. Specifically, for two unit cells located at Ri and Rj, Hαβ(Ri, Rj) decays to zero for sufficiently large RiRj, regardless of the values of α and β. This is equivalent to H(Ri, Rj)2 decaying to zero. A faster decay means the Hamiltonian is more localized in real space.

Given that the DFT Hamiltonian is typically large, it must be projected onto a smaller basis set for practical interpolation. However, while the original implicit DFT Hamiltonian is localized in real space, the projected explicit smaller Hamiltonian is not necessarily so. This can result in a slow decay of the matrix elements with respect to R, necessitating a very large Nk to achieve satisfactory interpolation accuracy. Thus, the challenge lies in constructing a small and localized Hamiltonian.

The maximally localized Wannier function (MLWF)3,4,5 is a powerful tool widely used for interpolation, known as Wannier interpolation (WI). As a compact basis set, MLWFs are optimized to be as localized as possible, ensuring that the projected Hamiltonian remains localized. WI is a popular interpolation method in condensed matter physics and plays a crucial role in constructing model Hamiltonians6,7 and computing various physical observables of solids8,9,10. However, constructing MLWFs is a challenging nonlinear optimization problem due to the presence of multiple local minima4. Consequently, the results can be sensitive to initial guesses, requiring users to have detailed knowledge of the system to provide a good starting point. Significant progress has been made in improving the robustness of numerical algorithms for finding localized Wannier functions11,12,13,14. One particularly robust approach is the selected columns of the density matrix (SCDM)15,16,17. However, constructing MLWFs remains challenging in certain cases, such as topological insulators18,19 and entangled band structures17,20,21.

Apparently, the Hamiltonian constructed from the “maximally localized wavefunction” is not necessarily maximally localized. By instead optimizing with the localization of the Hamiltonian as the target function, we can obtain a truly “maximally localized Hamiltonian”. In this work, we propose a new framework called Hamiltonian transformation (HT), specifically designed to directly localize the Hamiltonian. Unlike MLWFs, HT does not involve any optimization procedure at runtime. Instead, we design an invertible transform function f that transforms Hamiltonian H into f(H), and optimize f during the algorithm design phase to ensure f(H) is as localized as possible. After diagonalizing f(H) and obtaining the transformed eigenvalues f(ε), the true eigenvalues can be recovered through the inverse transformation ε = f−1(f(ε)). Notably, the same transform function f can also be applied within the WI framework, which yields an enhanced WI-SCDM-f scheme for more accurate model Hamiltonians.

HT offers two advantages over WI: (1) HT circumvents the complex optimization procedures required in WI by localizing the Hamiltonian through a pre-optimized transform function f, which we demonstrate to be universally applicable to all Hamiltonians; (2) By focusing on the localization of the Hamiltonian as the primary objective, HT achieves significantly higher accuracy (1 to 2 orders of magnitude better than WI-SCDM) in handling entangled bands. We should note that HT has two disadvantages compared to WI: (1) HT cannot generate localized orbitals, which limits its ability to provide information about chemical bonds; (2) HT requires a larger basis set than WI, resulting in an interpolated Hamiltonian that is approximately an order of magnitude larger than that produced by WI. In summary, the balance of advantages and limitations makes HT a specialized method for band structure interpolation: it is more accurate, more robust, and faster than WI-SCDM. HT is particularly effective for systems with entangled or topologically obstructed bands.

Results

Designing the transform function f

We begin with an example to demonstrate that the degradation of localization in the Hamiltonian is caused by spectral truncation. For a 1-D atomic chain with nearest-neighbor interactions, the Hamiltonian T is a tridiagonal Toeplitz matrix22. The main diagonal elements of T are 1, and the lower and upper diagonal elements are 0.5, with all other elements being zero. The matrix T and its eigenvalue spectrum are shown in Fig. 1a, b. Although T itself is localized, its eigenvectors are non-local, oscillating between positive and negative values, and canceling each other out away from the diagonal. In a typical SCF calculation, only a few of the lowest eigenvalues (assumed to be those less than 1.5 here) are obtained, corresponding to the truncated eigenvalue spectrum shown in Fig. 1d. Reconstructing the Hamiltonian using only the truncated eigenvalues and eigenvectors results in a non-localized Hamiltonian, as shown in Fig. 1c. After truncation, the eigenvalue spectrum becomes discontinuous, and the remaining eigenvectors are unable to cancel each other out effectively, leading to a delocalized reconstructed T. A key observation is that by shifting the remaining eigenvalues downward by 1.5, we can restore continuity in the eigenvalue spectrum, as shown in Fig. 1f. The reconstructed T becomes significantly more localized, as illustrated in Fig. 1e.

Fig. 1: An example demonstrating that modifying eigenvalues can recover the localization of the Hamiltonian.
figure 1

a Original tridiagonal Toeplitz Hamiltonian T for a 1-D atomic chain with nearest-neighbor interactions. b Corresponding eigenvalue spectrum of T. c Reconstructed Hamiltonian after spectral truncation, leading to delocalization. d Truncated eigenvalue spectrum with eigenvalues below 1.5. e Reconstructed Hamiltonian after shifting the remaining eigenvalues downward by 1.5, showing improved localization. f Adjusted eigenvalue spectrum after the shift, restoring continuity.

Therefore, the principle behind designing f is to ensure that it smooths the eigenvalue spectrum. We will demonstrate later that optimizing f is a multi-objective problem, making it difficult to determine the optimal form of f. A practical approach, therefore, is to design a family of f functions with adjustable parameters and compare their effects. The f is designed by derivative:

$${f}_{a,n}^{{\prime} }(x)=\left\{\begin{array}{ll}0\quad &x\ge \varepsilon \\ \frac{1}{2}-\frac{\,\text{erf}(n(\frac{1}{2}+\frac{x-\varepsilon }{a}))}{2\text{erf}\,(\frac{n}{2})}\quad &\varepsilon -a\le x < \varepsilon \\ 1\quad &x < \varepsilon -a.\end{array}\right.$$
(2)

Here, ε represents the maximum eigenvalue in the SCF calculation, and erf (x) is the error function. The function f has two adjustable parameters, a and n. The parameter a ≥ 0 controls the width of the transition region (typically set in proportion to the energy range of the entangled bands), while n governs the smoothness of the function f; a larger n results in a smoother function. The formula of f is obtained by integral from \({f}^{{\prime} }\) with f(ε) = 0, which is shown in Eq. (3).

$${f}_{a,n}(x)=\left\{\begin{array}{ll}0\quad &x\ge \varepsilon \\ \frac{\frac{2a({e}^{-\frac{{n}^{2}}{4}}-{e}^{-\frac{{n}^{2}{(2x+a)}^{2}}{4{a}^{2}}})}{\sqrt{\pi }n}+(2x+a)\left(\,{\text{erf}}\,\left(\frac{n}{2}\right)-\,{\text{erf}}\,\left(n\left(\frac{x}{a}+\frac{1}{2}\right)\right)\right)}{4\,{\text{erf}}\,\left(\frac{n}{2}\right)}\quad &\varepsilon -a\le x < \varepsilon \\ x+a/2\quad &x < \varepsilon -a\end{array}\right..$$
(3)

Without loss of generality, we assume ε = 0 in the following discussion. The plots of fa=1,n(x) and \({f}_{a = 1,n}^{{\prime} }(x)\) are shown in Fig. 2(a) and (b), respectively. In Fig. 2(a), the piecewise function fa,n(x) consists of three parts: the right part, for x > 0, where fa,n(x) is set to 0, simulating the truncation of eigenvalues; the left part, for x < − 1, which is linear, ensuring that eigenvalues significantly less than 0 undergo only a constant shift; and the middle part, which acts as a smoother, providing a gradual transition between the two linear regions.

Fig. 2: The transform function and its derivative.
figure 2

a The transform function fa,n(x) for different values of n, with the transition region width a = 1. As n increases, fa,n(x) becomes smoother. b The derivative \({f}_{a,n}^{{\prime} }(x)\). Higher values of n result in a more gradual change in slope.

Localization functional F

In this section, we introduce a functional F to quantitatively describe the localization properties of any sparse Hermitian Hamiltonian. In the plane-wave basis set, the DFT Hamiltonian is generally assumed to be a dense matrix. However, to achieve more accurate interpolation, we must adopt a sufficiently large k-point mesh, which is equivalent to using a larger supercell in real space. This enlargement ensures that for the farthest two unit cells, Ri and Rj, H(Ri, Rj)2 becomes sufficiently small, avoiding overlap with periodic mirror images. In this case, the Hamiltonian effectively becomes a sparse matrix.

The basic approach to analyzing the decay properties of a sparse matrix involves approximating the transform function using polynomials and analyzing the expansion coefficients. Similar ideas have been applied to study the sparsity of density matrices23,24.

In the following discussion, we assume the band indices α, β of Hamiltonian are fixed, thereby omitting them and simplifying Hαβ(Ri, Rj) to Hij. Consider an m-banded Hermitian matrix H with the following properties: (1) The eigenvalue spectrum σ(H) lies within the interval [ − 1, 1] (if not, H can be scaled to meet this requirement); (2) There exists an integer m ≥ 0 such that Hij = 0 when ij > m. We define the k-th best approximation error of a continuous transform function f on the closed interval [ − 1, 1] (i.e. f C[ − 1, 1]) as

$${E}_{k}(f)=\inf \left\{\mathop{\max }\limits_{-1\le x\le 1}| f(x)-p(x)| :p\in {{\mathcal{P}}}_{k}\right\},$$
(4)

where \({{\mathcal{P}}}_{k}\) denotes the subspace of algebraic polynomials of degree at most k in C[ − 1, 1]. Let i, j indices satisfy mk < ij ≤ m(k + 1), for any \({p}_{k}\in {{\mathcal{P}}}_{k}\), we have pk(H)ij = 0. Thus

$$\begin{array}{ll}\left\vert f{(H)}_{ij}\right\vert \,=\,\left\vert {[f(H)-{p}_{k}(H)]}_{ij}\right\vert \\\qquad\quad\;\;\, \le\, {\left\Vert f(H)-{p}_{k}(H)\right\Vert }_{2}=\mathop{\max }\limits_{x\in \sigma (H)}\left\vert f(x)-{p}_{k}(x)\right\vert\\\qquad\quad\;\;\,\le \mathop{\max }\limits_{-1\le x\le 1}\left\vert f(x)-{p}_{k}(x)\right\vert ,\end{array}$$
(5)

which means that

$$\left\vert f{(H)}_{ij}\right\vert \le {E}_{k}(f).$$
(6)

In Eq. (5) we have used

$$| {A}_{ij}| \le \sqrt{\sum _{i}| {A}_{ij}{| }^{2}}={\left\Vert A{e}_{j}\right\Vert }_{2}\le \mathop{\sup }\limits_{x\ne {\bf{0}}}\frac{{\left\Vert Ax\right\Vert }_{2}}{{\left\Vert x\right\Vert }_{2}}={\left\Vert A\right\Vert }_{2}.$$
(7)

The exact expression for the optimal pk is unknown, but we can approximate Ek(f) using Chebyshev polynomials. Approximation theory guarantees that Chebyshev polynomials are nearly optimal, and error bounds for the Chebyshev series are well-established for smooth functions25,26. Here we calculate exact error bounds for certain specific functions.

The expression of f in terms of the Chebyshev polynomial basis is given by:

$$f(x)=\frac{1}{2}{\alpha }_{0}+\mathop{\sum }\limits_{l = 1}^{\infty }{\alpha }_{l}{T}_{l}(x),$$
(8)
$${\alpha }_{l}=\frac{2}{\pi }\mathop{\int}\nolimits_{0}^{\pi }f(\cos \theta )\cos l\theta d\theta ,$$
(9)

where Tl(x) is the lth Chebyshev polynomial of the first kind. As a result, the decay properties of f(H) can be estimated by

$$\begin{array}{ll}\;\;\;| f{(H)}_{ij}| \le {E}_{k}(f)\le {\left\Vert \mathop{\sum }\nolimits_{l = k+1}^{\infty }{\alpha }_{l}{T}_{l}(x)\right\Vert }_{x\in [-1,1]}\\ =\frac{2}{\pi }{\left\Vert \mathop{\sum }\nolimits_{l = k+1}^{\infty }\cos l\theta \mathop{\int}\nolimits_{0}^{\pi }f(\cos t)\cos lt\,dt\right\Vert }_{\theta \in [0,\pi ]}\\ =c\,F[f,k],\end{array}$$
(10)

where c is a factor normalizing F[f, 0] to 1.

Up to this point, we have obtained a functional F in Eq. (10) to analyze the localization properties of Hamiltonian. An explanation of F is that, for any banded Hermitian matrix H with bandwidth m and eigenvalues in [ − 1, 1], if we apply a transformation f to H, then f(H)ij is bounded above by cF[f, k], where k is an integer satisfying mk < ijm(k + 1). Although H is restricted to a banded matrix, the results presented in this section can be extended to general sparse matrices, provided that H is associated with a sparsely connected, degree-limited graph24.

Optimizing transform function f a,n

By substituting fa,n from Eq. (3) into F in Eq. (10), and using Eq. (8) to simplify \(\mathop{\sum }\nolimits_{k+1}^{\infty }\) to \(\mathop{\sum }\nolimits_{1}^{k}\), we obtain the numerical results shown in Fig. 3.

Fig. 3: Decay of off-diagonal elements of transformed Hamiltonian.
figure 3

a Decay properties of the m-banded Hermitian matrix H after transformation, fa,n(H)ij ≤ ca,nF[fa,n, k], mk < ij ≤ m(k + 1), ca,n is a factor normalizes F[fa,n, 0] to 1. We emphasize that the results apply to all m-banded Hermitian matrices. b, c show similar decay behavior as in (a), but with the transition region width a set to 0.5 and 0.25, respectively.

In Fig. 3a, the black solid line corresponds to the case where f(x) = Θ( − x)(x − 0.5), simulating a discontinuous eigenvalue spectrum with a gap of 0.5. This line does not decay to zero, indicating that, in some extreme cases, for the farthest two unit cells located at Ri and Rj, H(Ri, Rj)2 converges to a nonzero value as Nk. The black dashed line represents F[f0,n, k], which corresponds to a continuous but non-differentiable spectrum. It decays rapidly for k ≤ 2, but more slowly for larger k. The colored solid lines in Fig. 3a represent F[f1,n, k]. These lines decay significantly faster than the black dashed line, indicating that the transform function f1,n is more effective than merely shifting the eigenvalues. Figure 3b, c show plots where the transition region width a is set to 0.5 and 0.25, respectively. These figures display similar behavior to the a = 1 case after rescaling, with larger a leading to faster decay of F.

There are two considerations when choosing the parameters a and n. First, each colored line in Fig. 3 exhibits an inflection point where F transitions from rapid to slower decrease. With small n, F decays quickly initially but reaches the inflection point early, leading to slower decay afterward. Conversely, larger n values result in a slightly slower initial decay but delay the inflection point, causing F to decay faster when k is sufficiently large. Second, for large a and n, the inverse function \({f}_{a,n}^{-1}(x)\) becomes ill-conditioned near x = 0, introducing more errors in the top bands. This necessitates including more bands in the SCF calculations. Based on our experience, setting n = 3 and

$$a=4(\mathop{\max }\limits_{{\bf{k}}}({\varepsilon }_{i{\bf{k}}})-\mathop{\min }\limits_{{\bf{k}}}({\varepsilon }_{i{\bf{k}}})),$$
(11)

where i is the index of the top band, provides a good balance between decay rate and the number of bands required for interpolation. Unless otherwise specified, our simulations will use this set of parameters. Further details regarding the choice of the parameter n are provided in the Results section.

Basis set transformation

The DFT Hamiltonian is usually too large to interpolate directly. We reduce the size of the Hamiltonian by changing to a relatively small, k-independent numerical basis set:

$${\psi }_{i{\bf{k}}}({\bf{r}})=\mathop{\sum }\limits_{\mu =1}^{{N}_{\mu }}{Q}_{\mu }({\bf{r}}){C}_{\mu ,i{\bf{k}}}.$$
(12)

Here, Nμ is the size of basis set, ψik(r) = eikruik(r) is the Bloch wavefunction in real space, and uik(r) is the periodic part within the unit cell. In Eq. (12), decomposition is performed within the unit cell at R = 0, not the entire supercell. By using the basis set Q, we can perform Fourier interpolation on a smaller Nμ × Nμ matrix, making the process more efficient.

The simplest method to perform such decomposition is singular value decomposition (SVD), but it is slow in large basis set. A specialized algorithm for this task is developed based on randomized QR factorization with column pivoting (QRCP)27, with technical details provided in Supplemental Material S1. Randomized QRCP is highly efficient, accounting for only a small fraction of the total computational time.

Compared to MLWFs, the basis functions Qμ(r) are independent of k, meaning that orbitals at all k-points share the same auxiliary basis. Changing to this basis set does not affect the decay properties of the Hamiltonian. On the other hand, a disadvantage of using Qμ(r) is that they are non-localized and cannot provide information about chemical bonds. Additionally, the size of this basis set is typically one order of magnitude larger than that of the Wannier basis set.

Hamiltonian transformation and time complexity

By combining the eigenvalue transformation function f with the change of basis set, we propose the Hamiltonian Transformation (HT) method to interpolate physical quantities such as the band structure. This method is outlined in Algorithm 1.

We constructs the numerical basis Qμ(r) from DFT orbitals ψik(r) obtained on a uniform k-grid. To handle nonorthogonal orbitals from the projector augmented wave (PAW) method or ultrasoft pseudopotentials, HT computes the overlap matrix \({\tilde{S}}_{\mu \nu }\) in the basis Qμ(r) and builds the Hamiltonian Hk using the coefficients \({\tilde{C}}_{\mu ,i{\bf{k}}}\). An eigenvalue transform f then produces f(Hk) with enhanced real-space locality, which is Fourier-interpolated to the desired q-points. Finally, HT solves the generalized eigenproblem for f(Hq) with \({\tilde{S}}_{\mu \nu }\) and recovers the true eigenvalues via the inverse transform f−1.

Algorithm 1

Hamiltonian transformation for band structure calculation

Input : uniform grid {k}, nonuniform path {q},

 eigenvalues {εik}, eigenvectors {ψik(r)}, overlap matrix \(S({\bf{r}},{{\bf{r}}}^{{\prime} })\)

Output: {εiq}

1. Construct the numerical basis set;

 ψik(r) = ∑μQμ(r)Cμ,ik;

2. Construct the explicit Hamiltonian;

\({\tilde{S}}_{\mu \nu }=\int\,d{\bf{r}}d{{\bf{r}}}^{{\prime} }{Q}_{\mu }^{* }({\bf{r}})S({\bf{r}},{{\bf{r}}}^{{\prime} }){Q}_{\nu }({{\bf{r}}}^{{\prime} })\);

\({\tilde{C}}_{\nu ,i{\bf{k}}}={\sum }_{\mu }{\tilde{S}}_{\nu \mu }{C}_{\mu ,i{\bf{k}}}\);

\(f({H}_{{\bf{k}},\mu \nu })={\sum }_{i}f({\varepsilon }_{i{\bf{k}}}){\tilde{C}}_{\mu ,i{\bf{k}}}{\tilde{C}}_{\nu ,i{\bf{k}}}^{* }\);

3. Fourier interpolate the Hamiltonian;

\(f({H}_{{\bf{q}},\mu \nu })=\frac{1}{{N}_{k}}{\sum }_{{\bf{k}},{\bf{R}}}f({H}_{{\bf{k}},\mu \nu }){e}^{{\rm{i}}({\bf{k}}-{\bf{q}})\cdot {\bf{R}}}\);

4. Diagonalize the interpolated Hamiltonian;

\(f({H}_{{\bf{q}},\mu \nu })={\sum }_{i}f({\varepsilon }_{i{\bf{q}}}){\tilde{C}}_{\mu ,i{\bf{q}}}{\tilde{C}}_{\nu ,i{\bf{q}}}^{* }\);

5. Recover the eigenvalues;

 εiq = f−1(f(εiq));

High-throughput accuracy tests

To verify the effectiveness of HT and compare it with WI, we perform high-throughput calculations using a database28 containing 200 materials that span a wide range of structural and chemical spaces. Among these materials, 187 have at least 6 bands around the Fermi level with entangled band structures and are selected for our tests. We use the SCDM method to construct MLWFs within the WI framework. The free parameters in the SCDM method are determined using an automatic projection procedure14,28. To evaluate the interpolation accuracy, we exclude the highest m bands and calculate the mean absolute error (MAE) of the remaining eigenvalues using:

$$\,{\text{MAE}}\,=\frac{\mathop{\sum }\nolimits_{i = 1}^{{N}_{b}-m}{\sum }_{{\bf{k}}}| {\varepsilon }_{i{\bf{k}}}^{\,{interpolation}\,}-{\varepsilon }_{i{\bf{k}}}^{\,{benchmark}\,}| }{{N}_{k}({N}_{b}-m)}.$$
(13)

In our calculations, we set m = 4 and use the non-self-consistent field (non-SCF) DFT band structures as the benchmark. Besides HT and WI-SCDM, we also test a combined approach where we apply the transformation function within the WI-SCDM method. Specifically, we transform the eigenvalues before applying WI-SCDM and then transform them back after the interpolation. We set n = 3 for the transform function f and refer to this method as “WI-SCDM-f”.

We compute the entangled band structures from the database using WI-SCDM, WI-SCDM-f, and HT, then calculate the MAE of the interpolated eigenvalues and present the cumulative frequency histogram of the MAE in Fig. 4a. The x-axis displays the MAE on a logarithmic scale from 10−5 to 10−1, and the y-axis shows the frequency (count) of occurrences for each error magnitude. The overall distribution for each method forms a peak, emphasized by an envelope curve. WI-SCDM exhibits the largest errors, with its peak around 10−2 eV. Through eigenvalue transformation, WI-SCDM-f slightly outperforms WI-SCDM, demonstrating that incorporating f into the WI-SCDM workflow yields more accurate model Hamiltonians. HT, however, significantly outperforms both, with its peak around 10−4 eV, indicating much lower errors. We also study the effect of n of the transform function f. As n increases from 1 to 4, the peak of the HT error distribution shifts progressively leftward. The largest improvement occurs between n = 1 and n = 3, with diminishing returns beyond n = 3, suggesting a practical optimum at n = 3.

Fig. 4: High-throughput mean absolute error (MAE) distribution and Hamiltonian decay behavior.
figure 4

a Histogram of MAEs for WI-SCDM, WI-SCDM-f, and HT with n = 1– 4 across 187 materials with entangled bands. HT yields the lowest MAEs, with the distribution shifting left as n increases, and outperforms WI-SCDM and WI-SCDM-f methods. The largest HT error is 7 × 10−3 eV for CBe2, which is further analyzed in Fig. 5(a). b Decay properties of Hamiltonians in high-throughput calculations. Generally, HT Hamiltonians exhibit faster decay than WI-SCDM and WI-SCDM-f Hamiltonians.

Furthermore, we present the decay properties of the Hamiltonians from high-throughput calculations in Fig. 4b. The x-axis represents R, and the y-axis shows H(R, 0)2/H(0, 0)2, indicating the relative strength of Hamiltonian elements as a function of distance. Since we are interpolating entangled band structures, the Hamiltonian elements do not decay exponentially but rather exhibit an initial rapid decay within the first 20–30 Å, followed by a slower, long-range decay. The WI-SCDM and WI-SCDM-f tight-binding Hamiltonians are projected onto coarser k-point grids, resulting in fewer data points compared to the HT Hamiltonians. Both WI-SCDM and WI-SCDM-f Hamiltonians display a similar decay trend, with values ranging from 10−5 to 10−3 when R = 20 Å. In contrast, the HT Hamiltonians show a wider spread, ranging from 10−6 to 10−3 at R = 20 Å. Overall, we observe that the HT Hamiltonians exhibit the fastest decay rate.

To further analyze the performance of HT, we focus on CBe2, where HT exhibits the largest MAE among all 187 structures. Figure 5a shows the band-resolved MAE distribution for CBe2. In the high-throughput calculation of Fig. 4a, using a plane-wave cutoff energy Ecut of 45 Ry, HT reaches an MAE of 7 × 10−3 eV. Increasing Ecut to 90 Ry reduces the MAE of HT to below 10−3 eV for most bands. In contrast, both WI-SCDM and WI-SCDM-f show negligible change with Ecut, indicating their dominant error arises from the disentanglement procedure rather than plane-wave convergence. Therefore, the poor performance of HT on some materials is primarily due to insufficient cutoff energy. Raising Ecut significantly reduces the interpolation error.

Fig. 5: Case studies of interpolation accuracy and k-point convergence.
figure 5

a Band-resolved MAE for CBe2, the material for which HT exhibits the largest errors in our dataset. Increasing the plane-wave cutoff energy Ecut from 45 Ry to 90 Ry dramatically reduces the HT MAE, while the MAEs of WI-SCDM and WI-SCDM-f remain essentially unchanged. b GW quasiparticle band structures for silicon, with HT showing the best agreement with the benchmark of inteqp. An extremely sparse k-point mesh is used here, and the significant errors in WI-SCDM and WI-SCDM-f indicate they require a much larger Nk to achieve sufficient accuracy. c MAE of silicon as a function of Nk. HT outperforms WI-SCDM and WI-SCDM-f, with its error rapidly decreasing as Nk increases.

We also observe that the MAE in Fig. 5a increases with the band index, where the highest-energy bands showing the largest errors. Such a band-dependent behavior arises from two factors: (1) the top bands are entangled with higher-energy bands that are excluded from the interpolation; (2) in HT, the slope of f vanishes near these bands, making the inverse transform f−1 ill-conditioned in that region. This issue can be mitigated by including additional bands in the calculation and discarding them after interpolation.

Unlike the DFT Hamiltonian, the GW quasiparticle Hamiltonian is more non-local. We perform calculations on Si2 to compare the performance of different methods. To make the interpolation errors more apparent, we intentionally chose a very sparse k-point mesh (5 × 5 × 5). The results are shown in Fig. 5b. The red points represent benchmarks obtained using the inteqp method from BerkeleyGW29, which requires additional information (the orbitals on fine k-point grids) compared to WI-SCDM and HT. The WI-SCDM results (orange lines) display visible errors, but these errors are reduced after applying the transformation (green lines). The HT band structures (blue lines) show the best agreement with the red benchmark points. It should be noted that the errors shown in Fig. 5b do not indicate failure of the two WI-based methods; rather, they merely require a significantly larger Nk to achieve comparable accuracy.

We test the accuracy of HT and WI-SCDM with respect to Nk by performing DFT calculations on silicon, increasing Nk, and comparing their MAEs for the lowest 8 bands along the path between Γ and X. The results are shown in Fig. 5c. We observe that WI-SCDM exhibits the lowest accuracy, and introducing the transformation function improves its performance. However, both methods encounter a bottleneck: when Nk reaches a certain threshold, their MAEs decrease much more slowly and begin to oscillate. In contrast, HT is more accurate than both WI-SCDM and WI-SCDM-f, and its accuracy can be systematically improved by increasing Nk. Furthermore, the MAEs of HT in Fig. 5c display decay patterns similar to those of the lines in Fig. 3a. Specifically, when Nk is small, a smaller n leads to a smaller MAE, whereas when Nk is large, a larger n results in a smaller MAE. This similarity further verifies the theoretical results.

Computational time scaling and performance

The theoretical time complexity of HT is shown in Table 1. Here, Nr represents the number of real space grids, Nμ is the size of the new basis set, and Nk is the number of SCF k-points. Additionally, Nb and Nq denote the number of bands and the number of k-points in the band structure calculation, respectively. Assuming that Nr, Nμ, and Nb are proportional to the number of electrons Ne, and Nq remains constant, the total time complexity of HT is \({\mathcal{O}}\left({N}_{e}^{3}{N}_{k}\log ({N}_{k})\right)\). HT and WI share the same time complexity, but their speed differs due to two factors: HT does not rely on run-time optimization, while WI uses a smaller basis set.

Table 1 Theoretical time complexity of various procedures in Hamiltonian transformation

We perform tests on the Si8 system by varying Nk to compare the time complexity of HT and WI-SCDM. The tests are conducted on a single CPU core with parallelization disabled. In Fig. 6a, although HT uses a larger basis set, it is still faster, requiring less computational time and exhibiting a lower scaling of \({N}_{k}^{0.62}\). In contrast, WI-SCDM requires run-time optimization, making it slower and showing a scaling of \({N}_{k}^{0.96}\). Theoretically, HT is expected to scale linearly with Nk, but we observe sublinear scaling. The reason is that the key computational steps of HT depend on the size of numerical basis set Nμ instead of Nk, and Nμ scales sublinearly with respect to Nk. Specifically, as Nk approaches infinity, Nμ tends toward a constant. Additional tests on Nμ are provided in Supplementary Material S2. We expect that when Nk becomes large enough, the steps that scale linearly with Nk will dominate the computational time of HT, causing the observed results to align with the theoretical scaling.

Fig. 6: Computational scaling and timing comparison of HT versus WI-SCDM.
figure 6

a Computational time as a function of Nk for HT and WI-SCDM on the Si8 system, performed on a single CPU core. Despite using a larger basis set, HT demonstrates faster performance and a lower scaling compared to WI-SCDM. b Actual computational time in high-throughput calculations for HT and WI-SCDM. HT runs on a single CPU core, while WI-SCDM utilizes 16 and 32 CPU cores for different tasks. HT is more efficient for large systems, whereas WI-SCDM performs better for smaller systems.

Furthermore, we present the computational time for both HT and WI-SCDM in the high-throughput calculations, as shown in the cumulative frequency histogram of Fig. 6b. Currently, HT does not support MPI parallelization and runs on a single CPU core. The WI-SCDM calculations use 16 CPU cores for computing the overlap and projection matrices with pw2wannier90.x, and 32 CPU cores for constructing MLWFs with wannier90.x. The runtime for both methods typically falls between 102 and 103 seconds, with WI-SCDM being faster for small systems but slower for larger ones. In HT, the primary bottleneck is the construction of overlap matrices and the explicit Hamiltonian when using the PAW method, which accounts for more than 50% of the total time.

Discussion

The localization of the Hamiltonian is the primary factor influencing interpolation accuracy. HT eliminates the need for the complex runtime optimization procedures required in WI by directly localizing the Hamiltonian through a pre-optimized eigenvalue transformation. By employing this transformation, HT could restore the localization of the Hamiltonian and achieve significantly higher accuracy than WI-SCDM. In our tests, HT demonstrates superior performance in handling entangled bands and GW quasiparticle band structures, providing both improved accuracy and efficiency. HT offers a robust and efficient alternative to WI-SCDM, particularly for complex electronic structure calculations. Moreover, WI-SCDM-f, which integrates the transform function f with the WI-SCDM method, produces model Hamiltonians that are more accurate than those obtained by WI-SCDM alone.

Methods

Code implementation

The HT method is implemented in Quantum ESPRESSO (QE)30,31,32. Currently, NUFFT and iterative diagonalization are not yet implemented in the code; they are temporarily replaced by matrix multiplication and direct diagonalization, respectively. DFT calculations are performed using QE with the Perdew-Burke-Ernzerhof (PBE) functional within the generalized gradient approximation (GGA)33. Quasi-particle energies at the GW level are computed using BerkeleyGW29,34. Wannier interpolations are performed with Wannier905.

Parameters of calculation

In the high-throughput calculations, pseudopotentials from the SSSP efficiency library (version 1.1, PBE functional)35 are used, along with the recommended energy cutoffs. The k-point mesh is chosen with a spacing of 0.2 Å−1. For other DFT calculations, the optimized norm-conserving Vanderbilt (ONCV) pseudopotentials36 are used. In the test of Fig. 5b, we use a cutoff energy of 25 Ry, and sp3 projections for constructing MLWFs. In the test of Fig. 5c, cutoff energy is 100 Ry, SCDM-μ is 10, SCDM-σ is 2.