Abstract
Quantum computation of the energy of molecules and materials is one of the most promising applications of fault-tolerant quantum computers. Practical applications require development of quantum algorithms with reduced resource requirements. Previous work has mainly focused on quantum algorithms where the Hamiltonian is represented in second quantization with compact basis sets while existing methods in first quantization are limited to a grid-based basis. In this work, we present a new method to solve the generic ground-state chemistry problem in first quantization using any basis set. We achieve asymptotic speedup in Toffoli count for molecular orbitals, and orders of magnitude improvement using dual plane waves as compared to the second quantization counterparts. In some instances, our approach provides similar or even lower resources compared to previous first quantization plane wave algorithms that, unlike our approach, avoids the loading of the classical data. The developed methodology can be applied to variety of applications, where the matrix elements of a first quantized Hamiltonian lack simple circuit representation.
Similar content being viewed by others
Introduction
Fault-tolerant quantum computing holds the potential to push the boundaries of quantum chemistry calculations1,2,3,4,5,6,7,8,9,10,11 due to efficient quantum algorithms, such as quantum phase estimation (QPE)12 that, given a sufficient initial state preparation13, allows for a nearly exact estimation of the energy of a molecule or a material in polynomial time. Specifically, QPE with qubitization is the leading approach14,15,16 which requires the lowest quantum resources for quantum chemistry problems2,8,17,18,19. In order to use this algorithm, one must map the Hamiltonian and wavefunction into the qubit representation and also embed the Hamiltonian into the block of a larger unitary matrix using, for example, a linear-combination-of-unitaries (LCU) decomposition. The computational cost of the quantum algorithm is determined by the efficiency of these procedures, which in turn depends on the representation of the quantum chemistry Hamiltonian and whether it is in first or second quantization.
The most studied approach in the context of quantum computation is second quantization. In this formalism, the anti-symmetry of the electronic wavefunction is encoded into the creation and annihilation operators. These operators can be mapped into the qubit representation using, for example, the Jordan-Wigner transformation20,21. Conveniently, the occupation number wavefunction maps directly onto the qubit basis, requiring 2D qubits for a system with 2D spin orbitals. The resulting computational cost does not explicitly depend on the number of electrons. Many LCU decompositions of the second quantized Hamiltonian exist, including the sparse representation19, single factorization19, double factorization18,22, and tensor hyper-contraction17. All of these methods, based on the factorization of the two-body term, are valid for any basis functions and can take advantage of state-of-the-art quantum chemistry basis sets.
There is, however, a tempting alternative. The first quantization formalism requires \(N{\log }_{2}2D\) qubits to represent the wavefunction, where N is the number of electrons. As a result, for a fixed N, first quantization offers an exponential improvement in the scaling of the number of system qubits with respect to the number of orbitals23. Therefore, we can increase the number of orbitals to obtain a better approximation of the continuum limit, with little increase in the number of system qubits16. Furthermore, unlike methods developed in second quantization, the first quantization approach is equally applicable to bosonic problems with fixed number of particles as it is to fermionic ones; the only difference is the initial state preparation of the wavefunction which accounts for the symmetry.
Previous work on qubitization in first quantization is limited to plane-wave (PW) basis sets24,25 (for other approaches see refs. 26,27,28,29,30,31). The electronic integrals have a simple analytical form in the PW basis set, which makes it possible to avoid the loading of classical data and instead use quantum arithmetic circuits to estimate the parameters of the Hamiltonian on the quantum computer. These methods have been developed further32,33 to implement the Goedecker-Teter-Hutter pseudopotentials34,35. However, one of the main disadvantages of these approaches from an applications point of view is their inability to treat active regions of molecules or materials - one of the main tasks in computational quantum chemistry. While pseudopotentials do separate electrons into active (valence) and frozen (core) electrons, this is not the same as a more general scheme for active space construction used in quantum chemistry. The other disadvantage is that it might not be possible to incorporate modern pseudopotentials36,37 or PAW method38, which do not have a simple analytical representation, without the classical-data loading. Therefore, developing approaches in first quantization for advanced quantum chemistry basis sets and exploring its advantages over second quantization in the context of QPE is of great interest.
In this work, we present a qubitization-based QPE implementation of the quantum chemistry Hamiltonian in first quantization, with an arbitrary basis set. We provide the first explicit LCU decomposition in first quantization that makes use of a sparse representation of the Hamiltonian. While we focus on electronic Hamiltonians, our result is general for any Hamiltonian in first quantization. We then explore two basis sets: molecular orbitals spanned on Gaussian-type orbitals and dual plane waves (DPW)39,40,41. For molecular orbitals, we compare quantum resource estimates of QPE with qubitization for active space calculations between sparse implementations in first and second quantization19. One of the main quantum primitives in sparse qubitization is advanced quantum read only memory (QROAM19,42,43) which allows for a trade-off between the number of qubits and Toffoli gates. Irrespective of the QROAM implementation, we surprisingly observe a polynomial speedup with respect to the number of basis functions, D, in the Toffoli count for a given number of electrons. For the implementation of QROAM that minimizes the number of logical qubits, the first quantization approach naturally provides exponential improvements in the qubit count, while for the implementation of QROAM that minimizes the number of Toffoli gates, we observe similar logical qubit count in both first and second quantization. The speedup in Toffoli-gate count is due to a lower subnormalization factor of the sparse LCU in first quantization. For DPWs, we also analyze the uniform electron gas and materials in the physically relevant regimes, and for DPWs, we observe a speed up of many orders of magnitude in both the logical qubit count as well as Toffoli counts over the second quantization counterpart. For some cases, we also observe resource reduction as compared to a first quantization plane wave (PW) implementation25 which avoids the data loading. To our best knowledge, this is the first work that implements first quantization in any basis sets for qubitized QPE.
Results
LCU decomposition in first quantization
In first quantization, the Hamiltonian of interacting identical particles can be written as follows
Here, N is the number of particles, D is the number of basis functions (orbitals) and σ and τ are spin indices. The one-body and two-body matrix elements, hpq and hpqrs, are independent of particle number and spin. The subscript on the operator \({(\vert p\sigma \rangle \langle q\sigma \vert )}_{i}\) indicates that the operator acts on the ith particle:
where I2D is the identity operator acting on vector space \({{\mathbb{C}}}^{2D}\).
To block encode the Hamiltonian for QPE with qubitization, we can express this Hamiltonian as a linear combination of unitary matrices (LCU),
where Uα are unitary matrices with coefficients ωα. The one-norm of the LCU,
is the subnormalisation of the LCU block encoding.
As a choice for unitary matrices we consider Pauli strings, which are tensor products of the Pauli matrices:
To express the Hamiltonian as an LCU with Pauli strings, we use the generic Pauli LCU decomposition presented in ref. 44, which is valid for square matrices of dimension D = 2M where M is a natural number. The Pauli LCU decomposition of the one-body term then reads as
where \(\mu (p,q)=\mathop{\sum }\nolimits_{k=0}^{M-1}{p}_{k}\wedge {q}_{k}\), ∧ is the AND operation, Xa and Za are Pauli operators acting on qubit a:
and pk is the value of the kth bit of the standard binary representation of p, likewise for qk. The action on the spin index is the identity, so there are \(N\log D\) system qubits. The coefficients ωpq of the Pauli strings are given by an XOR transform of the indices followed by a Hadamard transform,
where ⊕ is the bitwise XOR operation and H is a Hadamard matrix, defined as:
In quantum computation, the Hadamard matrix is defined with a refactor of \(\frac{1}{\sqrt{2}}\) unlike our definition. While the Eq. (II.6) is written using the conventional definition of Pauli strings, we find it more convenient to cancel − i with i and work with a slightly different form:
For a real Hamiltonian, anti-Hermitian terms in Eq. (II.10) have coefficients ωpq equal to zero, and all remaining unitary matrices are also Hermitian. From now on, we will refer to this form of the decomposition as Pauli LCU. The two-body LCU decomposition is
where the two-body Pauli LCU coefficients ωpqrs, are
Eq. (II.11) is a generalization of ref. 44 to a 4-dimensional tensor and details of this generalization are given in “Two-body LCU derivation”. Eq. (II.8) and (II.12) are manipulated into a matrix multiplication so that the fast Walsh-Hadamard transform can be used to efficiently find the LCU coefficients44.
Coefficients that correspond to the same Pauli string can be combined to reduce the one-norm of the LCU and data loading cost. When p = q = 0 or when r = s = 0, the two-body LCU repeats Pauli strings from the one-body LCU. Moreover, all terms that give the identity may be removed; these terms shift the eigenvalues of the Hamiltonian by a constant and do not need to be input on the quantum computer. These changes give the canonicalized one-body LCU,
where
The canonicalized two-body LCU is
where
The removed terms proportional to the identity are
The cost of an LCU block encoding depends on the number of unique, non-zero coefficients, L, and the one-norm, λ. The upper bound of the number of non-zero terms from Eq. (II.10) and (II.11) is L = D4 + D2. The one-norm of the canonicalized LCU is
In the case of dense matrix elements hpqrs, such that \(\sum _{pqrs}| {h}_{pqrs}| =O({D}^{4})\), we expect λ = O(N2D3). This is because in Eq. (II.12), \({H}^{\otimes M}/\sqrt{D}\) is the unitary transformation, and there is an extra prefactor 1/D. We confirm this speedup numerically in “Molecular orbitals. Scaling properties for basic models” for dense matrix elements. However, for chemical applications, hpqrs is sparse and λ depends on the particular basis set and symmetries presented in the system. Our numerical simulations (“Molecular orbitals. Scaling properties for basic models” and “Scaling properties using dual plane waves”) show that for molecular orbitals we observe less than a 1/D speedup, while for dual plane waves we achieve more than 1/D compared to second quantization for a fixed N. If D = O(N), the one-norm in second quantization scales better than in first quantization in the examples we studied. However, for a fixed N, λ can be smaller than the one-norm in second quantization not only in the asymptotic regime, but also for a practical size of the basis set. We present detailed numerical analyzes for different basis sets in “Molecular orbitals. Scaling properties for basic models” and “Scaling properties using dual plane waves” and show that for dual plane waves this is indeed the case.
So far, our results are completely general for any Hamiltonian. We used no symmetries of the tensors hpq and hpqrs, or of the coefficients ωpq and ωpqrs, to derive Eq. (II.10) through Eq. (II.18). In “Matrix elements and symmetries in different basis sets”, we discuss the consequences of different symmetries and the choice of the basis functions on the LCU.
Matrix elements and symmetries in different basis sets
Arbitrary basis sets
In quantum chemistry, the two-body matrix elements of the Hamiltonian can be computed as
where ϕq(r) are molecular orbitals and V is the supercell in the case of a periodic solid, or is \({{\mathbb{R}}}^{3}\) for an isolated molecule. Often, these orbitals are chosen to be real and in this case the two-body matrix elements possess the 8-fold symmetry
One-body matrix elements are symmetric for real orbitals,
Data loading is often a dominant cost of qubitized QPE. To reduce this cost, symmetries can be exploited to load only the unique, non-zero LCU coefficients. Next, we discuss how the symmetries of real matrix elements, given in Eq. (II.20) and (II.21), affect the one- and two-body LCU coefficients in turn.
The one-body LCU coefficients do not retain the two-fold symmetry of the one-body matrix elements. However, the Hadamard and XOR transforms of the matrix elements convert the symmetry into zero coefficients. The number of non-zero matrix elements for the one-body term is less than or equal to \(\frac{1}{2}D(D+1)\)44. By examining Eq. (II.11), it can be seen that the two-body LCU coefficients retain a two-fold symmetry,
This symmetry simplifies the canonicalized one-body coefficients so that
Accounting for the XOR and Hadamard transform of the LCU, the remaining two-fold symmetry and canonicalisation, the number of unique, non-zero, two-body LCU coefficients is less than or equal to D(D + 1)(D − 1)(D + 2)/8. Indeed, assuming all original tensor elements are non-zeros, for each pair (p, q) that corresponds to non-zero value of ωpqrs we have K ≔ D(D + 1)/2 pairs of (r, s). This corresponds to K2 non-zero coefficients. Taking into account the symmetry of ωpqrs under (p, q) ↔ (r, s) and canonicalization (II.16), we obtained the total number of unique, non-zero coefficients to be (K2 − K)/2 = D(D + 1)(D − 1)(D + 2)/8.
Basis sets diagonalizing the Coulomb potential
We next consider the Hamiltonian in basis sets which diagonalize the Coulomb potential, such as dual plane waves. Then,
where Tpq and Vpr are symmetric matrices. This Hamiltonian transforms to a Pauli LCU as:
with the LCU coefficients
and
As before, symmetries of the original one-body matrix elements manifest in the sparsity of ωpq, while two-body coefficients γpr remain symmetric. Similarly to the previous case, there is double-counting of Pauli strings in one- and two-body terms. By removing contributions proportional to the identity, the canonicalized Hamiltonian in the Pauli basis can be written as follows
where the one-body and two-body matrix elements are now
Here we again disregard terms with identity Pauli-strings, since they only contribute a constant shift of the eigenvalue. An example of such basis sets are DPWs39,40,41, which exhibit additional translation symmetry. This symmetry manifests in the sparsity of Pauli LCU coefficients as will be numerically shown in “Scaling properties using dual plane waves”.
Block encoding
In qubitized QPE, the Hamiltonian is embedded into a block of the larger unitary matrix U with subnormalization factor λ:
This unitary is then used to construct the quantum-walk operator,
which has eigenvalues related to the spectrum of the Hamiltonian, \({\rm{Eigenvalue}}{[\hat{W}]}_{i}={e}^{\pm \arccos \left({E}_{i}/\lambda \right)}\)15,16.
In a circuit implementation of the block encoding of \(\hat{U}\), the value of the ‘\(*\)’ blocks is irrelevant. We will use a linear combination of unitaries (LCU) approach to constructing the block encoding. A block encoding can be constructed for an LCU \(\hat{H}=\mathop{\sum }\nolimits_{l=0}^{L-1}| {a}_{l}| {\hat{U}}_{l}\) by combining the operators PREP (state preparation) and SELECT:
Typically, the dominant cost is \(O(\sqrt{L}\lambda /{\epsilon }_{{\rm{QPE}}})\), which assumes a QROAM-based implementation of PREP19.
First, we will show the circuit constructions for the sparse qubitization in first quantization for arbitrary basis sets, based on the Hamiltonian (II.13) and (II.15), and then, we discuss how the circuits cost reduces for the Hamiltonian with a diagonal Coulomb interaction (II.28).
Arbitrary basis sets
Let us write both the one-body (II.13) and the two-body (II.15) Hamiltonian as a single sum of i ≠ j. For the one-body term, we use s = r = 0:
with coefficients
The inequality (p, q) < (r, s) should be understood as inequality of composite indices and selects one branch of the symmetry (p, q) ↔ (r, s). The factor of 1/(N − 1) in the one-body coefficients cancels the extra sum over j.
A priori, there are L = D4 coefficients apqrs to load. However, the coefficients are very sparse with many zeros or values close to zero that can be truncated, resulting in an effective L < D4, which can be harnessed by using ideas from the sparse qubitization scheme19. Therefore, we will load L coefficients al, along with index value p(l), q(l), r(l), s(l) for use in SELECT.
The SELECT operator. We begin by showing a circuit construction for a SELECT operator selecting \(\mathop{\prod }\nolimits_{k=0}^{M-1}{Z}_{iM+k}^{{q}_{k}}\). We denote the circuit as \({{\rm{SELECT}}}_{i,q}[\mathop{\prod }\nolimits_{k=0}^{M-1}{Z}_{iM+k}^{{q}_{k}}]\) where the indices are the control registers for the SELECT operation. Our approach is to iterate over i with a unary iteration gadget43. For each value of i, we apply \(M={\log }_{2}D\) CCZ gates, each of which targets the iM + k’th system qubit (k = 0, …, M − 1) and is controlled on the unary iterator qubit specifying the current value of i as well as the k’th qubit of the q register. This is illustrated in the following, where we only show the unary iterator qubit and hide all other ancillary qubits and inner workings of the unary iteration. Between the first two cnots, the unary iterator qubit is \(\left\vert 1\right\rangle\) if i = 0. Between the second two cnots, the unary iterator qubit is \(\left\vert 1\right\rangle\) if i = 1, etc.

The above circuit can be adjusted to implement \({\rm{SELECT}}_{i,p,q}[\mathop{\prod }\nolimits_{k=0}^{M-1}{X}_{iM+k}^{{p}_{k}}{Z}_{iM+k}^{{q}_{k}}]\). Rather than implementing it as a product of two circuits (II.36) for q and p, we combine them to save the cost of a second unary iteration: A CCX gate is added to the left of each CCZ gate in (II.36), and is controlled on the relevant qubit of the \(\vert p\rangle\) register instead of the \(\vert q\rangle\) register. This has cost
when using measurement based uncomputation for the unary iteration. The (N − 1) Toffolis and ancilla qubits are internal to the unary iteration.
For the LCU (II.34), we use the circuit twice to implement the product
The same circuit can be used for both one-body and two-body LCU terms, by setting the control registers r = s = 0 in the one-body terms, as is taken care of in (II.35).
The work of25 uses a different approach to SELECT: Rather than implementing gates for each value of i in turn (II.36), the part of the system register belonging to electron i is swapped towards the top, then the gates are implemented on that register once, and it is swapped back. In our case this is not advantageous as we are not carrying out arithmetic on the system register and the CCX and CCZ gates are cheaper than the swaps and extra unary iteration for the unswaps would be.
The PREPARE operator. As mentioned, we use a sparse scheme to reduce the data loading cost in the face of many zero coefficients or small coefficients that are truncated to zero. For this, the non-zero coefficients ap,q,r,s from the LCU (II.34) are indexed by l ∈ {0, …L − 1} in arbitrary order. We use the sparse scheme to construct the PREP operator with QROAM and coherent alias sampling, see refs. 17,19,45. The construction of the PREP circuit is largely unchanged as compared to sparse qubitization in second quantization and the main modification is an additional step for preparing the equal superposition over electron numbers. For completeness, we provide a detailed description below:
-
1.
Prepare an equal superposition state \(\frac{1}{\sqrt{L}}\mathop{\sum }\nolimits_{l=0}^{L-1}\left\vert l\right\rangle\) of dimension L via amplitude amplification (see “Amplitude amplification for equal superpositions”).
-
2.
Data lookup: Use a QROAM indexed on \(\vert l\rangle\) to load indices and sign \(\vert p,q,r,s,\theta \rangle\), alternative indices and sign, and keep probabilities for the coefficients apqrs. This is the dominant cost of PREP in both time and space.
-
3.
Coherent alias sampling: Perform swaps between indices and alternative indices based on an inequality test between keep probability and an equal superposition state. The sign qubit does not have to be swapped but the phase can be applied directly (omitted during uncomputation).
-
4.
Prepare an equal superposition state \(\frac{1}{\sqrt{N(N-1)}}\mathop{\sum }\nolimits_{i\ne j}^{N-1}\vert i\rangle \vert j\rangle\). This can be done by preparing an equal superposition state on \(2\lceil {\log }_{2}N\rceil\) qubits and doing amplitude amplification, see “Amplitude amplification for equal superpositions”.
The SELECT operator from (II.38) should be controlled on the successful outcome of the uniform superposition preparations in step 1 and 4, requiring an extra Toffoli gate. Uncomputation of PREP can make extensive use of measurement based uncomputation, such that the only Toffoli cost is that of uncomputing the QROAM and the equal superposition states. The walk operator required for QPE is the product of the block encoding and a reflection. The reflection must be on the flag qubits, which are \(\lceil {\log }_{2}L\rceil +2\) for the equal superposition state \(\frac{1}{\sqrt{L}}\mathop{\sum }\nolimits_{l=0}^{L-1}\vert l\rangle\) and its success qubit and rotation qubit for amplitude amplification, similarly \(2\lceil {\log }_{2}N\rceil +2\) for \(\frac{1}{\sqrt{N(N-1)}}\mathop{\sum }\nolimits_{i\ne j}^{N-1}\left\vert i\right\rangle \left\vert j\right\rangle\). The costs are summarized in Table 1.
Basis sets diagonalizing the coulomb potential
The Hamiltonian (II.28) for block-encoding can be rewritten as
where the combined matrix elements are
Compared to the costs for a general Hamiltonian shown in Table 1 resource requirements differ in several ways: The number of unique coefficients L is much smaller, less than or equal to D2 + D(D + 1)/2, and we never need to implement an XZXZ-type Pauli string, so the cost of the second SELECT operator does not have the controlled X-gates, which lowers the Toffoli cost to 2N + 3NM, as opposed to 2N + 4NM Toffolis. In addition we only need to specify 3 indices in the output of the QROAM register in the PREP step, allowing for m = ℵ + 2(3M + 1) instead of m = ℵ + 2(4M + 1). These Toffoli counts are also indicated in Table 1.
Scaling properties of Pauli LCU and quantum resource estimates
In this section, we first discuss the properties of the LCU decomposition and establish asymptotic scalings which we compare to existing methods. The properties that affect the overall scaling of QPE are the one-norm and the number of unique, non-zero coefficients. A good LCU minimizes both of these factors. We will calculate the one-norm and number of unique, non-zero coefficients for several systems, truncating either the matrix elements of the Hamiltonian or its LCU coefficients.
With these results, we then provide the quantum resource estimates of QPE with qubitization for a range of systems. We focus on single-shot QPE resource estimation and do not consider subdominant costs such as the antisymmetrization of the initial state16. To obtain the number of physical qubits, we will assume that the algorithm is run on superconducting qubits placed on a square grid. We consider the code cycle duration of 10−6 s and the physical error rate of 0.1% – conventional figures for superconducting qubits46,47. We employ the surface code following refs. 48,49,50,51,52,53,54; for details we refer the reader to Appendix D of ref. 55.
Implementation of qubitization-based QPE uses QROAM which allows for a trade-off between the number of logical qubits and the number of Toffoli gates. Therefore, we will consider two versions of the algorithm: one minimizes the number of logical qubits by not using additional copies of the input registers for data loading, which corresponds to setting κ1 = 1 in Table 1 (min-Qu). The other version of the algorithm chooses κ1 such that it minimizes the number of Toffoli gates needed instead (min-T).
The systems that we will consider are a dense, real Hamiltonian, the square-planar H4 molecule, a hydrogen chain, the \({[{{\rm{Fe}}}_{2}{{\rm{S}}}_{2}]}^{2-}\) complex, the uniform electron gas and the crystalline solids nickel oxide and diamond. We apply the DPW basis set to the periodic systems and for the others we use molecular orbitals. These systems range from test models to possible applications.
We compare our work primarily against the sparse, second quantization LCU of ref. 19 in “Molecular orbitals. Scaling properties for basic models” and “Quantum resource estimates for iron-sulfur complex”. We aim to establish what advantage first quantization brings over second quantization when matrix elements of the Hamiltonian are treated similarly without applying any additional factorization. We believe that this method is comparable to our method as a baseline from which other methods in the same quantization can be developed. Additionally, we will compare our results with the second quantization DPW method of ref. 41 and the PW approach of refs. 24,25 in “Resource estimates for materials”.
Molecular orbitals. Scaling properties for basic models
In order to benchmark this work against Berry’s approach19, our first point of comparison is generating random matrix elements to form a dense, real Hamiltonian. The one-body matrix elements have two-fold symmetry and the two-body matrix elements have eight-fold symmetry. This is a worst case scenario of scaling for a generic Hamiltonian. For this reason, we do not truncate these matrix elements nor the LCU coefficients, except for a 10−10 Ha cut-off for ‘zero’ coefficients. To compare the asymptotic scaling of these methods, only the dominant two-body term is considered. For a dense real Hamiltonian, we keep the number of electrons constant at N = 4, so that it does not affect the scaling in D. The one-norm scales as \({\mathcal{O}}({D}^{2.93})\) in first quantization and as \({\mathcal{O}}({D}^{4})\) in second quantization as shown in Fig. 1a. Figure 1d shows that both methods have \({\mathcal{O}}({D}^{4})\) unique, non-zero LCU coefficients as expected.
a For a dense, real Hamiltonian with a fixed number of electrons, the one-norm of first quantization scales as \({\mathcal{O}}({D}^{3})\) compared to \({\mathcal{O}}({D}^{4})\) for the equivalent second quantization method, where D is the number of orbitals. b Similarly, for H4 the one-norm scales better in first than in second quantization. c Hydrogen chains with varying number of atoms, NA, with the number of orbitals per atom, NA/D, being constant. The number of electrons, N, is proportional to the number of atoms and orbitals, N = O(NA) = O(D). Therefore, the \({\mathcal{O}}({N}^{2})\) dependence of the one-norm in first quantization leads to worse scaling. d For a dense, real Hamiltonian, the number of unique, non-zero terms is the same for first and second quantization. e For the H4 molecule, the number of unique non-zero terms is greater for first quantization. f The number of unique, non-zero terms has better scaling in second quantization than first for the Hydrogen chain.
Square planar H4 is a common molecule used to compare different algorithms. In this work, we aim to use the same bond lengths and truncation procedures as Lee et al.17, so that we can compare our scaling results. The H4 molecule has N = 4 electrons and a bond length of 2 Bohr radii. For the second quantization representation, we truncate the matrix elements so that there is an error of less than 5.0 × 10−5 Ha per hydrogen atom, using CCSD(T). We use the aug-cc-pVQZ56,57 basis with D = 4 to 128 orbitals, using Boys localization. For the first quantization representation, we truncate the LCU coefficients to achieve this error instead, because the Pauli LCU decomposition is reversible and it is the number of non-zero LCU coefficients that determine the computational cost of the algorithm. As before, we count all LCU coefficients below 10−10 as zero and we only consider the two-body term since we are interested in the asymptotic scaling of the methods. For the line of best fit, we use the final three data points. For H4, the one-norm has lower asymptotic scaling in first quantization. As shown in Fig. 1b, our method scales as \({\mathcal{O}}({D}^{2.26})\), while the Berry representation scales as \({\mathcal{O}}({D}^{3.09})\). The Berry representation has fewer unique, non-zero LCU coefficients, but a slightly worse asymptotic scaling of \({\mathcal{O}}({D}^{4.25})\) compared to \({\mathcal{O}}({D}^{4.06})\) is obtained, see Fig. 1e. However, we expect that the asymptotic scaling is actually \({\mathcal{O}}({D}^{4})\), which is obscured because the truncation improved some data points more than others.
The hydrogen chain is another common system used to compare methods. It was also considered in Lee et al.17 and we will similarly use a bond length of 1.4 Bohr radii and a truncation procedure that aims for a CCSD(T) error of 5.0 × 10−5 Ha per hydrogen atom employing STO-3G basis set. In such basis set, we have one orbital per atom, so N = D = NA, where NA is the number of atoms in the chain. As for H4, we truncate the matrix elements of the second quantization representation and the LCU coefficients of the first quantization representation. Unlike the previous two examples, the number of electrons increases with the number of orbitals for the hydrogen chain model. For the previous systems, the number of system qubits is smaller in first quantization. For the hydrogen chain, the dependence of the number of orbitals on the number of electrons turns the system qubit number advantage into a disadvantage. In this case, first quantization requires \(D\log D\) system qubits compared to 2D in second quantization. Because the number of electrons grows with the system size, the \({\mathcal{O}}({N}^{2})\) scaling of the first quantization one-norm, shown in Eq. (II.18), becomes an \({\mathcal{O}}({D}^{2})\) scaling. Therefore, the scaling of the one-norm in first quantization for this problem is about D2 worse than second quantization methods, as shown in Fig. 1c. There is no such dependence on the number of electrons in the number of non-zero LCU coefficients. However, the scaling of the Berry representation is superior at \({\mathcal{O}}({D}^{3.28})\), compared to \({\mathcal{O}}({D}^{3.97})\) for the first quantization representation, see Fig. 1f.
Quantum resource estimates for iron-sulfur complex
Iron-sulphur clusters are known for their structural versatility and are ubiquitous in biochemical systems58,59. They easily participate in redox reactions in which electrons are exchanged between reactants and as a result form a crucial part of electron transfer mechanisms that take place in several important enzymes. These clusters consist of a FenSm (n, m are some integers) core cluster to which a number of ligands are attached and in an enzyme several such clusters may form an electron transfer pathway to ensure the passage of electrons within the protein60,61. There are several other functions iron-sulfur clusters serve in biological systems, and they also represent a challenge to electronic structure theory calculations58,59. The characterization of magnetic interactions in these clusters is especially challenging. In a typical scenario, each iron in the cluster can be thought of as a site where several unpaired electrons with parallel spins are localized. The electromagnetic properties of the entire cluster then depend on the interactions between several of these sites. Antiferromagnetic coupling is often favored between two sites, in which case all electrons in one site are aligned in an antiparallel fashion compared to the electrons on the other site. In chemistry, this situation perhaps is one of the best candidates for finding strong correlation effects62,63. Consequently, such systems require demanding classical computational techniques for even a qualitatively accurate calculation, and even the simplest of these clusters, such as Fe2S2, often feature as test systems in classical computational studies64,65. Such clusters are also used as a benchmark molecule in the context of quantum computation13,66.
In this work, we compare the quantum resource estimates of our new approach with the second quantization sparse qubitization algorithm of Berry et al.19, considering the \({[{{\rm{Fe}}}_{2}{{\rm{S}}}_{2}]}^{2-}\) complex as a test system. While the active space we consider consists of only 14 electrons, this is a more realistic example than those we used in “Molecular orbitals. Scaling properties for basic models”. In order to define the active space Hamiltonian, we have followed methodology similar to refs. 13,64. Namely, we used the cc-pVTZ-DK basis set56,67,68,69 and carried out high-spin restricted open-shell Kohn-Sham calculations with the exact two-component (x2c) scalar-relativistic correction70,71,72 and the BP86 functional73,74. We achieve convergence with the second-order co-iterative augmented Hessian (CIAH) method75. Next, we localized the virtual and occupied spaces separately using Pipek-Mezey (PM) localization76,77. We choose 12 occupied orbitals for the active space, consisting of five 3d orbitals per iron atom and one 3pz orbital per bridging sulfur atom. The active space is then completed by selecting D − 12 orbitals from the virtual space based on their orbital energy such that D = 16, 32, 64, 128. The first 122 virtual space orbitals are all centered on iron or a bridging sulfur.
In order to carry out resource estimations, one has to distribute the error budget among all approximations used in the quantum algorithm17:
where ϵQPE is the error due to finite precision in QPE, ϵtrunc is the error due to truncation of the Hamiltonian and ϵPREP is the error in the state preparation (PREP circuit). We choose these parameters as follows17:
Our target is to estimate the ground state energy with precision of ϵtot = 0.1 mHa, since the energy difference between different spin states in \({[{{\rm{Fe}}}_{2}{{\rm{S}}}_{2}]}^{2-}\) is on the order of only few mHa (the singlet-triplet gap is ≈2.0 mHa)64. This is higher precision than that typically used in resource estimation for quantum chemistry (1.6 mHa). We truncate the first quantization LCU coefficients and second quantization Hamiltonian matrix elements to meet the target error, ϵtrunc, as estimated with an unrestricted MP2 (UMP2) calculation. We choose UMP2 method instead of CCSD(T) because UMP2 requires fewer computational resources and does not have problems with convergence, unlike CCSD(T) for large active spaces (128 orbitals).
Figure 2 shows the resource estimates for the two approaches, which both use QROAM that minimizes the Toffoli count (min-T). The number of Toffoli gates in first quantization scales asymptotically better by a factor of D0.4. This asymptotic speedup is due to better scaling of the Pauli LCU one-norm in first quantization. The number of logical qubits is similar in both methods because of the small truncation value, ϵtrunc. For D ≤ 1024, sparse qubitization in second quantization shows smaller Toffoli count. The reason is that the one-norm of the first-quantization LCU explicitly depends on the number of electrons squared and this gives a large prefactor. Only at more than 103 orbitals would we expect first quantization to show an advantage.
Resource estimates were obtained for the target accuracy ϵtot = 0.1 mHa for ground state energy estimation. a Second quantization requires fewer Toffoli gates for all data points, but first quantization scales better in D. b Only the last three points were used for the fit of logical qubits. First and second quantization require a similar number of logical qubits.
Table 2 shows the resource estimates when one chooses to minimize the number of logical qubits. As one can see, the number of qubits becomes smaller in first quantization compared to second quantization for more than 16 orbitals. For example, for 128 orbitals the number of qubits is 293 while in second quantization this number is 492. With larger system size there will be even larger improvement. However, the Toffoli count is again only asymptotically better in first quantization, scaling with order \({\mathcal{O}}({D}^{6.04})\) against \({\mathcal{O}}({D}^{6.23})\).
Scaling properties using dual plane waves
In order to take advantage of the more efficient algorithm for systems with diagonal Coloumb interactions outlined in “Basis sets diagonalizing the coulomb potential” we consider the DPW. This basis set is related to PW via a Fourier transform, so the basis functions are localized in real space around evenly spaced grid points (see refs. 39,40 and Appendix C of ref. 41 for details). This transformation diagonalizes the Coloumb interaction, bringing the Hamiltonian into the form of Eq. (II.24) and greatly reducing the number of coefficients that need to be loaded. The trade-off is that one typically needs far more DPWs to achieve the same basis set accuracy compared to Gaussian orbitals, but since the size of our system register only scales logarithmically with the basis size we expect DPW to be favorable for our algorithm.
The Pauli-LCU one-norm of the Hamiltonian (II.24) is bound by
We note that this bound might be too loose for DPW and in order to establish a more precise dependence on the number of basis functions, we have carried out numerical calculations. For this, we have considered diamond with a cubic cell made of 8 atoms. The one-norm of the Pauli LCU consists of one- and two-body contributions and it is bounded by the sum of the one-norms of the kinetic energy, nuclear-electron and electron-electron interactions
The total norm, as well as each individual contribution, is shown in Fig. 3a, b. The one-body term provides smaller contribution to the total norm than the two-body term. The nuclear-electron contribution is around an order of magnitude smaller than electron-electron contribution and demonstrates subdominant scaling in basis functions. Based on these numerical results, we can conclude that the total norm scales as
where the first contribution is due to the kinetic energy and the second term is due to the electron-electron interaction. We note that the calculations we have done might be in the pre-asymptotic regime and as a result this scaling might change for the larger basis sets. However, this is the regime which is of interest when such methodology will be used with pseudo-potentials. In comparison, the one-norm in second quantization with DPW41 scales as \({\mathcal{O}}({D}^{2})\) with respect to the number of DPWs and thus, we achieved a considerable speedup by using the same basis but employing first quantization. For a very small number of basis functions, the norm in second quantization is smaller than our norm but for the physically interesting regime we achieve orders of magnitude improvement as shown in Fig. 3c.
Diamond is represented by an 8-atom unit cell a First quantization using DPWs. The two-body term contributes more to the one-norm. b First quantization using DPWs. Electron-electron interactions contribute more to the one-norm than the electron-nuclear or kinetic terms. c In first quantization, the scaling of the one-norm with respect to the number of basis functions, D, is more favorable than second quantization. The norm of PW approach of ref. 25 has the most favorable scaling but in the pre-asymptotic regime the first quantization DPWs norm of this work is lower. d The number of unique, non-zero coefficients scales as \({\mathcal{O}}({D}^{1.84})\) in the number of basis functions, D, for first quantization using DPWs and as \({\mathcal{O}}(D)\) for second quantization using DPWs.
Apart from improvement in the one-norm, we also observe polynomial speedup in the space-time cost of the walk operator. Such cost consists of the SELECT and PREP circuits. In second quantization, the dominant cost will be SELECT in which the Toffoli-gate count scales linearly with the size of the basis set, D,43 while in our approach such cost scales logarithmically in D. The most dominant cost is then due to data loading in PREP, where both Toffoli and qubit numbers scale as \({\mathcal{O}}(\sqrt{\Gamma })\), where Γ is the amount of information needed to specify the Hamiltonian19. From Fig. 3d, we can see that \(\Gamma ={\mathcal{O}}({D}^{1.85})\) and as a result the Toffoli complexity scales sublinearly. We also note that we did not use any translation symmetries here and we expect that if one encorporates those, one can achieve \({\mathcal{O}}({D}^{1/2})\) scaling of PREP. When one uses QROM instead of QROAM to minimize the number of qubits, the space cost scales logarithmically in D, while the time cost of the block-encoding scales as \({\mathcal{O}}({D}^{1.85})\). Even in this case, we achieve smaller total resources, as will be shown in the next section.
Resource estimates for materials
In this section, we report quantum resource estimates for the uniform electron gas (UEG), diamond and nickel oxide (NiO). We compare our approach with the second quantization DPW algorithm43 and the first quantization PW algorithm described in ref. 25. The latter bypasses the data loading and uses quantum arithmetic to approximate the Hamiltonian matrix elements on the quantum computer. The DPW can be obtained by applying the unitary transformation to PW and as a result, the same size DPW and PW would provide the same accuracy and can therefore be compared directly. However, the algorithm from ref. 25 can only work with basis sets of size \({({2}^{n}-1)}^{3}\) for integer n, whereas our algorithm requires 23n basis states, and as a result, we cannot compare the identical basis sizes and instead use the closest matches. Since we do not introduce the error due to truncation of the Hamiltonian matrix elements, we distribute the total error, ϵtot, as ϵQPE = 15.8ϵtot/16 and ϵPREP = 0.2ϵtot/16. The rationale for this is to reduce the Toffoli count as much as possible which is mainly affected by ϵQPE.
We provide resource estimates for the UEG with N = 14, 54, 114 electrons in the classically challenging strongly correlated regime. In order to converge the total energy of 14 electrons with 1 mHa accuracy, one would need on the order of 2000 plane waves, and the strongly correlated regime appears when the Winger-Seitz radius, rs, is larger than 3.5 Bohr78. As shown in ref. 78, even for around 100 basis functions there is a significant deviation (a few tens of mHa) of approximate methods from i-FCIQMC79. The energy estimates in the regime with more than 1000 PWs and 14 electrons are not currently available with FCIQMC method due to high computational cost, although we do not state that this is impossible to reach with modern large-scale HPC. A larger system with N = 54, and rs ≥ 5 is also challenging regime for classical algorithms, where different flavors of quantum Monte-Carlo algorithms give energies differing by more than 1 mHa per electron. For N = 114, only few data points are available in the regime rs ≤ 2, and D ≤ 210978,80. Therefore, we consider rs = 5 and the number of basis functions, D, to be 4096. For the 14-electron system, we also present results for D = 512. Resource estimates for these systems are shown in Table 3. Comparison between the four algorithms show that our method substantially outperforms the previous DPW algorithm by Babbush et al.43, requiring only a fraction of logical qubits and reducing the Toffoli count by four orders of magnitude. The min-Qu algorithm also achieves the lowest logical qubit counts across all tested systems, by requiring a much smaller system register compared to the second quantization algorithm, whose system register scales linearly with the number of basis states. The number of logical qubits in the PW algorithm also scales logarithmically in D similarly to the min-Q algorithm, but the qubit overhead of the PW algorithm is larger, and it requires more logical qubits overall. This results into higher physical qubit requirements, despite having overall lower quantum volume for large basis sets.
Next, we consider two examples of realistic materials: diamond and nickel oxide (NiO) in cubic cells with 8 and 64 atoms. While diamond is a relatively simple material to describe using classical electronic structure methods, NiO is the quintessential system for studying the strong correlations due to presence of localized d-electrons and it is often used as a prototype system to test the accuracy of different electronic structure methodologies81,82,83,84,85. All-electron calculations of such a material would require enormous PW basis sets. For example, the much simpler Si would require on the order of 109 PW per Å3 to converge the mean-field eigenvalues with the accuracy of 0.01 eV using the regularized Coulomb potential86. Instead of analyzing such a large basis set, we will focus on the practical size of basis sets which are suitable87,88 for use with pseudopotentials36,37 or the projector augmented-wave method38. For example, a simple conversion from grid-spacing, h, to plane-wave cutoff \({E}_{{\rm{cut}}}=\frac{1}{2}{(\pi /h)}^{2}\)89, for 64-atom diamond and NiO cells would provide Ecut ≈ 750 eV and Ecut ≈ 550 eV kinetic energy cutoff, respectively. Resource estimates with the parameters described are shown in the Table 4. As one can see, the min-Qu algorithm provides the lowest number of logical qubits for small cell calculations and as a result the lowest number of physical qubits, apart from the case of diamond and nickel oxide with 64 atoms cell, where the PW algorithm has lower physical qubit count. However, the number of physical qubits in both approaches differ by less than 10%. The min-T algorithm provides the lowest Toffoli count in all cases apart from diamond represented with a 64-atom cell, where the first quantization PW approach is better. The interesting result is that for a large system with 1152 electrons and ≈ 32000 basis functions, the min-T algorithm which loads the data from QROAM requires only a factor of 1.8 more logical qubits as compared to the first quantization PW approach which completely avoids the data loading. Similarly to the results obtained for the UEG model, the first quantization approach with DPWs (both min-Qu and min-T versions) is significantly better than the second quantization approach regardless of quadratic dependence of the first quantized Hamiltonian on the number of electrons as indicated in Table 4, and allows calculation of realistic systems with lower physical qubit counts than any prior method.
In order to understand why our approach provides a lower Toffoli count for some systems compared to the PW algorithm of ref. 25, we report the Toffoli cost for construction of the quantum walk operator and the one-norm. The one-norm of our approach is lower up to a few tens of thousands of basis functions (Fig. 3d). In Table 5, we show the cost of block-encoding and the one-norm for each system. For diamond and nickel-oxide the norm of our approach is lower than that of ref. 25, and for NiO with 8 atoms in the unit cell, the cost of block encoding is also slightly lower. For NiO with 64 atoms, the block encoding cost is slightly larger by a factor of 1.2, but this cost is compensated by a lower norm which leads to overall smaller Toffoli cost. For diamond with a 64 atom cell, the norm is smaller by a factor of 1.3 but the block encoding cost is larger by a factor of 2.0, which makes the total Toffoli cost larger. For larger basis sets (more then 30,000 basis functions) and for systems studied here, we expect the PW approach of ref. 25 to be better due to lower asymptotic scaling.
Discussion
In this work, we have presented a Pauli LCU decomposition of the Hamiltonian in first quantization, for arbitrary basis sets. This simple form of the LCU allow us to construct an efficient block encoding which can be used with the sparse qubitization technique. In addition to the unary iteration over the electron number, our SELECT circuit involves only CCX and CCZ gates for selecting Pauli strings and the number of such gates scales only logarithmically in the size of the basis set. We observe that our approach achieves asymptotic polynomial speedup in the number of Toffoli gates compared to second quantization with the same basis set. When QROAM is implemented in such a way as to minimize the number of qubits then we see that already for an active space of 14 electrons and 32 orbitals of the \({[{{\rm{Fe}}}_{2}{{\rm{S}}}_{2}]}^{2-}\) molecule, the first quantization algorithm requires fewer qubits compared to its second quantization counterpart. We observe these asymptotic improvements because the one-norm of the Pauli LCU decomposition scales better, but the large prefactor due to explicit dependence on the number of electrons squared can only be overcome for large basis sets.
In recent years, several efficient LCU decompositions of the Hamiltonian in second quantization and corresponding block-encodings have been developed17,18,19,45,90. Exactly the same methods, such as factorization of the Hamiltonian’s matrix elements, can of course be applied in first quantization. It is of great importance to understand if the first quantization approach can bring advantage beyond asymptotic improvements over second quantization when such factorizations are applied.
We have also applied our method to the DPW basis set, which is more suitable for periodic systems. In that case, the SELECT and PREP circuits can be slightly optimized compared to the generic quantum chemistry case. The resource estimates for realistic materials show that in some cases we outperform in some metrics (either qubit or Toffoli counts) the previous first quantization approach of refs. 24,25, which used regular PW. Our approach is more efficient in Toffoli and qubit count than the second quantization approach with DPW43 on all considered examples. However, we recognize that there are regimes when second quantization proves advantageous, for example, when one considers systems with the same number of orbitals as the number of electrons (half-filling).
Our approach with DPW employs classical data loading, yet it still provides quantum resources comparable to the approach of refs. 24,25 in the physically interesting regimes. The reasons for this are the sub-linear scaling of the one-norm of the Pauli LCU in the basis set size and the diagonal form of the Coulomb interaction which significantly reduces the amount of information needed to specify the Hamiltonian. For example, for NiO we achieve a lower T-gate count and the number of logical qubits is only a factor of 1.8 larger than that obtained with pure PW approach, for a PW cutoff that is suitable for use with pseudopotentials. However, since we used data loading, our approach can be used with norm-conserving pseudopotentials, and projector augmented-wave method (Conventional PAW approach38 introduces the non-orthogonal transformations which do not allow for a simple Hamiltonian expression using plane wave or dual plane waves in first quantization Eq. (II.1) and as a result PAW introduces Hamiltonian with non-orthogonal eigenstates. However, the unitary version of PAW (UPAW)55 overcomes these issues.), because the only change would be in the calculation of matrix elements on a classical computer. As is shown in several density functional theory calculations (https://molmod.ugent.be/deltacodesdft)87,88,91, PAW approach demonstrates smaller approximation error than norm-conserving pseudopotentials and often requires fewer number of PW in order to reach the convergence. While there has been generalization of refs. 24,25 on norm-conserving Goedecker-Teter-Hutter pseudopotentials34,35, it is not clear if such an approach can be easily generalized for PAW method or even other pseudopotentials without the data loading.
We also speculate that the algorithms presented here can be further improved. For example, in the DPW basis set, there are only 3D unique Hamiltonian matrix elements and all other matrix elements could be restored by using the translational symmetry. We have not considered this here. One possible way to take this into account is to load unique original matrix elements from QROAM and then calculate the Pauli LCU coefficients (Eq. (II.26) and (II.27)) on the quantum computer (quantum Pauli decomposition). This would reduce the data loading which in turn would make Toffoli cost of the PREPARE to scale as \({\mathcal{O}}(D^{0.5})\) .
The Pauli LCU decomposition of the Hamiltonian (II.13), (II.15) is generic and it can be applied equally to either bosonic or fermionic or even mixed systems. This in turn opens up for possibilities to apply the approach developed here to other application areas such as studying vibrational properties of molecules and materials92,93,94,95. We also hope that the explicit form of the Pauli LCU decomposition can be used to reduce resources in other quantum algorithms such as Trotterization96,97 or, more generally, Hamiltonian simulation23.
Methods
Two-body LCU derivation
In this Section, we derive Eq. (II.11), starting from the second term on the right-hand side of Eq. (II.1). First, we express the tensor product of operators in terms of X and Z Pauli operators in the same way as ref. 44:
We have neglected the spin indices, because these cancel to the identity the same as in the one-body term. Next, we take out the factors of − 1, which are the matrix elements of Hadamard matrices (see definition Eq. (II.9)),
Substituting Eq. (VIII.2) into the two-body term of Eq. (II.1), we find
Next, we change sum index from p to g = p ⊕ q and from r to f = r ⊕ s, giving
where
Since i ≠ j, the two Pauli strings always commute, and as a result
This is the result of Eq. (II.11) with different labels for the indices: g, u, f, v are p, q, r, s, respectively. To turn the two-body LCU coefficients calculation into a matrix multiplication, we form a D2 × D2 Hadamard matrix:
If we rearrange the matrix elements with
the LCU coefficients can be found with matrix multiplication,
Using the fast Hadamard transform, calculating these matrix elements scales as \({\mathcal{O}}({D}^{4}\log D)\), rather than \({\mathcal{O}}({D}^{6})\) for basic matrix multiplication.
Amplitude amplification for equal superpositions
The equal superposition state
needed in PREP can be prepared trivially with Hadamards if L is a power of 2. Otherwise, it is prepared using amplitude amplification; the procedure for this state is described in ref. 17.
In this appendix, we will adapt that procedure and cost the preparation of the equal superposition state
also required in PREP. We use a single iteration of Grover search98 along the lines of Fig. 3 from ref. 17.
In order to achieve a success probability close to one, we augment (VIII.11) by an ancilla rotated by a classical tunable angle β, such that the desired state is
The starting point for Grover search of this state (including the rotated ancilla) is an equal superposition over all computational basis states
The circuit we use to perform the Grover iteration is the following:

It closely follows the textbook circuit98. Initially, the Hadamards prepare the state \(\vert \psi \rangle\). This is followed by a Grover iteration, consisting of reflections about the target state, \((I-2\vert \phi \rangle \langle \phi \vert )\), and about the initial state, \((I-2\vert \psi \rangle \langle \psi \vert )\). Finally, the desired state is flagged in the success qubit.
The overlap of desired state and initial state is
The output of a single iteration Grover circuit is a state98
where \(\vert {\phi }^{\perp }\rangle\) is some state orthogonal to \(\vert \phi \rangle\); the success probability has been amplified from \(\cos \alpha\) to \(\cos (3\alpha )\). Thus, one can ensure a high success probability by choosing β such that \(\cos (3\alpha )=1\Rightarrow \cos \alpha =-1/2\). This is always possible because
so it can always be scaled down by \(\sin (\beta /2+\pi /2)\) to reach \(\cos \alpha =-1/2\).
The optimal value of β must be approximated to accuracy ϵ when compiling the rotations RY(β). In principle, they could be compiled directly into a Clifford+T gate set99 using \(4{\log }_{2}(1/\epsilon )+O(\log \log (1/\epsilon ))\,T\)-gates. However,17 choose the approach of ref. 100 Appendix A, adding a binary representation of β into a phase-gradient ancilla register, which has a similar T count. Yet, the latter reference mentions that for a classically given β, like here, compilation into Clifford+T is slightly faster. For consistency with literature, we follow the approach of ref. 17 using the phase gradient register.
The cost of this circuit is as follows: The Hadamard gates are Clifford gates and therefore require no Toffolis. The two inequality tests (i < N) and (j < N) are implemented as described in appendix H of19, where we can save one Toffoli because N is constant, for a Toffoli cost of \(2(\lceil {\log }_{2}N\rceil -1)\). The final equality test requires \(\lceil {\log }_{2}N\rceil\) Toffolis. If ηN is the largest power of 2 that divides N, the cost of each i < N inequality test is reduced by ηN Toffolis (by ignoring the lowest \({\log }_{2}{\eta }_{N}\) bits/qubits of N and \(\left\vert i\right\rangle\)). Each test also requires \(\lceil {\log }_{2}N\rceil -1\) ancilla not depicted in (VIII.14), and can be uncomputed without additional Toffoli cost using the out-of-place adders from ref. 101. (Essentially, the first ⊕ (i < N) is only the computation step of the addition. The ancillas are kept alive during the CCCZ gate, and the second ⊕ (i < N) is the uncomputation step of the addition). The total cost of all the (in)equality tests in the circuit (VIII.14) is therefore \(6\lceil {\log }_{2}N\rceil -4{\eta }_{N}-4\) Toffolis and \(3\lceil {\log }_{2}N\rceil -3\) ancilla.
The rotation by β using bN bits of precision (typically bN = 8) costs bN − 3 Toffolis, since the angle is known classically, plus a one-time cost for preparing a catalytic state that can be reused for rotations throughout the circuit and is thus negligible. It also requires bN ancilla. The reflection around \(\vert \phi \rangle\) requires 2 Toffolis, the reflection around \(\vert \psi \rangle\) needs \(2\lceil {\log }_{2}N\rceil -1\) Toffolis, and the flagging of the success qubit requires another 2 Toffolis. The total Toffoli cost of the equal superposition circuit is therefore
Toffolis, and requires the use of \(3\lceil {\log }_{2}N\rceil +{b}_{N}-3\) ancillas in parallel. We can reuse the ancilla from the QROAM procedure for most of these ancilla, with two exceptions: We need to keep the phase-gradient register of size bN, so we do not have to prepare it every time, and the success qubit and rotation qubit flag the block encoding and therefore cannot be reused either. The number of additional ancilla we need for this procedure is therefore
Data availability
The electron integrals in molecular orbital basis sets and dual plane waves used to produce results for this paper are available at Zenodo repository [https://doi.org/10.5281/zenodo.13946484].
References
Aspuru-Guzik, A., Dutoi, A. D., Love, P. J. & Head-Gordon, M. Simulated quantum computation of molecular energies. Science 309, 1704–1707 (2005).
Reiher, M., Wiebe, N., Svore, K. M., Wecker, D. & Troyer, M. Elucidating reaction mechanisms on quantum computers. Proc. Natl. Acad. Sci. USA 114, 7555–7560 (2017).
Li, Z., Li, J., Dattani, N. S., Umrigar, C. J. & Chan, G. K.-L. The electronic complexity of the ground-state of the FeMo cofactor of nitrogenase as relevant to quantum simulations. J. Chem. Phys. 150, 024302 (2019).
Cao, Y. et al. Quantum chemistry in the age of quantum computing. Chem. Rev. 119, 10856–10915 (2019).
Bauer, B., Bravyi, S., Motta, M. & Chan, G. K.-L. Quantum algorithms for quantum chemistry and quantum materials science. Chem. Rev. 120, 12685–12717 (2020).
Liu, H. et al. Prospects of quantum computing for molecular sciences. Mater. Theory 6, 11 (2022).
Goings, J. J. et al. Reliably assessing the electronic structure of cytochrome P450 on today’s classical computers and tomorrow’s quantum computers. Proc. Natl. Acad. Sci. USA 119, e2203533119 (2022).
Blunt, N. S. et al. Perspective on the current state-of-the-art of quantum computing for drug discovery applications. J. Chem. Theory Comput. 18, 7001–7023 (2022).
Delgado, A. et al. Simulating key properties of lithium-ion batteries with a fault-tolerant quantum computer. Phys. Rev. A 106, 032428 (2022).
Baiardi, A., Christandl, M. & Reiher, M. Quantum computing for molecular biology. ChemBioChem 24, e202300120 (2023).
Santagati, R. et al. Drug design on quantum computers. Nat. Phys. 20, 549–557 (2024).
Kitaev, A. Y. Quantum measurements and the Abelian Stabilizer Problem. arXiv preprint quant-ph/9511026 (1995).
Lee, S. et al. Evaluating the evidence for exponential quantum advantage in ground-state quantum chemistry. Nat. Commun. 14, 1952 (2023).
Low, G. H. & Chuang, I. L. Hamiltonian simulation by qubitization. Quantum 3, 163 (2019).
Poulin, D., Kitaev, A., Steiger, D. S., Hastings, M. B. & Troyer, M. Quantum Algorithm for spectral measurement with lower gate count. Phys. Rev. Lett. 121, 010501 (2018).
Berry, D. W. et al. Improved techniques for preparing eigenstates of fermionic Hamiltonians. npj Quantum Inf. 4, 22 (2018).
Lee, J. et al. Even more efficient quantum computations of chemistry through tensor hypercontraction. PRX Quantum 2, 030305 (2021).
Von Burg, V. et al. Quantum computing enhanced computational catalysis. Phys. Rev. Res. 3, 033055 (2021).
Berry, D. W., Gidney, C., Motta, M., McClean, J. R. & Babbush, R. Qubitization of arbitrary basis quantum chemistry leveraging sparsity and low rank factorization. Quantum 3, 208 (2019).
Jordan, P. & Wigner, E. Über das paulische äquivalenzverbot. Z. f.ür. Phys. 47, 631–651 (1928).
Ortiz, G., Gubernatis, J. E., Knill, E. & Laflamme, R. Quantum algorithms for fermionic simulations. Phys. Rev. A 64, 022319 (2001).
Rocca, D. et al. Reducing the runtime of fault-tolerant quantum simulations in chemistry through symmetry-compressed double factorization. J. Chem. Theory Comput. 20, 4369–4653 (2024).
Abrams, D. S. & Lloyd, S. Simulation of many-body fermi systems on a Universal quantum computer. Phys. Rev. Lett. 79, 2586–2589 (1997).
Babbush, R., Berry, D. W., McClean, J. R. & Neven, H. Quantum simulation of chemistry with sublinear scaling in basis size. npj Quantum Inf. 5, 92 (2019).
Su, Y., Berry, D. W., Wiebe, N., Rubin, N. & Babbush, R. Fault-tolerant quantum simulations of chemistry in first quantization. PRX Quantum 2, 040332 (2021).
Kassal, I., Jordan, S. P., Love, P. J., Mohseni, M. & Aspuru-Guzik, A. Polynomial-time quantum algorithm for the simulation of chemical dynamics. Proc. Natl Acad. Sci. USA 105, 18681–18686 (2008).
Toloui, B. & Love, P. J. Quantum Algorithms for Quantum Chemistry based on the sparsity of the CI-matrix. arXiv preprint arXiv:1312.2579 (2013).
Whitfield, J. D. Communication: spin-free quantum computational simulations and symmetry adapted states. J. Chem. Phys. 139, 021105 (2013).
Whitfield, J. D. Unified views of quantum simulation algorithms for chemistry. ArXiv:1502.03771 [quant-ph] (2015).
Babbush, R. et al. Exponentially more precise quantum simulation of fermions in the configuration interaction representation. Quantum Sci. Technol. 3, 015006 (2018).
Chan, H. H. S., Meister, R., Jones, T., Tew, D. P. & Benjamin, S. C. Grid-based methods for chemistry simulations on a quantum computer. Sci. Adv. 9, eabo7484 (2023).
Zini, M. S. et al. Quantum simulation of battery materials using ionic pseudopotentials. Quantum 7, 1049 (2023).
Berry, D. W. et al. Quantum simulation of realistic materials in first quantization using non-local pseudopotentials. npj Quantum Inf. 10, 130 (2024).
Goedecker, S., Teter, M. & Hutter, J. Separable dual-space Gaussian pseudopotentials. Phys. Rev. B 54, 1703–1710 (1996).
Hartwigsen, C., Goedecker, S. & Hutter, J. Relativistic separable dual-space Gaussian Pseudopotentials from H to Rn. Phys. Rev. B 58, 3641–3662 (1998).
Hamann, D. R., Schlüter, M. & Chiang, C. Norm-conserving pseudopotentials. Phys. Rev. Lett. 43, 1494–1497 (1979).
Vanderbilt, D. Soft self-consistent pseudopotentials in a generalized eigenvalue formalism. Phys. Rev. B 41, 7892–7895 (1990).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Skylaris, C.-K., Mostofi, A. A., Haynes, P. D., Diéguez, O. & Payne, M. C. Nonorthogonal generalized wannier function pseudopotential plane-wave method. Phys. Rev. B 66, 035119 (2002).
Mostofi, A. A., Skylaris, C.-K., Haynes, P. D. & Payne, M. C. Total-energy calculations on a real space grid with localized functions and a plane-wave basis. Comput. Phys. Commun. 147, 788–802 (2002).
Babbush, R. et al. Low-depth quantum simulation of materials. Phys. Rev. X 8, 011044 (2018).
Low, G. H., Kliuchnikov, V. & Schaeffer, L. Trading t-gates for dirty qubits in state preparation and unitary synthesis. arXiv preprint arXiv:1812.00954 (2018).
Babbush, R. et al. Encoding electronic spectra in quantum circuits with linear T complexity. Phys. Rev. X 8, 041015 (2018).
Georges, T. N., Berntson, B. K., Sünderhauf, C. & Ivanov, A. V. Pauli decomposition via the fast walsh-hadamard transform. New J. Phys. 27, 033004 (2025).
Ivanov, A. V. et al. Quantum computation for periodic solids in second quantization. Phys. Rev. Res. 5, 013200 (2023).
Krinner, S. et al. Realizing repeated quantum error correction in a distance-three surface code. Nature 605, 669–674 (2022).
Acharya, R. et al. Suppressing quantum errors by scaling a surface code logical qubit. arXiv: 2207.06431 (2022).
Litinski, D. Magic state distillation: not as costly as you think. arXiv:1905.06903 [quant-ph] (2019).
Litinski, D. A game of surface codes: large-scale quantum computing with lattice surgery. Quantum 3, 128 (2019).
Gidney, C. & Fowler, A. G. Efficient magic state factories with a catalyzed \(\left\vert CCZ\right\rangle\) to \(2\left\vert T\right\rangle\) transformation. Quantum 3, 135 (2019).
Campbell, E. T. & Howard, M. Unified framework for magic state distillation and multiqubit gate synthesis with reduced resource cost. Phys. Rev. A 95, 022316 (2017).
Jones, C. Novel constructions for the fault-tolerant Toffoli gate. Phys. Rev. A 87, 022328 (2013).
Eastin, B. Distilling one-qubit magic states into Toffoli states. Phys. Rev. A 87, 032321 (2013).
Fowler, A. G., Mariantoni, M., Martinis, J. M. & Cleland, A. N. Surface codes: towards practical large-scale quantum computation. Phys. Rev. A 86, 032324 (2012).
Ivanov, A. V. et al. Quantum computation of electronic structure with projector augmented-wave method and plane wave basis set (2024). arXiv:2408.03159 [quant-ph].
Dunning Jr, T. H. Gaussian basis sets for use in correlated molecular calculations. i. the atoms boron through neon and hydrogen. J. Chem. Phys. 90, 1007–1023 (1989).
Kendall, R. A., Dunning, T. H. & Harrison, R. J. Electron affinities of the first-row atoms revisited. systematic basis sets and wave functions. J. Chem. Phys. 96, 6796–6806 (1992).
Beinert, H., Holm, R. H. & Munck, E. Iron-sulfur clusters: nature’s modular, multipurpose structures. Science 277, 653–659 (1997).
Rouault, T. A. (ed.) Iron-sulfur clusters in chemistry and biology, 2 (de Gruyter, 2017).
Volbeda, A. et al. Crystal structure of the nickel–iron hydrogenase from desulfovibrio gigas. Nature 373, 580–587 (1995).
Pandelia, M.-E. et al. Electronic structure of the unique [4fe-3s] cluster in o2-tolerant hydrogenases characterized by 57fe mössbauer and epr spectroscopy. Proc. Natl. Acad. Sci. USA 110, 483–488 (2013).
Izsák, R., Ivanov, A. V., Blunt, N. S., Holzmann, N. & Neese, F. Measuring electron correlation: the impact of symmetry and orbital transformations. J. Chem. Theory Comput. 19, 2703–2720 (2023).
Ganoe, B. & Shee, J. On the notion of strong correlation in electronic structure theory. Faraday Discussions 254, 53–75 (2024).
Sharma, S., Sivalingam, K., Neese, F. & Chan, G. K.-L. Low-energy spectrum of iron–sulfur clusters directly from many-particle quantum mechanics. Nat. Chem. 6, 927–933 (2014).
Dobrautz, W., Weser, O., Bogdanov, N. A., Alavi, A. & Li Manni, G. Spin-pure stochastic-casscf via guga-fciqmc applied to iron–sulfur clusters. J. Chem. Theory Comput. 17, 5684–5703 (2021).
Tazhigulov, R. N. et al. Simulating models of challenging correlated molecules and materials on the sycamore quantum processor. PRX Quantum 3, 040318 (2022).
Balabanov, N. B. & Peterson, K. A. Systematically convergent basis sets for transition metals. i. all-electron correlation consistent basis sets for the 3d elements sc-zn. J. Chem. Phys 123, 064107 (2005).
De Jong, W. A., Harrison, R. J. & Dixon, D. A. Parallel Douglas–Kroll energy and gradients in nwchem: estimating scalar relativistic effects using Douglas–Kroll contracted basis sets. J. Chem. Phys. 114, 48–53 (2001).
Woon, D. E. & Dunning Jr, T. H. Gaussian basis sets for use in correlated molecular calculations. iii. the atoms aluminum through argon. J. Chem. Phys. 98, 1358–1371 (1993).
Liu, W. Ideas of relativistic quantum chemistry. Mol. Phys. 108, 1679–1706 (2010).
Saue, T. Relativistic Hamiltonians for chemistry: a primer. ChemPhysChem 12, 3077–3094 (2011).
Peng, D. & Reiher, M. Exact decoupling of the relativistic Fock operator. Theor. Chem. Acc. 131, 1081 (2012).
Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 38, 3098–3100 (1988).
Perdew, J. P. Density-functional approximation for the correlation energy of the inhomogeneous electron gas. Phys. Rev. B 33, 8822–8824 (1986).
Sun, Q. Co-iterative augmented hessian method for orbital optimization. arXiv preprint arXiv:1610.08423 (2016).
Lehtola, S. & Jónsson, H. Pipek–mezey orbital localization using various partial charge estimates. J. Chem. Theory Comput. 10, 642–649 (2014).
Pipek, J. & Mezey, P. G. A fast intrinsic localization procedure applicable for abinitio and semiempirical linear combination of atomic orbital wave functions. J. Chem. Phys. 90, 4916–4926 (1989).
Lee, J., Malone, F. D. & Morales, M. A. An auxiliary-field quantum Monte Carlo perspective on the ground state of the dense uniform electron gas: an investigation with Hartree-Fock trial wavefunctions. J. Chem. Phys. 151, 064122 (2019).
Shepherd, J. J., Booth, G. H. & Alavi, A. Investigation of the full configuration interaction quantum Monte Carlo method using homogeneous electron gas models. J. Chem. Phys. 136, 244101 (2012).
Kwon, Y., Ceperley, D. & Martin, R. M. Effects of backflow correlation in the three-dimensional electron gas: Quantum monte carlo study. Phys. Rev. B 58, 6800 (1998).
Cui, Z.-H., Zhu, T. & Chan, G. K.-L. Efficient implementation of ab initio quantum embedding in periodic systems: density matrix embedding theory. J. Chem. Theory Comput. 16, 119–129 (2020).
Zhang, S., Malone, F. D. & Morales, M. A. Auxiliary-field quantum Monte Carlo calculations of the structural properties of nickel oxide. J. Chem. Phys. 149, 164102 (2018).
Peng, H. & Perdew, J. P. Synergy of van der Waals and self-interaction corrections in transition metal monoxides. Phys. Rev. B 96, 100101 (2017).
Mitra, C., Krogel, J. T., Santana, J. A. & Reboredo, F. A. Many-body ab initio diffusion quantum Monte Carlo applied to the strongly correlated oxide NiO. J. Chem. Phys. 143, 164710 (2015).
Korotin, D. et al. Construction and solution of a Wannier-functions based Hamiltonian in the pseudopotential plane-wave framework for strongly correlated materials. Eur. Phys. J. B 65, 91–98 (2008).
Gygi, F. All-electron plane-wave electronic structure calculations. J. Chem. Theory Comput. 19, 1300–1309 (2023).
Lejaeghere, K. et al. Reproducibility in density functional theory calculations of solids. Science 351, aad3000 (2016).
Prandini, G., Marrazzo, A., Castelli, I. E., Mounet, N. & Marzari, N. Precision and efficiency in solid-state pseudopotential calculations. npj Comput. Mater. 4, 72 (2018).
Briggs, E. L., Sullivan, D. J. & Bernholc, J. Real-space multigrid-based approach to large-scale electronic structure calculations. Phys. Rev. B 54, 14362–14375 (1996).
Rubin, N. C. et al. Fault-tolerant quantum simulation of materials using bloch orbitals. PRX Quantum 4, 040303 (2023).
Bosoni, E. et al. How to verify the precision of density-functional-theory implementations via reproducible and universal workflows. Nat. Rev. Phys. 6, 45–58 (2023).
McArdle, S., Mayorov, A., Shan, X., Benjamin, S. & Yuan, X. Digital quantum simulation of molecular vibrations. Chem. Sci. 10, 5725–5735 (2019).
Sawaya, N. P. D. et al. Resource-efficient digital quantum simulation of d-level systems for photonic, vibrational, and spin-s Hamiltonians. npj Quantum Inf. 6, 1–13 (2020).
Sawaya, N. P. D., Paesani, F. & Tabor, D. P. Near- and long-term quantum algorithmic approaches for vibrational spectroscopy. Phys. Rev. A 104, 062419 (2021).
Trenev, D. et al. Refining resource estimation for the quantum computation of vibrational molecular spectra through Trotter error analysis. Quantum 9, 1630 (2025).
Suzuki, M. General theory of fractal path integrals with applications to many-body theories and statistical physics. J. Math. Phys. 32, 400–407 (1991).
Childs, A. M., Maslov, D., Nam, Y., Ross, N. J. & Su, Y. Toward the first quantum simulation with quantum speedup. Proc. Natl. Acad. Sci. USA 115, 9456–9461 (2018).
Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information: 10th Anniversary Edition (Cambridge University Press, 2010).
Ross, N. J. & Selinger, P. Optimal ancilla-free clifford+t approximation of z-rotations. arXiv preprint arXiv:1403.2975 (2014).
Sanders, Y. R. et al. Compilation of fault-tolerant quantum heuristics for combinatorial optimization. PRX Quantum 1, 020312 (2020).
Gidney, C. Halving the cost of quantum addition. Quantum 2, 74 (2018).
Sun, Q. et al. Recent developments in the PySCF program package. J. Chem. Phys. 153, 024109 (2020).
Sun, Q. et al. PySCF: the Python-based simulations of chemistry framework. WIREs Comput. Mol. Sci. 8, e1340 (2018).
Sun, Q. Libcint: an efficient general integral library for Gaussian basis functions. J. Comput. Chem. 36, 1664–1671 (2015).
Larsen, A. H. et al. The atomic simulation environment-"a python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Bahn, S. R. & Jacobsen, K. W. An object-oriented scripting interface to a legacy electronic structure code. Comput. Sci. Eng. 4, 56–66 (2002).
Mortensen, J. J. et al. GPAW: an open Python package for electronic structure calculations. J. Chem. Phys. 160, 092503 (2024).
Enkovaara, J. et al. Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method. J. Phys. Condens. Matter 22, 253202 (2010).
Mortensen, J. J., Hansen, L. B. & Jacobsen, K. W. Real-space grid implementation of the projector augmented wave method. Phys. Rev. B 71, 035109 (2005).
McClean, J. R. et al. Openfermion: the electronic structure package for quantum computers. Quantum Sci. Technol. 5, 034014 (2020).
Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38 (1996).
Acknowledgements
We thank Nick Blunt for discussions, carefully reading the manuscript and providing valuable suggestions, as well as Andrew Patterson for assistance with quantum error correction resource estimates. We also thank Matt Ord for improvements to the resource estimation software and assistance deriving the LCU decomposition. The work presented in this paper was part funded by a grant from Innovate UK under the ‘Feasibility Studies in Quantum Computing Applications’ competition (Project Number 10074148). M.B. is a Sustaining Innovation Postdoctoral Research Associate at Astex Pharmaceuticals and thanks Astex Pharmaceuticals for funding, as well as his Astex colleague Patrick Schoepf for his support.
Author information
Authors and Affiliations
Contributions
T.N.G. derived LCU decomposition with input from A.V.I. and B.K.B. and wrote “LCU decomposition in first quantization” and “Two-body LCU derivation”. T.N.G. and A.V.I. carried out quantum chemistry calculations with input from R.I. T.N.G. carried out numerical analysisfor “Molecular orbitals. Scaling properties for basic models” and “Quantum resource estimates for iron-sulfur complex” with input from A.V.I. and R.I. M.B. and A.V.I. carried out numerics for “Scaling properties using dual plane waves” and “Resource estimates for materials”. A.V.I. and C.S. worked out the SELECT operator, C.S. derived the detailed cost of the algorithm and wrote “Block encoding. arbitrary basis sets” and “Amplitude amplification for equal superpositions”. M.B. contributed to the costing of amplitude amplification (“Amplitude amplification for equal superpositions”). R.I. wrote about the iron-sulphur complex. A.V.I. wrote the final version of the paper with input from all authors. A.V.I. conceived of and supervised the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Georges, T.N., Bothe, M., Sünderhauf, C. et al. Quantum simulations of chemistry in first quantization with any basis set. npj Quantum Inf 11, 55 (2025). https://doi.org/10.1038/s41534-025-00987-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41534-025-00987-1