Introduction

Quantum algorithms have the potential to provide speed-up over classical algorithms for some problems1,2. However, certain quantum algorithms may require non-trivial input states3,4, which in general, are challenging to prepare and may require exponentially many CX gates5 in the worst case. Quantum state preparation (QSP) is an active area of research within the space of quantum computation6,7,8. The resource overhead of general QSP problems for arbitrary states is known to be \({\mathcal{O}}({2}^{n})\) in the number of CX gates, where n is the number of qubits5,9. Within this limit, there are different techniques utilizing different resources, which can be useful depending on the context. For instance, if depth is more of a concern than space, ref. 10 provides an \({\mathcal{O}}({n}^{2})\) depth divide-and-conquer method to prepare arbitrary states at the cost of needing \({\mathcal{O}}({2}^{n})\) ancillary qubits. In ref. 11, the ancilla overhead is improved to \({\mathcal{O}}(n)\). For arbitrary QSP, the state-of-the-art deterministic protocol achieves \({\mathcal{O}}(\frac{23}{24}{2}^{n})\) CX scaling on even numbers of qubits12. The downside of this method is that it relies on the Schmidt decomposition of the n-qubit target state, a computationally expensive task, which has a classical run time of \({\mathcal{O}}({2}^{3n/2})\)13.

While arbitrary state preparation is exponential, there are states of practical interest that do not require exponential resources, even in the worst case14,15. Manual methods are strategies for QSP that use advanced knowledge of the target state to save resources in ways that might not be obvious when preparing arbitrary states. For example, GHZ states are highly entangled, but only require a linear number of CX gates. Similarly, the methods introduced in ref. 16 require \({\mathcal{O}}(kn)\) CX gates to prepare n-qubit Dicke states of Hamming weight k. For machine learning applications on classical data, quantum data loaders can prepare sparse amplitude encodings of d-dimensional real-valued vectors on d-qubits in \({\mathcal{O}}(\log d)\) circuit depth17. In ref. 18, Zhang et al. introduce a technique to prepare input states to the HHL algorithm3 based on the finite element method equations relevant to electromagnetic problems. Zhang et al.’s algorithm achieves \({\mathcal{O}}(n)\) depth scaling by exploiting the symmetry and sparsity of the relevant target states; however, these useful assumptions are not guaranteed for arbitrary states.

The state of the art ancilla free method for preparing arbitrary sparse states consisting of m 2n non-zero amplitude computational basis states is detailed in ref. 19, which produces circuits with \({\mathcal{O}}(nm)\) CX gates and runs in \({\mathcal{O}}(n{m}^{2}\log (m))\) time classically. This type of method is attractive for real-world applications of quantum computers because it can take advantage of sparsity while still remaining agnostic to particular details of the state. Other methods achieve this, but require ancillas20,21. The method using decision trees21 requires 1 ancilla qubit. The method in ref. 22 also achieves \({\mathcal{O}}(nm)\) CX gates. However, this method is based on Householder reflections and is more complicated than the one introduced in ref. 19.

In this work we show how continuous-time quantum walks (CTQWs) on dynamic graphs can be used to create arbitrary sparse (and dense) quantum states. CTQWs on graphs are a universal model of computation23 that excels at spatial searches24,25 and has applications in finance26, coherent transport on networks27, modeling transport in geological formations28, and combinatorial optimization29. In this model, the quantum state vector is evolved under the action of eiAt for some time t, where A is an adjacency matrix of some fixed unweighted graph. Such walks have been implemented natively on photonic chips30,31,32.

In 2019, CTQWs on dynamic graphs (i.e., graphs that may change as a function of time) were introduced and shown to also be universal for computation33 by implementing the universal gate set H, CX, and T.

In the original dynamic graph model, isolated vertices were propagated as singletons. However, a new model where isolated vertices are not propagated was introduced later in ref. 34. Simplification techniques were introduced that can reduce the length of the dynamic graph sequence if graphs in the sequence satisfy certain properties35. Furthermore, the authors of ref. 36 found that CTQWs on at most three dynamic graphs can be used to implement the equivalent of a universal gate set.

While prior work has focused on representing quantum gate operations as CTQWs, the opposite conversion – generating a quantum circuit from a CTQW – is less well-studied. Since CTQWs on dynamic graphs have yet to be natively implemented on hardware, and given recent progress in the development of gate-based quantum hardware37,38,39,40, converting CTQWs to the gate model is necessary to execute them on existing quantum hardware. Additionally, developing an algorithm for such a conversion offers an alternative way of thinking about circuit design, which may result in simpler circuits, compiler optimization techniques, and new ansatzes for QAOA41,42,43,44, MA-QAOA45,46,47 or VQE48,49,50, where the trainable parameters correspond to dynamic graph propagation times.

To this end, we develop an algorithm that converts CTQWs on single-edge graphs and single self-loop graphs, which can serve as a basis for arbitrary graphs, to a sequence of gates in the circuit model. This conversion algorithm is used as a foundation for a new deterministic arbitrary quantum state preparation framework, where the CX count of the resulting circuit is linear in the number of m non-zero amplitude computational basis states that comprise the target state and the number of qubits n. Figure 1 gives a general picture of the quantum walks QSP framework introduced in this paper. Our framework does not require ancillas.

Fig. 1: Alternating single-edge and self-loop quantum walks yields a framework for arbitrary QSP.
figure 1

We provide conversion methods to construct the gate-based circuit.

Our approach builds on the idea of exploiting sparsity to increase the efficiency of QSP techniques by approaching the problem from the perspective of dynamic quantum walks. As such, our method is best suited for the asymptotically sparse states (\(m={\mathcal{O}}(poly(n))\)), but can, in principle, also work for the asymptotically dense states (\(m={\mathcal{O}}({2}^{n})\)). The walk framework that we present here is flexible, scalable, and intuitive, building upon well-established graph-based algorithms to efficiently prepare sparse quantum states. We also show that the state of the art ancilla free merging states method of ref. 19 is encompassed in our CTQW state preparation framework.

A dynamic graph is defined as a sequence of ordered pairs \({\{({G}_{i},{t}_{i})\}}_{i = 1}^{\ell }\) for some \(\ell \in {\mathbb{N}}\) that consists of unweighted graphs Gi and corresponding propagation times ti. A CTQW on a dynamic graph is then a CTQW where the first walk is performed on graph G1 for time t1, followed by a walk on graph G2 for time t2, and so on until all graphs in the sequence have been walked upon. The final state \(\left\vert {\psi }_{\ell }\right\rangle\) is then given by

$$\left\vert {\psi }_{\ell }\right\rangle ={e}^{-i{A}_{\ell }{t}_{\ell }}\ldots {e}^{-i{A}_{2}{t}_{2}}{e}^{-i{A}_{1}{t}_{1}}\left\vert {\psi }_{0}\right\rangle .$$
(1)

where Ai is the adjacency matrix for graph Gi, that is a 0-1 valued symmetric matrix. For simplicity, we assume that each graph has 2n vertices for some \(n\in {\mathbb{N}}\), and each vertex represents a computational basis state.

As an example, consider the dynamic graph found in Fig. 2, where the adjacency matrices are

$$\begin{array}{rcl}{A}_{1}&=&\left(\begin{array}{cccc}0&0&0&0\\ 0&0&0&0\\ 0&0&0&1\\ 0&0&1&0\end{array}\right)\\ {A}_{2}&=&\left(\begin{array}{cccc}0&0&0&0\\ 0&0&0&0\\ 0&0&1&0\\ 0&0&0&1\end{array}\right)\end{array}$$
Fig. 2
figure 2

Dynamic CTQW implementation of CX gate.

If the walker starts in the initial state \(\left\vert {\psi }_{0}\right\rangle ={(a,b,c,d)}^{T}\), then the final state of this walker is

$$\left\vert {\psi }_{2}\right\rangle =\left(\begin{array}{cccc}1&0&0&0\\ 0&1&0&0\\ 0&0&i&0\\ 0&0&0&i\end{array}\right)\left(\begin{array}{cccc}1&0&0&0\\ 0&1&0&0\\ 0&0&0&-i\\ 0&0&-i&0\end{array}\right)\left(\begin{array}{c}a\\ b\\ c\\ d\end{array}\right)=\left(\begin{array}{c}a\\ b\\ d\\ c\end{array}\right)$$

which is equivalent to a CX gate in the circuit model.

Interestingly, CTQWs on dynamic graphs that correspond to operations in the gate model tend to have a similar form: some phase is added to particular basis states via self-loops on vertices of the graph, then they are connected by edges to allow for state transfer, followed by additional self-loops to eliminate unwanted phase. This general alternating sequence of graphs inspires the state preparation algorithm introduced later in this work.

This paper is organized as follows. In “Results”, we present an algorithm for converting single-edge and single self-loop CTQWs to the circuit model and give a general overview of our deterministic state preparation method. Next, we compare various walk orders to the sparse state preparation method from ref. 19 and to the uniformly controlled rotation method5,9 used by Qiskit. In both cases, we found walk orders that require fewer CX gates. Then we discuss the impact of the results and directions for future work in “Dscussion”. Finally, combinatorics methods for selecting walking orders that reduce the number of controlled gates in the QSP circuits and additional optimizations are presented in “Methods”.

Results

A standard basis for any adjacency matrix is given by the set of adjacency matrices corresponding to all single-edge walks and all single self-loop walks. We introduce the conversion methods that can construct the gate-based representations for these walks on n qubits. Throughout this paper, the considered states are represented in the standard computational basis set.

First, we introduce the conversion from single-edge CTQWs to the circuit model. The adjacency matrix for a single-edge graph connecting basis states \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) is given by

$$A(j,k)=\left\vert j\right\rangle \left\langle k\right\vert +\left\vert k\right\rangle \left\langle j\right\vert$$
(2)

and the corresponding CTQW for time t is

$$\begin{array}{ll}U(j,k;t)\,=\,{e}^{-itA(j,k)}=\cos (t)(\left\vert j\right\rangle \left\langle j\right\vert +\left\vert k\right\rangle \left\langle k\right\vert )\\\qquad\qquad\,\,-i\sin (t)(\left\vert j\right\rangle \left\langle k\right\vert +\left\vert k\right\rangle \left\langle j\right\vert )+\sum _{l\notin \{j,k\}}\left\vert l\right\rangle \left\langle l\right\vert \end{array}$$
(3)

This unitary transfers amplitude between states \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) and leaves all other states untouched.

If \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) differ in exactly one bit at position l, U(j, k; t) is the (n − 1)-controlled Rx(2t) gate

$$\,\text{Rx}\,(2t)=\left(\begin{array}{cc}\cos (t)&-i\sin (t)\\ -i\sin (t)&\cos (t)\end{array}\right)$$
(4)

where the target qubit is l, and the remaining qubits are either 0- or 1-controls corresponding to the remaining bits of \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\).

If the Hamming distance between \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) is greater than 1, using the CRx(2t) gate is still possible, but requires some extra overhead. First apply a sequence of CX gates where the control is any bit in which \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) differ, and the targets are the other bits in which \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) differ. Then apply the CRx(2t) gate as before and apply the reverse of the conjugating CX sequence. An example conversion of a single-edge walk to gates is shown in Fig. 3. The first CX gate transforms the two basis states such that their Hamming distance becomes equal to 1. The CRx creates the superposition, and the final CX restores the original basis states.

Fig. 3
figure 3

When a single-edge walk is between states with non-unit Hamming distance from each other, CX gates might be required in addition to the CRx gate.

The (n − 1)-controlled CRx gate is a multi-controlled special unitary, and can be decomposed in \({\mathcal{O}}(n)\) CX and single-qubit gates51.

Next, we introduce the conversion from single self-loop CTQWs to the circuit model. A single self-loop walk on the basis state \(\left\vert j\right\rangle\) is given by the adjacency matrix

$$A(j)=\left\vert j\right\rangle \left\langle j\right\vert ,$$
(5)

The corresponding CTQW for time t is

$$U(j;t)={e}^{-itA(j)}={e}^{-it}\left\vert j\right\rangle \left\langle j\right\vert +\sum _{k\ne j}\left\vert k\right\rangle \left\langle k\right\vert ,$$
(6)

which is a diagonal matrix with a single non-unit element on the diagonal.

From this equation, one can see that the self-loop walk on a single vertex is equivalent to adding phase to exactly one computational basis state. This operation corresponds to an application of an (n − 1)-controlled phase gate P for time − t

$$\,\text{P}\,(t)=\left(\begin{array}{cc}1&0\\ 0&{e}^{it}\end{array}\right),$$
(7)

where the target qubit can be arbitrarily chosen among the bits of j that are equal to 1 (or 0, with the appropriate X conjugation), and the remaining qubits are either 0- or 1-controls corresponding to the remaining bits of \(\left\vert j\right\rangle\).

However, the P(t) gate is not a special unitary. As such, without using ancilla qubits, its decomposition requires \({\mathcal{O}}({n}^{2})\) CX gates52. A more efficient (but also more nuanced) approach is to use an (n − 1)-controlled Rz(2t) gate

$$\,\text{Rz}\,(2t)=\left(\begin{array}{cc}{e}^{-it}&0\\ 0&{e}^{it}\end{array}\right)$$
(8)

which is a special unitary and can be decomposed in \({\mathcal{O}}(n)\) CX gates51.

At a first glance, an (n − 1)-controlled Rz gate affects two adjacent computational basis states, which makes it not equivalent to a self-loop walk in general. However, if only one of these two basis states exists in the state affected by the CRz gate, then it becomes essentially equivalent to the CP gate and can also be used to implement a self-loop walk.

As long as the state we apply the walk to (\(\left\vert {\psi }_{0}\right\rangle\) in Eq. (1)) has at least one zero-amplitude basis state \(\left\vert z\right\rangle\), not necessarily adjacent to \(\left\vert j\right\rangle\), one can use the same CX conjugation technique as described in the previous section for the Rx gate. This enables the interaction between \(\left\vert j\right\rangle\) and \(\left\vert z\right\rangle\) via the CRz gate and implements the self-loop walk on \(\left\vert j\right\rangle\).

In the case when \(\left\vert {\psi }_{0}\right\rangle\) is fully dense, i.e., all 2n basis states have non-zero amplitudes, the above method will not work and a CP gate will have to be used instead.

As it was mentioned in the Introduction, CTQWs on dynamic graphs are known to be universal33. However, the authors of ref. 33 use walks on arbitrary graphs in their proof of universality. In this section, we prove that using only single-edge and single self-loop walks is sufficient to decompose an arbitrary unitary.

Proposition 1

An arbitrary d × d unitary can be decomposed into a series of single self-loop and single-edge CTQWs.

Proof

The case of d = 1 is trivial so we consider d ≥ 2. An arbitrary unitary can be decomposed into a series of 2-level unitaries (unitaries that act non-trivially on two or fewer basis states, pages 189–191 of ref. 53).

Similarly to how an arbitrary single-qubit unitary can be decomposed as U = W(α)Rz(β)Rx(γ)Rz(δ) (page 175 of ref. 53), where W = eiαI, an arbitrary 2-level unitary U can be decomposed as

$$U={\text{W}}^{{\prime} }(\alpha ){\text{Rz}}^{{\prime} }(\beta ){\text{Rx}}^{{\prime} }(\gamma ){\text{Rz}}^{{\prime} }(\delta )$$
(9)

for some real numbers α, β, γ, δ, where the W′, Rz′ and Rx′ are 2-level unitaries whose action in the corresponding 2D subspace is equivalent to their single-qubit counterparts.

As it was shown in the introduced conversions from CTQWs to the circuit model, Eqs. (3) and (6), a single-edge CTQW corresponds to Rx′, and a sequence of two single self-loop CTQWs corresponds to W′ and Rz′. Thus, an arbitrary unitary can be decomposed into a series of single self-loop CTQWs and single-edge CTQWs.□

In this section, we describe how a series of CTQWs on dynamic graphs can be used to prepare an arbitrary quantum state.

Given the task of preparing a quantum state with m < 2n non-zero amplitudes on n qubits, we start by presenting a high-level overview of the framework to accomplish this task:

  1. 1.

    Construct a tree graph connecting the non-zero amplitudes of the target state in some way.

  2. 2.

    Starting from some node, and following a graph traversal order (see “Methods”), perform a self-loop walk for each encountered node and a single-edge quantum walk for each encountered edge.

  3. 3.

    Convert the sequence of the quantum walks to the circuit representation (which may include control reduction and CX forward- and backward-propagation, as described below).

Many different heuristics can be used to construct the sequence of walks in step 1 or choose a particular traversal order in step 2. Initially, the system starts with all amplitude in the root state, which can be constructed from the ground \({\left\vert 0\right\rangle }^{\otimes n}\) state by applying the X gates to the necessary qubits.

An arbitrary amount of amplitude (absolute value) from the source can be transferred to the destination for any pair of the connected states corresponding to each single-edge walk, and all non-zero amplitude states are connected transitively via a tree. Therefore, an arbitrary distribution of absolute values of amplitude (but not phases) can be established on the set of non-zero amplitude basis states via single-edge walks.

Each self-loop walk on a basis state \(\left\vert {z}_{i}\right\rangle\) establishes the correct phase of the corresponding coefficient ci, but does not change its absolute value of it. Therefore, performing self-loop walks on each non-zero amplitude basis state establishes the correct phases of the corresponding coefficients, without interfering with the absolute values established by the single-edge walk. Hence, an arbitrary quantum state can be prepared with a sequence of single-edge walks and self-loop walks, as described in the algorithm above.

In the context of state preparation, CRz gates can easily be used instead of CP gates to implement the self-loop walks, since every single-edge walk transfers the amplitude to a new basis state with zero amplitude. Thus, as described in the previous section, the same basis state can be used to establish the correct phase on the source state for the single-edge walk without additional CX conjugations. However, this will not apply to the leaf nodes (degree one vertices) in the tree, which will require another zero-amplitude state to interact with.

The tree contains \({\mathcal{O}}(m)\) nodes and edges, therefore, \({\mathcal{O}}(m)\) walks will be required. As described in the previous section, each walk is represented by a single multi-controlled gate (CRz or CRx) that has up to n − 1 controls. Each such gate can be implemented in \({\mathcal{O}}(n)\) CX gates51 and may need to be conjugated by \({\mathcal{O}}(n)\) additional CX gates due to the Hamming distance between the adjacent basis states. Therefore, the overall complexity of the algorithm, in terms of the required number of CX gates, is \({\mathcal{O}}(nm)\).

Next, we discuss the advantages of this framework. The CTQW approach to state preparation allows us to examine the problem from a graph-based point of view. This is a powerful perspective, as it enables us to leverage graph topology, such as connectivity patterns captured by algorithms like the minimum spanning tree, and enforce specific tree structures that can be helpful for developing new state preparation heuristics. For example, we can enforce a linear topology (see Fig. 5) or a star-shaped topology. Moreover, the state-of-the-art ancilla-free sparse state preparation method, referred to as Merging States (MS) method in ref. 19 fits in the CTQW framework we present in this paper. From examination, we determined that MS typically corresponds to star-shaped graphs.

Next, we show that the merging states method forms a tree graph representing a CTQW over the basis states of the target state with non-zero amplitude. Since MS eventually results in a single merged state, it must form a tree graph where nodes that are associated with the states being merged all feed into a single node. Thus, the MS is analogous to our splitting method described above, but run in reverse. When performing the merging or splitting, the basis states must be brought within a Hamming distance of 1. After merging or splitting of two basis states, the protocol can proceed in the rotated frame or be restored to the original basis. In MS, the protocol proceeds in the rotated frame. An example of MS represented in the CTQW framework is provided in Fig. 4. Second, the merging states method relies on a controlled-G gate for merging. This simply corresponds to a particular sequence of CTQWs, given by the pattern of single self-loop, single-edge, and single self-loop.

Fig. 4: Given a set of bit strings over which to prepare the superposition state, this walk-based family of methods applies a heuristic for choosing an ordering of the basis states and then applies a sequence of merging (or equivalently splitting) operations to reduce (expand) an initial state to a desired endpoint.
figure 4

Each merge (split) operation can be interpreted as a sequence of quantum walks. Ref. 19 uses the controlled G Gate as the primary merging operation, but this gate is equivalent to the action of a controlled P, controlled Rx, and controlled P gate, which are the CTQWs in this work.

The following section demonstrates that, while the asymptotic scaling is equivalent, the level of sparsity in the state to be prepared can play a large role in the best heuristic to choose. Indeed, there exists the possibility of mixing and matching heuristics during the ordering process to get the best performance across many situations. We leave such an investigation to future work.

In this section, we present the comparison between the results of numerical simulations obtained for different walk orders of our single-edge state preparation method, the Merging States (MS) method introduced in ref. 19, and Qiskit’s built-in state preparation method based on uniformly controlled rotations ref. 5,9. The error bars in the figures denote the 95% confidence interval for the estimate of the mean.

First, we compare the performance of different walk orders presented in “Methods”, which is shown in Fig. 5. The greedy methods use the initial ordering specified in parentheses. For the “Greedy Combined” results, we consider both Greedy and the method in parentheses and choose the one with the lower CX count.

Fig. 5: Comparison of walk orders with m = n.
figure 5

Greedy(MHS) (Greedy(Sorted)) uses MHS Linear (Sorted) as the initial ordering. Each data point is averaged over 1000 random states.

As seen from the figure, all considered walking orders have similar performance, with Random, Sorted, SHP, and MST requiring more controlled gates than the others. MHS Linear and Nonlinear outperform Greedy(Sorted) on average except at 5 qubits where only MHS Linear beats Greedy. As expected, Greedy(MHS) Combined performs the best on average.

In Fig. 6a and b, we compare the performance of the best walk order heuristic we found (MHS Nonlinear) to that of the Merging States method and Qiskit’s built-in method (which is based on ref. 9). Just as in the previous section, an average CX gate count for all methods is calculated for a set of 1000 randomly generated sparse states (m = n and m = n2 non-zero amplitude basis states) for every value of n from 5 up to 11.

Fig. 6
figure 6

CX count comparison between the single-edge method with the best walk order (MHS Nonlinear) presented in this paper, the state-of-the-art sparse state preparation method (Merging States), and Qiskit’s built-in state preparation method for (a) m = n and (b) m = n2. Each data point is averaged over 1000 randomly generated states.

As can be seen from Fig. 6a and b, Qiskit’s performance scales exponentially and is very similar regardless of the value of m. In fact, for m = n2 and m = 2n−1, Qiskit produces circuits with exactly the same number of CX gates regardless of the state being prepared.

In contrast to Qiskit, our method takes advantage of the sparsity of the target state and is expected to outperform Qiskit’s built-in method for any asymptotically sparse state (i.e., \(m={\mathcal{O}}(poly(n))\)) for sufficiently large values of n. Our method also outperforms MS for the considered cases, but our walk order heuristic seems to excel only in the linear case. From numerical examination, it appears that for m = n the gap increases, but for the m = n2 case, a crossover happens at n = 13. All the discussed CTQWs (including MS) have the same asymptotic scaling.

The exact code and data for these numerical experiments can be found in the linked repository (see Data Availability).

Discussion

In this work, we developed an algorithm that can be used to convert CTQWs on dynamic graphs that consist only of self-loops or single edges to the quantum gate model. This algorithm serves as the basis for a deterministic state preparation framework that has complexity of \({\mathcal{O}}(nm)\) where n is the number of qubits and m is the number of basis states in the target state. Our framework encompasses MS19. Furthermore, we introduce multiple methods that can reduce the CX count of the state preparation algorithm. We test various walk orders against the MS method and uniformly controlled rotation method9. We find that our MHS Nonlinear heuristic excels and requires fewer CX gates than the MS method when the target state has a linear number of non-zero amplitudes.

The single-edge framework we present here posts many interesting possible avenues for future investigation. As mentioned in “Methods”, the gate count for preparing an arbitrary state can be reduced depending on the order in which the state transfer occurs. Without control reduction, the optimal sequence of walks is given by the minimum spanning tree approach. Determining the optimal sequence of basis state transfers in the presence of control reduction is more complicated and could be an interesting direction for future research. Additionally, if a target state is comprised of basis states with symmetry, it may be possible to cleverly reduce the number of gates required to create the target state by exploiting symmetry.

Methods

One of the most powerful optimization methods we utilize is control reduction. The number of CX gates necessary to implement the multi-controlled Rz and Rx gates is proportional to the number of controls on those gates. As mentioned earlier, we might need up to n − 1 controls in the worst case, but depending on the walk that we want to implement and the basis states that we have already visited, we might need less than that. For example, when we perform the first walk from the root, we never need any controls, since there are no other basis states that would be affected by the Rx gate.

More generally,

Proposition 2

Let n denote the number of qubits and \(S=\{\left\vert {z}_{1}\right\rangle ,\left\vert {z}_{2}\right\rangle ,...,\left\vert {z}_{k}\right\rangle \}\) denote the set of visited nodes. Suppose we wish to perform a single-edge walk from \(\vert {z}_{j}\rangle \in S\) to \(\vert {z}_{\ell }\rangle \notin S\) and \(\vert {z}_{j}\rangle\) and \(\left\vert {z}_{\ell }\right\rangle\) have a Hamming distance of 1. Let b denote the qubit where \(\vert {z}_{j}\rangle\) is different from \(\left\vert {z}_{\ell }\right\rangle\), i.e., zj[b] ≠ z[b]. Let D = {{kzi[k] ≠ zj[k], kb}zi Szj} denote the set of differing bits of the visited nodes with zj excluding the qubit b. For the gate-based representation of a single-edge walk from basis state \(\vert {z}_{j}\rangle\) to \(\left\vert {z}_{\ell }\right\rangle\), it is sufficient to control the CRz or CRx gate on any hitting set of D, where the values of controls are equal to the corresponding bits of zj. A hitting set of D is a collection of elements h such that \(h\cap {d}_{i}\,\ne\, {{\emptyset}}\) for all di D.

Proof. Adding a control on an arbitrary qubit c to any gate makes the gate act only on the states conforming to the value of that control, i.e., \(\left\vert {z}_{k}\right\rangle\) such that zk[c] = 0 or 1, depending on the state of the control. By the definition of di D, controlling the CRz or CRx gate on any index e di with the value of control equal to zj[e] will ensure that the gate does not act on zi. Thus, controlling on any hitting set of D will ensure that the gate does not act on all zi S, except zj.□

When zj and z have a Hamming distance greater than 1, we first have to update \(S\to \tilde{S}\) with the conjugating CX gates, as described in “Results”. Then we apply the previously described control reduction method to \(\tilde{S}\). Note that in this case, any of the differing qubits can be used as a target for the CRz or CRx gates, with the appropriate CX conjugation. However, the target qubit cannot be chosen as a control; therefore, the choice of target affects the efficiency of control reduction. In the following sections, when a target qubit is not explicitly dictated by a particular walk order, we consider all options for the target and choose the one that results in the minimum number of controls.

In order to minimize the number of controls, one needs to minimize the size of the hitting set, i.e., solve the minimum hitting set (MHS) problem, which is known to be NP-complete. As such, no polynomial algorithm is known to solve it exactly, but a number of heuristics exist that can provide good suboptimal solutions quickly54,55. In practice, solving this problem for each walk drastically reduces the number of control qubits on many gates in the sequence (see Fig. 7). The MHS problems that we consider in the following sections are solved approximately.

Fig. 7: The average number of CX gates necessary to implement a random state with n qubits, with and without control reduction.
figure 7

Each state consists of m = n non-zero amplitudes, and each point is the average over 1000 random states.

Example

Consider the sequence of walks [[001, 111], [111, 110]]. Suppose we have implemented the first walk and are now implementing the second walk. The second walk transfers amplitude from 111 to 110. The current state is a superposition of \(\left\vert 001\right\rangle\) and \(\left\vert 111\right\rangle\). Thus, we want the walk to affect only the amplitude of basis state 111 and leave 001 alone. Therefore, we need the controls for the Rx gate to be satisfied only by 111. Controlling the CRx gate on qubit 0 accomplishes this.

Next, as described in “Results”, single-edge quantum walks between states with Hamming distance greater than one require conjugation with CX gates on both sides of the CRx gate. The CX gates on the left side bring the desired basis states to a Hamming distance of 1, where they interact via the CRx gate, and the ones on the right move the basis states back to their original positions. Rather than immediately moving the states back after the application of the CRx gate, one can instead keep working with the transformed basis and move the right branch of the conjugating CX gates to the end of the circuit, which allows many of those CX gates from different single-edge walks to cancel. This is called forward-propagation of CX gates. The backward-propagation of CX gates is the same method applied to the reverse of the circuit. The advantage of the backward-propagation is that the CX gates are moved to the beginning of the circuit. Since the initial state is the ground state, these CX gates act trivially and can be removed.

Example

Consider the sequence of walks [[001, 111], [011, 110]]. The first walk [001, 111] has a Hamming distance of 2. For target qubit 1, we require CX10 to bring the states to a Hamming distance of 1. Applying this to the basis states, we get [001, 011], [111, 010]. After applying the multicontrol gate(s) for the first walk, we do not rotate the basis states back to the original frame, but instead keep track of the CX rotation, which will be applied at the end of the circuit. This is an example of CX forward-propagation.

Next, we discuss leaf node considerations. The phase on the destination node is adjusted once we consider the first walk out of that node. However, when the destination node is a terminal (leaf) node, there are no other walks that start from it, and we still need to adjust its phase.

One way to solve this issue is to use a separate CRz gate, but it adds another multi-controlled gate to the circuit and the number of controls for it is likely to be greater compared to the walk into the leaf node due to the fact that we will need to distinguish the leaf from a larger number of non-zero amplitude states and control reduction will be less efficient.

An alternative approach is to combine this last CRz gate with the CRz and CRx gates from the walk leading into the leaf, as one gate that simultaneously transfers the probability amplitude and adjusts the phase on both ends of the walk. This reduces the number of multi-controlled operations from 3 to 1, but the downside of this approach is that the resulting gate will not belong to the SU(2) group. As such, the number of CX gates in its decomposition will be proportional to \({\mathcal{O}}({n}^{2})\), instead of \({\mathcal{O}}(n)\) necessary for the regular SU(2) gates, such as CRz or CRx.56 Asymptotically, it makes this approach less efficient for sufficiently large values of n, but for our dataset, it is still favorable.

Any tree constructed in the first step is, in principle, sufficient for state preparation, but not all trees result in the same CX gate count. Moreover, different walk orders on the same tree can result in a different number of CX gates as well. In this section, we consider some of the options for tree construction and walk order.

We first describe heuristics that typically perform best on average.

MHS Nonlinear

As discussed previously, control reduction is a powerful technique that allows to significantly simplify the state preparation circuit. The efficiency of it depends on how easily a particular state can be separated from the rest of the currently existing states with non-zero amplitudes, which in turn depends on the local density of states around the state that we want to separate. The basic idea of the following walk order heuristic is that to maximize the efficiency of control reduction, we need to walk through hard-to-separate states first, before the density of states around them increases, which allows us to separate them using a smaller number of controls. Since the target qubit has a significant effect on the CX count, it is determined during the walk order construction as well.

More specifically, we construct a sequence of walks in reverse time order, from last to first. Given a list of basis states, we choose the state that requires the smallest number of controls to differentiate it from the rest of the basis states by solving the Minimum Hitting Set (MHS) problem. When two states are equally easy to distinguish, we choose the state with the largest number of bits that are different from the rest. Let us call this state z1. The target qubit t is the MHS bit with the smallest frequency. The set differentiated by t is then ordered by the easiest element to distinguish via the MHS. When two elements are equally easy to distinguish, we order them by Hamming distance from z1. The easiest element is set as z2. The (z1, z2, t) that requires the fewest controls is added to the path. The number of controls required is upper bounded by the size of the MHS for z1 plus the size of the MHS for z2.

All the elements are then updated by the back-propagation of CX gates. z2 is removed from the list, and the process is repeated until every node is visited by the path. The circuit is constructed from the path of CTQWs in terms of the original basis states and target qubits and with CX backward-propagation (see Algorithm 2 for pseudocode).

MHS Linear

This protocol follows the same procedure as MHS Nonlinear except that in each step, we assign z1 of the previous pair as the z2 for the next pair. This results in a linear graph. In contrast, in MHS Nonlinear z2 of the next pair is chosen using the MHS methods.

Greedy insertion

The linear greedy algorithm takes an ordered list of basis states and sequentially finds the best position to insert the next basis state. The classical complexity of this algorithm is \({\mathcal{O}}(n{m}^{4}+C)\), where C is the classical complexity of the method that generates the initial ordering. Typically, the greedy insertion performs better when given an initial ordering that already performs well (see Alg. 3 for pseudocode).

The following are other orders we considered that on average, typically do not outperform state-of-the-art methods.

Minimum Spanning Tree

The most straightforward option to minimize the Hamming distance between the connected basis states is to build a Minimum Spanning Tree (MST) over the complete graph of target basis states, where each edge is weighted by the Hamming distance between its endpoints.

The cost of calculating all pairwise Hamming distances is \({\mathcal{O}}(n{m}^{2})\), while the MST itself can be built in \({\mathcal{O}}({m}^{2})\) with Prim’s algorithm. Thus, the overall classical complexity for this method is \({\mathcal{O}}(n{m}^{2})\).

In the case when the target state is fully dense, i.e., m = 2n, this method can be simplified, since the tree can be automatically built without calculating all pairwise distances by branching off sequentially in each dimension of the hypercube (see Fig. 8b). The same method can also be used for generally dense states, i.e., when \(m={\mathcal{O}}({2}^{n})\). In this case, the resulting tree will go through some zero-amplitude states, which is not optimal for the circuit, but allows to save classical computational resources.

Fig. 8: Example walks generated by different methods for the case of 3 qubits.
figure 8

The weight of an edge is the Hamming distance of the two nodes. A potential walk path chosen by a given method is shown in red. In Fig. 8a and b, the costs of the edges are suppressed except for the path. a MST for a sparse state. b MST for a dense state. c SHP for a sparse state. d SHP for a dense state.

Shortest Hamiltonian path

The MST approach minimizes the total Hamming distance for the walks, which corresponds to the optimal walk order if control reduction is not taken into account. However, when control reduction is applied, other walk orders may be more efficient since they may allow for additional control reduction.

Therefore, instead of MST, one could find the Shortest Hamiltonian Path (SHP) in the same graph of the target basis states as used to build the MST. This option is more computationally expensive, since the cost of finding SHP exactly is \({\mathcal{O}}(m!)\). However, approximate SHP heuristics can find suboptimal solutions much faster57.

Similar to MST, when m = 2n the procedure can be simplified by building a Hamiltonian path throughout the whole hypercube of basis states without calculating all pairwise distances. This path is the known as the Gray code. Examples of the quantum walks produced by SHP for the cases of sparse and dense states are shown in Fig. 8.

Sorted order

Another method to build a simple linear path without solving SHP is to connect the basis states sequentially in increasing order. This method is less optimal than SHP, but it is computationally inexpensive since the target basis states can be sorted in \({\mathcal{O}}(nm\log (m))\), and can provide us with a reasonably good walking order. For example, for a fully dense state, the average Hamming distance given by the sorted walk order is 2, which is much better than n/2 achieved by a random walk order.

A comparison of the different walking orders presented in this section for the case of m = n is shown in Fig. 5. We empirically validate the quantum walk order methods and find that on the prepared states, the MHS methods and greedy methods require the fewest CX gates. Greedy(MHS) Combined is the greedy method on the initial ordering of MHS Linear, and the better circuit between the two methods is used.

Lastly, a single edge between states \(\left\vert j\right\rangle\) and \(\left\vert k\right\rangle\) corresponds to a CRx gate, which introduces an imaginary phase on \(\left\vert k\right\rangle\). However, for the purpose of preparing a state with some real-valued amplitudes, a better approach is to use a CRy gate instead, which has the same action as CRx, except it does not introduce a complex phase. CRy keeps the amplitudes real and makes it unnecessary to use CRz or CP gates when the amplitude is transferred between the basis states with real coefficients. As such, for these states, one might want to use a special walk order that goes through all real-valued basis states before moving to the complex-valued ones. For the states where all amplitudes are real, this approach can reduce the total number of CX gates in the circuit by a factor of approximately 2. We plan to implement this in our code in the future.

Algorithm 1

Conversion

1: Convert from CTQWs to a circuit.

2: Input:

3: walks: The set of walks.

4: state: The target state.

5: Output:

6: The circuit that implements the quantum walks.

7:

8: walks ← Reverse of the walks.

9: circ ← Initial empty quantum circuit.

10: for z1, z2, target, pangle, xangle in walks do

11: if target is not specified then

12:   target ← {The qubit index that results in the smallest number of controls.}

13: end if

14:  {Apply the CX gates required to bring z1 and z2 to a Hamming distance of 1.}

15:  Determine mhs for z1, z2, target.

16:  if z2 is a leaf node then

17:  {Apply controlled U Gate to circ (corresponds to the phase, Rx, phase gate pattern described in the leaf node discussion in “Methods”.)}

18: else

19:   Apply controlled SU(2) Gate to circ.

20: end if

21:  Update the walks with the CX gates.

22: end for

23: return inverse of circ.

Algorithm 2

MHS Nonlinear

1: Input:

2: basis: A set of basis states to walk through.

3: Output:

4: A sequence of single-edge quantum walks [[z1, z2, target]i] through the provided basis states that minimizes the number of controls needed to implement the corresponding multi-controlled operations.

5:

6: path ← empty.

7: mutable_basis ← copy basis.

8: while size of basis > 1 do

9:  all_mhs_z1 ← {MHS for every element in mutable_basis.}

10: all_diffs ← {for every element zi in mutable_basis get the subset ki = mutable_basis/zi and construct a list of list of the bits that differ with zi.}

11: all_targets ← {For every element mhs in all_mhs_z1 get the smallest frequency index in the corresponding all_diffs[mhs]}.

12: all_basis_z2 ← {A list of list. For every element zi in mutable_basis get the subset ki mutable_basis/zi such that ki is only covered by the hitting set all_mhs_z1[zi] with the corresponding target all_targets[zi].}

13: all_mhs_z2 ← {A list of list of list. For every element basisi in all_basis_z2 get the MHS for each element in basisi.}

14   {Append to path[z1, z2, target] that corresponds to the smallest size(all_mhs_z1[z1]) + size(all_mhs_z2[z1][z2]). If pairs have the same size, use the one with the largest all_diffs[z1]). Note that z1 and z2 are in terms of the original basis.}

15:   Remove z2 from basis.

16:   Remove z2 from mutable_basis.

17:   Update mutable_basis with CX gates.

18: end while

19: return the inverse of path.

Algorithm 3

Greedy Insertion

1: The walks considered here all form linear graphs. Greedy finds the best position to insert the next basis state into the path sequence.

2: Input:

3: basis: Basis in some order.

4: Output:

5: The best circuit found.

6:

7: path ← Empty list

8: for z in basisdo

9: cxcount ← None.

10: curr_path ← None.

11: for idx in (length(path)+1) do

12:   temp_path ← copy path.

13:   Insert z in temp_path at idx.

14:   temp_cxcount ← CX count for constructed circuit with temp_path.

15:   if cxcount is None or temp_cxcount < cxcount then

16:    curr_pathtemp_path.

17:    cxcount ← temp_cxcount

18:   end if

19: end for

20: pathcurr_path.

21: end for

22: return circuit for path.