Abstract
The quantum Fourier transform (QFT) is a fundamental component in various quantum algorithms, including Shor’s factoring algorithm and the Harrow-Hassidim-Lloyd (HHL) algorithm for solving systems of linear equations. Efficient implementation of the QFT is essential for the practical realization of large-scale quantum algorithms, especially in fault-tolerant quantum computing. In fault-tolerant implementations, the Clifford + T gate library is the standard choice for building quantum circuits. As the most resource-intensive component within this framework, the T gate’s associated cost poses a significant challenge to the efficient implementation of the QFT and its dependent algorithms. While approximate QFT (AQFT) circuits reduce this cost, state-of-the-art implementations still require a T-count of \(8n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))\) and a T-depth of \(n{\text{log}}_{2}(n/\varepsilon )+O(n)\). Although these results represent a notable achievement, the associated resource cost remains a primary bottleneck for practical, large-scale quantum algorithms, motivating further optimization. To address this bottleneck, this paper introduces two novel \(n\)-qubit AQFT circuits with an approximation error of \(O(\varepsilon )\). Our first design, AQFT Circuit 1, halves the T-count to \(4n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))\) by constructing inverse phase gradient transformation (PGT) circuits without using additional non-Clifford gates and by implementing the inverse PGTs using quantum adders. Our second design, AQFT Circuit 2, reduces the T-depth to \(\frac{1}{2}n{\text{log}}_{2}(n/\varepsilon )+O(n)\) through parallelization of the inverse PGTs that add only \(O(n)\) additional T gates. For both AQFT circuits, the state-of-the-art linear-depth quantum adder is employed. We demonstrate that employing the linear-depth quantum adder provides advantages over the currently known logarithmic-depth quantum adder, not only in terms of T-count but also in T-depth optimization for the AQFT, particularly within the range \(3<n/\varepsilon <{10}^{13}\), which encompasses practical system sizes.
Introduction
The quantum Fourier transform (QFT) is considered one of the most versatile components in quantum algorithms. It plays a key role in fundamental quantum algorithms such as quantum phase estimation1, algorithms for hidden subgroup problems including Shor’s factoring algorithm2, the Harrow-Hassidim-Lloyd (HHL) algorithm for solving systems of linear equations3, and quantum amplitude estimation4, to name a few. The applications of QFT span various fields, including basic arithmetic operations5,6,7,8, cryptography9,10,11,12,13,14, signal processing15,16, quantum simulation17,18,19,20, quantum machine learning21,22,23,24,25, quantum finance26,27,28, and computational fluid dynamics29,30,31. Improving the efficiency of QFT implementations is therefore indispensable for enhancing the performance of various quantum algorithms and expanding their applicability.
Most quantum algorithms are designed based on the quantum circuit model, which describes a computation as a sequence of discrete quantum gates applied to qubits. Therefore, the efficient execution of these algorithms directly depends on the efficiency of their underlying circuits. This necessity has driven significant research into optimizing fundamental building blocks, particularly for resource-intensive tasks like quantum arithmetic32,33,34,35,36,37,38.
Large-scale quantum algorithms, such as Shor’s factoring algorithm2 and the HHL algorithm 3, which utilize QFT, should be implemented in a fault-tolerant manner due to the fragility of quantum information. In fault-tolerant quantum computations, circuits are synthesized using universal and fault-tolerant gates. The Clifford + T gate library is generally chosen for this purpose in various promising error correction codes39,40. Within these fault-tolerance approaches, Clifford gates are easier to implement, often transversally. In contrast, T gates require more resource-intensive methods, such as magic state distillation41,42, thus dominating the implementation cost43,44. Consequently, the T-gate cost—typically quantified by the number of T gates (T-count) and the depth of T gates (T-depth)—has become a primary bottleneck for the efficient execution of large-scale quantum algorithms.
In fault-tolerant quantum computing, QFT is approximately implemented with an acceptable error, known as approximate quantum Fourier transform (AQFT). Typically, AQFT is implemented by removing all controlled rotation gates with angles below a certain threshold value. It has been demonstrated that, for effectively implementing various quantum algorithms utilizing QFT, applying AQFT instead of a full QFT often yields satisfactory results without significant performance penalties45,46,47,48. For instance, using AQFT with a threshold value of \(\pi /2^{8}\) is sufficient for factoring RSA-2048 with a \(95\%\) success rate48.
The basic implementation method for an \(n\)-qubit AQFT involves approximating the \(n\)-qubit QFT by omitting small-angle controlled rotations and decomposing the remaining controlled rotation gates into the Clifford + T gate library. This approach reduces the T-count for QFT implementation from \(O({n}^{2}\text{log}n)\) to \(O(n{\text{log}}^{2}n)\), while omitting the dependence on the approximation error for brevity. For the semiclassical version of the AQFT49, which is followed by measurement, thereby limiting its application in the midst of computation, it has been demonstrated that AQFT can be implemented with a T-count of \(O(n\text{log}n)\)50. In the case of fully coherent AQFT, Nam et al.51 achieved a T-count of \(O(n\text{log}n)\) using a method originally reported in Ref.52 that involves implementing the phase gradient transformation (PGT) with a quantum adder. They utilized Toffoli gates (specifically, relative phase Toffoli gates, measurements, and classically controlled gates) to construct PGT circuits and employed quantum adders from Ref.37. The T-count and T-depth of their \(n\)-qubit AQFT circuit with an approximation error of \(O(\varepsilon )\) are reported as \(8n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))\) and \(n{\text{log}}_{2}(n/\varepsilon )+5n-O({\text{log}}^{2}(n/\varepsilon )),\) respectively. While this work achieved an asymptotic scaling of \(O(n{\text{log}}_{2}(n/\varepsilon ))\), the large constant factors on the leading terms of both the T-count and T-depth indicate a clear potential for further practical optimizations.
In this paper, we present two new fully coherent n-qubit AQFT circuits, AQFT Circuit 1 and AQFT Circuit 2, which achieve further optimization of T-count and T-depth while maintaining an approximation error of \(O(\varepsilon )\). AQFT Circuit 1 is designed to reduce the T-count. In this design, we construct the rotation gate layers that perform inverse PGTs without using Toffoli gates, which are non-Clifford gates and accounted for approximately half of the T-count in the AQFT circuit design in Ref.51. Next, we replace the rotation gate layers with quantum adders from Ref.37. On the other hand, AQFT Circuit 2 is designed to reduce the T-depth. In this design, we construct the rotation gate layers without Toffoli gates, pair and parallelize two rotation gate layers into one, and replace them with quantum adders from Ref.37.
As a result, AQFT Circuit 1 achieves a T-count of \(4n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))\) with a T-depth of \(n{\text{log}}_{2}(n/\varepsilon )+n-O({\text{log}}^{2}(n/\varepsilon ))\), and AQFT Circuit 2 achieves a T-depth of \(\frac{1}{2}n{\text{log}}_{2}(n/\varepsilon )+\frac{3}{2}n-O({\text{log}}^{2}(n/\varepsilon ))\) with a T-count of \(4n{\text{log}}_{2}(n/\varepsilon )+n-O({\text{log}}^{2}(n/\varepsilon )).\) One might argue that using logarithmic-depth quantum adders could reduce the T-depth in AQFT implementation at the expense of increasing T-count. However, we demonstrate that the linear-depth quantum adders described in Ref.37 offer advantages over the state-of-the-art logarithmic-depth quantum adders in Ref.38, even in terms of T-depth optimization. This conclusion is supported by a comparison of both approaches when applied to AQFT circuits for systems satisfying \(3<n/\varepsilon <{10}^{13}\).
Overview of the AQFT circuits and their advantages
In this paper, we introduce the construction of two novel n-qubit AQFT circuits based on the Clifford + T gate library. AQFT Circuit 1 is optimized to minimize the T-count, whereas AQFT Circuit 2 focuses on reducing the T-depth. These AQFT circuits approximate the QFT with an error of \(O(\varepsilon )\). Figure 1a,b present the schematic diagrams for AQFT Circuit 1 and AQFT Circuit 2, respectively.
Schematic diagram of AQFT circuits presented in this paper. In these circuits, we construct inverse PGTs, and implement them using quantum adders. The boxes labeled RUS represent “repeat until success” circuits from Ref.53. Each RUS circuit is used to prepare a special quantum state \(|{\psi }_{b+1}\rangle \equiv \frac{1}{\sqrt{{2}^{b+1}}} {\sum }_{k=0}^{{2}^{b+1}-1}{e}^{2\pi ik/{2}^{b+1}}|k\rangle ,\) where \(b=\lceil{\text{log}}_{2}(n/\varepsilon )\rceil\). The special states are required to perform inverse PGTs using quantum adders. (a) AQFT Circuit 1: designed to reduce T-count. All circuits labeled \({C}_{i}\) consist of Clifford gates. (b) AQFT Circuit 2: designed to reduce T-depth. Each circuit labeled \({C}_{i}{\prime}\) consists of Clifford gates with a maximum of two T gates. Note that the quantum adders are paired and parallelized.
Similar to the AQFT circuit described in Ref.51, the construction of our AQFT circuits involves creating \({Z}^{\theta }\) gate (See Eq. (1)) layers, each of which performs an inverse PGT, and implementing them using quantum adders.
In order to facilitate this process, a special quantum state \(|{\psi }_{b+1}\rangle \equiv \frac{1}{\sqrt{{2}^{b+1}}} {\sum }_{k=0}^{{2}^{b+1}-1}{e}^{2\pi ik/{2}^{b+1}}|k\rangle\) needs to be prepared, where \(b\) is chosen as \(\lceil{\text{log}}_{2}(n/\varepsilon )\rceil\). We employ “repeat until success” circuits described in Ref.53 to prepare the special quantum state \(\left|{\psi }_{b+1}\right.\rangle\). A detailed explanation of the inverse PGT and its execution through a quantum adder is provided in Supplementary Material.
Before presenting detailed circuit designs and results, we provide a rough overview of the advantages of our AQFT circuits. AQFT Circuit 1 requires a count of \(\sim n\) \((b+1)\)-qubit quantum adders to perform ~\(n\) inverse PGTs. Constructing these quantum adders involves a T-count of \(\sim 4nb\). A key advantage of AQFT Circuit 1 is that the construction of inverse PGTs does not require additional non-Clifford gates, such as Toffoli gates (Note that each “\({C}_{i}\)” box in Fig. 1a consists solely of Clifford gates.). This omission of non-Clifford gates results in a reduction of the T-count from \(\sim 8nb\) to \(\sim 4nb\). AQFT Circuit 2 reduces T-depth from \(\sim nb\) to \(\sim \frac{1}{2}nb\) compared to AQFT Circuit 1 by pairing and parallelizing the inverse PGTs, which requires only \(O(n)\) additional T gates. These additional T gates are included within the \({C{\prime}}_{i}\) boxes illustrated in Fig. 1b.
In the subsequent sections, for each AQFT circuit, we present the circuit design process, resource estimation focused on T-count and T-depth, and approximation error analysis.
AQFT circuit 1: optimized for T-count
Circuit design
Throughout this paper, we employ a compact notation for sequential CNOT gates that share a single control qubit but act on different target qubits. This “fan-out” notation is illustrated in Supplementary Material. This convention is used to improve the readability of complex circuits.
AQFT Circuit 1 is constructed through the following four steps:
(Step 1: QFT subcircuit decomposition) We begin by decomposing the subcircuits of the standard QFT circuit described in Ref.54. The initial structure of these subcircuits is illustrated on the left-hand side of Fig. 2. The right-hand side of Fig. 2 shows the decomposed structure of the QFT subcircuits. A detailed explanation of this decomposition process is provided in Supplementary Material. In the decomposed subcircuit, multiple \({Z}^{\theta }\) gate layers are present. However, in the subsequent step, we will consolidate these layers. Specifically, the \({Z}^{\theta }\) gates within the red boxes from all the decomposed QFT subcircuits in Fig. 2 will be combined into a single \({Z}^{\theta }\) gate layer and placed at the very front of the QFT circuit. Similarly, the \({Z}^{\theta }\) gates in the purple boxes from all the decomposed QFT subcircuits in Fig. 2 will be merged into a \({Z}^{\theta }\) gate layer at the very end of the QFT circuit.
Subcircuit decomposition for AQFT Circuit 1. The left-hand side illustrates a subcircuit of the standard QFT circuit described in Ref.54. This subcircuit is decomposed into the structure shown on the right-hand side. The \({Z}^{\theta }\) gates in the red boxes from all the decomposed QFT subcircuits will be gathered into a single \({Z}^{\theta }\) gate layer at the very front of the QFT circuit. This reordering is possible because the circuit in the green box has a diagonal matrix representation. The \({Z}^{\theta }\) gate in the purple box will be gathered into a single \({Z}^{\theta }\) gate layer at the very end of the QFT circuit. Note that the \({Z}^{\theta }\) gate layer in the green box is constructed without using Toffoli gates. This results in T-count reduction.
(Step 2: Constructing QFT circuit) We combine all the decomposed QFT subcircuits to construct a full QFT circuit. Next, all the \({\text{Z}}^{\uptheta }\) gates in the red boxes from the decomposed subcircuits in Fig. 2 are gathered at the very front of the QFT circuit. This reordering is possible because both the circuits in the green and red boxes of Fig. 2 have diagonal matrix representations and therefore commute with each other. Similarly, we gather the \({\text{Z}}^{\uptheta }\) gates in the purple boxes from the decomposed subcircuits in Fig. 2 at the very end of the QFT circuit. After this reorganization, the \({\text{Z}}^{\uptheta }\) gates at the very front and end of the QFT circuit have the form of \({\text{Z}}^{{2}^{\text{k}-1}-1/{2}^{\text{k}}}\), where \(\text{k}\in \{\text{1,2}, \dots ,\text{n}\}\). We divide each \({\text{Z}}^{{2}^{\text{k}-1}-1/{2}^{\text{k}}}\) gate into \({\text{Z}}^{1/2}\) and \({\text{Z}}^{-1/{2}^{\text{k}}}\) gates. Note that the \({\text{Z}}^{1/2}\) gate is a Clifford gate.
(Step 3: Approximation) We remove all the \({\text{Z}}^{-1/{2}^{\text{k}}}\) gates with \(\text{k}>\text{b}\) from the QFT circuit (See Fig. 3).
AQFT Circuit 1: \(n\)-qubit AQFT with an error of \(O(\varepsilon )\) optimized for T-count, where \(b=\lceil{\text{log}}_{2}(n/\varepsilon )\rceil\). This circuit is decomposed into single-qubit \(H\) and \({Z}^{\theta }\) gates, and standard two-qubit CNOT gates. To improve readability, the \({Z}^{\theta }\) gates are aligned vertically to represent a gate layer; this does not imply a multi-qubit control structure. Each \({Z}^{\theta }\) gate layer in the blue boxes performs an inverse PGT with the help of a \({Z}^{-1}\) gate. The effect of inserting \({Z}^{-1}\) gates can be canceled by inserting \(Z\) gates. Note that this circuit contains \((n+1)\) inverse PGTs and these are constructed without using additional non-Clifford gates. We implement each inverse PGT using a quantum adder from Ref.37. Since the circuit is constructed entirely from unitary gates, it is inherently reversible.
(Step 4: Inverse PGT implementation using quantum adder) In Fig. 3, each \({\text{Z}}^{\uptheta }\) gate layer in blue boxes performs an inverse PGT with the help of a \({\text{Z}}^{-1}\) gate. The effect of inserting the \({\text{Z}}^{-1}\) gates can be canceled by inserting \(\text{Z}\) gates. We implement each inverse PGT using a quantum adder from Ref.37.
T-count and T-depth estimation
In this section, we estimate the T-count and T-depth of AQFT Circuit 1. The T-count refers to the total number of T gates. The depth of a circuit refers to the number of sequential layers of gates that cannot be executed in parallel. Based on this, the T-depth is the number of sequential layers of T gates that cannot be parallelized, effectively ignoring the depth contributed by other gate types such as Clifford gates. The T-count and T-depth estimates presented in this paper are derived from the analytical formulas developed in the following sections, not from numerical simulations on a quantum simulator.
Among the inverse PGTs in Fig. 3, the 3-qubit inverse PGT does not need to be replaced by a 3-qubit quantum adder, because it only requires a single T gate. Considering this, the T gates required to construct AQFT Circuit 1 are found in four parts:
-
(1)
\((n-b+3)\) \((b+1)\)-qubit quantum adders,
-
(2)
One for each of the \(k\)-qubit quantum adders, where \(k\in \{4, 5, \dots , b\}\),
-
(3)
One T gate in the 3-qubit inverse PGT,
-
(4)
T gates for preparing the special state \(|{\psi }_{b+1}\rangle\).
The \(t\)-qubit quantum adder in Ref.37 requires a T-count of \((4t-4)\) and a T-depth of \(t\). With the quantum adder, the T-count and T-depth in part (1) are \(4nb-4{b}^{2}+12b\) and \(nb+n-{b}^{2}+2b+3\), respectively. The T-count and T-depth in part (2) are \({\sum }_{k=4}^{b}\left(4k-4\right)=2{b}^{2}-2b-12\) and \({\sum }_{k=4}^{b}k={b}^{2}/2+b/2-6\), respectively. For part (4), \((b-2)\) \({Z}^{\theta }\) gates have to be synthesized with an error of \(\varepsilon /b\) for each gate (See Supplementary Material). The value \(\varepsilon /b\) is chosen so that AQFT Circuit 1 has an approximation error of \(O(\varepsilon )\). In the gate synthesis, we use \(RUS\) circuits from Ref.53, and the expected T-count to synthesize a \({Z}^{\theta }\) gate with a Fourier angle with an error of \(\varepsilon {\prime}\) is \(1.08{\text{log}}_{2}(1/\varepsilon {\prime})+17.5.\) Therefore, the total T-count for AQFT Circuit 1 is
and the total T-depth for AQFT Circuit 1 is
Approximation error analysis
The approximation error, defined as the spectral norm54, can be found in the Supplementary Material. We demonstrate that AQFT Circuit 1 has an error of \(O(\varepsilon )\) when implemented in replacement of a QFT circuit. The approximation error in AQFT Circuit 1 arises from two sources:
-
(1)
Removing all the \({Z}^{-1/{2}^{k}}\) gates with \(k>b\),
-
(2)
Approximating \({Z}^{\theta }\) gates to prepare the special state \(|{\psi }_{b+1}\rangle\) using \(RUS\) circuits.
We call the error from part (1) \({\varepsilon }_{1}^{(1)}\) and the error from part (2) \({\varepsilon }_{1}^{(2)}\).
The error for removing a \({Z}^{\theta }\) gate is
and this can be rewritten as
since
Therefore, the error \({\varepsilon }_{1}^{(1)}\) is bounded as follows:
Note that \(b\) is chosen as \(\lceil{\text{log}}_{2}(n/\varepsilon )\rceil\), thus we have \(n{2}^{-b}\le \varepsilon\).
When synthesizing \({Z}^{\theta }\) gates to prepare the special state \(|{\psi }_{b+1}\rangle\), each \({Z}^{\theta }\) gate is approximated with an error of \(\varepsilon /b\), resulting in \({\varepsilon }_{1}^{(2)}\le \left(b-2\right)\bullet \frac{\varepsilon }{b}<\varepsilon\). Consequently, the total error implementing AQFT Circuit 1 instead of the QFT circuit is \(O(\varepsilon )\) because
AQFT circuit 2: optimized for T-depth
Circuit design
The construction of AQFT Circuit 2 involves the following four steps:
(Step 1: QFT subcircuit decomposition) We begin by pairing the subcircuits on the left-hand side of Fig. 2, as shown on the left-hand side of Fig. 4. These paired subcircuits are then decomposed into the structure illustrated on the right-hand side of Fig. 4. For the decomposition of a \(\text{k}\)-qubit subcircuit, \((\text{k}-1)\) ancilla qubits, initially in the state \({\left|0\right.\rangle }^{\otimes (\text{k}-1)}\), are required. A detailed demonstration of this decomposition is provided in Supplementary Material.
Subcircuit decomposition for AQFT Circuit 2. The \({Z}^{\theta }\) gates in the red boxes from all the QFT subcircuits will be gathered into a single \({Z}^{\theta }\) gate layer at the very front of the QFT circuit. This reordering is possible because the circuits in the red and green boxes have diagonal matrix representations. The \({Z}^{\theta }\) gates in the purple box will be gathered into a single \({Z}^{\theta }\) gate layer at the very end of the QFT circuit. Note that two \({Z}^{\theta }\) gate layers are paired and parallelized in the green box. This results in T-depth reduction.
These ancilla qubits serve as a temporary workspace to enable the parallel execution of gate layers that would otherwise be sequential. The state of the primary register is copied to the ancilla register, allowing two distinct operations to be applied simultaneously before the ancilla is returned to its initial state, thereby reducing the circuit’s overall T-depth.
In the decomposed structure, note that two \({\text{Z}}^{\uptheta }\) gate layers are paired and parallelized within the green box in Fig. 4. Similar to AQFT Circuit 1, in the next step, we will combine the \({\text{Z}}^{\uptheta }\) gates in the red boxes from all the decomposed QFT subcircuits in Fig. 4 into a single \({\text{Z}}^{\uptheta }\) gate layer at the very front of the QFT circuit. Additionally, the \({\text{Z}}^{\uptheta }\) gates in the purple boxes from all the decomposed QFT subcircuits in Fig. 4 will be gathered into a single \({\text{Z}}^{\uptheta }\) gate layer at the very end of the QFT circuit.
(Step 2: Constructing QFT circuit) We combine all the decomposed QFT subcircuits to construct a full QFT circuit. We modify the circuit using a method similar to that of AQFT Circuit 1. All the \({Z}^{\theta }\) gates in the red boxes from the decomposed subcircuits in Fig. 4 are gathered at the very front of the QFT circuit, and the \({Z}^{\theta }\) gates in the purple boxes from the decomposed subcircuits in Fig. 4 are gathered at the very end of the QFT circuit. Then, the \({Z}^{\theta }\) gates at the very front and end of the QFT circuit take the form of \({Z}^{{2}^{k-1}-1/{2}^{k}}\), where \(k\in \{\text{1,2}, \dots ,n\}\). We divide each \({Z}^{{2}^{k-1}-1/{2}^{k}}\) gate into \({Z}^{1/2}\) and \({Z}^{-1/{2}^{k}}\) gates.
(Step 3: Approximation) We remove all the \({\text{Z}}^{-1/{2}^{\text{k}}}\) gates with \(\text{k}>\text{b}\) from the QFT circuit (See Fig. 5). In the approximation process, we no longer need \((\text{n}-1)\) ancilla qubits; instead, we only need \(\text{b}\) ancilla qubits.
AQFT Circuit 2: \(n\)-qubit AQFT with an error of \(O(\varepsilon )\) optimized for T-depth, where \(b=\lceil{\text{log}}_{2}(n/\varepsilon )\rceil\). This circuit employs \(b\) ancilla qubits each of which is labeled as \({a}_{i}\) and initially in the state \(|0\rangle\). Each \({Z}^{\theta }\) gate layer in the blue boxes performs an inverse PGT with the help of a \({Z}^{-1}\) gate. The effect of inserting \({Z}^{-1}\) gates can be canceled by inserting \(Z\) gates. Note that this circuit contains \((n+1)\) inverse PGTs, and each yellow box contains only two T gates. We implement each inverse PGT using a quantum adder from Ref.37.
(Step 4: Inverse PGT implementation using quantum adder) In Fig. 5, each \({Z}^{\theta }\) gate layer in blue boxes performs an inverse PGT with the help of a \({Z}^{-1}\) gate. The effect of inserting the \({Z}^{-1}\) gates can be canceled by inserting \(Z\) gates. We implement each inverse PGT using a quantum adder from Ref.37.
In the construction of AQFT Circuit 2, we utilize the linear-depth quantum adder described in Ref.37. While one might argue that employing a logarithmic-depth quantum adder could further reduce the T-depth in the AQFT circuit, our findings indicate that the quantum adder from Ref.37 is more efficient in terms of T-depth compared to the currently known logarithmic-depth quantum adder when utilized for AQFT implementation within the range \({3<n/\varepsilon <10}^{13}\). In Sect. “Disadvantages of using logarithmic-depth quantum adders”, we provide a comparative analysis of using the quantum adder from Ref.37 and the state-of-the-art logarithmic-depth quantum adder from Ref.38, which has the lowest T-depth among the logarithmic-depth adders at the time of writing.
T-count and T-depth estimation
Among the inverse PGTs in Fig. 5, the 3-qubit inverse PGT does not need to be replaced by a 3-qubit quantum adder, because it only requires a single T gate. Considering this, the T gates required to construct AQFT Circuit 2 are found in five parts:
-
(1)
\((n-b+3)\) \((b+1)\)-qubit quantum adders
-
(2)
One for each of the \(k\)-qubit quantum adders, where \(k\in \{4, 5, \dots ,b\}\)
-
(3)
One T gate in the 3-qubit inverse PGT,
-
(4)
\((n-1)\) T gates in yellow boxes in Fig. 5,
-
(5)
T gates for preparing two of the special states \(|{\psi }_{b+1}\rangle\)
In the following resource estimation, for the sake of convenience in calculation, we assume that \(n\) is even and \(b\) is odd. However, even without this assumption, the difference in resource calculation remains within \(O(1)\). The \(t\)-qubit quantum adder in Ref.37 requires a T-count of \((4t-4)\) and a T-depth of \(t\). Using this quantum adder, the T-counts in part (1) through (4) are \(4nb-4{b}^{2}+12b,\) \({\sum }_{k=4}^{b}\left(4k-4\right)=2{b}^{2}-2b-12,\) \(1,\) and \(n-1,\) respectively. The T-depths required in part (1) through (4) are \(\left\{2+(n-b+1)/2\right\}\bullet \left(b+1\right),\) \({\sum }_{k=2}^{(b-1)/2}\left(2k+1\right)={b}^{2}/4+b/2-15/4, 1,\) and \(n-1,\) respectively. For T gates in part (5), \(2(b-2)\) \({Z}^{\theta }\) gates have to be synthesized with an error of \(\varepsilon /2b\) for each gate (See Supplementary Material). The value \(\varepsilon /2b\) is chosen to ensure that AQFT Circuit 2 has an approximation error of \(O(\varepsilon )\). As with AQFT Circuit 1, we use the \(RUS\) circuits from Ref.53. Consequently, the total T-count for AQFT Circuit 2 is
and the total T-depth for AQFT Circuit 2 is
Approximation error analysis
We prove that AQFT Circuit 2 has an error of \(O(\varepsilon )\) when implemented in place of a QFT circuit. The sources of the approximation error in AQFT Circuit 2 are identified in two parts:
-
(1)
Removing all the \({Z}^{-1/{2}^{k}}\) gates with \(k>b\),
-
(2)
Approximating \({Z}^{\theta }\) gates to prepare two of the special states \(|{\psi }_{b+1}\rangle\) using \(RUS\) circuits.
We call the error from part (1) \({\varepsilon }_{2}^{(1)}\) and the error from part (2) \({\varepsilon }_{2}^{(2)}\).
The error \({\varepsilon }_{2}^{(1)}\) is bounded in the same way as \({\varepsilon }_{1}^{(1)}\). When synthesizing \({Z}^{\theta }\) gates to prepare two of the special states \(|{\psi }_{b+1}\rangle\) for AQFT Circuit 2, each \({Z}^{\theta }\) gate is approximated with an error of \(\varepsilon /2b\), resulting in \({\varepsilon }_{2}^{(2)}\le 2\left(b-2\right)\bullet \frac{\varepsilon }{2b}<\varepsilon\). Consequently, the total error implementing AQFT Circuit 2 instead of the QFT circuit is \(O(\varepsilon )\).
Disadvantages of using logarithmic-depth quantum adders
If logarithmic-depth quantum adders were applied in the configuration of the circuit shown in Fig. 5, the T-depth of the AQFT circuit would be reduced from \(O(n\text{log}(n/\varepsilon ))\) to \(O\left(n\text{log}\left(\text{log}\left(n/\varepsilon \right)\right)\right)\) at the expense of increasing the T-count. In this section, we compare the results of using the state-of-the-art logarithmic-depth quantum adder (In-FT-QCLA1 from Ref.38) with those of AQFT Circuit 2. Our analysis demonstrates that AQFT Circuit 2 is advantageous not only in terms of T-count but also T-depth. We refer to the AQFT circuit that uses the logarithmic-depth quantum adder in the configuration in Fig. 5 as AQFT Circuit 2’.
First, we present the T-count and T-depth of AQFT Circuit 2’. As for AQFT Circuit 2, the T gates required to construct AQFT Circuit 2’ are categorized in five parts:
-
(1)
\((n-b+3)\) \((b+1)\)-qubit quantum adders,
-
(2)
One for each of the \(k\)-qubit quantum adders, where \(k\in \{4, 5, \dots ,b\}\),
-
(3)
One T gate in the 3-qubit inverse PGT,
-
(4)
\((n-1)\) T gates in yellow boxes in Fig. 5,
-
(5)
T gates for preparing two of the special states \(|{\psi }_{b+1}\rangle\).
The t-qubit quantum adder (In-FT-QCLA1 from Ref.38) requires a T-count of \(20t - 8w\left( t \right) - 8w\left( {t - 1} \right) - 4\left\lfloor {\log_{2} \left( t \right)} \right\rfloor - 4\left\lfloor {\log_{2} \left( {t - 1} \right)} \right\rfloor - 8\), where \(w\left( t \right) = t - \mathop {\sum \left\lfloor {t/2^{i} } \right\rfloor }\limits_{i = 1}^{\infty }\) represents the number of ones in the binary expansion of \(t\), and a T-depth of \(2\left\lfloor {\log_{2} \left( t \right)} \right\rfloor + 2\left\lfloor {\log_{2} \left( {t - 1} \right)} \right\rfloor + 2\left\lfloor {\log_{2} \left( {t/3} \right)} \right\rfloor + 2\left\lfloor {\log_{2} \left( {\left( {t - 1} \right)/3} \right)} \right\rfloor + 28\). For readability, we use the following notations:
In the following resource estimation, for the sake of convenience in calculation, we assume that \(n\) is even and \(b\) is odd. Even without this assumption, however, the difference in resource calculation is only \(O(1)\). Then, the T-count and T-depth in part (1) are \((n-b+3)\bullet T{C}_{adder}(b)\) and \(\{2+\frac{n-b+1}{2}\}\bullet T{D}_{adder}\left(b\right)\), respectively. The T-count and T-depth in part (2) are \({\sum }_{i=4}^{b-1}T{C}_{adder}(i)\) and \({\sum }_{i=2}^{(b-1)/2}T{D}_{adder}(2i)\), respectively. For T gates in part (5), \(2(b-2)\) \({Z}^{\theta }\) gates have to be synthesized with an error of \(\varepsilon /2b\) for each gate. The value \(\varepsilon /2b\) is chosen to ensure that AQFT Circuit 2’ has an approximation error of \(O(\varepsilon )\). Therefore, the total T-count for AQFT Circuit 2’ is
and the total T-depth for AQFT Circuit 2’ is
Comparing AQFT Circuit 2’ with AQFT Circuit 2, the T-depth appears to be reduced at the expense of an increased T-count. However, within the range of \(3<n/\varepsilon <{10}^{13}\), the inequality \({4\text{nlog}}_{2}b>\frac{1}{2}nb\) holds, indicating that the T-depth is not reduced. This can be observed in Fig. 6. In summary, using the linear-depth adder from Ref.37 for AQFT implementation offers advantages over the state-of-the-art logarithmic-depth quantum adder from Ref.38 in terms of both T-count and T-depth.
Comparison of T-depth for \(n\)-qubit AQFT implementations. In the figure, the blue solid lines represent the results from AQFT Circuit 2’ using the logarithmic-depth quantum adder from Ref.38, and the red dash-dot lines represent the results from AQFT Circuit 2 using the linear-depth quantum adder from Ref.37. The results here are derived from the analytical formulas, not from a numerical simulation on a quantum simulator. (a) \(\varepsilon =1/10.\) (b) \(\varepsilon =1/100.\) (c) \(\varepsilon =1/1000.\) (d) \(\varepsilon =1/10000\).
Results and discussion
In this paper, we introduced two novel \(n\)-qubit AQFT circuits with an approximation error of \(O(\varepsilon )\): AQFT Circuit 1 for T-count optimization and AQFT Circuit 2 for T-depth optimization. AQFT Circuit 1 is designed to minimize T-count, achieving a T-count of \(4n{\text{log}}_{2}\left(n/\varepsilon \right)-O({\text{log}}^{2}\left(n/\varepsilon \right))\), which is roughly half of the best-known result in Ref.51. This reduction is accomplished by constructing inverse PGT circuits without using Toffoli gates and by replacing them with quantum adders from Ref.37. On the other hand, AQFT Circuit 2 is designed to minimize T-depth, reducing it from \(n{\text{log}}_{2}\left(n/\varepsilon \right)+O(n)\) to \(\frac{1}{2}n{\text{log}}_{2}\left(n/\varepsilon \right)+O\left(n\right).\) This is achieved by pairing and parallelizing inverse PGTs using \(O(n)\) additional T gates. These results are summarized in Table 1.
As shown in Fig. 7, AQFT Circuit 1 is optimized for T-count, while AQFT Circuit 2 is optimized for T-depth. In terms of T-count, the difference between AQFT Circuit 1 and AQFT Circuit 2 is only \(O\left(n\right),\) which does not affect the leading order term \(4n{\text{log}}_{2}(n/\varepsilon ).\) Consequently, the difference between them in Fig. 7a is subtle. However, AQFT Circuit 2 has the disadvantage of requiring \(O(\text{log}(n/\varepsilon ))\) more qubits than AQFT Circuit 1 (See Fig. 1). This additional resource requirement might be a limitation in scenarios with stringent hardware constraints.
Comparison of T-count and T-depth for \(n\)-qubit AQFT implementations with a fixed \(\varepsilon =1/100\). In the figure, the blue solid lines represent the results from Ref.51, the green dash-dot lines represent the results of AQFT Circuit 1, and the red dashed lines represent the results of AQFT Circuit 2. The results here are derived from the analytical formulas, not from a numerical simulation on a quantum simulator. (a) T-count comparison. The difference in T-count between AQFT Circuit 1 and AQFT Circuit 2 is only \(O\left(n\right),\) which does not affect the leading order term \(4\mathit{n }{\text{log}}_{2}(n/\varepsilon ).\) Consequently, the difference between them is subtle. (b) T-depth comparison.
Conclusion
In conclusion, this work successfully reduces the T gate cost of AQFTs by introducing two novel circuit designs. Our first circuit halves the T-count of the previous state-of-the-art by eliminating the need for additional non-Clifford gates in the PGT construction. Our second circuit similarly halves the T-depth through parallelization at the cost of additional ancilla qubits.
Despite these advancements, certain limitations remain. Both AQFT circuits still exhibit a leading-order complexity of \(O(n{\text{log}}_{2}\left(n/\varepsilon \right)\)) for T-count and T-depth. Although employing logarithmic-depth quantum adders could theoretically further improve T-depth complexity, our analysis reveals that existing logarithmic-depth quantum adders do not yet offer practical advantages due to their inherently higher T-count and T-depth. Thus, future research should prioritize the development of novel logarithmic-depth quantum adders capable of reducing both T-count and T-depth simultaneously.
Nevertheless, our results achieve meaningful reductions in T-count and T-depth for AQFT implementations. Considering that T gates constitute the primary bottleneck in the implementation of fault-tolerant quantum algorithms, and that QFT is ubiquitous in quantum computing, our proposed circuits represent a valuable contribution toward improving the practical feasibility of large-scale quantum algorithms, including Shor’s factoring and the HHL algorithm.
Data availability
The data generated during the current study are available from the corresponding author on reasonable request.
References
Kitaev, Y. Quantum measurements and the Abelian stabilizer problem. https://arxiv.org/abs/quant-ph/9511026 (1995).
Shor, P. W. Algorithms for quantum computation: discrete logarithms and factoring. Proceedings of the 35th Annual Symposium on Foundations of Computer Science 124–134 (IEEE, 1994).
Harrow, A. W., Hassidim, A. & Lloyd, S. Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103, 150502 (2009).
Brassard, G., Hoyer, P., Mosca, M. & Tapp, A. Quantum amplitude amplification and estimation. Contemp. Math. 305, 53–74 (2002).
Draper, T. G. Addition on a quantum computer. https://arxiv.org/abs/quant-ph/0008033 (2000).
Ruiz-Perez, L. & Garcia-Escartin, J. C. Quantum arithmetic with the quantum Fourier transform. Quantum Inf. Process. 16, 1–14 (2017).
Şahin, E. Quantum arithmetic operations based on quantum Fourier transform on signed integers. Int. J. Quantum Inform. 18, 2050035 (2020).
Pavlidis, A. & Floratos, E. Quantum-Fourier-transform-based quantum arithmetic with qudits. Phys. Rev. A 103, 032417 (2021).
Da-Zu, H., Zhi-Gang, C. & Ying, G. Multiparty quantum secret sharing using quantum Fourier transform. Commun. Theor. Phys. 51, 221 (2009).
Yang, Y.-G., Xia, J., Jia, X. & Zhang, H. Novel image encryption/decryption based on quantum Fourier transform and double phase encoding. Quantum Inf. Process. 12, 3477–3493 (2013).
Yang, Y.-G., Jia, X., Sun, S.-J. & Pan, Q.-X. Quantum cryptographic algorithm for color images using quantum Fourier transform and double random-phase encoding. Inf. Sci. 277, 445–457 (2014).
Zhang, W.-W., Gao, F., Liu, B., Wen, Q.-Y. & Chen, H. A watermark strategy for quantum images based on quantum Fourier transform. Quantum Inf. Process. 12, 793–803 (2013).
Yang, H.-Y. & Ye, T.-Y. Secure multi-party quantum summation based on quantum Fourier transform. Quantum Inf. Process. 17, 129 (2018).
Tan, R.-C., Lei, T., Zhao, Q.-M., Gong, L.-H. & Zhou, Z.-H. Quantum color image encryption algorithm based on a hyper-chaotic system and quantum Fourier transform. Int. J. Theor. Phys. 55, 5368–5384 (2016).
Yin, H., Lu, D. & Zhang, R. Quantum windowed Fourier transform and its application to quantum signal processing. Int. J. Theor. Phys. 60, 3896–3918 (2021).
Vorobyov, V. et al. Quantum Fourier transform for nanoscale quantum sensing. npj Quantum Inf. 7, 124 (2021).
Abrams, D. S. & Lloyd, S. Quantum algorithm providing exponential speed increase for finding eigenvalues and eigenvectors. Phys. Rev. Lett. 83, 5162–5165 (1999).
Lidar, D. A. & Wang, H. Calculating the thermal rate constant with exponential speedup on a quantum computer. Phys. Rev. E 59, 2429–2438 (1999).
Aspuru-Guzik, A., Dutoi, A. D., Love, P. J. & Head-Gordon, M. Simulated quantum computation of molecular energies. Science 309, 1704–1707 (2005).
Kassal, I., Whitfield, J. D., Perdomo-Ortiz, A., Yung, M.-H. & Aspuru-Guzik, A. Simulating chemistry using quantum computers. Annu. Rev. Phys. Chem. 62, 185–207 (2011).
Lloyd, S., Mohseni, M. & Rebentrost, P. Quantum principal component analysis. Nat. Phys. 10, 631–633 (2014).
Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014).
Wiebe, N., Braun, D. & Lloyd, S. Quantum algorithm for data fitting. Phys. Rev. Lett. 109, 050505 (2012).
Schuld, M., Sinayskiy, I. & Petruccione, F. Prediction by linear regression on a quantum computer. Phys. Rev. A 94, 022342 (2016).
Kerenidis, I. & Prakash, A. Quantum gradient descent for linear systems and least squares. Phys. Rev. A 101, 022316 (2020).
Rebentrost, P., Gupt, B. & Bromley, T. R. Quantum computational finance: Monte Carlo pricing of financial derivatives. Phys. Rev. A 98, 022321 (2018).
Woerner, S. & Egger, D. J. Quantum risk analysis. Npj Quantum Inf. 5, 15 (2019).
Stamatopoulos, N. et al. Option pricing using quantum computers. Quantum 4, 291 (2020).
Steijl, R. & Barakos, G. N. Parallel evaluation of quantum algorithms for computational fluid dynamics. Comput. Fluids 173, 22–28 (2018).
Gaitan, F. Finding flows of a Navier-Stokes fluid through quantum computing. npj Quantum Inf. 6, 61 (2020).
Meng, Z. & Yang, Y. Quantum computing of fluid dynamics using the hydrodynamic Schrödinger equation. Phys. Rev. Res. 5, 033182 (2023).
Noorallahzadeh, M., Mosleh, M., Ahmadpour, S., Pal, J. & Sen, B. A new design of parity preserving reversible Vedic multiplier targeting emerging quantum circuits. Int. J. Numer. Modell. 36, e3089 (2023).
Noorallahzadeh, M., Mosleh, M., Misra, N. K. & Mehranzadeh, A. A novel design of reversible quantum multiplier based on multiple-control toffoli synthesis. Quantum Inf. Process. 22, 167 (2023).
Ahmadpour, S.-S. et al. A new energy-efficient design for quantum-based multiplier for nano-scale devices in internet of things. Comput. Electr. Eng. 117, 109263 (2024).
Noorallahzadeh, M., Mosleh, M. & Datta, K. A new design of parity-preserving reversible multipliers based on multiple-control toffoli synthesis targeting emerging quantum circuits. Front. Comput. Sci. 18, 186908 (2024).
Noorallahzadeh, M. & Mosleh, M. Synthesis of a reversible quantum Vedic multiplier on IBM quantum computers. Sci. Rep. 15, 18897 (2025).
Gidney, C. Halving the cost of quantum addition. Quantum 2, 74 (2018).
Thapliyal, H., Muñoz-Coreas, E. & Khalus, V. Quantum circuit designs of carry lookahead adder optimized for T-count T-depth and qubits. Sust. Comput. 29, 100457 (2021).
Campbell, E. T., Terhal, B. M. & Vuillot, C. Roads towards fault-tolerant universal quantum computation. Nature 549, 172–179 (2017).
Fowler, A. G., Mariantoni, M., Martinis, J. M. & Cleland, A. N. Surface codes: Towards practical large-scale quantum computation. Phys. Rev. A 86, 032324 (2012).
Knill, E. Fault-tolerant postselected quantum computation: schemes. https://arxiv.org/abs/quant-ph/0402171 (2004).
Bravyi, S. & Kitaev, A. Universal quantum computation with ideal Clifford gates and noisy ancillas. Phys. Rev. A 71, 022316 (2005).
Aliferis, P., Gottesman, D. & Preskill, J. Quantum accuracy threshold for concatenated distance-3 codes. Quantum Inform. Comput. 6, 97–165 (2006).
Fowler, A. G., Stephens, A. M. & Groszkowski, P. High-threshold universal quantum computation on the surface code. Phys. Rev. A 80, 052312 (2009).
Barenco, A., Ekert, A., Suominen, K.-A. & Törmä, P. Approximate quantum Fourier transform and decoherence. Phys. Rev. A 54, 139–146 (1996).
Coppersmith, D. An approximate Fourier transform useful in quantum factoring. https://arxiv.org/abs/quant-ph/0201067 (2002).
Nam, Y. S. & Blümel, R. Performance scaling of Shor’s algorithm with a banded quantum Fourier transform. Phys. Rev. A 86, 044303 (2012).
Nam, Y. S. & Blümel, R. Scaling laws for Shor’s algorithm with a banded quantum Fourier transform. Phys. Rev. A 87, 032333 (2013).
Griffiths, R. B. & Niu, C.-S. Semiclassical Fourier transform for quantum computation. Phys. Rev. Lett. 76, 3228 (1996).
Goto, H. Resource requirements for a fault-tolerant quantum Fourier transform. Phys. Rev. A 90, 052318 (2014).
Nam, Y., Su, Y. & Maslov, D. Approximate quantum Fourier transform with O(nlog(n)) T gates. npj Quantum Inf. 6, 26 (2020).
Kitaev, A. Y., Shen, A. & Vyalyi, M. N. Classical and Quantum Computation (American Mathematical Society, 2002).
Bocharov, A., Roetteler, M. & Svore, K. M. Efficient synthesis of universal repeat-until-success quantum circuits. Phys. Rev. Lett. 114, 080502 (2015).
Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge University Press, 2010).
Acknowledgements
The authors thank Sangkyu Baek for his valuable comments.
Funding
This research was funded by Korea National Research Foundation (NRF) grant No. NRF-2023R1A2C1003570, RS-2023-00225385, RS-2024-00422330, AFOSR grant FA2386-21-1-0089, AFOSR grant FA2386-22-1-4052, and Amazon Web Services. This work was also supported by the National Quantum Laboratory at the University of Maryland (QLab).
Author information
Authors and Affiliations
Contributions
B. P. developed the main idea, conducted the theoretical analysis, prepared the figures, and drafted the manuscript under the supervision of D. A. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Park, B., Ahn, D. Reducing T-count and T-depth in approximate quantum Fourier transform circuits. Sci Rep 15, 37199 (2025). https://doi.org/10.1038/s41598-025-21087-2
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-21087-2






