Reducing T-count and T-depth in approximate quantum Fourier transform circuits

Park, Byeongyong; Ahn, Doyeol

doi:10.1038/s41598-025-21087-2

Download PDF

Article
Open access
Published: 24 October 2025

Reducing T-count and T-depth in approximate quantum Fourier transform circuits

Byeongyong Park^1,2 &
Doyeol Ahn^1,2

Scientific Reports volume 15, Article number: 37199 (2025) Cite this article

2017 Accesses
Metrics details

Subjects

Abstract

The quantum Fourier transform (QFT) is a fundamental component in various quantum algorithms, including Shor’s factoring algorithm and the Harrow-Hassidim-Lloyd (HHL) algorithm for solving systems of linear equations. Efficient implementation of the QFT is essential for the practical realization of large-scale quantum algorithms, especially in fault-tolerant quantum computing. In fault-tolerant implementations, the Clifford + T gate library is the standard choice for building quantum circuits. As the most resource-intensive component within this framework, the T gate’s associated cost poses a significant challenge to the efficient implementation of the QFT and its dependent algorithms. While approximate QFT (AQFT) circuits reduce this cost, state-of-the-art implementations still require a T-count of $8n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))$ and a T-depth of $n{\text{log}}_{2}(n/\varepsilon )+O(n)$. Although these results represent a notable achievement, the associated resource cost remains a primary bottleneck for practical, large-scale quantum algorithms, motivating further optimization. To address this bottleneck, this paper introduces two novel $n$-qubit AQFT circuits with an approximation error of $O(\varepsilon )$. Our first design, AQFT Circuit 1, halves the T-count to $4n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))$ by constructing inverse phase gradient transformation (PGT) circuits without using additional non-Clifford gates and by implementing the inverse PGTs using quantum adders. Our second design, AQFT Circuit 2, reduces the T-depth to $\frac{1}{2}n{\text{log}}_{2}(n/\varepsilon )+O(n)$ through parallelization of the inverse PGTs that add only $O(n)$ additional T gates. For both AQFT circuits, the state-of-the-art linear-depth quantum adder is employed. We demonstrate that employing the linear-depth quantum adder provides advantages over the currently known logarithmic-depth quantum adder, not only in terms of T-count but also in T-depth optimization for the AQFT, particularly within the range $3<n/\varepsilon <{10}^{13}$, which encompasses practical system sizes.

Introduction

The quantum Fourier transform (QFT) is considered one of the most versatile components in quantum algorithms. It plays a key role in fundamental quantum algorithms such as quantum phase estimation¹, algorithms for hidden subgroup problems including Shor’s factoring algorithm², the Harrow-Hassidim-Lloyd (HHL) algorithm for solving systems of linear equations³, and quantum amplitude estimation⁴, to name a few. The applications of QFT span various fields, including basic arithmetic operations^5,6,7,8, cryptography^{9,10,11,12,13,14}, signal processing^15,16, quantum simulation^17,18,19,20, quantum machine learning^{21,22,23,24,25}, quantum finance^26,27,28, and computational fluid dynamics^29,30,31. Improving the efficiency of QFT implementations is therefore indispensable for enhancing the performance of various quantum algorithms and expanding their applicability.

Most quantum algorithms are designed based on the quantum circuit model, which describes a computation as a sequence of discrete quantum gates applied to qubits. Therefore, the efficient execution of these algorithms directly depends on the efficiency of their underlying circuits. This necessity has driven significant research into optimizing fundamental building blocks, particularly for resource-intensive tasks like quantum arithmetic^{32,33,34,35,36,37,38}.

Large-scale quantum algorithms, such as Shor’s factoring algorithm² and the HHL algorithm ³, which utilize QFT, should be implemented in a fault-tolerant manner due to the fragility of quantum information. In fault-tolerant quantum computations, circuits are synthesized using universal and fault-tolerant gates. The Clifford + T gate library is generally chosen for this purpose in various promising error correction codes^39,40. Within these fault-tolerance approaches, Clifford gates are easier to implement, often transversally. In contrast, T gates require more resource-intensive methods, such as magic state distillation^41,42, thus dominating the implementation cost^43,44. Consequently, the T-gate cost—typically quantified by the number of T gates (T-count) and the depth of T gates (T-depth)—has become a primary bottleneck for the efficient execution of large-scale quantum algorithms.

In fault-tolerant quantum computing, QFT is approximately implemented with an acceptable error, known as approximate quantum Fourier transform (AQFT). Typically, AQFT is implemented by removing all controlled rotation gates with angles below a certain threshold value. It has been demonstrated that, for effectively implementing various quantum algorithms utilizing QFT, applying AQFT instead of a full QFT often yields satisfactory results without significant performance penalties^45,46,47,48. For instance, using AQFT with a threshold value of $\pi /2^{8}$ is sufficient for factoring RSA-2048 with a $95\%$ success rate⁴⁸.

The basic implementation method for an $n$-qubit AQFT involves approximating the $n$-qubit QFT by omitting small-angle controlled rotations and decomposing the remaining controlled rotation gates into the Clifford + T gate library. This approach reduces the T-count for QFT implementation from $O({n}^{2}\text{log}n)$ to $O(n{\text{log}}^{2}n)$, while omitting the dependence on the approximation error for brevity. For the semiclassical version of the AQFT⁴⁹, which is followed by measurement, thereby limiting its application in the midst of computation, it has been demonstrated that AQFT can be implemented with a T-count of $O(n\text{log}n)$⁵⁰. In the case of fully coherent AQFT, Nam et al.⁵¹ achieved a T-count of $O(n\text{log}n)$ using a method originally reported in Ref.⁵² that involves implementing the phase gradient transformation (PGT) with a quantum adder. They utilized Toffoli gates (specifically, relative phase Toffoli gates, measurements, and classically controlled gates) to construct PGT circuits and employed quantum adders from Ref.³⁷. The T-count and T-depth of their $n$-qubit AQFT circuit with an approximation error of $O(\varepsilon )$ are reported as $8n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))$ and $n{\text{log}}_{2}(n/\varepsilon )+5n-O({\text{log}}^{2}(n/\varepsilon )),$ respectively. While this work achieved an asymptotic scaling of $O(n{\text{log}}_{2}(n/\varepsilon ))$, the large constant factors on the leading terms of both the T-count and T-depth indicate a clear potential for further practical optimizations.

In this paper, we present two new fully coherent n-qubit AQFT circuits, AQFT Circuit 1 and AQFT Circuit 2, which achieve further optimization of T-count and T-depth while maintaining an approximation error of $O(\varepsilon )$. AQFT Circuit 1 is designed to reduce the T-count. In this design, we construct the rotation gate layers that perform inverse PGTs without using Toffoli gates, which are non-Clifford gates and accounted for approximately half of the T-count in the AQFT circuit design in Ref.⁵¹. Next, we replace the rotation gate layers with quantum adders from Ref.³⁷. On the other hand, AQFT Circuit 2 is designed to reduce the T-depth. In this design, we construct the rotation gate layers without Toffoli gates, pair and parallelize two rotation gate layers into one, and replace them with quantum adders from Ref.³⁷.

As a result, AQFT Circuit 1 achieves a T-count of $4n{\text{log}}_{2}(n/\varepsilon )-O({\text{log}}^{2}(n/\varepsilon ))$ with a T-depth of $n{\text{log}}_{2}(n/\varepsilon )+n-O({\text{log}}^{2}(n/\varepsilon ))$, and AQFT Circuit 2 achieves a T-depth of $\frac{1}{2}n{\text{log}}_{2}(n/\varepsilon )+\frac{3}{2}n-O({\text{log}}^{2}(n/\varepsilon ))$ with a T-count of $4n{\text{log}}_{2}(n/\varepsilon )+n-O({\text{log}}^{2}(n/\varepsilon )).$ One might argue that using logarithmic-depth quantum adders could reduce the T-depth in AQFT implementation at the expense of increasing T-count. However, we demonstrate that the linear-depth quantum adders described in Ref.³⁷ offer advantages over the state-of-the-art logarithmic-depth quantum adders in Ref.³⁸, even in terms of T-depth optimization. This conclusion is supported by a comparison of both approaches when applied to AQFT circuits for systems satisfying $3<n/\varepsilon <{10}^{13}$.

Overview of the AQFT circuits and their advantages

In this paper, we introduce the construction of two novel n-qubit AQFT circuits based on the Clifford + T gate library. AQFT Circuit 1 is optimized to minimize the T-count, whereas AQFT Circuit 2 focuses on reducing the T-depth. These AQFT circuits approximate the QFT with an error of $O(\varepsilon )$. Figure 1a,b present the schematic diagrams for AQFT Circuit 1 and AQFT Circuit 2, respectively.

Similar to the AQFT circuit described in Ref.⁵¹, the construction of our AQFT circuits involves creating ${Z}^{\theta }$ gate (See Eq. (1)) layers, each of which performs an inverse PGT, and implementing them using quantum adders.

$${Z}^{\theta }=\left(\begin{array}{cc}1& 0\\ 0& {e}^{i\pi \theta }\end{array}\right).$$

(1)

In order to facilitate this process, a special quantum state $|{\psi }_{b+1}\rangle \equiv \frac{1}{\sqrt{{2}^{b+1}}} {\sum }_{k=0}^{{2}^{b+1}-1}{e}^{2\pi ik/{2}^{b+1}}|k\rangle$ needs to be prepared, where $b$ is chosen as $\lceil{\text{log}}_{2}(n/\varepsilon )\rceil$. We employ “repeat until success” circuits described in Ref.⁵³ to prepare the special quantum state $\left|{\psi }_{b+1}\right.\rangle$. A detailed explanation of the inverse PGT and its execution through a quantum adder is provided in Supplementary Material.

Before presenting detailed circuit designs and results, we provide a rough overview of the advantages of our AQFT circuits. AQFT Circuit 1 requires a count of $\sim n$ $(b+1)$-qubit quantum adders to perform ~$n$ inverse PGTs. Constructing these quantum adders involves a T-count of $\sim 4nb$. A key advantage of AQFT Circuit 1 is that the construction of inverse PGTs does not require additional non-Clifford gates, such as Toffoli gates (Note that each “${C}_{i}$” box in Fig. 1a consists solely of Clifford gates.). This omission of non-Clifford gates results in a reduction of the T-count from $\sim 8nb$ to $\sim 4nb$. AQFT Circuit 2 reduces T-depth from $\sim nb$ to $\sim \frac{1}{2}nb$ compared to AQFT Circuit 1 by pairing and parallelizing the inverse PGTs, which requires only $O(n)$ additional T gates. These additional T gates are included within the ${C{\prime}}_{i}$ boxes illustrated in Fig. 1b.

In the subsequent sections, for each AQFT circuit, we present the circuit design process, resource estimation focused on T-count and T-depth, and approximation error analysis.

AQFT circuit 1: optimized for T-count

Circuit design

Throughout this paper, we employ a compact notation for sequential CNOT gates that share a single control qubit but act on different target qubits. This “fan-out” notation is illustrated in Supplementary Material. This convention is used to improve the readability of complex circuits.

AQFT Circuit 1 is constructed through the following four steps:

(Step 1: QFT subcircuit decomposition) We begin by decomposing the subcircuits of the standard QFT circuit described in Ref.⁵⁴. The initial structure of these subcircuits is illustrated on the left-hand side of Fig. 2. The right-hand side of Fig. 2 shows the decomposed structure of the QFT subcircuits. A detailed explanation of this decomposition process is provided in Supplementary Material. In the decomposed subcircuit, multiple ${Z}^{\theta }$ gate layers are present. However, in the subsequent step, we will consolidate these layers. Specifically, the ${Z}^{\theta }$ gates within the red boxes from all the decomposed QFT subcircuits in Fig. 2 will be combined into a single ${Z}^{\theta }$ gate layer and placed at the very front of the QFT circuit. Similarly, the ${Z}^{\theta }$ gates in the purple boxes from all the decomposed QFT subcircuits in Fig. 2 will be merged into a ${Z}^{\theta }$ gate layer at the very end of the QFT circuit.

(Step 2: Constructing QFT circuit) We combine all the decomposed QFT subcircuits to construct a full QFT circuit. Next, all the ${\text{Z}}^{\uptheta }$ gates in the red boxes from the decomposed subcircuits in Fig. 2 are gathered at the very front of the QFT circuit. This reordering is possible because both the circuits in the green and red boxes of Fig. 2 have diagonal matrix representations and therefore commute with each other. Similarly, we gather the ${\text{Z}}^{\uptheta }$ gates in the purple boxes from the decomposed subcircuits in Fig. 2 at the very end of the QFT circuit. After this reorganization, the ${\text{Z}}^{\uptheta }$ gates at the very front and end of the QFT circuit have the form of ${\text{Z}}^{{2}^{\text{k}-1}-1/{2}^{\text{k}}}$, where $\text{k}\in \{\text{1,2}, \dots ,\text{n}\}$. We divide each ${\text{Z}}^{{2}^{\text{k}-1}-1/{2}^{\text{k}}}$ gate into ${\text{Z}}^{1/2}$ and ${\text{Z}}^{-1/{2}^{\text{k}}}$ gates. Note that the ${\text{Z}}^{1/2}$ gate is a Clifford gate.

(Step 3: Approximation) We remove all the ${\text{Z}}^{-1/{2}^{\text{k}}}$ gates with $\text{k}>\text{b}$ from the QFT circuit (See Fig. 3).

(Step 4: Inverse PGT implementation using quantum adder) In Fig. 3, each ${\text{Z}}^{\uptheta }$ gate layer in blue boxes performs an inverse PGT with the help of a ${\text{Z}}^{-1}$ gate. The effect of inserting the ${\text{Z}}^{-1}$ gates can be canceled by inserting $\text{Z}$ gates. We implement each inverse PGT using a quantum adder from Ref.³⁷.

T-count and T-depth estimation

In this section, we estimate the T-count and T-depth of AQFT Circuit 1. The T-count refers to the total number of T gates. The depth of a circuit refers to the number of sequential layers of gates that cannot be executed in parallel. Based on this, the T-depth is the number of sequential layers of T gates that cannot be parallelized, effectively ignoring the depth contributed by other gate types such as Clifford gates. The T-count and T-depth estimates presented in this paper are derived from the analytical formulas developed in the following sections, not from numerical simulations on a quantum simulator.

Among the inverse PGTs in Fig. 3, the 3-qubit inverse PGT does not need to be replaced by a 3-qubit quantum adder, because it only requires a single T gate. Considering this, the T gates required to construct AQFT Circuit 1 are found in four parts:

(1)
$(n-b+3)$ $(b+1)$-qubit quantum adders,
(2)
One for each of the $k$-qubit quantum adders, where $k\in \{4, 5, \dots , b\}$,
(3)
One T gate in the 3-qubit inverse PGT,
(4)
T gates for preparing the special state $|{\psi }_{b+1}\rangle$.

The $t$-qubit quantum adder in Ref.³⁷ requires a T-count of $(4t-4)$ and a T-depth of $t$. With the quantum adder, the T-count and T-depth in part (1) are $4nb-4{b}^{2}+12b$ and $nb+n-{b}^{2}+2b+3$, respectively. The T-count and T-depth in part (2) are ${\sum }_{k=4}^{b}\left(4k-4\right)=2{b}^{2}-2b-12$ and ${\sum }_{k=4}^{b}k={b}^{2}/2+b/2-6$, respectively. For part (4), $(b-2)$ ${Z}^{\theta }$ gates have to be synthesized with an error of $\varepsilon /b$ for each gate (See Supplementary Material). The value $\varepsilon /b$ is chosen so that AQFT Circuit 1 has an approximation error of $O(\varepsilon )$. In the gate synthesis, we use $RUS$ circuits from Ref.⁵³, and the expected T-count to synthesize a ${Z}^{\theta }$ gate with a Fourier angle with an error of $\varepsilon {\prime}$ is $1.08{\text{log}}_{2}(1/\varepsilon {\prime})+17.5.$ Therefore, the total T-count for AQFT Circuit 1 is

$$4nb-2{b}^{2}+b\left\{1.08 {\text{log}}_{2}\left(\frac{b}{\varepsilon }\right)+27.5\right\}-2.16{\text{log}}_{2}\left(\frac{b}{\varepsilon }\right)-45=4nb-O({b}^{2})$$

(2)

and the total T-depth for AQFT Circuit 1 is

$$nb+n-{\frac{1}{2}b}^{2}+\frac{5}{2}b+1.08{\text{log}}_{2}\left(\frac{b}{\varepsilon }\right)+15.5=nb+n-O({b}^{2})$$

(3)

Approximation error analysis

The approximation error, defined as the spectral norm⁵⁴, can be found in the Supplementary Material. We demonstrate that AQFT Circuit 1 has an error of $O(\varepsilon )$ when implemented in replacement of a QFT circuit. The approximation error in AQFT Circuit 1 arises from two sources:

(1)
Removing all the ${Z}^{-1/{2}^{k}}$ gates with $k>b$,
(2)
Approximating ${Z}^{\theta }$ gates to prepare the special state $|{\psi }_{b+1}\rangle$ using $RUS$ circuits.

We call the error from part (1) ${\varepsilon }_{1}^{(1)}$ and the error from part (2) ${\varepsilon }_{1}^{(2)}$.

The error for removing a ${Z}^{\theta }$ gate is

$$\underset{|\psi \rangle }{\text{max}}\Vert \left(I-{Z}^{\theta }\right)|\psi \rangle \Vert$$

(4)

and this can be rewritten as

$$\sqrt{2\left(1-\text{cos}\theta \pi \right)}=2\left|\text{sin}(\pi \theta /2)\right|<\left|\pi \theta \right|$$

(5)

since

$$\left( {I - Z^{\theta } } \right)^{\dag } \left( {I - Z^{\theta } } \right) = \left( {\begin{array}{*{20}c} 0 & 0 \\ 0 & {2\left( {1 - \cos \theta \pi } \right)} \\ \end{array} } \right)$$

(6)

Therefore, the error ${\varepsilon }_{1}^{(1)}$ is bounded as follows:

$${\varepsilon }_{1}^{(1)}\le \left(n-b\right)\bullet {\sum }_{j=b+1}^{n}\frac{\pi }{{2}^{j}}<n{2}^{-b}\pi <\varepsilon \pi$$

(7)

Note that $b$ is chosen as $\lceil{\text{log}}_{2}(n/\varepsilon )\rceil$, thus we have $n{2}^{-b}\le \varepsilon$.

When synthesizing ${Z}^{\theta }$ gates to prepare the special state $|{\psi }_{b+1}\rangle$, each ${Z}^{\theta }$ gate is approximated with an error of $\varepsilon /b$, resulting in ${\varepsilon }_{1}^{(2)}\le \left(b-2\right)\bullet \frac{\varepsilon }{b}<\varepsilon$. Consequently, the total error implementing AQFT Circuit 1 instead of the QFT circuit is $O(\varepsilon )$ because

$${\varepsilon }_{1}^{(1)}+{\varepsilon }_{1}^{(2)}<(1+\pi )\varepsilon$$

(8)