Introduction

Many complicated problems have recently been solved through quantum computation1,2. The key component of quantum computation is quantum entanglement3, which is realized in different quantum systems4,5,6. These quantum systems are primarily achieved through superconducting qubits7, superconducting electronics8 , ion traps, quantum optics, quantum-dot physics, atomic physics, and quantum cavities9. In a superconducting quantum system, 51-qubit one-dimensional cluster states are realized and achieved fidelities of 0.637. This improvement is necessary to realize medium-scale quantum computing10. However, a common issue in all these quantum systems is noise. It is generally difficult to eliminate, and even its accurate description poses a challenge11,12.

The standard approach for describing quantum systems with noise is to perform quantum tomography13,14. However, the resources consumed in this procedure increase exponentially with the number of qubits in system. To address this resource problem, several schemes have been proposed, considering the structure of the density matrix or process matrix in quantum tomography. For example, matrix product state tomography employs the matrix product property to save resources for describing many body quantum systems and this approach is certificated by reconstructing a 14-spin simulate state in experiment15. The other property of certain density matrices is the low-rank property. It is utilized in some protocol16,17. Apart from the resource obstacle, another challenge is to obtain a physical density matrix or process matrix for multi-qubit system in a relatively short time. To this end, many efficient estimators and algorithms have been developed to reconstruct the density matrix18,19. Similarly, some of these algorithms have been extended to identify or characterize the system’s process by calculating specific system parameters. This process is named by quantum process tomography. It aims to describe the dynamics of a system. By exploiting the sparsity of the process matrix, compressive sensing is implemented with purpose of reducing copies of state20. To achieve the same target, another efficient approach is to perform adaptive quantum tomography, which updates the measurement settings based on previous measurement results21. This is similar to self-guided quantum tomography, where a gradient algorithm is used to search for the state by minimizing the infidelity22. This method has recently been certified through experiment23.

However, characterizing system with no apparent structure remains challenging using these methods. For general quantum systems, the resource consumption is directly related to the system’s dimension, scaling exponentially as the dimension increases linearly. Therefore, rather than estimating all the parameters of the system, we focus on estimating several key parameters.

One of these is the fidelity of the system24. It measures the similarity between the input state and the output state of the system, which is often our target metric, as it requires significantly fewer resources and provides insights into the noise impact on the quantum gate25,26,27. When the noise is weak, the fidelity is close to 1. Conversely, when the noise is strong, the fidelity approaches 0. Thus, utilization of fidelity is advantageous for identifying a quantum system.

When the variation of the noise of quantum system is relatively weak. Average fidelity is proposed and estimated by Monte Carlo sampling. It shows the minimum experimental effort scales as \(2^n\) for n-qubit system to estimate the average error of a quantum gate28. When the system is consisted by a set of quantum gate, for instance, it consists of random sequences of Clifford gates, randomized benchmarking protocol is proposed to characterize the average error rate or the average sequence fidelity of the system29. To further evaluate the average error of individual quantum computational gate, Interleaved Randomized Benchmarking is developed for the case that the average noise variation over all Clifford gates is small30. This protocol is improved for mid-circuit measurements. It interleaves mid-circuit measurements on an ancilla qubit on a control qubit. Therefore, this technique efficiently characterizes the performance of quantum computation31. To evaluate certain specific noise in quantum circuit, such as incoherent errors or time-dependent Markovian noise, incoherent infidelity is proposed to evaluate it. Similarly, this tool is restricted to the weak noise regime as well32.

When the noise of system is strong, both incoherent infidelity and the average fidelity do not perform well. Since the value of minimum fidelity of system might be far below the average fidelity in this scenario, fidelity for general case is studied. For instance, a protocol is proposed for cross-platform verification of fidelity. It requires local measurement of randomized product bases and is certificated by the 10-qubit quantum entanglement states in a trapped ion quantum simulator33. When both of the states are pure for fidelity estimation, it is calculated by applying the combination of computational basis and entanglement basis34. Meanwhile, fidelity between one pure state and a mixed state is typically estimated by Direct Fidelity Estimation (DFE)35,36 by employing pauli basis. It consumes copies of state, which increases linearly with the dimension of system. However, the resource consumption of this approach remains substantial for multi-qubit systems, the details are shown in the first part of appendix. Most of these protocols for state parameter estimation are compared with each other from the aspect of the prior information used, complexity, assumptions used, feasibility and so on37.

Here, we introduce a novel fidelity estimation approach to further minimize the resource consumption.

Optimal fidelity estimation

An optimal method is developed to minimize the number of copies of state for fidelity estimation. The two states for fidelity estimation are denoted by \(\rho _1\) and \(\rho _2\) separately. The state \(\rho _1\) is the target state, which is generally a pure state. It is a known state and stored in a classical memory, so we can calculate its decomposition in any basis for Eq. (1). But the state \(\rho _2\) is the state mixed with certain noise, it is a mixed state generally. It is an unknown state codified in a quantum system. The goal is to compare the unknown state \(\rho _2\) of a quantum system with a known target state \(\rho _1\). For an n-qubit system, fidelity between \(\rho _1\) and \(\rho _2\) is defined as

$$\begin{aligned} F(\rho _{2},\rho _{1}) = \text {Tr}(\rho _{2}\rho _{1}) = \sum _{j=1}^{d^2} S_j \text {Tr}(W_j \rho _{2}) = \sum _{j=1}^{d^2} S_j \sum _{k=1}^{d} e_{jk} \text {Tr}(\Pi _{jk} \rho _{2}), \end{aligned}$$
(1)

where \(d=2^n\) is the dimension of the Hilbert space, \(S_j\) is the coefficient that \(\rho _{1}\) is decomposed into operators \(W_j\), where \(W_j\) is the tensor product of Pauli matrices and identity operators. The \(\Pi _{jk}\) represents the projection operator onto the k-th eigenstate of the j-th measurement setting, \(e_{jk}\) is the eigenvalue, and

$$\begin{aligned} p_{jk}=Tr(\Pi _{jk} \rho _{2}) \end{aligned}$$
(2)

is the probability of obtaining the eigenstate. Since \(p_{jk}\) is not directly obtained, it is approximated by relative frequencies \(f_{jk}\). Similarly, when \(W_j\) contains identity operator, \(Tr(W_j\rho _{2})\) is obtained by a linear combination of the relative frequencies of other \(W_j\)’ that containing no identity operator.

Therefore, the fidelity of pure state is simplified by substituting Eq. (2) into Eq. (1),

$$\begin{aligned} F(\rho _{2},\rho _{1})=\sum _{kj}S_{j}e_{jk}f_{jk}. \end{aligned}$$
(3)

In Eq. (3), each term of F is accurately estimated. To analyze the accuracy of F, the deviation of F is calculated, which is mainly impacted by the number of copies of the state. Since the target of distribution of limited copies of state is to control the deviation of fidelity \(\Delta F\). Specifically, the standard deviation of fidelity \(\Delta F\) is

$$\begin{aligned} \Delta F=\sqrt{\sum _{kj}(\frac{\partial F}{\partial f_{kj}})^2 D(f_{kj})^2}, \end{aligned}$$
(4)

in which \(D(f_{kj})\) is the standard deviation of \(f_{kj}\). When \(t_j\) copies of state are projected into the bases \(\Pi _{jk}\) of measurement setting \(W_j\) (\(W_j=\sum _k e_{kj}\Pi _{jk}\)), the variance of the number of copies of the state projected into the base \(\Pi _{jk}\) is \(t_jf_{kj}(1-f_{kj})\), where \(f_{kj}=t_{kj}/t_{j}\) and \(t_{kj}\) represents the number of copy of the state detected on the base \(\Pi _{jk}\). Therefore, the standard deviation of relative frequency \(f_{kj}\) from binomial distribution of copies of the state is

$$\begin{aligned} D(f_{kj})=\sqrt{\frac{f_{kj}(1-f_{kj})}{t_j}}.\ \end{aligned}$$
(5)

Since \(e_{kj}^2=1\), Eq. (4) is rewritten as

$$\begin{aligned} \Delta F=\sqrt{\sum _{kj}(\frac{\partial F}{\partial f_{kj}})^2 D(f_{kj})^2} =\sqrt{\sum _{kj}(S_{j})^2(\sqrt{f_{kj}(1-f_{kj})/t_j})^2} =\sqrt{\sum _{kj}(S_{j})^2 f_{kj}(1-f_{kj})/t_j} \end{aligned}$$
(6)

Our target is to obtain a small deviation of F by employing minimum number of copies of the state. Therefore, the following optimization problem is constructed:

$$\begin{aligned} \begin{aligned} \min _{t_j}&\sum _{j=1}^{d^2}t_j\\ \text {s}.\text{t}.&\sqrt{\sum _{kj}S_{j}^2\frac{f_{kj}(1-f_{kj})}{t_j}}\le \epsilon _0. \end{aligned} \end{aligned}$$
(7)

where \(\epsilon _0\) is the threshold of noise. The standard deviation of fidelity \(\Delta F\) is expected as small as possible given certain number of copies of the state t. Therefore, Eq. (7) is rewritten as

$$\begin{aligned} \begin{aligned} \min _{t_j}&\Delta F(t_j) \\ \text {s}.\text{t}.&\sum _{j=1}^{d^2} t_j = t \end{aligned} \end{aligned}$$
(8)

By substituting Eq.  (6) into Eq.  (8), one obtains

$$\begin{aligned} \begin{aligned} \min _{R_j}&\sum _{j=1}^{d^2} \frac{S_j^2(1-\sum _{k=1}^{d}f_{kj}^2)}{R_j} \\ \text {s}.\text{t}.&\sum _{j=1}^{d^2} R_j=1 \end{aligned} \end{aligned}$$
(9)

where \(R_j=t_j/t.\) Lagrange multiplier method is employed to solve the Eq. (9). Therefore, it further leads to minimize the

$$\begin{aligned} L= \sum _{j=1}^{d^2} \frac{S_j^2(1-\sum _{k=1}^{d}f_{kj}^2)}{R_j}+ \lambda (\sum _{j=1}^{d^2} R_j-1). \end{aligned}$$
(10)

From Eq. (10), partial derivative for \(R_j\) is

$$\begin{aligned} \frac{\partial L}{\partial R_j}=-\frac{S_j^2(1-\sum _{k=1}^{d}f_{kj}^2)}{R_j^2}-\lambda =0. \end{aligned}$$
(11)

From Eq. (11), we arrive at

$$\begin{aligned} R_j^2=-\frac{S_j^2(1-\sum _{k=1}^{d}f_{kj}^2)}{\lambda }. \end{aligned}$$
(12)

Since

$$\begin{aligned} \sum _{j=1}^{d^2} R_j=1. \end{aligned}$$
(13)

By substituting Eq. (12) into Eq. (13), we obtain

$$\begin{aligned} \sqrt{-\lambda }=\sum _{j=1}^{d^2}|S_j|\sqrt{1-\sum _{k=1}^{d}f_{kj}^2}. \end{aligned}$$
(14)

By substituting Eq. (14) into Eq. (12), we have

$$\begin{aligned} R_j=\frac{|S_j|\sqrt{1-\sum _{k=1}^{d}f_{kj}^2}}{\sum _{j=1}^{d^2}(|S_j|\sqrt{1-\sum _{k=1}^{d}f_{kj}^2})}. \end{aligned}$$
(15)

Meanwhile, \(f_{kj}\) is approximated by \(Tr(\Pi _{jk}\rho _{1})\) when noise is weak. To estimate \(f_{kj}\) accurately, adaptative approach is applied. It updates the \(f_{kj}\) after measuring a small number of copies of the state \(\rho _1\). To be specific, several steps are included to perform this approach. Firstly, the initial relative frequency is denoted by \(f_{jk}^{0}\). It is calculated by \(f_{jk}^{0}=Tr(\rho _1\Pi _{jk})\). Secondly, measurement is performed according to Born rule \(p_{jk}^{(0)}=Tr(\rho _2\Pi _{jk})\) by using a constant number of copies of the state \(\rho _2\). Thirdly, a distribution of number of copies of state on different basis in the same measurement setting is obtained and represented by (\(t_{c1}^{w}\), \(t_{c2}^{w}\), \(t_{c3}^{w}\), ... \(t_{ck}^{w}\), .... \(t_{cd}^{w}\) ), where the number of copies of state on the \(k-th\) basis is denoted by \(t_{ck}^{w}\), w represents the number of current iteration. Fourthly, the relative frequency is obtained by the equation \(f_k^{w}=t_{ck}^{w}/\sum _{k=1}^{d}t_{ck}^{w}\). Lastly, the relative frequency is updated according to the formula \(p_{jk}^{w+1}=(wp_{jk}^{w}+f_k^{w})/(w+1)\). After that, go back to the second step and perform the same process. Stop the iteration when the stop criterion \(Max | p_{jk}^{w}-p_{jk}^{w+1}|< small \ constant\) is satisfied.

In the optimal fidelity estimation, accurately calculating the standard deviation of fidelity \(\Delta F\) before performing measurements is challenging, as the relative frequency \(f_{kj}\) has not yet been obtained. The solution involves using a small number of copies of the state to measure the state \(\rho _2\) and obtain an estimated relative frequency with a large deviation \(p_{jk}^{(0)}=Tr(\rho _2\Pi _{jk})\) . This estimated relative frequency \(p_{jk}^{(0)}\) is then used to compute the standard deviation of fidelity \(\Delta F\) and determine the number of additional copies of the state required for measurements in different bases \(\Pi _{jk}\). This approach helps distribute the copies of the state in an optimal way.

When \(1-\sum _{k=1}^{d}f_{kj}^2\) is approximated as a constant for different j’s. Then, Eq. (15) is simplified as following:

$$\begin{aligned} R_j\approx \frac{|S_j|}{\sum _{j=1}^{d^2}|S_j|}. \end{aligned}$$
(16)

Therefore, the main steps of obtaining fidelity by optimal fidelity estimation is summarized and the pseudo code is shown in Fig. 1.

Figure 1
figure 1

Main steps of optimal fidelity estimation for single qubit system by applying pauli operator.

Furthermore, the resources used for the optimal fidelity estimation are analyzed. By solving Eq. (7), the solution is

$$\begin{aligned} & t_j=\lceil \frac{\sqrt{K_j}(\sum _{j'=1}^{d^2}\sqrt{K_{j'}})}{\epsilon _0^2}\rceil \nonumber \\ & <\frac{\sqrt{K_j}(\sum _{j'=1}^{d^2}\sqrt{K_{j'}})}{\epsilon _0^2}+1\end{aligned}$$
(17)

in which

$$\begin{aligned} & K_j=S_{j}^2(1-\sum _{k=1}^{d}f_{kj}^2) \end{aligned}$$
(18)

and

$$\begin{aligned} & K_{j'}=S_{j'}^2(1-\sum _{k'=1}^{d}f_{k'j'}^2). \end{aligned}$$
(19)

Here the scaling of the average number of copy of state consumed is derived. The purity of density matrix \(\rho\) is

$$\begin{aligned} & Tr(\rho ^2)=Tr(\sum _{j=1,j'=1}^{d^2}S_{j}W_{j}S_{j'}W_{j'})\end{aligned}$$
(20)
$$\begin{aligned} & =Tr(\sum _{j=1,j'=1}^{d^2}S_{j}S_{j'}\sigma _{j1}\sigma _{j'1}\otimes \sigma _{j2}\sigma _{j'2}\otimes ....\otimes \sigma _{jn}\sigma _{j'n} )\end{aligned}$$
(21)
$$\begin{aligned} & =\sum _{j=1,j'=1}^{d^2}S_{j}S_{j'}Tr(\sigma _{j1}\sigma _{j'1})Tr(\sigma _{j2}\sigma _{j'2})...Tr(\sigma _{jn}\sigma _{j'n})\end{aligned}$$
(22)
$$\begin{aligned} & =\sum _{j=1}^{d^2}S_{j}^{2}2^n. \end{aligned}$$
(23)

We consider two cases for the density matrix. The first case is a pure state, where the purity is equal to one.

$$\begin{aligned} & Tr(\rho ^2)=1. \end{aligned}$$
(24)

One has

$$\begin{aligned} & \sum _{j=1}^{d^2}S_{j}^2=\frac{1}{d}. \end{aligned}$$
(25)

On average,

$$\begin{aligned} & \overline{S_{j}^2}=\frac{1}{d^3}. \end{aligned}$$
(26)

Therefore,

$$\begin{aligned} & \overline{S_{j}}=\frac{1}{d^{1.5}}. \end{aligned}$$
(27)

From Eq. (18), one has

$$\begin{aligned} & \overline{K_{j}}\approx \frac{1}{d^{3}}. \end{aligned}$$
(28)

From Eq. (17), one obtains

$$\begin{aligned} & \overline{t_{j}}\approx \frac{1}{d\epsilon _{0}^2}. \end{aligned}$$
(29)

Therefore, the total number of copies of state is roughly

$$\begin{aligned} & \overline{\sum _{j=1}^{d^2}t_{j}}\approx \frac{d}{\epsilon _{0}^2}. \end{aligned}$$
(30)

For the second case, \(\rho\) is a mixed state, where the purity is less than one. Therefore, one has

$$\begin{aligned} & \sum _{j=1}^{d^2}S_{j}^{2}<\frac{1}{d}, \end{aligned}$$
(31)

From Eq. (17) and Eq. (18), the total number of copies of the state is given by

$$\begin{aligned} & \sum _{j=1}^{d^2}t_{j}<\frac{d}{\epsilon _{0}^{2}}. \end{aligned}$$
(32)

The advantage of the optimal fidelity estimation is mainly from three aspects. Firstly, the target state information is applied to select the measurement settings. This leads to that the number of measurement setting selected is much smaller than the direct fidelity estimation for a large number of states. Therefore, the limited number of copies of state is used more wisely and efficiently. Secondly, instead of randomly generate copies of state and distribute them randomly on different measurement settings, optimal fidelity estimation applies the information of target state \(\rho _1\) to have the initial estimation of relative frequency and calculates the fixed optimal ratio of number of copies of the state to distribute on different measurement settings. This saves many copies of the state since certain measurement settings may not require to be performed measurement. In contrast, a small number of copies of the state is still applied for each of these measurement settings in direct fidelity estimation. The number of measurement settings that do not require to perform measurement generally increases fast as the dimension of state increases. Lastly, optimal fidelity estimation has no restrictions on measurement basis setting. It is selected freely. When the information of target state \(\rho _1\) is applied, only one measurement setting is enough to calculate fidelity. In contrast, direct fidelity estimation is restricted to pauli measurement and the number of measurement setting is \(d^2\) for d dimension system. The numerical comparison is performed in the second part in the numerical simulation results.

Numerical simulation results

Comparison with direct fidelity estimation by pauli measurement

In this section, our optimal fidelity estimation is numerically compared with traditional Direct Fidelity Estimation(DFE). Eq. (16) is applied to a single qubit density matrix and a two-qubit density matrix separately.

For the single qubit system, the density matrix \(\rho _1\) is decomposed as

$$\begin{aligned} \rho _1 = \frac{1}{2}(I + S_x\sigma _x + S_y\sigma _y + S_z\sigma _z). \end{aligned}$$
(33)

where \(\sigma _x\), \(\sigma _y\), and \(\sigma _z\) are Pauli matrices, and \(S_x\), \(S_y\), and \(S_z\) are the corresponding coefficients. The total number of copies of the state is denoted as \(N_{c1}\). The optimal distribution of copies of state in the three Pauli settings is by \(N_{c1}R_x\), \(N_{c1}R_y\) and \(N_{c1}R_z\). Based on Eq. (16), \(R_x\), \(R_y\), and \(R_z\) are

$$\begin{aligned} R_x = \frac{|S_x|}{|S_x| + |S_y| + |S_z|}, \ R_y = \frac{|S_y|}{|S_x| + |S_y| + |S_z|}, \ R_z = \frac{|S_z|}{|S_x| + |S_y| + |S_z|}. \end{aligned}$$
(34)

Therefore, the fidelity between a single-qubit state \(\rho _1\) and \(\rho _2\) is precisely estimated by Eq. (34). Numerical simulation is performed to compare this protocol with Direct fidelity estimation.

In numerical simulation, a single-qubit state \(\rho _1\) is randomly generated. The real parts of four elements of \(\rho _1\) are all roughly equal to 0.5 and the imaginary parts of the entries are nearly equal to zero. \(\rho _1\) is taken as the target pure state. \(\rho _2\) is produced by mixing gaussian noise into this pure state \(\rho _1\). The true fidelity between the mixed state \(\rho _2\) and the target pure state \(\rho _1\) is denoted by \(F_{true}\). Therefore, it is directly obtained. Then, the direct fidelity estimation and the optimal fidelity estimation are applied separately to estimate the \(F_{true}\). Denote the fidelity obtained from the direct fidelity by \(F_d\) and the one obtained from the optimal fidelity estimation by \(F_{op}\). To have a fair comparison, the number of copies of the state used, the target state, the gaussian noise and the measurement settings (Pauli measurement) are exactly the same during the estimation by two methods. To eliminate the statistical fluctuations, we repeat the estimations by 100 times and calculate the average value of the gap \(|F_d-F_{true}|\) and \(|F_{op}-F_{true}|\), which is denoted by \(Ave|F_d-F_{true}|\) and \(Ave|F_{op}-F_{true}|\).

$$\begin{aligned} Ave(|F_d-F_{true}|)=\frac{\sum _{q=1}^{100}|F_{dq}-F_{true}|}{100}, \end{aligned}$$
(35)

where \(F_{dq}\) is the q-th estimated fidelity by direct fidelity estimation.

$$\begin{aligned} Ave(|F_{op}-F_{true}|)=\frac{\sum _{q=1}^{100}|F_{op,q}-F_{true}|}{100} \end{aligned}$$
(36)

where \(F_{op,q}\) is the q-th estimated fidelity by optimal fidelity estimation.

As shown in Fig. 2, \(Ave(|F_d-F_{true}|)\) is denoted by black point and \(Ave(|F_{op}-F_{true}|)\) is denoted by red point. The results show that \(Ave(|F_d-F_{true}|)\) is larger than \(Ave(|F_{op}-F_{true}|)\) in the cases of different number of copies of the state. Obviously, when a large number of copies of the state is employed, the accuracies of the estimation of fidelity for both methods are much higher since the deviation from the true fidelity \(F_{true}\) is smaller. Inspired by the Eq. (31) and Eq. (32), the scaling of the number of copy of the state and the precision of the estimation of fidelity are calculated for both methods. The number of copies of the state is \(0.06246/(|F-F_{true}|^2)\) for traditional direct fidelity estimation, the value is \(0.03088/(|F-F_{true}|^2)\) for optimal fidelity estimation. Therefore, the number of copies of the state costed in optimal fidelity estimation is roughly half of the number costed in the direct fidelity estimation in this case.

Figure 2
figure 2

Comparison of the single qubit case for two fidelity estimation methods: the average value of \(|F_d-F_{true}|\) from 100 estimations by traditional method (direct fidelity estimation) is represented by black dot, the average value of \(|F_{op}-F_{true}|\) from 100 estimations by our optimal fidelity estimation is denoted by red dot, where \(F_d\) is the fidelity estimated by direct fidelity estimation, \(F_{op}\) is the fidelity estimated by optimal fidelity estimation, \(F_{true}\) is the true value of fidelity. Error bar is the standard deviation of the \(|F_d-F_{true}|\) or \(|F_{op}-F_{true}|\).

In addition, Eq. (16) is applied to the two-qubit density matrix as well. Firstly, the density matrix of the two-qubit state \(\rho\) is decomposed as

$$\begin{aligned} \rho = \frac{1}{4}(I \otimes I) + S_{ix}(I \otimes \sigma _x) + S_{iy}(I \otimes \sigma _y) + \cdots + S_{zz}(\sigma _z \otimes \sigma _z). \end{aligned}$$
(37)

where \(S_{ix}\), \(S_{iy}\), ..., \(S_{zz}\) are the corresponding coefficients for different bases. Suppose the total number of copies of the state is \(N_{c2}\). The optimal distribution of these copies of the state in different bases (\(\sigma _x\otimes \sigma _x\), \(\sigma _x\otimes \sigma _y\), and so on) is according to \(N_{c2}R_{xx}\), \(N_{c2}R_{xy}\), \(N_{c2}R_{xz}\), \(N_{c2}R_{yx}\), \(\cdots\), and \(N_cR_{zz}\), where

$$\begin{aligned} R_{xx} = \frac{|S_{xx}|}{|S_{xx}| + |S_{xy}| + \cdots + |S_{zz}|}, \ \cdots \ R_{zz} = \frac{|S_{zz}|}{|S_{xx}| + |S_{xy}| + \cdots + |S_{zz}|}. \end{aligned}$$
(38)

Notice that there are some terms in Eq. (37) that contain the identity operator I. The expectation values of these operators are obtained as a linear combination of measurement operators containing no I. For example, the term \(S_{ix}(I \otimes \sigma _x)\) is accounted for by normalizing the remaining measurement results that involves \(\sigma _x\) on the second qubit. Therefore, the fidelity for a two-qubit system is obtained using all the measurement expectation values.

The optimal fidelity estimation is compared to the DFE. In numerical simulation, the two-qubit \(Schr\ddot{o}dinger\) Cat state is applied, as shown in the second part of appendix. The similar simulation with Single qubit case is performed. The results show that \(Ave(|F_d-F_{true}|)\) is larger than \(Ave(|F_{op}-F_{true}|)\) for the cases of different number of copies of the state, as shown in Fig. 3.

Figure 3
figure 3

Comparison of the fidelity of Two-qubit density matrix by two fidelity estimation methods: Each point is the average gap between the estimated value and the true fidelity \(F_{true}\) calculated by 100 estimations separately. Black dot represents the estimation by traditional method (direct fidelity estimation), while red dot represents the estimation our optimal fidelity estimation. Error bar is the standard deviation of the estimated gap between the estimated fidelity and true fidelity.

Comparison of optimal fidelity by single measurement setting with direct fidelity estimation

When the measurement setting is not limited to pauli measurement and Positive Operator-Valued Measure (POVM) \(\Pi _{k}\) is prepared with any form, the number of copies of the state is further reduced for fidelity estimation. Since the state \(\rho _1\) is the target state, which is a pure state and already known before measurement, we construct POVM bases with the same form as \(\rho _1\). Therefore, the state \(\rho _2\) is measured using the bases of \(\rho _1\) according to the formula of fidelity for pure state Eq. (1). Therefore, the relative frequency is taken as the fidelity directly, which further leads to a reduction in resources.

To certificate the reduction of resources, we perform the numerical estimations by two methods separately. Firstly, two different quantum states are randomly generated. One is a pure state, the other is the pure state mixed with certain noise. The fidelity between the two states is around 0.55. By setting the gap between the true fidelity \(F_{true}\) and the estimated fidelity to be 0.01, we simulate the numerical estimation process in computer and calculate the number of copies of the state that arrives at the accuracy 0.01 for both methods. After repeating this process for 100 times, we calculate the average number of copies of the state consumed to estimate these fidelities and mean square error of the number of copies of the state for two methods separately. The similar estimations are repeated by setting the same accuracy gap 0.01 between the estimated fidelity and the true fidelity for two-qubit, three-qubit, four-qubit and five-qubit density matrices. As shown in Figs. 4 and 5, black point represents the average number of copies of the state consumed by direct fidelity estimation, red point represents the average number of copies of the state consumed by optimal fidelity estimation. It is observed that the black points are all above the red points, which shows that optimal fidelity estimation consumes less copies of state than the direct fidelity estimation. The required number of copies of state increases exponentially with the linear increase of the number of qubit by direct fidelity estimation. In contrast, the number of copies of the state increases slower than \(d/\epsilon _0^2\) and is independent of dimension of density matrix d, which is roughly a constant with the increase of the number of qubit by optimal fidelity estimation.

Figure 4
figure 4

Comparison of the number of copies of the state required for different dimension density matrices by two fidelity estimation methods. Random density matrices are applied. Target state \(\rho _1\) is a pure state, \(\rho _2\) is a mixed state. When only one measurement setting is applied for optimal fidelity estimation, the number of copies of state is calculated and represented by red dot. It is compared with the traditional method (direct fidelity estimation), denoted by black dot. Each point is the average number of copies of state costed when the same accuracy of fidelity estimation is achieved from 100 estimations, \(F's\) are the estimated fidelities by two approaches separately, \(F_{true}\) is the true value of fidelity. Error bar is the standard deviation of the \(|F-F_{true}|\).

Figure 5
figure 5

Comparison of the number of copies of the state required for different dimension density matrices by two fidelity estimation methods. The data is exactly same with the Fig. 4 and is plotted in log scale.

Therefore, optimal fidelity estimation saves a large number of copies of the state.

Conclusion and outlook

An optimal method is developed for estimating the fidelity of a quantum state. It is compared with direct fidelity estimation under the same condition. The results show that optimal fidelity estimation has a significant reduction of the number of copies of the state costed, achieving the same level of accuracy in fidelity estimation. Optimal fidelity estimations are realized in both measurement setting of pauli basis and random measurement setting. It is suitable for any states, which include sparse state and any other states. In the future, more specific restricts of the measurement basis setting can be considered and more specific restricts from different physical systems can be added in the optimization problem, so that the schemes can be easily applied in different quantum systems for detecting error and noise.