Abstract
The ability to identify cause–effect relations is an essential component of the scientific method. The identification of causal relations is generally accomplished through statistical trials where alternative hypotheses are tested against each other. Traditionally, such trials have been based on classical statistics. However, classical statistics becomes inadequate at the quantum scale, where a richer spectrum of causal relations is accessible. Here we show that quantum strategies can greatly speed up the identification of causal relations. We analyse the task of identifying the effect of a given variable, and we show that the optimal quantum strategy beats all classical strategies by running multiple equivalent tests in a quantum superposition. The same working principle leads to advantages in the detection of a causal link between two variables, and in the identification of the cause of a given variable.
Similar content being viewed by others
Introduction
Identifying causal relations is a fundamental primitive in a variety of areas, including machine learning, medicine, and genetics1,2,3. A canonical approach is to formulate different hypotheses on the cause–effect relations characterizing a given phenomenon, and test them against each other. For example, in a drug test some patients are administered the drug, while others are administered a placebo, with the scope of determining whether or not the drug causes recovery. Traditionally, causal discovery techniques have been based on classical statistics, which effectively describes the behavior of macroscopic variables. However, classical techniques become inadequate when dealing with quantum systems, whose response to interventions can strikingly differ from that of classical random variables4,5.
Recently, there has been a growing interest in the extension of causal reasoning to the quantum domain. Several quantum generalizations of the notion of causal network have been proposed6,7,8,9,10,11,12,13,14,15 and new algorithms for quantum causal discovery have been designed16,17,18,19,20. Besides its foundational relevance, the study of quantum causal discovery algorithms is expected to have applications in the emerging area of quantum machine learning21,22, in the same way as classical causal discovery algorithms have previously impacted classical artificial intelligence.
An intriguing possibility is that quantum mechanics may provide enhanced ways to identify causal links. A clue in this direction comes from refs. 17,18, where the authors show that certain quantum correlations are witnesses of causal relationships, in apparent violation of the classical tenet “correlation does not imply causation”. This observation suggests that quantum setups for testing causal relationships could overcome some of the limitations of existing classical setups. However, the type of advantage highlighted in refs. 17,18 only concerns a limited class of setups, where the experimenter is constrained to a subset of the possible interventions. If arbitrary interventions are allowed, this particular type of advantage disappears. A fundamental open question is whether quantum setups can offer an advantage over all classical setups, without any restriction on the experimenter’s interventions.
Here, we answer the question in the affirmative, proving that quantum features like superposition and entanglement can significantly speed up the identification of causal relations. We start from the task of deciding which variable, out of a list of candidates, is the effect of a given variable. We first analyze the problem in the classical setting, determining the performance of the best classical strategy. Then, we construct a quantum strategy that reduces the error probability by an exponential amount, doubling the decay rate of the error probability with the number of accesses to the relevant variables. Remarkably, the decay rate of our strategy is the highest achievable rate allowed by quantum mechanics, even if one allows for exotic setups where the order of operations is indefinite23,24. The key ingredient of the quantum speedup is the ability to run multiple equivalent experiments in a quantum superposition. The same working principle enables quantum speedups in a broader set of tasks, including, e.g., the task of deciding whether there exists a causal link between two given variables, and the task of identifying the cause of a given variable.
Results
Theory-independent framework for testing causal hypotheses
Here, we outline a framework for testing causal hypotheses in general physical theories25,26,27,28,29,30. In this framework, variables are represented as physical systems, each system with its set of states. The framework applies to theories satisfying the Causality Axiom28, stating that the probability of an event at a given time should not depend on choices of settings made at future times.
A causal relation between variable A and variable B is represented by a map describing how the state of B responds to changes in the state of A. If the map discards A and outputs a fixed state of B, then no causal influence can be observed. In all the other cases, some change of A will lead to an observable change of B. Hence, we say that A is a cause for B.
In general, the set of allowed causal relationships depends on the physical theory, which determines which maps can be implemented by physical processes. In classical physics, cause–effect relations can be represented by conditional probability distributions of the form p(b|a), where a and b are the values of the random variables A and B, respectively. In quantum theory, cause–effect relations are described by quantum channels, i.e., completely positive trace-preserving maps transforming density matrices of system A into density matrices of system B.
Given a set of variables, one can formulate hypotheses on the causal relationships among them. For example, consider a three-variable scenario, where variable A may cause either variable B or variable C, but not both. The causal relation is described by a process \({\cal C}\), with input A and outputs B and C. Here, we consider two alternative causal hypotheses: either A causes B but not C; or A causes C but not B. The problem is to distinguish between these two hypotheses without having further knowledge of the physical process responsible for the causal relation. This means that the process \({\cal C}\) is unknown, except for the fact that it must compatible with one and only one of the two hypotheses. Mathematically, the two hypotheses correspond to two sets of physical processes, and the problem is to determine which set contains the process \({\cal C}\).
In order to decide which hypothesis is correct, we assume that the experimenter has black box access to the physical process \({\cal C}\). The experimenter can probe the process for N times, intervening between one instance and the next, as illustrated in Fig. 1. In the end, a measurement is performed and its outcome is used to guess the correct hypothesis.
Testing causal hypotheses in the black box scenario. The unknown process \({\cal C}\) induces a causal relation between one input variable and two output variables. The experimenter probes the process for N times, intervening on the relevant variables at each time step. The first intervention is the preparation of a state Ψ, involving the input of the black box and, possibly, an additional reference system (top wire). The subsequent interventions \({\cal U}_i\) manipulate the output variables and prepare the inputs variables for the next steps. In the end, the output variables and the reference system are measured, and the measurement outcome is used to infer the causal relation
An important question is how fast the probability of error decays with N. The decay is typically exponential, with an error probability vanishing as perr(N) ≈ 2−RN for some positive constant R, which we call the discrimination rate. The operational meaning of the discrimination rate is the following. Given an error threshold ε, the error probability can be made smaller than ε using approximately N > log ε−1/R calls to the unknown process. The bigger the rate, the smaller the number of calls needed to bring the error below the desired threshold.
Since the explicit form of the process \({\cal C}\) is unknown, we take perr(N) to be the worst-case probability over all processes compatible with the two given causal hypotheses. If prior information over \({\cal C}\) is available, one may also consider a weaker performance measure, based on the average with respect to some prior. In the following, we stick to the worst case scenario, as it provides a stronger guarantee on the performance of the test.
Identifying causal intermediaries
A variable B is a causal intermediary for variable A if all the influences of A propagate through B. Physically, one can think of B as a slice of the future light cone of A, so that all causal influences of A must pass through B, as illustrated in Fig. 2. Mathematically, the fact that B is a causal intermediary means that there exists a process \({\cal C}\) from A to B such that for every other variable B′ and for every process \({\cal C}^\prime\) with input A and output B′ one can decompose \({\cal C}^\prime\) as \({\cal C}^\prime = {\cal R}\circ {\cal C}\), where \({\cal R}\) is a suitable process from B to B′.
Spacetime picture of a causal intermediary. Variable A is localized at a point in spacetime, and its causal influences propagate within its future light cone. Variable B is distributed over a section of the light cone of A and intercepts all the influences of A. Every other variable B′ that is affected by A and comes after B must be obtained from variable B through some physical process
The condition that a variable is a causal intermediary of another has a simple characterization in all physical theories where processes are fundamentally reversible, meaning that they can be modeled as the result of a reversible evolution of the system and an environment28. The reversibility condition is captured by the expression \({\cal C} = ({\cal I}_B \otimes {\mathrm{Tr}}_{E\prime }){\cal U}({\cal I}_A \otimes \eta _E)\), where variables E and E′ represent the environment (before and after the interaction), η is the initial state of the environment, TrE′ is the operation of discarding system E′28, and \({\cal U}\) is a reversible process from AE to BE′.
When the reversibility condition is satisfied, the variable A can be recovered from variables B and E′. If variable B is to be a causal intermediary of A, then the process \({\cal C}\) must be correctable, in the sense that its action can be undone by another process \({\cal R}\). In addition, if the state spaces of variables A and B are finite dimensional and of the same dimension, then the process \({\cal C}\) must be reversible. In classical theory, this means that \({\cal C}\) is an invertible function. In quantum theory, this means that \({\cal C}\) is a unitary channel, of the form \({\cal C}(\rho ) = U\rho U^\dagger\) for some unitary operator U.
In the following, we will consider the task of identifying which variable, out of a given set of candidates, is the causal intermediary of a given variable A. An important feature of this task is that it admits a complete analytical treatment, allowing us to rigorously prove a quantum advantage over all classical strategies. Besides its fundamental interest, this advantage could have applications to the task of monitoring the information flow in future quantum communication networks, allowing an experimenter to determine which node of a quantum network receives information from a given source node.
Optimal classical strategy
Suppose that A, B, and C are random variables with the same alphabet of size d < ∞. In this case, the fact that X ∈ {B, C} is a causal intermediary for A means that the map from A to X is a permutation. The first (second) causal hypothesis is that B (C) is a permutation of A, while C (B) is uniformly random. Other than this, no information about the functional relation between the variables is known to the experimenter. In particular, the experimenter does not know which permutation relates the variable A to its causal intermediary X.
Let us determine how well one can distinguish between the two hypotheses with a finite number of experiments. In principle, we should examine all sequential strategies as in Fig. 1. However, in classical theory the problem can be greatly simplified: the optimal discrimination rate can be achieved by a parallel strategy, wherein the N input variables are initially set to some prescribed set of values31.
The possibility of an error arises is when the randomly fluctuating variable accidentally takes values that are compatible with a permutation, so that the outcome of the test gives no ground to discriminate between the two hypotheses. The probability of such inconclusive scenario is equal to P(d, v)/dN, where v is the number of distinct values of A probed in the experiment and P(d, v) = d!/(d − v)! is the number of injective functions from a v-element set to a d-element set. The probability of confusion is minimal for v = 1, leading to the overall error probability
As a consequence, the rate at which the two causal hypotheses can be distinguished from each other is
A first quantum advantage
Classical systems can be regarded as quantum systems that lost coherence across the states of a fixed basis, consisting of the classical states. But what if coherence is preserved? Could a coherent superposition of classical states be a better probe for the causal structure?
If the causal relations are restricted to reversible gates that permute the classical states, coherence offers an immediate advantage. The experimenter can prepare N probes, each in the superposition \(|e_0\rangle = \mathop {\sum}\nolimits_{i = 0}^{d - 1} |i\rangle /\sqrt d\). Since the superposition is invariant under permutations, the unknown process will produce either N copies of the state |e0⟩⟨e0|⊗I/d or N copies of the state I/d⊗|e0⟩⟨e0|, depending on which causal hypothesis holds. Using Helstrom’s minimum error measurement32, the error probability is reduced to
Compared with the classical error probability (1), the error probability of this simple quantum strategy is reduced by a factor d, which does not change the rate, but could be significant when the size of the alphabet is large.
Let us consider the full quantum version of the problem. Three quantum variables A, B, and C, corresponding to d-dimensional quantum systems, are promised to satisfy one of two causal hypotheses: either (i) the state of B is obtained from the state of A through an arbitrary unitary evolution and the state of C is maximally mixed, or (ii) the state of C is obtained from the state of A through an arbitrary unitary evolution and the state of B is maximally mixed.
Despite the fact that now the cause–effect relation can be one of the infinitely many unitary gates, it turns out that the error probability (3) can still be attained. A universal quantum strategy, working for arbitrary unitary gates, is to prepare d particles in the singlet state
where \(\epsilon _{k_1k_2 \ldots k_d}\) is the totally antisymmetric tensor and the sum ranges over all vectors in the computational basis. Then, each of the d particles is used as an input to one use of the channel. Repeating the experiment for t times, and performing Helstrom’s minimum error measurement one can attain the error probability \(p_{{\mathrm{err}}}^{{\mathrm{coh}}} = (2d^N)^{ - 1}\), with N = td, independently of the unitary gate representing the cause–effect relationship. In summary, the quantum error probability is at least d times smaller than the best classical error probability, even if the cause–effect relationship is described by an arbitrary unitary gate.
Optimality among simple parallel strategies
We now show that the value (3) is optimal among all simple strategies where the unknown process is applied N times in parallel on N identical input systems, as in Fig. 3.
Simple parallel strategies. The unknown process \({\cal C}\) is probed for N times, acting in parallel on N identical systems, initially prepared in a correlated state Ψ
Optimality follows from a complementarity relation between the information about the causal structure and the information about the functional dependence between cause and effect. Suppose that the cause–effect dependence amounts to a unitary gate U in some finite set U. The ability of a state |Ψ〉 to probe the cause–effect dependence can be quantified by the probability \(p_{{\mathrm{guess}}}^{\mathrm{U}}\) of correctly guessing the unitary U from the state U⊗N|Ψ〉. When the set of possibly unitaries has sufficient symmetry, we find that the probability of error in identifying the causal structure satisfies the lower bound
(Supplementary Note 1). The higher the probability of success in guessing the cause–effect dependence, the higher the probability of error in identifying the causal structure. A consequence of the bound (5) is that the minimum error probability in identifying the causal intermediary is (2dN)−1, and is attained when the success probability \(p_{{\mathrm{guess}}}^{\mathrm{U}}\) is equal to the random guess probability 1/|U|.
Exponential reduction of the error probability
The bound (5) shows that the discrimination rate of simple parallel strategies cannot exceed the classical discrimination rate log d. We now show that that the rate can be doubled by entangling the N probes with an additional reference system.
The working principle of our strategy is to build a quantum superposition of equivalent experimental setups. If no reference system is used, we know that the optimal strategy is to divide the N probes into N/d groups (assuming for simplicity that N is a multiple of d), and to entangle the probes within each group. Clearly, different ways of dividing the N inputs into groups of d are equally optimal: it does not matter which particle is entangled with which, as long as all each particle is part of a singlet state. Still, we can imagine a machine that partitions the particles according to a certain configuration i if a control system is in the state |i〉. When the control system is in a superposition, the machine will probe the unknown process in a superposition of configurations, as pictorially illustrated in Fig. 4. Explicitly, the optimal input state is
where i labels the different ways to partition N identical objects into groups of d elements, GN,d is the number of such ways, \(\left( {\left| {S_d} \right\rangle ^{ \otimes N/d}} \right)_i\) is the product of N/d singlet states arranged according to the i-th configuration, and {|i〉, i = 1, …, GN,d} are orthogonal states of the reference system.
Coherent superposition of configurations. a shows the three different ways of dividing four quantum bits into groups of two. These three configurations are all equivalent for the identification of the causal intermediary. b pictorially illustrates a quantum superposition of configurations, with the choice of configuration correlated with the state of a control system
Classically, there would be no point in randomizing optimal configurations, because mixtures cannot reduce the error probability. But in the quantum case, the coherent superposition of equivalent configurations brings the error probability down to
where r is the number of linearly independent states of the form \(\left( {\left| {S_d} \right\rangle ^{ \otimes N/d}} \right)_i\) (Supplementary Note 2).
To determine how much the error probability can be reduced, we only need to evaluate the number of linearly independent states. It turns out that this number grows as dN, up to a polynomial factor (Supplementary Note 2 again). Taking the logarithm, we obtain the discrimination rate
which is twice the classical discrimination rate (2). In fact, the asymptotic regime is already reached with a small number of interrogations, of the order of a few tens. For example, the causal relation between two quantum bits can be determined with an error probability smaller than 10−6 using with 12 interrogations, whereas 20 interrogations are necessary for classical binary variables.
The above strategy is universal, in that it applies to causal relationships described by arbitrary unitary gates. In particular, it applies to gates that permute the classical states. Hence, the ability to maintain coherence across the classical states and to generate entanglement with a reference system offers an exponential speedup with respect to the best classical strategy. In passing, we note that the universal quantum strategy is insensitive to the presence of perfectly correlated noise, such as the noise due to the lack of a reference frame33, where each of the N input variables is subjected to the same unknown unitary gate.
The ultimate quantum limit
So far, we examined strategies where the unknown process is applied in parallel to a large entangled state. Could a general sequence of interventions achieve an even better rate?
Finding the optimal sequential strategy is generally a hard problem. To address this problem, we introduce the fidelity divergence of two quantum channels \({\cal C}_1\) and \({\cal C}_2\), defined as
where ρ1 and ρ2 are joint states of the channel’s input and of the reference system R. It is understood that the infimum in the right-hand side is taken over pairs of states (ρ1, ρ2) for which the fidelity F(ρ1, ρ2) is non-zero, so that the expression on the right-hand side of Eq. (9) is well-defined.
The fidelity divergence quantifies the ability of channels \({\cal C}_1\) and \({\cal C}_2\) to move two states apart from each other. In the Methods section, we show that the error probability in distinguishing between \({\cal C}_1\) and \({\cal C}_2\) with N queries is lower bounded as
In particular, suppose that the two channels \({\cal C}_1\) and \({\cal C}_2\) have the form \({\cal C}_1 = {\cal U} \otimes I/d\) and \({\cal C}_2 = I/d \otimes {\cal U}\), where \({\cal U}\) is a fixed unitary channel. In this case, we find that the fidelity divergence is 1/d2. Hence, the error probability satisfies the bound
In the causal intermediary problem, the unitary gate \({\cal U}\) is unknown, and therefore the error probability can only be larger than \(p_{{\mathrm{err}}}^{{\mathrm{seq}}}({\cal C}_1,{\cal C}_2;N)\). Hence, the identification of the causal intermediary cannot occur at a rate faster than 2 log d.
Equation (11) limits all sequential quantum strategies. But in fact quantum theory is also compatible with scenarios where physical processes take place in an indefinite order23,24. Could the rate be increased if the experimenter had access to exotic phenomena involving indefinite order?
The answer is negative. In the Methods section, we develop the concepts and methods needed to answer this question, and we show that the minimum error probability in distinguishing between the two channels \({\cal C}_1 = {\cal I} \otimes I/d\) and \({\cal C}_2 = I/d \otimes {\cal I}\) using arbitrary setups with indefinite order satisfies the bound
Clearly, this bound applies to the causal intermediary problem, which is harder than the discrimination of the two specific channels \({\cal C}_1 = {\cal I} \otimes I/d\) and \({\cal C}_2 = I/d \otimes {\cal I}\). Hence, the rate RQ = 2 log d represents the ultimate quantum limit to the identification of a causal intermediary.
Extension to arbitrary numbers of hypotheses
The quantum advantage demonstrated in the previous sections can be extended to the identification of the causal intermediary among an arbitrary number k of candidate variables. The best classical strategy still consists of initializing all variables to the same value. Errors arise when the values of two or more output variables are compatible with an invertible function. In the limit of many repetitions, the minimum error probability is \(p_{{\mathrm{err}},{\mathrm{k}}}^{\mathrm{C}} = (k - 1)/(2d^{N - 1}) + O\left( {d^{ - 2N}} \right)\). (Supplementary Note 3). For quantum strategies, the best option among simple parallel strategies is still to divide the input particles into N/d groups of d particles and to initialize each group in the singlet state. In Supplementary Note 4, we show that this strategy reduces the error probability to \(p_{{\mathrm{err}},{\mathrm{k}}}^{{\mathrm{coh}}} = (k - 1)/(2d^N) + O\left( {d^{ - 2N}} \right)\), for causal relations represented by arbitrary unitary gates.
An exponentially smaller error probability can be achieved using the input state (6). The evaluation of the error probability is more complex than in the two-hypothesis case, but the end result is the same: when the causal dependency is probed N times, the quantum error probability decays at the exponential rate RQ = 2 log d, twice the rate of the best classical strategy (see Supplementary Note 5 for the technical details).
Applications to other tests of causal hypotheses
The strategies developed in the previous sections can be applied to the identification of causal relations in a variety of scenarios. For example, they can be used to decide whether there is a causal link between two variables A and B. More specifically, they can be used to determine whether variable B is a causal intermediary for variable A or whether B fluctuates at random independently of A. Also in this case, the error probability of the best classical strategy is 1/(2dN−1), whereas preparing N/d copies of the singlet yields error probability 1/(2dN).
By superposing all possible partitions of the N inputs into groups of d, one can boost the discrimination rate from log d to 2 log d. One could speculate that, in the future, such a fast identification could be useful as a quantum version of the ping protocol, capable of establishing whether there exists a quantum communication link between two nodes of a quantum internet34.
Another application of our techniques is in the problem of identifying the cause of a given variable. Suppose that one of k variables A1, A2, …, Ak is the cause for a given variable B. An example of this situation arises in genetics, when trying to identify the gene responsible for a certain characteristic. Here, the interesting scenario is when the number of candidate causes is large.
Classically, the problem is to find the variable Ax such that B is a function of Ax. For simplicity, we first assume that all variables have the same d-dimensional alphabet, and that the function from Ax to B is the identity, namely b = ax. In this case, the cause can be identified without any error by probing the unknown process for \(\left\lceil {{\mathrm{log}}_dk} \right\rceil\) times. The identification is done by a simple search algorithm, where one divides the candidate variables in d groups and initializes the input variables in the i-th group to the value i. In this way, d − 1 groups can be ruled out, and one can iterate the search in the remaining group. Using a decision tree argument35, it is not hard to see that \(\left\lceil {{\mathrm{log}}_dk} \right\rceil\) is the minimum number of queries needed to identify the unknown process in the worst case scenario.
In the quantum version of the problem, we find that the number of queries can be cut down by approximately a half when the number of hypotheses is large. The trick is to prepare k maximally entangled states, and to apply the unknown process to the first system of each pair. Repeating this procedure for N times and using results on port-based teleportation36 we find that the error probability is perr = (k − 1)/(d2N + k − 1). Hence, \(N = \left\lceil {(1 + \epsilon)({\mathrm{log}}_dk)/2} \right\rceil\) queries are sufficient to identify the cause with vanishing error probability in the large k limit.
In Supplementary Note 6, we consider the more complex scenario where the functional dependence between the cause and effect is unknown, and the only assumption is that the effect is a causal intermediary of the cause. Despite the lack of information about the functional dependence, we show that the correct cause can be still identified with high probability using \(N = \left\lceil {(1{\mathrm{ }} + \epsilon)({\mathrm{log}}_dk)/2} \right\rceil\) calls to the unknown process. The fast identification of the cause is achieved by dividing the N copies of each input variable Ai into groups of d copies, preparing each group in the singlet state, and entangling the configuration of the groupings with an external reference system. Once again, the superposition of multiple equivalent setups leads to a quantum speedup over the best classical strategy.
Discussion
We showed that quantum mechanics enhances our ability to detect direct cause–effect links. This finding motivates the exploration of more complex networks of causal relations, including intermediate nodes and global causal dependences between groups of variables1,2,3. The development of new techniques for testing causal relations could find applications to future quantum communication networks, providing a fast way to test the presence of communication links. It could also assist the design of intelligent quantum machines, in a similar way as classical causal discovery algorithms have been useful in classical artificial intelligence. In view of such applications, it is important to go beyond the noiseless scenario considered in this paper, and to address scenarios where the cause–effect relationships are obfuscated by noise. The techniques developed in our work already provide some insights in this direction. Quite interestingly, one can show that the quantum advantage persists in the presence of depolarizing noise, provided that the noise level is not too high (see Supplementary Note 7). A complete study of the noisy scenario, however, remains an open direction of future research.
Another direction of future investigation is foundational. Given the advantage of quantum theory over classical theory, it is tempting to ask whether alternative physical theories could offer even larger advantages. Interesting candidates are theories that admit more powerful dense coding protocols than quantum theory37, as one might expect super-quantum advantages to arise from the presence of stronger correlations with the reference system. In a similar vein, one could explore physical theories with higher dimensional state spaces, such as Zyczkowski’s quartic theory38, or quantum theory on quaternionic Hilbert spaces39. Indeed, it is intriguing to observe that the classical rate RC = log d and the quantum rate RQ = 2 log d are equal to the logarithms of the dimensions of the classical and quantum state spaces, respectively. In general, one may expect a relationship between the dimension of the state space and the rate. Should super-quantum advantages emerge, it would be natural to ask which physical principle determines the causal identification power of quantum mechanics. An intriguing possibility is that one of the hidden physical principles of quantum theory could be a principle on the ability to distinguish alternative causal hypotheses.
Methods
Properties of the fidelity divergence
Here, we derive two properties of the fidelity divergence defined in Eq. (9). First, the fidelity divergence provides a lower bound on the probability of misidentifying a channel with another:
Proposition 1 The probability of error in distinguishing between two quantum channels \({\cal C}_1\) and \({\cal C}_2\) with N queries is lower bounded as \(p_{{\mathrm{err}}}^{{\mathrm{seq}}}({\cal C}_1,{\cal C}_2;N) \ge \partial F({\cal C}_1,{\cal C}_2)^N/4\).
The bound can be obtained in the following way. Let \(\rho _x^{(N)}\) be the output state of a circuit as in Fig. 1. Then, we have the bound
The first line follows from Helstrom’s theorem32, and the second line follows from the Fuchs–Van De Graaf Inequality40. The third line follows from the definition of the fidelity divergence (9), which implies that the fidelity between the states right after the (t + 1)-th use of the unknown channel \({\cal C}_x\), denoted by ρx,t+1, satisfies the bound
where \({\cal U}_{t + 1}\) is the (t + 1)-th operation in Fig. 1. The fourth line follows from the elementary inequality \(\sqrt {1 - t} \le 1 - t/2\).
Another important property is that the fidelity divergence can be evaluated on pure states. The proof is simple: let ρ1 and ρ2 be two arbitrary states of the composite system AR, where R is an arbitrary reference system. By Uhlmann’s theorem41, there exists a third system E and two purifications \(|\Psi _1\rangle ,|\Psi _2\rangle \in {\cal H}_A \otimes {\cal H}_R \otimes {\cal H}_E\), such that F(Ψ1, Ψ2) = F(ρ1, ρ2). On the other hand, the monotonicity of the fidelity under partial trace42, ensures that the fidelity between the output states \(({\cal C}_1 \otimes {\cal I}_{RE})({\mathrm{\Psi }}_1)\) and \(({\cal C}_2 \otimes {\cal I}_{RE})({\mathrm{\Psi }}_2)\) cannot be larger than the fidelity between the states \(({\cal C}_1 \otimes {\cal I}_R)(\rho _1)\) and \(({\cal C}_2 \otimes {\cal I}_R)(\rho _2)\). Hence, the minimization on the right-hand side of Eq. (9) can be restricted without loss of generality to pure states.
Fidelity divergence for the identification of the causal intermediary
Let us see how the fidelity divergence can be applied to our causal identification problem. The two channels are of the form \({\cal C}_{1,U}(\rho ) = U\rho U^\dagger \otimes I/d\) and \({\cal C}_{2,V} = I/d \otimes V\rho V^\dagger\), where U and V are two unknown unitary gates. Since we are interested in the worst case scenario, every choice of U and V will give an upper bound to the discrimination rate. In particular, we pick U = V.
Proposition 2 The fidelity divergence for the two channels \({\cal C}_{1,U}\) and \({\cal C}_{2,U}\) is \(\partial F({\cal C}_{1,U},{\cal C}_{2,U}) = 1/d^2\).
By the unitary invariance of the fidelity, \(\partial F({\cal C}_{1,U},{\cal C}_{2,U})\) is independent of U. Without loss of generality, let us pick U = I. For a generic reference system R and two generic pure states \(|{\mathrm{\Psi }}_1,\rangle |{\mathrm{\Psi }}_2\rangle \in {\cal H}_A \otimes {\cal H}_R\), the two output states are
up to reordering of the Hilbert spaces. The fidelity can be computed with the relation
where we omitted the identity operators for the sake of brevity. Let us expand the input states as
where {|n⟩} is an orthonormal basis for the reference system, and {|ϕxn⟩} is a set of unnormalized vectors. Inserting Eq. (17) into Eq. (16), we obtain the expression
with \(C = \mathop {\sum}\nolimits_n {\kern 1pt} |\phi _{1n}\rangle \langle \phi _{2n}|\). On the other hand, the fidelity between the input states is
Hence, the fidelity divergence satisfies the bound
having used the inequality |Tr[C]| ≤ Tr|C|, valid for every operator C. The inequality holds with the equality sign whenever C is positive. This condition is satisfied, e.g., when the input states |Ψ1〉 and |Ψ2〉 are identical.
Quantum strategies with indefinite causal order
In principle, quantum mechanics is compatible with situations where multiple processes are combined in indefinite order23,24. This suggests that an experimenter could devise new ways to probe quantum channels, allowing the relative order among different uses of the same channel to be indefinite. We call such strategies indefinite testers.
Consider the problem of identifying a channel \({\cal C}_x\) from N uses. The input resource is the channel \({\cal C}_x^{ \otimes N}\), representing N identical black boxes that can be arranged in any desired order. Besides the product of N independent channels, the most general class of channels with this property is the class of no-signaling channels with N pairs of input/output systems.
Mathematically, an indefinite tester is a linear map from the set of no-signaling channels to the set of probability distributions over a given set of outcomes. Equivalently, the tester can be described by a set of operators {Tx}, where each operator Tx acts on the Hilbert space \(\otimes _i({\cal H}_i^{{\mathrm{in}}} \otimes {\cal H}_i^{{\mathrm{out}}})\), where \({\cal H}_i^{{\mathrm{in}}}\) and \({\cal H}_i^{{\mathrm{out}}}\) are the Hilbert spaces of the input and output system in the i-th pair, respectively. When the test is performed on a no-signaling channel \({\cal C}\), the probability of the outcome x is given by the generalized Born rule px = Tr[TxC], where C is the Choi operator of the channel \({\cal C}\)43. The normalization of the probabilities
is required to hold for every no-signaling channel \({\cal C}\).
Consider the problem of distinguishing between a set of no-signaling channels \(\{ {\cal C}_x\}\) using an indefinite tester. For every probability distribution {πx}, the worst-case probability of error satisfies the bound
Now, suppose that there exists a constant λ and a no-signaling channel \({\cal C}\) such that
for every x. Substituting Eq. (23) into Eq. (22) one obtains the bound
having used the normalization condition (21). The bound (24) can be seen as a generalization of the classical Yuen–Kennedy–Lax bound for quantum state discrimination44.
We now apply the bound (24) to the task of distinguishing between the two channels \({\cal C}_{1,I} = ({\cal U} \otimes I/d)^{ \otimes N}\) and \({\cal C}_{2,I} = (I/d \otimes {\cal U})^{ \otimes N}\). To this purpose, we consider the universal cloning channel45
and the universal NOT channel46
with P± = (I ± SWAP)/2, and SWAP being the unitary operator that swaps between the even and odd output spaces. It is easy to verify that both channels are no-signaling. Moreover, we find that the convex combination \({\cal C} = p_ + {\cal C}_ + + p_ - {\cal C}_ -\) with \(p_ \pm = \sqrt {\frac{{d^N \pm 1}}{{2d^N}}} /\left( {\sqrt {\frac{{d^N + 1}}{{2d^N}}} + \sqrt {\frac{{d^N - 1}}{{2d^N}}} } \right)\) satisfies the condition (23) with \(\lambda = \frac{1}{2}\left( {\sqrt {\frac{{d^N + 1}}{{2d^N}}} + \sqrt {\frac{{d^N - 1}}{{2d^N}}} } \right)^2\) (see Supplementary Note 8 for technical details). Hence, the bound (24) becomes
The above bound implies that the discrimination rate of quantum strategies with indefinite order cannot exceed 2 log d.
Data availability
The authors declare that the data supporting the findings of this study are available within the paper and in the Supplementary Information files.
References
Spirtes, P., Glymour, C. N. & Scheines, R. Causation, Prediction, and Search (MIT Press, Cambridge, Massachusetts, United States 2000).
Pearl, J. Causality (Cambridge University Press, Cambridge, United Kingdom 2009).
Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Morgan Kaufmann, Burlington, Massachusetts, United States 2014).
Chaves, R. et al. Quantum violation of an instrumental test. Nat. Phys. 14, 291–296 (2018).
Van Himbeeck, T. et al. Quantum violations in the instrumental scenario and their relations to the Bell scenario. Preprint at: https://arxiv.org/abs/1804.04119 (2018).
Leifer, M. S. Quantum dynamics as an analog of conditional probability. Phys. Rev. A 74, 042310 (2006).
Chiribella, G., D’Ariano, G. M. & Perinotti, P. Theoretical framework for quantum networks. Phys. Rev. A 80, 022339 (2009).
Coecke, B. & Spekkens, R. W. Picturing classical and quantum Bayesian inference. Synthese 186, 651–696 (2012).
Leifer, M. S. & Spekkens, R. W. Towards a formulation of quantum theory as a causally neutral theory of Bayesian inference. Phys. Rev. A 88, 052130 (2013).
Henson, J., Lal, R. & Pusey, M. F. Theory-independent limits on correlations from generalized Bayesian networks. New J. Phys. 16, 113043 (2014).
Pienaar, J. & Brukner, Č. A graph-separation theorem for quantum causal models. New J. Phys. 17, 073020 (2015).
Costa, F. & Shrapnel, S. Quantum causal modelling. New J. Phys. 18, 063032 (2016).
Portmann, C., Matt, C., Maurer, U., Renner, R. & Tackmann, B. Causal boxes: quantum information-processing systems closed under composition. IEEE Trans. Inf. Theory 63, 3277–3305 (2017).
Allen, J.-M. A., Barrett, J., Horsman, D. C., Lee, C. M. & Spekkens, R. W. Quantum common causes and quantum causal models. Phys. Rev. X 7, 031021 (2017).
MacLean, J.-P. W., Ried, K., Spekkens, R. W. & Resch, K. J. Quantum-coherent mixtures of causal relations. Nat. Commun. 8, 15149 (2017).
Wood, C. J. & Spekkens, R. W. The lesson of causal discovery algorithms for quantum correlations: causal explanations of Bell-inequality violations require fine-tuning. New J. Phys. 17, 033002 (2015).
Fitzsimons, J. F., Jones, J. A. & Vedral, V. Quantum correlations which imply causation. Sci. Rep. 5, 18281 (2015).
Ried, K. et al. A quantum advantage for inferring causal structure. Nat. Phys. 11, 414–420 (2015).
Chaves, R., Majenz, C. & Gross, D. Information–theoretic implications of quantum causal structures. Nat. Commun. 6, 5766 (2015).
Giarmatzi, C. & Costa, F. A quantum causal discovery algorithm. npj Quantum Inf. 4, 17 (2018).
Schuld, M., Sinayskiy, I. & Petruccione, F. An introduction to quantum machine learning. Contemp. Phys. 56, 172–185 (2015).
Biamonte, J. et al. Quantum machine learning. Nature 549, 195 (2017).
Chiribella, G., D’Ariano, G. M., Perinotti, P. & Valiron, B. Quantum computations without definite causal structure. Phys. Rev. A 88, 022318 (2013).
Oreshkov, O., Costa, F. & Brukner, Č. Quantum correlations with no causal order. Nat. Commun. 3, 1092 (2012).
Hardy, L. Quantum theory from five reasonable axioms. Preprint at: https://arxiv.org/abs/quant-ph/0101012 (2001).
Barnum, H., Barrett, J., Leifer, M. & Wilce, A. Generalized no-broadcasting theorem. Phys. Rev. Lett. 99, 240501 (2007).
Barrett, J. Information processing in generalized probabilistic theories. Phys. Rev. A 75, 032304 (2007).
Chiribella, G., D’Ariano, G. & Perinotti, P. Probabilistic theories with purification. Phys. Rev. A 81, 062348 (2010).
Hardy, L. Foliable operational structures for general probabilistic theories. In Deep Beauty: Understanding the Quantum World through Mathematical Innovation (ed. Halvorson, H.) 409–442 (Cambridge University Press, Cambridge, United Kingdom 2011).
Chiribella, G. & Spekkens, R. W. Quantum Theory: Informational Foundations and Foils (Springer, Dordrecht, The Netherlands 2016).
Hayashi, M. Discrimination of two channels by adaptive methods and its application to quantum system. IEEE Trans. Inf. Theory 55, 3807–3820 (2009).
Helstrom, C. W. Quantum detection and estimation theory. J. Stat. Phys. 1, 231–252 (1969).
Bartlett, S. D., Rudolph, T. & Spekkens, R. W. Reference frames, superselection rules, and quantum information. Rev. Mod. Phys. 79, 555–609 (2007).
Kimble, H. J. The quantum internet. Nature 453, 1023–1030 (2008).
Cormen, T. H., Leiserson, C. E., Rivest, R. L. & Stein, C. Introduction to Algorithms (MIT Press, Cambridge, Massachusetts, United States 2009).
Mozrzymas, M., Studziński, M., Strelchuk, S. & Horodecki, M. Optimal port-based teleportation. New J. Phys. 20, 053006 (2018).
Massar, S., Pironio, S. & Pitalúa-Garca, D. Hyperdense coding and superadditivity of classical capacities in hypersphere theories. New J. Phys. 17, 113002 (2015).
Życzkowski, K. Quartic quantum theory: an extension of the standard quantum mechanics. J. Phys. A 41, 355302 (2008).
Barnum, H., Graydon, M. A. & Wilce, A. Some nearly quantum theories. Preprint at: https://arxiv.org/abs/1507.06278 (2015).
Fuchs, C. A. & Van De Graaf, J. Cryptographic distinguishability measures for quantum-mechanical states. IEEE Trans. Inf. Theory 45, 1216–1227 (1999).
Uhlmann, A. The transition probability in the state space of a*-algebra. Rep. Math. Phys. 9, 273–279 (1976).
Wilde, M. M. Quantum Information Theory (Cambridge University Press, 2013).
Choi, M.-D. Completely positive linear maps on complex matrices. Linear Algebra Appl. 10, 285–290 (1975).
Yuen, H., Kennedy, R. & Lax, M. Optimum testing of multiple hypotheses in quantum detection theory. IEEE Trans. Inf. Theory 21, 125–134 (1975).
Werner, R. F. Optimal cloning of pure states. Phys. Rev. A 58, 1827–1832 (1998).
Bužek, V., Hillery, M. & Werner, R. Optimal manipulations with qubits: universal-not gate. Phys. Rev. A 60, R2626–R2629 (1999).
Acknowledgements
The authors acknowledge Robert Spekkens, David Schmidt, Lucien Hardy, Sergii Strelchuk, Akihito Soeda, and Thomas Gonda for stimulating discussions. This work is supported by the National Natural Science Foundation of China through Grant 11675136, the Croucher Foundation, John Templeton Foundation, Project 60609, Quantum Causal Structures, the Canadian Institute for Advanced Research (CIFAR), the Hong Research Grant Council through Grants 17300317 and 17300918, and the Foundational Questions Institute through Grant FQXi-RFP3-1325. This publication was made possible through the support of a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. This research was supported in part by Perimeter Institute for Theoretical Physics. Research at Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science.
Author information
Authors and Affiliations
Contributions
Both the authors contributed substantially to the research presented in this paper and to the preparation of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Journal peer review information: Nature Communications thanks Cyril Branciard, Jonatan Bohr Brask and the other anonymous reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chiribella, G., Ebler, D. Quantum speedup in the identification of cause–effect relations. Nat Commun 10, 1472 (2019). https://doi.org/10.1038/s41467-019-09383-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-019-09383-8
This article is cited by
-
Quantum causal inference with extremely light touch
npj Quantum Information (2025)
-
Experimental aspects of indefinite causal order in quantum mechanics
Nature Reviews Physics (2024)
-
RSNET: inferring gene regulatory networks by a redundancy silencing and network enhancement technique
BMC Bioinformatics (2022)
-
Quantum operations with indefinite time direction
Communications Physics (2022)
-
Quantum causal unravelling
npj Quantum Information (2022)






