Quantum equilibrium propagation for efficient training of quantum systems based on Onsager reciprocity

Wanjura, Clara C.; Marquardt, Florian

doi:10.1038/s41467-025-61665-6

Download PDF

Article
Open access
Published: 17 July 2025

Quantum equilibrium propagation for efficient training of quantum systems based on Onsager reciprocity

Nature Communications volume 16, Article number: 6595 (2025) Cite this article

3725 Accesses
3 Citations
Metrics details

Subjects

Abstract

The widespread adoption of machine learning and artificial intelligence in all branches of science and technology creates a need for energy-efficient, alternative hardware. While such neuromorphic systems have been demonstrated in a wide range of platforms, it remains an open challenge to find efficient and general physics-based training approaches. Equilibrium propagation (EP), the most widely studied approach, has been introduced for classical energy-based models relaxing to an equilibrium. Here, we show a direct connection between EP and Onsager reciprocity and exploit this to derive a quantum version of EP. For an arbitrary quantum system, this can now be used to extract training gradients with respect to all tuneable parameters via a single linear response experiment. We illustrate this new concept in examples in which the input or the task is of quantum-mechanical nature, e.g., the recognition of many-body ground states, phase discovery, sensing, and phase boundary exploration. Quantum EP may be used to solve challenges such as quantum phase discovery for Hamiltonians which are classically hard to simulate or even partially unknown. Our scheme is relevant for a variety of quantum simulation platforms such as ion chains, superconducting circuits, Rydberg atom tweezer arrays and ultracold atoms in optical lattices.

Hamiltonian engineering of collective XYZ spin models in an optical cavity

Article 15 April 2025

Single-atom exploration of optimized nonequilibrium quantum thermodynamics by reinforcement learning

Article Open access 07 October 2023

Quantum phases of matter on a 256-atom programmable quantum simulator

Article 07 July 2021

Introduction

As deep learning and artificial intelligence are adopted in all fields of science and technology, the increasing complexity of neural networks has led to an exponential increase in energy consumption and training costs. This has created a need for more efficient alternatives, sparking the rapidly developing field of neuromorphic computing¹, which explores a variety of different platforms^2,3 to design physical, analogue neural networks.

Existing training strategies for neuromorphic platforms include in-silico training, requiring a faithful digital model of the system, and physics-aware backpropagation⁴, combining physical inference with a simulated backward pass which relaxes these constraints. However, it is a central question whether not only inference but also training can exploit the physical dynamics⁵, making full use of the energy efficiency of neuromorphic systems. For example, feedback-based parameter shifting does not require any simulation but scales unfavourably with the network size⁶. Moving towards physical implementations of efficient backpropagation (the central technique for training artificial neural networks), strategies for specific types of non-linearities have been developed^7,8,9, as well as approaches performing backpropagation only on the linear components^10,11. Another novel recent approach enables 'forward-forward' type gradient calculation in systems which perform sequential information processing¹². Furthermore, efficient measurements of gradients via scattering experiments can be performed in optical systems that employ a framework recently developed to produce nonlinear computation with linear wave setups¹³— such nonlinear processing was also recently demonstrated in refs. ^12,14,15.

General approaches for physical backpropagation so far only exist in two classes of physical systems: Hamiltonian Echo Backpropagation¹⁶, which applies to essentially lossless systems in which a time-reversal operation can be implemented, and equilibrium propagation (EP)^17,18, which applies to energy-based, equilibrating systems.

EP is currently the most intensively studied physics-based training procedure for neuromorphic systems. It stands in the tradition of contrastive learning approaches, comparing measurements obtained from two different equilibria and using feedback to update parameters. Concretely, EP consists of two phases: the free and the nudged phase. In the free phase, the input is fixed and the system relaxes into its equilibrium state. In the nudged phase, the output is 'nudged' closer towards the target output and the system evolves to a new equilibrium. Comparing the system state in the free and the nudged phase, respectively, one obtains the necessary gradients which are then used to update parameters.

Since its introduction in 2017, EP has been investigated thoroughly^{19,20,21,22,23,24} and a variant, coupled learning, has been developed^25,26. In particular, EP was proposed for training nonlinear resistor circuits¹⁹, systems of coupled phase oscillators²⁴, was further adapted to spiking networks^20,27, to implement continual parameter updates²¹ and a dynamical version was developed²². Experimentally, EP has been applied to train electronic systems^28,29, elastic networks³⁰ and even a memristor crossbar array³¹. Furthermore, a classical Ising model has been trained using a quantum annealer to efficiently reach equilibrium³².

Given the elegance and wide-ranging impact of the EP approach, it is a natural question to ask whether it can be extended to quantum systems to train a fully quantum Hamiltonian via a nudging procedure similar to classical EP. In this work, we will show that there is a direct connection between EP and Onsager reciprocity and exploit it to derive a quantum version of EP (QEP). This new approach can be used to train efficiently arbitrary quantum systems, including specifically the highly tuneable quantum many-body systems realized nowadays in quantum simulators³³.

We illustrate this new concept with supervised and unsupervised learning examples of a quantum-mechanical nature. Specifically, we investigate the recognition of quantum many-body phases and introduce as new concepts the exploration of phase diagrams with quantum simulator platforms via efficient gradient descent optimization enabled by QEP as well as the optimisation of sensitivity (e.g. for sensing applications) and phase boundary exploration. QEP is applicable to systems which are hard or impossible to simulate classically and can be employed even in settings in which the Hamiltonian is only partially known or partially accessible.

In terms of the general question of using quantum devices for learning tasks, the area of quantum machine learning^34,35 by now has a long history. There have been some ideas of how to learn to reproduce quantum states by adapting tuneable parameters ('quantum Boltzmann machines', see ref. ³⁵). The major research efforts in this domain are, however, spent on variational quantum circuits, which require fully controllable digital quantum computing platforms for implementation (possibly even fault-tolerant), imposing resource demands that go significantly beyond what we are going to assume here. We provide an overview of digital and analogue neuromorphic computing approaches in the classical and quantum domain in Fig. 1c.

**Fig. 1: The concept of quantum equilibrium propagation.**

Results

Onsager reciprocity as the basis for quantum equilibrium propagation

Consider a parameterized Hamiltonian

$$\hat{H}(\lambda )=\sum_{j}{\lambda }_{j}{\hat{A}}_{j}$$

(1)

and its ground state $\left\vert \Psi (\lambda )\right\rangle$. A small static force coupling to ${\hat{A}}_{j}$ (entering as a term $\delta {\lambda }_{j}{\hat{A}}_{j}$ inside $\hat{H}$) will produce a linear response in the expectation value $\left\langle {\hat{A}}_{\ell }\right\rangle$, given by

$${\chi }_{\ell j}=\frac{\partial }{\partial {\lambda }_{j}}\left\langle \Psi (\lambda )\right\vert {\hat{A}}_{\ell }\left\vert \Psi (\lambda )\right\rangle .$$

(2)

Onsager reciprocity guarantees the symmetry of the susceptibility χ_jℓ = χ_ℓj, i.e., the same effect will be produced by a force acting on ${\hat{A}}_{\ell }$ influencing the expectation value $\left\langle {\hat{A}}_{j}\right\rangle$, see Fig. 1, i.e.,

$$\frac{\partial }{\partial {\lambda }_{j}}\langle {\hat{A}}_{\ell }\rangle=\frac{\partial }{\partial {\lambda }_{\ell }}\langle {\hat{A}}_{j}\rangle .$$

(3)

Onsager reciprocity³⁶ can be derived in many ways, also for the quantum case³⁷. However, the most elementary approach for static response situations such as the one considered here and applied to the ground state in particular uses first-order perturbation theory for the deformation of the ground state ${\partial }_{{\lambda }_{j}}| \Psi (\lambda )\left.\right\rangle={(E(\lambda )-\hat{H}(\lambda ))}^{-1}({\hat{A}}_{j}-\langle {\hat{A}}_{j}\rangle )| \Psi (\lambda )\left.\right\rangle$. Expression (3) also holds for thermal states for which the expectation values above are replaced by $\langle {\hat{A}}_{j}\rangle={{\rm{Tr}}}(\,\hat{\rho }{\hat{A}}_{j})$, so the following results apply for arbitrary-temperature quantum equilibrium states. In the case of ground-state degeneracy, expression (3) also holds with $\hat{\rho }={\sum }_{j}\vert {\Psi }_{j}(\lambda )\rangle \langle {\Psi }_{j}(\lambda )\rangle$ in which j sums over all degenerate states; this is approximately equivalent to a thermal state at small, but non-vanishing, temperature. Classical Onsager reciprocity is equivalent to what has been termed the 'Fundamental Lemma' for classical EP^17,18.

We will now show that this well-known result (3) gives us access to a general version of equilibrium propagation for quantum systems. To understand that, we first consider supervised learning. Let us assume that the set of operators ${\hat{A}}_{j}$ is split into degrees of freedom relating to the input $j\in {{{\mathcal{S}}}}_{{{\rm{in}}}}$, trainable variables $j\in {{{\mathcal{S}}}}_{{{\rm{train}}}}$, and the output $j\in {{{\mathcal{S}}}}_{{{\rm{out}}}}$. Accordingly, the set of parameters λ is split into the input x containing the parameters corresponding to the input λ_j = x_j, the set of training parameters θ with λ_j = θ_j and the set of couplings to the output observables ν with λ_j = ν_j. Hence, a general QEP Hamiltonian is of the form $\hat{H}(x,\theta,\nu )$ in which ν = 0 during inference. For any given training sample, the input x is fixed by applying a field to all input degrees of freedom (we may write λ_k = x_k for $k\in {{{\mathcal{S}}}}_{{{\rm{in}}}}$, with x representing the input vector). The output is then read off as the expectation values ${y}_{\ell }=\langle {\hat{A}}_{\ell }\rangle$ in the operators $\ell \in {{{\mathcal{S}}}}_{{{\rm{out}}}}$. Note that ${\hat{A}}_{\ell }$ can be chosen as a projector, in which case y_ℓ becomes the probability of obtaining a particular outcome in a measurement; this is useful for classification tasks.

In supervised learning, we are interested in adjusting the trainable parameters θ in order to 'nudge' the output closer to the desired target output y^target(x), for any given input x. More generally, we aim to reduce the loss function ${{\mathcal{L}}}(y,{y}^{{{\rm{target}}}})$, or rather its average $\bar{{{\mathcal{L}}}}$ over many training samples (x, y^target(x)), via gradient descent: $\delta \theta=-\eta \partial \bar{{{\mathcal{L}}}}/\partial {\theta }_{j}$, in which η is the learning rate. To do this, we need to obtain the influence of a change in θ_j on any of the outputs y_ℓ. For a given fixed training sample, this is just the susceptibility ${\chi }_{\ell j}(\lambda )=\partial \langle {\hat{A}}_{\ell }\rangle /\partial {\theta }_{j}$. Evaluating χ_ℓj for all possible trainable parameters j, see Fig. 1a, scales unfavourably, requiring a number of different experiments that scale linearly in the number N of these parameters. Accessing the training gradient in this way amounts to the parameter-shift method which is always applicable in any neuromorphic platform but should generally be avoided whenever possible due to this unfavourable scaling. However, Onsager reciprocity, Eq. (3), tells us that we can also access the susceptibility χ_ℓj by performing an alternative, much more efficient experiment that instead reveals χ_jℓ: apply a small force ν acting on the outputs and observe its influence on the expectation values of the degrees of freedom $\langle {\hat{A}}_{k}\rangle$ connected to the trainable parameters: ${\chi }_{j\ell }(\lambda )=\partial \langle {\hat{A}}_{j}\rangle /\partial {\nu }_{\ell }$, see Fig. 1b. The required measurements of $\langle {\hat{A}}_{j}\rangle$ (for any j) under application of a given force component ν_ℓ (at fixed ℓ) can be performed in parallel, in a single experiment. Thus, this approach would already seem to require a number of experiments that only scales with the number of outputs N_out (number of choices ℓ), which is typically much smaller than the number of trainable parameters. This already would offer a substantial speedup compared with the naive parameter shift method. However, by evaluating explicitly the desired gradient of the loss function, it becomes apparent that only a single experiment is in fact needed: in this experiment, a force vector $\varepsilon=\partial {{\mathcal{L}}}/\partial y$, the so-called error signal, is applied to the output degrees of freedom.

In this way, Onsager reciprocity teaches us how to translate the classical equilibrium propagation approach to quantum devices. Since Onsager reciprocity holds generally in equilibrium systems (SI), quantum Hamiltonians of arbitrary structure can be considered.

Quantum equilibrium propagation procedure

We now explicitly summarize the QEP procedure. For clarity, we will from now on denote the output observables by ${\hat{O}}_{\ell }$. For a gradient-descent parameter update, we need to compute the derivative of the loss function,

$$\frac{\partial }{\partial {\theta }_{j}}{{\mathcal{L}}}(y,{y}^{{{\rm{target}}}}(x))=\sum_{\ell }\frac{\partial {{\mathcal{L}}}}{\partial {y}_{\ell }}\frac{\partial {y}_{\ell }}{\partial {\theta }_{j}}=\varepsilon \frac{\partial y}{\partial {\theta }_{j}}$$

(4)

in which the error signal vector has components ${\varepsilon }_{\ell }=\partial {{\mathcal{L}}}/\partial {y}_{\ell }$. For a mean-square-error loss function, we would have ε = 2(y − y^target(x)).

The QEP procedure for supervised learning can be summarized as follows. (i) Free phase: The nudging forces are off (ν_j = 0 for all j) and the output expectation values ${y}_{\ell }=\langle {\hat{O}}_{\ell }\rangle$ as well as the expectation values of all operators associated with trainable parameters ${\hat{A}}_{j}$ are measured in the ground state of the Hamiltonian $\hat{H}(x,\theta,0)$. (ii) Nudged phase: We compute the error signal ε and use it to nudge the Hamiltonian $\hat{H}(x,\theta,\nu=\beta \varepsilon )$ by switching on the couplings to the output observables, adding a term ${\sum }_{\ell }{\nu }_{\ell }{\hat{A}}_{\ell }$ to the Hamiltonian. The couplings are given by the vector ν = βε, in which β is a small parameter (keeping with the notation for classical EP¹⁷; this is unrelated to the inverse temperature). We again measure the expectation values of all observables ${\hat{A}}_{j}$. (iii) Parameter update: Using Onsager reciprocity, Eq. (3), $\partial \langle {\hat{O}}_{\ell }\rangle /\partial {\theta }_{j}=\partial \langle {\hat{A}}_{j}\rangle /\partial {\nu }_{\ell }$, we can approximate the gradient $\partial \langle {\hat{O}}_{\ell }\rangle /\partial {\theta }_{j}$ and hence arrive at

$$\frac{\partial }{\partial {\theta }_{j}}{{\mathcal{L}}}(\,y,{y}^{{{\rm{target}}}}(x))\approx \frac{\langle {\hat{A}}_{j}\rangle {| }_{\nu=\beta \varepsilon }-\langle {\hat{A}}_{j}\rangle {| }_{\nu=0}}{\beta }.$$

(5)

In a similar spirit as for classical Equilibrium Propagation, one may consider variants which combine positive and negative nudging³⁸, i.e., approximate the gradient using $(\langle {\hat{A}}_{j}\rangle {| }_{\nu=\beta \varepsilon }-\langle {\hat{A}}_{j}\rangle ){| }_{\nu=-\beta \varepsilon }/(2\beta )$, which empirically performs better for finite nudging.

Experimental requirements

We now discuss the most important practical considerations for implementing quantum equilibrium propagation (QEP) in any experimental platform.

Above all, the platform needs to have tuneable couplings θ, whose number preferably should be easy to scale up with growing system size. Such tuneable couplings have been developed for many quantum simulators³³ and quantum computing platforms by now. Examples include: (i) ion chains, for which spin-spin couplings can be mediated and engineered via the vibrational modes of the chain, employing suitable Raman transitions., e.g.³⁹; (ii) superconducting-qubit arrays, as employed for quantum computing, with current-tuneable couplers between neighbouring qubits⁴⁰ and tuneable qubit energies; (iii) neutral-atom Rydberg atom tweezer arrays providing tuneable spin-spin couplings⁴¹; (iv) strongly interacting atoms in optical lattices with spatially engineered potential energies, hopping amplitudes and interactions, e.g. based on holographic potential shaping⁴². Other platforms, e.g. in optomechanical arrays, coupled microwave cavities, or coupled laser arrays, also demonstrate interesting tuneable coupling schemes, but they often operate out of equilibrium and are therefore not directly suitable for QEP, unless one can map them back to an equilibrium situation.

Beyond this primary requirement of a scalable number of tuneable couplings, QEP platforms also need ways to apply the output forces ν. This demands local fields, e.g. effective magnetic fields or qubit detunings, easily available in most platforms that are flexible enough to support tuneable couplings. In addition, the expectation values of both the output operators and of the coupling operators connected to trainable degrees of freedom should be measurable. Regarding the couplings, we note that the expectation values of interaction terms of the form ${\hat{X}}_{j}{\hat{X}}_{k}$ or similar can easily be measured even by observing the spin operators ${\hat{X}}_{j}$ individually (and multiplying outcomes). A Heisenberg-type coupling operator ${\sum }_{\alpha=x,y,z}{\hat{\alpha }}_{j}{\hat{\alpha }}_{k}$ would need three separate measurements, for the x,y,z components, performed in independent shots of the experiment, eventually obtaining the expectation value of the complete operator. Alternatively, one could carry out a collective (two-qubit) measurement. The latter is typically performed via an ancilla, as demonstrated for syndrome extraction in quantum error correction schemes, and therefore requires more experimental effort. We note that the statistical nature of quantum physics in any case requires many runs of the experiment to measure the expectation values of output variables (needed for inference) and of coupling operators (needed for training). Each of these runs involves an equilibration step.

Finally, QEP requires efficient experimental means to approach the equilibrium state, i.e., for the zero-temperature limit, the ground state $\left\vert \Psi (\lambda )\right\rangle$ of the Hamiltonian. Before discussing physical equilibration, we note that mathematically/ computationally the task of ground state search for arbitrary Hamiltonians can become hard both for the classical case (NP-hard for general local spin glass Hamiltonians) and for the quantum case (QMA-hard, i.e., hard even for quantum computers in some instances). At the same time, recently, it has been shown⁴³ that there exist local quantum Hamiltonians for which it would be classically hard to obtain the ground state that nevertheless can be reached via thermalization. Coming back to physical equilibration dynamics, the situation in quantum EP is analogous to classical EP, in which equilibration needs to be studied on a case-by-case basis. Fortunately, efficient experimental ground-state preparation of complex quantum many-body Hamiltonians is one of the most intensively researched questions in quantum simulation and quantum computing, mirroring analogous progress in classical equilibration⁴⁴. For the purposes of QEP, we distinguish between two options: (i) hybrid approaches, for which an external digital computer is employed during equilibration, and (ii) purely autonomous schemes. Hybrid approaches could rely on variational quantum eigensolvers⁴⁵, for which an ansatz quantum circuit with continuously parametrized unitary gates is performed, the expectation value of the Hamiltonian is measured, and a classical optimization is performed to adapt parameters of the circuit. At first glance, the use of a classical optimizer to find the quantum ground state in this way might seem to contradict the basic premise of QEP or neuromorphic computing in general, i.e., using a physical system to do information processing. However, if the problem setting makes efficient use of the resulting quantum many-body ground state, even such a hybrid approach may still yield an advantage over an entirely classical digital device, in the same way that variational quantum eigensolvers are thought to be beneficial under the right circumstances as compared to numerical ground state search by classical algorithms.

Purely autonomous equilibration schemes get rid of any feedback loop. In principle, coupling to a cold environment can be sufficient, but recently, there has been active research into speeding up equilibration. The techniques put forward often rely in one way or another on variations of quantum reservoir engineering⁴⁶, which introduce dissipation deliberately, and which have, e.g., been used to stabilize quantum many-body states in superconducting circuits⁴⁷. This can happen in the form of suitable continuous driving (inspired by laser cooling), e.g. ref. ⁴⁸, or else in the form of quantum circuits that introduce gates coupling the quantum many-body system to ancilla qubits which then may be periodically reset ('digital quantum cooling', e.g. ref. ⁴⁹) or other schemes to reduce entropy, such as via suitable measurements⁵⁰.

Supervised learning: phase recognition in a quantum many-body system

We now move to illustrative applications of QEP. The input considered above is classical, and in the simplest setting QEP could be used to train a quantum device to perform an essentially classical machine learning task. However, we now turn to an important class of applications for which a QEP-trained system can effectively receive input that is quantum instead of classical.

In general, this setting can be realized by starting from a Hamiltonian ${\hat{H}}_{0}(x)$, whose quantum ground state we want to analyze with the help of a QEP setup. By tuning the classical parameters x (typically a few), we are able to realize different phases with different ground states, e.g., sweeping through some phase diagram in the case of a quantum many-body system. One task could consist in predicting, for any given x, the distinct quantum phase that the system assumes, possibly after seeing a few labelled training examples at a few parameter locations x. Another task could consist of approximating the entanglement entropy between some subsystems of ${\hat{H}}_{0}$ or predicting any other quantity of interest that can be derived in principle by inspecting the ground state but may be hard to extract directly by simple measurements.

Any of these tasks can be addressed via QEP in the following way. We couple the system of interest, described by ${\hat{H}}_{0}(x)$, to a trainable physical sensor, described by ${\hat{H}}_{{{\rm{sens}}}}(\theta )$, Fig. 2a. Overall, coming back to our previous definitions, we thus have the full QEP Hamiltonian $\hat{H}(x,\theta )={\hat{H}}_{0}(x)+\hat{V}(\theta )+{\hat{H}}_{{{\rm{sens}}}}(\theta )$, in which we assume that the couplings between the two systems reside inside $\hat{V}$ (see Fig. 2).

**Fig. 2: Supervised learning: Learning recognition of quantum phases using QEP.**

Ideally, the couplings inside the system of interest, ${\hat{H}}_{0}$, should be stronger than the couplings to the sensor and within that sensor. This will ensure that the system of interest is only weakly perturbed, while the recognition model can still react strongly to the features of the ground state $\left\vert {\psi }_{0}\right\rangle$ of ${\hat{H}}_{0}$. To ensure that the system-sensor couplings remain small in our numerical experiment (below), we enforce a soft cutoff on the couplings during the training (SI).

Distinguishing quantum phases can serve as an important application, and it has been considered before in the context of quantum machine learning based on gate-based quantum computers, for which a unitary circuit acts on a given ground state encoded in a multi-qubit register^51,52,53. In contrast to that, QEP relies on equilibration and moreover is far more general in the choice of systems—e.g. it does not require qubits as degrees of freedom, nor the ability to perform gates, nor any detailed knowledge of all aspects of the Hamiltonian.

Both for this task as well as others analyzed below, we choose the cluster Ising Hamiltonian (see e.g. ref. ⁵³) as an illustrative quantum many-body system,

$${\hat{H}}_{0}={g}_{{{\rm{ZXZ}}}}\sum_{j}{\hat{Z}}_{j-1}{\hat{X}}_{j}{\hat{Z}}_{j+1}-{g}_{{{\rm{ZZ}}}}\sum_{j}{\hat{Z}}_{j}{\hat{Z}}_{j+1}-{g}_{{{\rm{X}}}}\sum_{j}{\hat{X}}_{j},$$

(6)

in which $\hat{X},\hat{Y},\hat{Z}$ are the Pauli matrices. This model has three phases, including a topologically nontrivial one. We will now regard x = (g_ZXZ, g_ZZ, g_X) as the input parameters.

To enable classification of the phases, we couple a sensor to the chain. This sensor is made of qubits that are coupled in all possible ways among each other (2-local, with terms like ${\hat{X}}_{\alpha }{\hat{Y}}_{\beta }$ or Z_α etc) and to a limited region in the chain (couplings ${\hat{Z}}_{\alpha }{\hat{X}}_{j}$ etc., where j is a spin in the chain). All of these couplings are tuneable. The sensor qubits are measured in the Z-basis, and the resulting configuration is supposed to announce the detected phase. Here we use a mean-squared-error loss function, although a categorical cross-entropy would be suitable as well. We find that even a small-scale sensor of only two qubits, coupled to two spins in the chain, has sufficient expressivity to properly learn the known phase diagram of the cluster Ising model (Fig. 2b, c), when trained in a supervised fashion using QEP. We also confirm that the quantum nature of the coupling is important. This can be ascertained by comparing to a sensor that is only allowed to couple to $\hat{Z}$ operators in the chain, which performs much more poorly (Fig. 2g). By contrast, allowing coupling to non-commuting obervables yields the observed good accuracy. We also investigate the influence of the coupling strength between system and sensor (SI). With the soft cutoff we enforced to keep the couplings small, the system learns to predict the correct phases even when it is only trained on patches within each phase. This ability to generalize is less pronounced without the cutoff on the couplings since, in this case, the system-sensor coupling generally become large during training which disturbs the system spins so that their state substantially deviates from the unperturbed state.

We investigate the influence of measurement shot noise ('Methods') which, in practice, will make the gradient estimates noisy with fluctuations $\sim 1/\sqrt{M}$ with M the number of measurement shots. It is possible to counteract this effect by increasing the nudging strength, although this leads to a deviation from the linear response and hence the ideal gradients. As a result, there is a trade-off between shot noise and finite nudging which we analyze by inspecting the scalar product between the true and the estimated gradient (Fig. 2d, e).

Next, we demonstrate our method’s generality by applying it to a more complex model. This is particularly relevant since QEP is a physics-based training technique which in actual future experimental implementations, would not be hampered by the exponential growth of the Hilbert space, unlike classical simulations. Specifically, we numerically simulate the training of a phase sensor coupled to a strongly correlated two-dimensional spin model, namely the honeycomb Kitaev model⁵⁴

$${\hat{H}}_{{{\rm{Kitaev}}}}=\sum_{\alpha }\sum_{{{\bf{r}}}}{J}_{\alpha }{\hat{\alpha }}_{A,{{\bf{r}}}}{\hat{\alpha }}_{B,{{\bf{r}}}+{{{\bf{r}}}}_{\alpha }}$$

(7)

with α ∈ {X, Y, Z } and r_α defined according to Fig. 3a. The model has three Abelian phases for J_X ≥ J_Y + J_Z, J_Y ≥ J_Z + J_X, or J_Z ≥ J_X + J_Y, respectively, with other parameter values belonging to a non-Abelian phase. We train sensor spins coupled to ${\hat{H}}_{{{\rm{Kitaev}}}}$ ('Methods') to distinguish the Abelian from the non-Abelian phases. Figure 3b shows the single-shot accuracy during the training. The best performing phase sensor achieved a single-shot accuracy of 89.4 % and a many-queries accuracy of 100 % on the evaluation set shown as inset of Fig. 3b ('Methods'). The corresponding phase diagram with the predicted phases is depicted in Fig. 3c.

**Fig. 3: Phase sensor for the honeycomb Kitaev model.**

Unsupervised learning: phase exploration and sensitivity optimization

We now discuss two applications of unsupervised learning for which the gradients are used for optimization tasks.

Phase exploration

In this first example, we would like to explore the phase diagram of a quantum many-body system. In practice, this could be an interesting task for characterizing the capabilities of a quantum simulator when we would like to explore the phase diagram of a system which is partially unknown (e.g. some Hamiltonian terms are not known or cannot be tuned) and computationally hard to simulate classically. We can nevertheless ask whether phases exist that maximize (or minimize) certain expectation values and use QEP to explore the phase diagram and find such regimes. In this scenario, the relevant QEP-Hamiltonian of this unsupervised learning task is $\hat{H}(\theta,\nu )$, not containing any inputs x.

Figure 4 a illustrates the procedure: starting from an initial set of parameters θ (a point in the phase diagram), we optimize θ by computing the gradients with QEP to find the (potentially local) maximum of the expectation value of interest. The only difference in the QEP procedure here compared to the previous case is that the derivative of the simple loss function ${{\mathcal{L}}}=-y$ w.r.t. the output variable yields a trivial error signal of −1 which can then be inserted in the gradient calculation above according to Eq. (5).

**Fig. 4: Unsupervised learning applications: phase exploration and sensitivity optimization for sensing.**

Here, we first exemplify the procedure by examining a slice through the phase diagram of the cluster Ising Hamiltonian (6) when we fix ${g}_{ZXZ}=-0.5\equiv {{\rm{const.}}}$ We optimize the Néel order parameter $y\equiv \langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle$, such that the loss function is simply ${{\mathcal{L}}}(y)\equiv -\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle$. We show two example trajectories in Fig. 4b. In both cases, the trajectories quickly converge to the g_ZZ = 0 line and then move along it, going into the paramagnetic phase. Accordingly, the loss function, Fig. 4c, rapidly decreases, with the decrease becoming slower as the trajectory moves along the g_ZZ = 0 line.

We also consider phase exploration for the honeycomb Kitaev model with an additional magnetic field in [1, 1, 1] direction breaking integrability

$$\hat{H}={\hat{H}}_{{{\rm{Kitaev}}}}+h \sum_{\beta }\sum_{{{\bf{r}}}}({\hat{X}}_{\beta,{{\bf{r}}}}+{\hat{Y}}_{\beta,{{\bf{r}}}}+{\hat{Z}}_{\beta,{{\bf{r}}}}).$$

(8)

We use QEP to explore how the expectation value of the flux operator $y\equiv \langle {\hat{W}}_{1}\rangle=\langle {\hat{X}}_{1}{\hat{Y}}_{2}{\hat{Z}}_{3}{\hat{X}}_{4}{\hat{Y}}_{5}{\hat{Z}}_{6} \rangle$, Fig. 4d, changes. While for h = 0, $\langle {\hat{W}}_{1}\rangle=1$ in the ground-state sector, this is no longer the case for h ≠ 0 (see Methods). One can apply QEP to find the J_α that maximize $\langle {\hat{W}}_{1}\rangle$ (Fig. 4e, f), and in an experiment this could be used for systems of any size, including arbitrary additional terms in the Hamiltonian.

We suggest that, in general, this approach could be an efficient technique for exploring higher dimensional phase diagrams which cannot simply be mapped out without considerable effort by sweeping all of the parameters.

Sensitivity optimization

In the second unsupervised learning application, the aim is to maximize the derivative of some expectation value of interest w.r.t. a certain parameter θ_j. As Fig. 4g illustrates, in contrast to the previous examples, we now start with two points, θ⁽¹⁾ and θ⁽²⁾, in the phase diagram at which we compute expectation values ${y}_{1,2}\equiv \langle {\hat{A}}_{\ell }\rangle {| }_{{\theta }^{(1,2)}}$ and maximize the slope calculated from the difference quotient of the output expectation values at these two points. Concretely, the corresponding loss function has the form

$${{\mathcal{L}}}(\,{y}_{1},{y}_{2})=-\left| \frac{\langle {\hat{A}}_{\ell }\rangle {| }_{{\theta }^{(1)}}-\langle {\hat{A}}_{\ell }\rangle {| }_{{\theta }^{(2)}}}{{\theta }_{j}^{(1)}-{\theta }_{j}^{(2)}}\right| .$$

(9)

An appealing application of this procedure could be to find the optimal working point of sensors, such as magnetic field sensors.

To illustrate this scheme, we again consider the cluster Ising Hamiltonian (6) and optimize the slope of $\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle$ w.r.t. the parameter g_X, which is proportional to the magnetic field. Figure 4h shows two sets of example trajectories and the corresponding loss functions, Fig. 4i. We observe that the trajectories converge to the phase boundary, where the slope is largest. For the first run, Fig. 4j, we see that the trajectory moves along the phase boundary, since the slope is larger for smaller g_ZZ. We suggest that, in the future, this feature could be exploited to more generally map out phase boundaries.

Discussion

We have exploited Onsager reciprocity to derive a quantum version of equilibrium propagation, which in its classical form is currently arguably the most widely studied general training technique for neuromorphic platforms. We have shown that this can be used successfully also for situations in which the input is effectively a quantum state (as in classifying quantum phases via supervised learning), as well as for unsupervised learning tasks (related to exploring the phase diagrams of quantum simulators or suited for optimizing sensing capabilities). In all of these cases, QEP can be applied even when the Hamiltonian is hard or impossible to simulate classically, since QEP extracts gradients via the physical response. In addition, it can be employed in cases in which only some aspects of the system are well characterized or accessible while other parts of the Hamiltonian are unknown. An important requirement is being able to reach the ground state experimentally, and we explained various options in the section on experimental requirements. Beyond the tasks analyzed in this manuscript, we can envisage further possibilities. One might train a tuneable quantum simulator to realize an arbitrary phase diagram, which is 'sketched', i.e., the phase diagram is prescribed in parts. Moreover, one might train it to approximately match the phase diagram and overall behaviour of another experimentally accessible quantum system, producing a quantum 'twin' and realizing the original promise of quantum simulations. A large variety of experimental platforms should be amenable to implementations of quantum equilibrium propagation, including highly tuneable systems based on trapped ions, Rydberg atoms, strongly interacting atoms in optical lattices, and superconducting qubit arrays. This will enable the transformation of many quantum simulators into learning machines, opening up a novel avenue for these intensively studied platforms.

During the final stage of completion of this manuscript and shortly before submission, two related preprints appeared on the arXiv^55,56, also introducing a quantum version of equilibrium propagation, with different use cases.

Methods

Supervised learning example

We considered a cluster Ising chain of N = 8 spins, and a sensor of 2 qubits, such that the total Hilbert space is 1024-dimensional. This allows efficient exact diagonalization using the Lanczos algorithm applied to sparse matrices, to find the ground state of the coupled system. The three considered output operators are projectors onto three states, each with a definite combination of the sensor operators ${\hat{Z}}_{{1}^{{\prime} }}$ and ${\hat{Z}}_{{2}^{{\prime} }}$, as shown in the figure; e.g., ${\hat{P}}_{1,-1}=(1+{\hat{Z}}_{{1}^{{\prime} }})(1-{\hat{Z}}_{{2}^{{\prime} }})/4$ projects onto the combination (+1, −1) and would be used to indicate the ferromagnetic phase, which is reached when g_ZZ dominates. The expectation value is correspondingly the probability to observe this particular combination in a projective measurement of these two operators. The training samples are drawn uniformly from the triangular phase diagram shown in the figure, in which (following the convention in the literature) we set the sum of all three coupling parameters to 4. The phase boundaries are known for this benchmark quantum many-body model⁵³, which allows us to provide the correct labels: in each phase, one of the projectors is 1, while the other two are 0 ('one-hot-encoding'), and the assignment of the three combinations of $({Z}_{{1}^{{\prime} }},{Z}_{{2}^{{\prime} }})$ to the three phases is defined in a fixed arbitrary way.

Gradients are obtained using the symmetric nudging procedure, by adding the nudged output operators ${\sum }_{j}{\nu }_{\ell }{\hat{A}}_{\ell }$ to the Hamiltonian, in which in our case ${\hat{A}}_{\ell }={\hat{P}}_{\ell }$ and ℓ ∈ {(+1, +1), (+1, −1), (−1, −1)}. The ground state for the nudged Hamiltonian (for both signs of nudging) is re-calculated using sparse Lanczos diagonalization. Interestingly, our experiments have shown that this is numerically more efficient than attempting to obtain the exact linear response using first-order perturbation theory, which involves solving a linear system of equations when applying ${(E-\hat{H})}^{-1}$ to the perturbed ground state. After the approximate gradient has been obtained using QEP, we use it inside an Adam adaptive gradient descent optimizer to update the parameters—which would be possible also for the QEP gradients obtained in real experiments. Just like for usual training of artificial neural networks, we group training samples into batches and actually employ the batch-averaged gradient. The learning rate employed in the numerical examples was set to 0.01.

Test accuracies are measured on a test set of 200 points that are also uniformly randomly distributed across the triangular space and which are fixed before the training run. We distinguish two measures of accuracy: In the 'many queries' version, we imagine that one would obtain the expectation values of the three projectors by simply measuring ${\hat{Z}}_{{1}^{{\prime} }}$ and ${\hat{Z}}_{{2}^{{\prime} }}$ simultaneously. After performing many measurement shots, this will yield the measurement probabilities for the three different combinations of outcomes {(+1, +1), (+1, −1), (−1, −1)}, corresponding to the three different projectors. We then assign as official outcome the phase whose associated projector has the largest expectation value (largest measurement probability). This is similar to how accuracy would be assessed for classical machine-learning classification models, by identifying the label of maximum probability and comparing it against the true label. In the 'single shot' version, we imagine to run inference only once (equilibrating to the ground state once) and performing a single measurement of the two sensor operators. The outcome will be declared correct if the combination matches the correct label of the true underlying phase for this parameter combination. Single-shot performance is more difficult, but even so the training results show that single-shot accuracy can also reach relatively high values. Repetition, e.g. using three shots and taking a majority vote when possible, will quickly boost the accuracy (until it reaches the 'many queries' result in the limit of many shots).

Influence of measurement shot noise

An important general aspect of QEP training is the unavoidable projection shot noise encountered in any quantum experiment, in which expectation values are obtained by repeated measurements with individually discrete outcomes—in contrast to the classical situation.

For assessing the influence of shot noise, we replace the exact expectation values by numerical values that are drawn from a Gaussian distribution centred around that value, with the correct variance ${{\rm{Var}}}{\hat{A}}_{\ell }/M$, in which M is the number of measurement shots (recall each of those will usually require a new equilibration, unless one adopts some of the strategies mentioned in the main text). Replacing the true distribution by a Gaussian is a reasonable approximation unless M is very small.

Measurement shot noise hampers training, since the estimate of the gradient is noisy, with fluctuations $\sim 1/\sqrt{M}$, in which each of the M shots requires a renewed equilibration. Fortunately, we find that this effect can be counteracted by employing an increased finite nudging strength β, effectively boosting the contrast in estimating the response of expectation values. Finite nudging, however, leads to a deviation from the linear response that would yield the ideal gradients. Therefore ultimately there is a sweet spot, balancing shot noise vs. nonlinearity of the response, to obtain optimized training convergence. This can be analyzed by inspecting the scalar product between the true gradient direction and the estimated gradient (Fig. 2d, e of the main text). In our numerical experiments, we find that the goal of minimizing the total number of experimental runs (i.e. number of training samples multiplied by number of shots per sample) is best achieved by keeping M small and simply taking more samples. Apparently, this leads to more variety in the observed training data and better training performance. In experiments, the number of shots M could also be reduced by coupling multiple sensors at different locations to the system, having them share their coupling values. Finally, in principle, there is an alternative to averaging over projective measurements, namely performing weak continuous measurements of the expectation values (a weakly coupled sensor is able to realize this). This could be performed without rethermalizing to the ground state.

Aspects of scalability

The numerical simulation results we show in the main text serve the purpose of illustrating the method QEP, but are naturally limited in the size of the systems we can treat, due to the exponential growth of the Hilbert space. This exponential growth of the numerical effort is also the reason why a directly physics-based training method such as QEP is attractive, since in the eventual application to experiments the method will precisely obviate the need for such challenging numerical simulations.

In the experiment, there are no constraints that would prevent the application of QEP to larger systems, as the number of measurements needed for QEP is independent of the system size. We note that in typical modern quantum simulator platforms care is taken to construct setups that require only a single experimental run to perform a measurement of all (commuting) observables simultaneously; a typical example being a single snapshot of ultracold atoms in an optical lattice revealing a projective measurement of the atom number in all lattice sites. Therefore, scaling up the system size does not in itself lead to an increase in the number of measurements. That being said, the total number of measurements is the product of the number of experiments that need to be done under varying conditions and the number of measurement shots per experiment. While straightforward parameter shift methods would require a number of experiments scaling linearly in the (potentially large) number of trainable parameters, the QEP method only requires a single linear-response experiment (or only a small fixed number of order 1) to physically extract the training gradient independent of system size.

Taking measurement shot-noise into account, one would expect that one has to perform around 100 measurement shots or more to get a reasonably accurate expectation. However, our numerical experiments reported in the main text show that the stochastic gradient descent still works very well even for a small number of shots. Regardless, the required number of shots does not increase with system size, and likewise the number of experiments remains the same (on order 1), independent of system size.

Finally, the required number of different training samples normally also does not increase with system size. In a supervised training scenario of the type envisaged in our manuscript, e.g., for recognizing the phase of the quantum many-body ground state, the number of training samples will determine how densely one can sample the phase diagram of the system. However, larger system sizes do not require a larger number of training samples, since the learning tasks remains essentially the same (distinguishing a finite number of potential phases).

Technical details of the phase sensor for the Kitaev honeycomb model

We simulate a lattice of 8 × 2 spins which is coupled to two additional sensor spins, Fig. 3a, i.e., 8 plaquettes, under periodic boundary conditions. Since we perform exact diagonalisation to compute expectation values in the ground state, we choose a system size that allows to perform the calculations within a reasonable time frame. We note that the envisioned implementation of the system with a quantum simulator would not have the same limitations. We choose this particular 8 × 2 lattice to keep the impact of finite size effects as small as possible. In particular, without the correct boundary conditions, contrary to what is expected from the exact solution in the thermodynamic limit $\langle {\hat{W}}_{1}\rangle \ne 1$ at finite system size. This is because the quantization of $\langle {\hat{W}}_{1}\rangle$ is based on Lieb’s theorem which only applies in the limit of an infinite system or with the correct, twisted periodic boundary conditions at finite size⁵⁷. Limiting the total number of system spins to 16, we chose to simulate a system of 8 × 2 spins for which the normal periodic boundary conditions correspond to the twisted boundary conditions needed so that Lieb’s theorem holds and $\langle {\hat{W}}_{1}\rangle$ is quantised at finite system size.

We consider couplings between the two sensor spins (all possible combinations of Paulis) as well as couplings to the system spins of the Kitaev model via the operators ${\hat{X}}_{A,{{\bf{r}}}}{\hat{X}}_{B,{{\bf{r}}}+{{{\bf{r}}}}_{X}}$, ${\hat{Y}}_{A,{{\bf{r}}}}{\hat{Y}}_{B,{{\bf{r}}}+{{{\bf{r}}}}_{Y}}$, ${\hat{Z}}_{A,{{\bf{r}}}}{\hat{Z}}_{B,{{\bf{r}}}+{{{\bf{r}}}}_{Z}}$ of one plaquette, Fig. 3a. See the Supplementary Material for further details about the numerical implementation.

Different approaches for nudging

In classical EP, the idea is to add the loss function ${{\mathcal{L}}}(y,{y}^{{{\rm{target}}}}(x))$ to the energy, multiplied by β, in which β is a small constant. Instead, in the main text we advocated simply adding ${\sum }_{\ell }{\nu }_{\ell }{\hat{A}}_{\ell }$ to the Hamiltonian, in which ${\hat{A}}_{\ell }$ are the output operators and ${\nu }_{\ell }=\beta \partial {{\mathcal{L}}}/\partial {y}_{\ell }$ is the nudge force, since for small β this produces the force needed to elicit the linear response required for the gradient. If we were instead to translate directly the classical EP prescription to the quantum Hamiltonian, we could add to the Hamiltonian a term $\beta {{\mathcal{L}}}(\,\hat{y},{y}^{{{\rm{target}}}}(x))$, with the operator version of the outputs, ${\hat{y}}_{\ell }={\hat{A}}_{\ell }$ for $\ell \in {{{\mathcal{S}}}}_{{{\rm{out}}}}$, replacing the expectation values y_ℓ. Expanding this term to linear order in ${\hat{A}}_{\ell }$, we would obtain the same result as our ansatz. The higher-order terms in $\beta {{\mathcal{L}}}$ would lead to further corrections to the Hamiltonian, and depending on the quantum fluctuations in ${\hat{A}}_{\ell }$ these might be as large as the linear-order term itself (e.g. for mean-squared-error loss functions, these could correspond to a stiffening of the potential acting on ${\hat{A}}_{\ell }$). It is difficult to assess a priori the effect of these fluctuations arising from such an ansatz $\beta {{\mathcal{L}}}(\,\hat{y},{y}^{{{\rm{target}}}}(x))$. Furthermore, these higher-order terms might also be more challenging to implement experimentally, depending on the shape of ${\hat{A}}_{\ell }$. Since they are not needed to evaluate the gradient, we chose the procedure explained in the main text, adding only linear terms to the Hamiltonian. It would be interesting in the future to compare the various approaches for finite nudging, when β is not small (which is actually the case for the numerical experiments).

Unsupervised learning examples

Phase exploration

We again consider a cluster Ising Hamiltonian of 10 spins, and search for the points in the phase diagram that maximize $\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle$, hence, ${\hat{X}}_{0}{\hat{X}}_{4}$ is the output operator. This is expected to be maximized in the paramagnetic phase. Initial parameters will typically be chosen randomly. For illustrative purposes, we manually select the initial parameters, i.e. the starting points of the trajectories in Fig. 4b. Concretely, the two starting points are g_X = −0.1, g_ZZ = 0.4 (run 1) and g_X = 0.9, g_ZZ = 0.9 (run 2). We obtain gradients according to the QEP procedure by switching on the coupling to the output operator $\nu {\hat{X}}_{0}{\hat{X}}_{4}$. As in the supervised learning example, the ground state is computed using sparse Lanczos diagonalization. For the results shown in Fig. 4b, both the learning rate and the nudge parameter ν were set to 0.1.

Sensitivity optimization

In the second unsupervised learning example, we search for the largest slope of $\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle$ w.r.t. g_X in the phase diagram of the cluster Ising Hamiltonian (6) with 10 spins by optimizing the loss function (9). Concretely, we consider two different values of ${g}_{X}^{(1,2)}$ while the other parameters are the same (g_ZZ is trained while ${g}_{ZXZ}=-\!0.5\equiv {{\rm{const.}}}$).

During one step of the training, we update ${g}_{X}^{(1)}$, ${g}_{X}^{(2)}$ and g_ZZ. To obtain the necessary gradients, we need to compute the following derivatives (for brevity we denote ∂/∂θ_j by ${\partial }_{{\theta }_{j}}$)

$${\partial }_{{g}_{X}^{(1)}}{{\mathcal{L}}}= \varepsilon \,\frac{{\partial }_{{g}_{X}^{(1)}}\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(1)}}}{| {g}_{X}^{(1)}-{g}_{X}^{(2)}| }\\ +{{\rm{sgn}}}\,({g}_{X}^{(1)}-{g}_{X}^{(2)})\frac{{{\mathcal{L}}}}{| {g}_{X}^{(1)}-{g}_{X}^{(2)}| }$$

(10)

$${\partial }_{{g}_{X}^{(2)}}{{\mathcal{L}}}= -\varepsilon \,\frac{{\partial }_{{g}_{X}^{(2)}}\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(2)}}}{| {g}_{X}^{(1)}-{g}_{X}^{(2)}| }\\ -{{\rm{sgn}}}\,({g}_{X}^{(1)}-{g}_{X}^{(2)})\frac{{{\mathcal{L}}}}{| {g}_{X}^{(1)}-{g}_{X}^{(2)}| }$$

(11)

$${\partial }_{{g}_{ZZ}}{{\mathcal{L}}}=\varepsilon \,{\partial }_{{g}_{ZZ}}\frac{\left(\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(1)}}-\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(2)}}\right)}{| {g}_{X}^{(1)}-{g}_{X}^{(2)}| }$$

(12)

with $\varepsilon \equiv {{\rm{sgn}}}\,[\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(1)}}-\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(2)}}]$. In all of the expressions, we use QEP to extract ${\partial }_{{\theta }_{j}}\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{{g}_{X}^{(1,2)}}$ with ${\theta }_{j}\in \{{g}_{X}^{(1)},{g}_{X}^{(2)},{g}_{ZZ}\}$.

To that end, as outlined in the main text, we approximate the derivative by comparing the nudged and the free expectation value

$$\frac{\partial }{\partial {\theta }_{j}}\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle \approx \frac{\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{\nu=\beta \varepsilon }-\langle {\hat{X}}_{0}{\hat{X}}_{4}\rangle {| }_{\nu=0}}{\beta }.$$

(13)

As before, to evaluate the expectation value in the nudged phase, we couple to the output operator by adding $\nu {\hat{X}}_{0}{\hat{X}}_{4}$ to the Hamiltonian. Ground states in the free and the nudged phase are again computed using sparse Lanczos diagonalization.

To illustrate the procedure, we consider two different starting points in the phase diagram: ${g}_{X}^{(1)}=-0.2$, ${g}_{X}^{(2)}=-1.5$, g_ZZ = −1.5 (run 1; each point starts in a different phase) and ${g}_{X}^{(1)}=-0.5$, ${g}_{X}^{(2)}=0.3$, g_ZZ = 1.0 (run 2; both points are in the same phase). Both the learning rate and the nudging parameter were set to 0.1.

Further details about phase exploration with the Kitaev honeycomb model

In the limit h = 0, the Kitaev honeycomb model, Eq. (8) can be mapped to a bi-linear Majorana fermion model and is exactly solvable⁵⁴. In that case, the flux operator defined on one plaquette, e.g. ${\hat{W}}_{1}={\hat{X}}_{1}{\hat{Y}}_{2}{\hat{Z}}_{3}{\hat{X}}_{4}{\hat{Y}}_{5}{\hat{Z}}_{6}$, has eigenvalues ±1 with the +1 eigenvalue corresponding to the ground state sector in which $\langle {\hat{W}}_{1}\rangle=1$. Switching on the magnetic field h ≠ 0, the model is not solvable and $\langle {\hat{W}}_{1}\rangle$ is no longer quantized, so we use QEP in the main text to explore how $\langle {\hat{W}}_{1}\rangle$ changes as a function of J_α; in particular, we fix $h=0.05\equiv {{\rm{const.}}}$ and are looking for the parameters J_α that maximize $y\equiv \langle {\hat{W}}_{1}\rangle$. Note that any other plaquette would yield the same behaviour due to translational symmetry.

Data availability

The numerical data produced in this work are available from the repository at https://github.com/ClaraWanjura/QuantumEP.

Code availability

The code to generate the results discussed in this work is available from the GitHub repository⁵⁸ at https://github.com/ClaraWanjura/QuantumEP and Zenodo⁵⁹ at https://doi.org/10.5281/zenodo.15741281.

References

Marković, D., Mizrahi, A., Querlioz, D. & Grollier, J. Physics for neuromorphic computing. Nat. Rev. Phys. 2, 499 (2020).
Article Google Scholar
Wetzstein, G. et al. Inference in artificial intelligence with deep optics and photonics. Nature 588, 39 (2020).
Article ADS CAS PubMed Google Scholar
Shastri, B. J. et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photonics 15, 102 (2021).
Article ADS CAS Google Scholar
Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Momeni, A. et al. Training of physical neural networks, https://arxiv.org/abs/2406.03372 (2024).
Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures, in Advances in Neural Information Processing Systems, 31, edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N. and Garnett, R. https://proceedings.neurips.cc/paper_files/paper/2018/file/63c3ddcc7b23daa1e42dc41f9a44a873-Paper.pdf (Curran Associates, Inc., 2018).
Psaltis, D., Brady, D., Gu, X.-G. & Lin, S. Holography in artificial neural networks. Nature 343, 325 (1990).
Article ADS CAS PubMed Google Scholar
Guo, X., Barrett, T. D., Wang, Z. M. & Lvovsky, A. Backpropagation through nonlinear units for the all-optical training of neural networks. Photonics Res. 9, B71 (2021).
Article Google Scholar
Spall, J., Guo, X. & Lvovsky, A. I. Training neural networks with end-to-end optical backpropagation. Adv. Photonics 7, 016004 (2025).
Article CAS Google Scholar
Hughes, T. W., Minkov, M., Shi, Y. & Fan, S. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica 5, 864 (2018).
Article ADS Google Scholar
Pai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380, 398 (2023).
Article ADS CAS PubMed Google Scholar
Momeni, A., Rahmani, B., Malléjac, M., del Hougne, P. & Fleury, R. Backpropagation-free training of deep physical neural networks. Science 382, 1297 (2023).
Article ADS MathSciNet CAS PubMed Google Scholar
Wanjura, C. C. & Marquardt, F. Fully nonlinear neuromorphic computing with linear wave scattering. Nat. Phys. 20, 1434 (2024).
Article CAS Google Scholar
Yildirim, M., Dinc, N. U., Oguz, I., Psaltis, D. & Moser, C. Nonlinear processing with linear optics. Nat. Photonics 18, 1076 (2024).
Article CAS PubMed PubMed Central Google Scholar
Xia, F. et al. Nonlinear optical encoding enabled by recurrent linear scattering. Nat. Photonics 18, 1067 (2024).
Article CAS PubMed PubMed Central Google Scholar
López-Pastor, V. & Marquardt, F. Self-learning machines based on Hamiltonian Echo Backpropagation. Phys. Rev. X 13, 031020 (2023).
Google Scholar
Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).
Article PubMed PubMed Central Google Scholar
Scellier, B. A deep learning theory for neural networks grounded in physics, arXiv preprint arXiv:2103.09985 (2021).
Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. and Scellier, B. Training end-to-end analog neural networks with equilibrium propagation, arXiv preprint arXiv:2006.01981 (2020).
Martin, E.et al. Eqspike: spike-driven equilibrium propagation for neuromorphic implementations. Iscience 24, 102222 (2021).
Ernoult, M., Grollier, J., Querlioz, D., Bengio, Y. and Scellier, B. Equilibrium propagation with continual weight updates, arXiv preprint arXiv:2005.04168 (2020).
Scellier, B., Mishra, S., Bengio, Y. and Ollivier, Y. Agnostic physics-driven deep learning, https://arxiv.org/abs/2205.15021 (2022).
Falk, M. J., Strupp, A. T., Scellier, B. & Murugan, A. Temporal contrastive learning through implicit non-equilibrium memory. Nat. Commun. 16, 2163 (2025).
Article CAS PubMed PubMed Central Google Scholar
Wang, Q., Wanjura, C. C. & Marquardt, F. Training coupled phase oscillators as a neuromorphic platform using equilibrium propagation. Neuromorphic Comput. Eng. 4, 034014 (2024).
Article Google Scholar
Stern, M., Hexner, D., Rocks, J. W. & Liu, A. J. Supervised learning in physical networks: From machine learning to learning machines. Phys. Rev. X 11, 021045 (2021).
CAS Google Scholar
Stern, M., Liu, A. J. & Balasubramanian, V. Physical effects of learning. Phys. Rev. E 109, 024311 (2024).
Article ADS CAS PubMed Google Scholar
O’Connor, P., Gavves, E. and Welling, M. Training a spiking neural network with equilibrium propagation, in Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, 89, edited by Chaudhuri, K. and Sugiyama, M. https://proceedings.mlr.press/v89/o-connor19a.html pp. 1516–1523 (PMLR, 2019).
Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).
Article ADS CAS Google Scholar
Dillavou, S.et al. Machine learning without a processor: Emergent learning in a nonlinear electronic metamaterial, arXiv preprint arXiv:2311.00537 (2023).
Altman, L. E., Stern, M., Liu, A. J. & Durian, D. J. Experimental demonstration of coupled learning in elastic networks. Phys. Rev. Appl. 22, 024053 (2024).
Article CAS Google Scholar
Oh, S., An, J., Cho, S., Yoon, R. & Min, K.-S. Memristor crossbar circuits implementing equilibrium propagation for on-device learning. Micromachines 14, 1367 (2023).
Article PubMed PubMed Central Google Scholar
Laydevant, J., Marković, D. & Grollier, J. Training an Ising machine with equilibrium propagation. Nat. Commun. 15, 3671 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Altman, E. et al. Quantum simulators: architectures and opportunities. PRX quantum 2, 017003 (2021).
Article Google Scholar
Schuld, M., Sinayskiy, I. & Petruccione, F. An introduction to quantum machine learning. Contemp. Phys. 56, 172 (2015).
Article ADS Google Scholar
Biamonte, J. et al. Quantum machine learning. Nature 549, 195 (2017).
Article ADS CAS PubMed Google Scholar
Onsager, L. Reciprocal relations in irreversible processes. I. Phys. Rev. 37, 405 (1931).
Article ADS CAS Google Scholar
Kubo, R. The fluctuation-dissipation theorem. Rep. Prog. Phys. 29, 255 (1966).
Article ADS CAS Google Scholar
Scellier, B., Ernoult, M., Kendall, J. and Kumar, S. Energy-based learning algorithms for analog computing: a comparative study, https://arxiv.org/abs/2312.15103 (2023).
Kim, K. et al. Entanglement and tunable spin-spin couplings between trapped ions using multiple transverse modes. Phys. Rev. Lett. 103, 120502 (2009).
Article ADS CAS PubMed Google Scholar
Chen, Y. et al. Qubit architecture with high coherence and fast tunable coupling. Phys. Rev. Lett. 113, 220502 (2014).
Article ADS PubMed Google Scholar
Steinert, L.-M. et al. Spatially tunable spin interactions in neutral atom arrays. Phys. Rev. Lett. 130, 243001 (2023).
Article ADS CAS PubMed Google Scholar
Bakr, W. S., Gillen, J. I., Peng, A., Fölling, S. & Greiner, M. A quantum gas microscope for detecting single atoms in a Hubbard-regime optical lattice. Nature 462, 74 (2009).
Article ADS CAS PubMed Google Scholar
Chen, C.-F., Huang, H.-Y., Preskill, J. and Zhou, L. Local minima in quantum systems, in Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC ’24). (Vancouver, 2024).
Kalinin, K. P. & Berloff, N. G. Global optimization of spin hamiltonians with gain-dissipative systems. Sci. Rep. 8, 17791 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Tilly, J. et al. The variational quantum eigensolver: a review of methods and best practices. Phys. Rep. 986, 1 (2022).
Article ADS MathSciNet Google Scholar
Poyatos, J., Cirac, J. I. & Zoller, P. Quantum reservoir engineering with laser cooled trapped ions. Phys. Rev. Lett. 77, 4728 (1996).
Article ADS CAS PubMed Google Scholar
Ma, R. et al. A dissipatively stabilized Mott insulator of photons. Nature 566, 51 (2019).
Article ADS CAS PubMed Google Scholar
Raghunandan, M., Wolf, F., Ospelkaus, C., Schmidt, P. O. & Weimer, H. Initialization of quantum simulators by sympathetic cooling. Sci. Adv. 6, eaaw9268 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Polla, S., Herasymenko, Y. & O’Brien, T. E. Quantum digital cooling. Phys. Rev. A 104, 012414 (2021).
Article ADS MathSciNet CAS Google Scholar
Cotler, J. et al. Quantum virtual cooling. Phys. Rev. X 9, 031013 (2019).
CAS Google Scholar
Cong, I., Choi, S. & Lukin, M. D. Quantum convolutional neural networks. Nat. Phys. 15, 1273 (2019).
Article CAS Google Scholar
Herrmann, J. et al. Realizing quantum convolutional neural networks on a superconducting quantum processor to recognize quantum phases. Nat. Commun. 13, 4144 (2022).
Article ADS PubMed PubMed Central Google Scholar
Liu, Y.-J., Smith, A., Knap, M. & Pollmann, F. Model-independent learning of quantum phases of matter with quantum convolutional neural networks. Phys. Rev. Lett. 130, 220603 (2023).
Article ADS MathSciNet CAS PubMed Google Scholar
Kitaev, A. Anyons in an exactly solved model and beyond. Ann. Phys. 321, 2 (2006).
Article ADS MathSciNet CAS Google Scholar
Massar, S. and Mognetti, B. M. Equilibrium propagation: the quantum and the thermal cases, https://arxiv.org/abs/2405.08467 (2024).
Scellier, B. Quantum equilibrium propagation: Gradient-descent training of quantum systems, https://arxiv.org/abs/2406.00879 (2024).
Zschocke, F. & Vojta, M. Physical states and finite-size effects in Kitaev’s honeycomb model: Bond disorder, spin excitations, and nmr line shape. Phys. Rev. B 92, 014403 (2015).
Article ADS Google Scholar
Wanjura, C. C. and Marquardt, F. QuantumEP [Computer software] https://github.com/ClaraWanjura/QuantumEP (2025).
Wanjura, C. C. and Marquardt, F. Code for the work presented in 'Quantum Equilibrium Propagation for efficient training of quantum systems based on Onsager reciprocity'. Zenodo. https://doi.org/10.5281/zenodo.15741281 (2025).
Marković, D. & Grollier, J. Quantum neuromorphic computing. Appl. Phys. Lett. 117, 150501 (2020).
Article ADS Google Scholar

Download references

Acknowledgements

We acknowledge funding via the German Research Foundation (DFG, Project-ID 429529648-TRR 306 QuCoLiMA, 'Quantum Cooperativity of Light and Matter'). This research is part of the Munich Quantum Valley, which is supported by the Bavarian state government with funds from the Hightech Agenda Bayern Plus.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Max Planck Institute for the Science of Light, Erlangen, Germany
Clara C. Wanjura & Florian Marquardt
Department of Physics, University of Erlangen-Nuremberg, Erlangen, Germany
Florian Marquardt

Authors

Clara C. Wanjura
View author publications
Search author on:PubMed Google Scholar
Florian Marquardt
View author publications
Search author on:PubMed Google Scholar

Contributions

Both authors, C.C.W. and F.M., developed the idea, ran the numerical experiments and wrote the manuscript.

Corresponding author

Correspondence to Clara C. Wanjura.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wanjura, C.C., Marquardt, F. Quantum equilibrium propagation for efficient training of quantum systems based on Onsager reciprocity. Nat Commun 16, 6595 (2025). https://doi.org/10.1038/s41467-025-61665-6

Download citation

Received: 28 June 2024
Accepted: 27 June 2025
Published: 17 July 2025
Version of record: 17 July 2025
DOI: https://doi.org/10.1038/s41467-025-61665-6

Subjects

Abstract

Similar content being viewed by others

Hamiltonian engineering of collective XYZ spin models in an optical cavity

Single-atom exploration of optimized nonequilibrium quantum thermodynamics by reinforcement learning

Quantum phases of matter on a 256-atom programmable quantum simulator

Introduction

Results

Onsager reciprocity as the basis for quantum equilibrium propagation

Quantum equilibrium propagation procedure

Experimental requirements

Supervised learning: phase recognition in a quantum many-body system

Unsupervised learning: phase exploration and sensitivity optimization

Phase exploration

Sensitivity optimization

Discussion

Methods

Supervised learning example

Influence of measurement shot noise

Aspects of scalability

Technical details of the phase sensor for the Kitaev honeycomb model

Different approaches for nudging

Unsupervised learning examples

Phase exploration

Sensitivity optimization

Further details about phase exploration with the Kitaev honeycomb model

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Transparent Peer Review file

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links