Introduction

Encoding quantum information within bosonic modes has found a number of applications in quantum computation, metrology, and communication. In particular, multimode multi-photon states have been extremely useful in the development of bosonic error-correcting codes to aid fault-tolerant quantum computation1,2,3,4,5,6. These error-correcting codes can be tailored to be hardware-efficient2,3,7, and have recently been shown to reach the break-even point for error-corrected quantum computation8,9,10. Multimode bosonic states have also been shown to achieve improved resolution and parameter sensitivity in the case of metrology and sensing11,12,13. Quantum communication protocols utilizing bosonic states are also found to have improved fidelity of quantum state transfer6,14,15 and entanglement distribution between nodes16.

A critical capability for harnessing the full potential of multimode multi-photon states is the high-fidelity state preparation and universal quantum control. Standard circuit constructions using single and two-qubit gates are not well suited to manipulate arbitrary multimode multi-photon bosonic states because their code-space spans only a fraction of the entire Hilbert space. Error correction, particularly for photon loss, requires complex, error-prone circuitry with ancillary modes17,18. Alternatively, architectures that naturally operate on the photon-number basis have been proposed using weak second and third-order optical nonlinearities2,3,7,19,20. These nonlinearities generate significant temporal mode distortions, making them extremely challenging to use for controllable manipulation of multi-photon bosonic states21,22,23,24,25. Reliably utilizing these optical nonlinearities requires fast time-dependent control of cavity output coupling, which is challenging to implement in practice26,27,28,29. Moreover, these architectures are often limited by the symmetry of their Hamiltonians, while photon-photon interactions mediated by simple cavity quantum electrodynamic systems suffer from time-bandwidth constraints, reducing fidelity30,31,32,33,34,35. General methods for scalable, high-fidelity manipulation of multimode multi-photon states remain elusive.

In this paper, we demonstrate a framework to prepare arbitrary multimode multi-photon states, and implement universal and encoded quantum operations using a strong programmable optical nonlinearity. We utilize quantum photonic neural networks, parameterized by phase-shifts, enabling code-agnostic universal control of bosonic encodings. The central element of our architecture is the optical nonlinearity, which is implemented by strong light–matter interaction with a three-level Λ atomic system33,36,37. This nonlinearity is used as the elementwise activation function in our architecture, and acts as a photon-number selective phase gate. The dynamics of this light-matter interaction are asymptotically confined to the single-mode subspace, and do not introduce temporal mode distortions, which is a common limitation of other optical nonlinearities. Moreover, the strong light–matter interaction provided by this nonlinearity can be used as a means to perform photon-number resolving non-demolition measurements for error correction circuits.

We exemplify this approach by constructing an end-to-end architecture for logical quantum computation. First, we show that the optical nonlinearity we develop is able to construct deterministic entangling gates for physical qubits encoded in the dual-rail basis. Using this as the activation function within our neural network architecture, we demonstrate the preparation of essential resource states, and emphasize this by training a sample of Haar-random states to high fidelity. We analyze the impact of coherent errors caused by component imperfections on the fidelity of state preparation on a 4-photon N00N state. To showcase the universality of this circuit ansatze, we demonstrate code-agnostic control of bosonic error-correcting code. In particular, we construct a universal gate set for the two-mode χ(2) binomial bosonic code, because it cannot be prepared using just linear optical elements and a χ(2) nonlinearity2. Additionally, by adapting components of our architecture, non-demolition measurements of the total photon number can be performed. This aids the construction of error-correcting circuits to protect bosonic codes from the dominant decoherence channel of photon loss. Finally, we evaluate the hardware requirements that must be addressed in order to construct the neural network processor, and show that the proposed architecture can be realized using present-day integrated photonic hardware.

In the next section, we detail the architecture of the quantum photonic neural network, including the nonlinear dynamics of the Λ atomic system for deterministic, high-fidelity gates. In the following sections, we present the numerical simulations for quantum state preparation, universal control of bosonic error-correcting codes, and construction of error-correcting circuits. Lastly, in section “Discussion”, we discuss the hardware parameters necessary for experimental realization on an integrated platform. We conclude by summarizing the key advantages of our architecture and its future prospects.

Results

The neural network architecture

Quantum photonic neural networks have emerged as a promising platform for optical quantum computation. Theoretical proposals have shown that these architectures are excellent at gate synthesis, black-box quantum simulation, and acting as one-way quantum repeaters7,19,20. These approaches, however, utilize weak optical nonlinearities and assume idealized models of the hardware, without accounting for the multimode dynamics of photon–photon interactions. Single-mode nonlinearities impose stringent hardware requirements, including high-speed active switching and precise cavity control, making experimental implementation challenging. To address these limitations, we present a hardware model for quantum photonic neural networks based on passive light-matter interactions using experimentally realized components and current photonic hardware.

Our proposed architecture, illustrated in Fig.1a, consists of cascaded layers of linear and nonlinear transformations. The network operates on N-photon multimode quantum states distributed across M spatial modes. These photons are indistinguishable, with identical temporal wave-packets, and their state amplitudes are encoded as a complex-valued vector with unit magnitude. Linear transformation layers apply unitary operations U(i) using M-port interferometers38,39,40,41,42, while nonlinear activation functions are implemented via strong light-matter interactions at each interferometer output port.

Fig. 1: Schematic of the components of a quantum photonic neural network.
figure 1

a Illustrative representation of a neural network represented as a sequence of N layers. Inputs to this network are multi-photon Fock states. Each gray block (U(i)) performs a linear-optical transformation, and the red blocks perform the element-wise nonlinear activation function. b Hardware implementation of the linear layer—a multiport interferometer in the CLEMENTS configuration. The insets show the constituent components of the mesh including Mach-Zehnder Interferometers and phase-shifters. c Illustrative representation of the hardware to implement the nonlinear activation function. The atom is a three-level Λ atomic system coupled to an optical cavity as shown in the inset. Transitions of the three-level atom are coupled to a single optical path, i.e., the \(\left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle \leftrightarrow \left\vert e\right\rangle\) transition is coupled to the red path and the \(\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle \leftrightarrow \left\vert e\right\rangle\) transition is coupled to the blue path. The input and output of the nonlinear element are along the modes \({\hat{h}}_{{\rm{in}}}\) and \({\hat{h}}_{{\rm{out}}}\), respectively.

The fundamental building block of a programmable multiport interferometer that implements linear optical operations is the 2-port Mach-Zehnder Interferometer (MZI). A schematic of this mesh is shown in Fig. 1b. A single MZI (upper inset) comprises two beam splitters and two tunable phase shifters, parameterized by (θ, ϕ) and implements any 2 × 2 unitary operation. Beam splitters are assumed to have ideal 50:50 splitting ratios, with deviations parameterized by (α, β). The transfer function between modes i, j is given by:

$${T}_{i,j}(\theta ,\phi )=i{e}^{i\theta /2}\left[\begin{array}{cc}{e}^{i\phi }\sin (\theta /2)&\cos (\theta /2)\\ {e}^{i\phi }\cos (\theta /2)&-\sin (\theta /2)\end{array}\right]$$
(1)

Larger programmable unitary operations U can be implemented by decomposing them into a product of 2 × 2 unitary matrices Ti,j(θ, ϕ) and phase shifts on the output modes (lower inset), corresponding to a diagonal matrix D. An M × M unitary matrix is expressed as U = DTi,j(θ, ϕ).

Central to the functioning of our neural network architecture is the element-wise nonlinear activation, which is realized through a three-level Λ atomic system. This nonlinearity provides a programmable phase conditioned on the total photon number. The schematic of the hardware that implements the nonlinear optical transformation is illustrated in Fig. 1c. This nonlinear gate uses only two optical channels that carry orthogonal degrees of freedom, such as distinct polarizations or spatial paths, bidirectionally. The output from each port of the linear interferometer serves as the input to the mode denoted by \({\hat{h}}_{{\rm{in}}}\). The input to \({\hat{v}}_{{\rm{in}}}\), which encodes the mode orthogonal to \({\hat{h}}_{{\rm{in}}}\) is always the vacuum state. These modes are used to excite a single three-level Λ atomic system, with two ground states \(\left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle ,\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle\) and one excited state \(\left\vert e\right\rangle\). Each transition of the atom couples to only one of the modes with equal cooperativity, i.e., pulses in mode \({\hat{h}}_{{\rm{in}}}\) can excite only the \(\left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle \leftrightarrow \left\vert e\right\rangle\) transition and pulses in mode \({\hat{v}}_{{\rm{in}}}\) can excite only the \(\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle \leftrightarrow \left\vert e\right\rangle\) transition. The output of the atomic system which we denote using the \({\hat{h}}^{{\prime} }\) and \({\hat{v}}^{{\prime} }\), uses the same optical channels to acquire independent phase shifts φ1 and φ2, and is time-reversed using a phase-conjugating mirror. The time-reversed state re-interacts with the atom and exits along the same incident path, but in the opposite direction, denoted by \({\hat{h}}_{{\rm{out}}}\), forming a programmable nonlinear phase gate \({\hat{U}}_{{\rm{NL}}}({\varphi }_{1},{\varphi }_{2})\). While we use the notation \({\hat{h}}_{{\rm{in}}},{\hat{h}}^{{\prime} }\) and \({\hat{h}}_{{\rm{out}}}\) to describe the optical state at various stages of the phase gate, the optical state populates only two optical channels determined by the encoding degree of freedom at all times.

During the first interaction of the optical state from mode \({\hat{h}}_{{\rm{in}}}\) with the atomic system, a photon is subtracted and placed in mode \({\hat{v}}^{{\prime} }\). This deterministic photon subtraction mediated by the Λ atom and its corresponding state flip has been extensively studied33,36,37,43,44,45. The Hamiltonian associated with the interaction of light interacting with a single three-level atomic system trapped in an optical cavity is given by ref. 46:

$$\hat{H}=-i\hslash g\sqrt{\frac{\kappa }{\pi }}\int\frac{1}{\kappa -i\omega }\left(\left\vert e\right\rangle \left\langle {g}_{{\rm{h}}}\right\vert {\hat{a}}_{\omega }+\left\vert e\right\rangle \left\langle {g}_{{\rm{v}}}\right\vert {\hat{b}}_{\omega }\right){e}^{-i(\omega +\delta )t}{\rm{d}}\omega +{\rm{H.C.}}$$
(2)

where \({\hat{a}}_{\omega }\) and \({\hat{b}}_{\omega }\) are the bosonic annihilation operators for the modes \(\hat{h}\) and \(\hat{v}\), respectively. The cavity is assumed to be perfectly resonant with the atomic transitions, i.e., the detuning δ = 0. The parameter g denotes the coupling strength between the cavity mode and the atomic transition, and κ denotes the cavity decay rate. In general, the input to the nonlinear phase gate is an N-photon Fock state with temporal mode profile ξ(t) in the \({\hat{h}}_{{\rm{in}}}\) mode. As we approach the limit where the spectral width of the incident photons, denoted by σ, is spectrally narrower than the line-width of the cavity (κσ) and the cavity-enhanced decay rate (2g2/κσ), the two-mode subtracted state is:

$$\begin{array}{l}\left\vert {\psi }_{{\rm{v}}}(t,{t}^{{\prime} },{t}^{{\prime\prime} })\right\rangle =-\sqrt{\frac{N}{(N-1)!}}\left(\mathop{\int}\nolimits_{-\infty }^{t}{\rm{d}}{t}^{{\prime} }\xi ({t}^{{\prime} }){\hat{b}}_{{t}^{{\prime} }}^{\dagger }\right)\\\qquad\qquad\qquad\quad\times\,{\left(\mathop{\int}\nolimits_{{t}^{{\prime} }}^{\infty }{\rm{d}}{t}^{{\prime\prime} }\xi ({t}^{{\prime\prime} }){\hat{a}}_{t{\prime\prime} }^{\dagger }\right)}^{N-1}\left\vert 0\right\rangle\end{array}$$
(3)

In the limit that t, we recover the subtracted state derived in ref. 43 (see sec. 1 of the Supplementary Information). Equation (3) indicates that the subtracted state is temporally entangled, but occupies orthogonal modes—a single excitation in mode \({\hat{v}}^{{\prime} }\) and (N–1) excitations in mode \({\hat{h}}^{{\prime} }\). In this limit, each mode retains the original temporal profile ξ(t) without distortion, meaning that it is confined to the single-mode subspace.

Following the process of photon subtraction, each mode of the two-mode state in Eq. (3) is directed towards phase shifters that impart phases proportional to (φ1, φ2). These phases are independently programmable, giving rise to a nonlinear phase shift that depends on the photon number. The phase-shifters can be implemented using a birefringent material if the inputs \({\hat{h}}_{{\rm{in}}}\) and \({\hat{v}}_{{\rm{in}}}\) are encoded in different polarization states, or placed on different physical channels if the inputs are encoded in spatial modes. Subsequently, this two-mode optical state is time-reversed using a phase-conjugating mirror. Perfect time-reversal of an arbitrary pulse using time-dependent refractive-index modulation has been proposed and demonstrated in refs. 47,48,49,50, that preserves the temporal profile of the pulse. Alternatively, nonlinear optical processes such as four-wave mixing have also been demonstrated to perform time-reversal of an optical pulse51,52,53,54,55,56,57,58. This time-reversed state returns to interact with the three-level Λ system a second time.

Finally, the two-mode time-reversed state propagates through the phase-shifters a second time and interacts with the Λ atomic system again, and is added into a single mode \({\hat{h}}_{{\rm{out}}}\) with a global nonlinear phase. This interaction step emulates the time-reversal of the photon subtraction step, where the time-reversed state propagates along the same physical channel, but in the opposite direction to interact with the atom in the \(\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle\) state. The single photon in mode \({\hat{v}}^{{\prime} }\) excites the atom, and is deterministically added to the the pulse in \({\hat{h}}^{{\prime} }\) which propagates into the output mode \({\hat{h}}_{{\rm{out}}}\) (see sec. 1 of the Supplementary Information). This reconstructs the N-photon fock state with a global nonlinear phase. Therefore, the nonlinear phase gate described in Fig. 1c performs the following transformations on an input N-photon Fock state:

$$\left\vert N\right\rangle \mapsto {e}^{2i{\varphi }_{1}}{e}^{2i(N-1){\varphi }_{2}}\left\vert N\right\rangle$$
(4)

Since the nonlinear phase gate operates bidirectionally on only two optical modes, a non-reciprocal device is necessary to separate the counter-propagating pulses. Specifically, an optical circulator or a fast optical switch is required to route the subtracted state toward the phase-conjugating mirror, and subsequently direct the time-reversed output toward the next linear layer in the network.

This photon-number selectivity enables the construction of multi-qubit entangling gates, which are essential for completing the universal gate set. In the dual rail basis, along with single-qubit gates programmed by linear optics, two-qubit controlled phase gates can be implemented using this optical nonlinearity. The function implemented by the nonlinear phase gate \({\hat{U}}_{{\rm{NL}}}({\varphi }_{1},{\varphi }_{2})\) transforms the superposition state \({c}_{0}\left\vert 0\right\rangle +{c}_{1}\left\vert 1\right\rangle +{c}_{2}\left\vert 2\right\rangle\) into the state \({c}_{0}\left\vert 0\right\rangle +{c}_{1}{e}^{2i{\varphi }_{1}}\left\vert 1\right\rangle +{c}_{2}{e}^{2i({\varphi }_{1}+{\varphi }_{2})}\left\vert 2\right\rangle\). When φ1 = 0, φ2 = π/2, the output state is \({c}_{0}\left\vert 0\right\rangle +{c}_{1}\left\vert 1\right\rangle -{c}_{2}\left\vert 2\right\rangle\), which is required to implement a controlled-phase gate32,59. The fidelity of this nonlinear phase gate is discussed in sec. 2 of the Supplementary Information. This optical nonlinearity can be extended to construct general photon-number-selective arbitrary phase gates for N> 2 incident photons. This model of the nonlinearity is discussed in sec. 3 of the Supplementary Information.

We benchmark the performance of the proposed architecture by numerically simulating tasks such as state preparation, encoding logical information, and realizing logical gates under the influence of realistic hardware errors. The linear layers of our architecture are implemented by a programmable multiport interferometer in the CLEMENTS configuration39, parameterized by \((\overrightarrow{\theta },\overrightarrow{\phi })\). The nonlinear activation is implemented by the programmable nonlinear phase gate described in sec. “The Neural Network Architecture”, and is parameterized by \(({\overrightarrow{\varphi }}_{1},{\overrightarrow{\varphi }}_{2})\). The phases for both the linear and nonlinear layers are initialized uniformly at random. The digital differentiable model was programmed using open-source automatic differentiation tools with the JAX library, and numerically simulated to optimize the phases of the network. The network was trained using the ADAM optimizer60, with the learning rate annealed from 0.025 to 0.001 on the NVIDIA A100 GPU.

State preparation

To demonstrate the generalizability and efficiency of the training routine, we consider the problem of preparing multimode multi-photon states. Here, we show that the network is capable of preparing Haar-random multimode multi-photon states to high fidelities, and analyze the impact of component imperfections on the preparation of a 4-photon N00N state. Given an initial state \(\left\vert {\psi }_{{\rm{in}}}\right\rangle\), the network is trained to find the unitary operation that maximizes the overlap between a selected target state \(\vert {\psi }_{{\rm{target}}}\rangle\) and the output of the network. The fidelity of the learned state is evaluated as:

$${{\mathcal{F}}}_{{\rm{state}}}=| \langle {\psi }_{{\rm{target}}}| {U}_{{\rm{NN}}}| {\psi }_{{\rm{in}}}\rangle |$$
(5)

where UNN is the transformation implemented by the neural network. To find the optimal parameter set, the loss function \({\mathcal{L}}={(1-{{\mathcal{F}}}_{{\rm{state}}})}^{2}\) is minimized using the gradient descent technique described above.

First, we consider the task of preparing a sample of Haar-random multi-photon states. These states are uniformly distributed over the space of multi-photon qudit states61. Successfully preparing a sufficiently large sample of these states would empirically indicate that the network can learn mappings to prepare the entire group of multi-photon states. A Haar-random target state is selected by sampling a unitary U from the Haar measure62,63 and then evolving a fiducial initial state \(\left\vert {\psi }_{0}\right\rangle\), such that \(\left\vert {\psi }_{{\rm{Haar}}}\right\rangle =U\left\vert {\psi }_{0}\right\rangle\).

A network that is 4 layers deep is trained to prepare a set of 100 Haar-random N-photon states in a 4-mode network. The initial state that is input into the network is single-photon Fock states that populate the first N modes of the network. The state fidelity \({{\mathcal{F}}}_{{\rm{state}}}\) as a function of the number of iterations is shown in Fig. 2a. The blue, green and orange lines plot the average fidelity over the 100 samples for the 2, 3, and 4 photon states, respectively. The shaded regions indicate the bounds of the state fidelity over all the trained instances. The inset shows the distribution of the final fidelity for the 100 runs of each multi-photon state after 2000 iterations. The final fidelity decreases from >99.9% to ~97.8% as the number of photons increases due to difficulty in training as the size of the Hilbert space grows. Fine-tuning hyperparameters or annealing the learning rate could improve performance for larger photon-number states.

Fig. 2: Performance of the network in learning one-to-one mapping of quantum states.
figure 2

a State fidelity as a function of iteration number for a sample of 100 Haar-random multi-photon quantum states. The fidelity of the 2, 3, and 4-photon states approaches unity within 2000 iterations. The inset illustrates the distribution of learned state fidelities. Increasing the number of photons increases the difficulty in learning the target state, resulting in a decrease in the mean fidelity \(\bar{F}\). b State infidelity as a function of the depth of the network for the 4-photon N00N state under the influence of component imperfections. Increasing the depth of the network increases the fidelity, denoted by the black line. The green and blue distributions correspond to the distribution of state infidelity when the beam-splitter error σ is 0.001 and 0.01, respectively. c Distribution of state infidelity as a function of beam-splitter error σ in the Mach-Zehnder Interferometer (MZI). Increasing the error σ increases the mean infidelity, as well as the distribution of state fidelities.

Next, we consider the preparation of N00N states, which are many-body entangled states that are extremely sensitive to noise and loss. Component imperfections due to fabrication process variations introduce perturbations in the splitting ratio of beam-splitters in the Mach–Zehnder Interferometers. Imperfect splitting within beam-splitter meshes introduce errors into the programmed unitary matrix64,65,66, and thereby give rise to infidelity in the preparation of any target state. Deviations of the constituent beam-splitters from the 50:50 splitting ratio are denoted by angles (α, β) and are assumed to be distributed as independent Gaussians \({\mathcal{N}}(0,\sigma )\).

The state infidelity in a 4-photon N00N state as a function of the number of layers for ideal and faulty circuits is plotted in Fig. 2b. In the case of ideal circuits, the fidelity improves with depth, as indicated by the black line. A network that is three layers deep achieves a fidelity >99.9%. Faulty circuits, sampled with 10000 splitter errors (α, β), reveal the impact of component imperfections. The green and blue histograms indicate distributions of state infidelities for small splitter errors (σ = 0.001) and larger, more practical splitter errors (σ = 0.01). At a depth of four layers, circuits with large errors already perform ~100× worse than near-perfect circuits. Increasing the depth beyond four layers further reduces the fidelity because the errors outweigh the fidelity gains from training.

The state infidelity as a function of the splitter error σ is plotted in Fig. 2c for a network that is three layers deep. The error bars illustrate the infidelity bounds for a given beam-splitter error σ. Increasing the splitter errors corresponds to an increase in the median state infidelity as well as the upper bound of the distribution of infidelities. At σ = 0.02, which is the typical tolerance for wafer-scale process variations67, the median fidelity is below 99%, with the worst case fidelity approaching ~10%. A number of reconfiguration techniques have been proposed to correct for coherent errors in the interferometer platform, including global optimization68,69,70,71,72,73, local correction64,74, and self-configuration75,76,77,78.

Universal operations on encoded bases

The network’s programmability allows us to operate directly on the logical subspace of bosonic error-correcting codes. More generally, this architecture is able to operate on any bosonic code whose basis states lie in the finite-dimensional Hilbert space, including cat, GKP, binomial or multimode multi-photon codes2,3,4,5,6,9,10,79,80,81. This ability is essential in providing robustness against the dominant photon loss channel as well as phase errors, while still permitting universal, code-agnostic control through non-transversal gates. The network is therefore trained to find a completely positive trace-preserving map \({\mathcal{E}}\) acting on the logical subspace. The parameters of the network are optimized to maximize the overlap between the transformation implemented by the network UNN and the target channel \({\mathcal{E}}\). The fidelity of the learned channel is evaluated as the average fidelity over the Haar measure82 as:

$$\begin{array}{rcl}{{\mathcal{F}}}_{{\rm{av}}}({U}_{{\rm{NN}}},{\mathcal{E}})&=&\int{\rm{d}}\psi \left\langle \psi \right\vert {U}_{{\rm{NN}}}^{\dagger }{\mathcal{E}}\left(\left\vert \psi \right\rangle \left\langle \psi \right\vert \right)\left\vert \psi \right\rangle \\ &=&\frac{d{\mathcal{F}}\left({\mathcal{E}},{U}_{{\rm{NN}}}\right)+1}{d+1}=\frac{{\rm{Tr}}\left({U}_{{\rm{NN}}}^{\dagger }{\mathcal{E}}\right)+1}{d+1}\end{array}$$
(6)

where d is the dimension of the multi-photon quantum system. The loss function, defined as, \({\mathcal{L}}={(1-{{\mathcal{F}}}_{{\rm{av}}}({U}_{{\rm{NN}}},{\mathcal{E}}))}^{2}\) is minimized using the gradient descent technique described above to maximize the average channel fidelity \({{\mathcal{F}}}_{{\rm{av}}}\) to 1.

Here, we demonstrate that the network is able to encode and perform universal logical operations on a symmetry-protected bosonic error-correcting code. In particular, we consider an encoding that cannot be prepared using linear optics and a χ(2) optical nonlinearity. This is the two-mode χ(2) binomial code proposed by Niu et al.2, whose logical basis states are:

$$\begin{array}{ll}\left\vert \tilde{0}\right\rangle =\displaystyle\frac{1}{{2}^{N-1}}\mathop{\sum}\limits_{j = 0}^{N-1}\sqrt{\left(\begin{array}{c}2N-1\\ 2j\end{array}\right)}\,\left\vert 2j,2N-1-2j\right\rangle \\ \left\vert \tilde{1}\right\rangle=\displaystyle\frac{1}{{2}^{N-1}}\mathop{\sum}\limits_{j = 0}^{N-1}\sqrt{\left(\begin{array}{c}2N-1\\ 2j+1\end{array}\right)}\,\left\vert 2j+1,2(N-1-j)\right\rangle \end{array}$$
(7)

where \(\left\vert {n}_{{\rm{a}}},{n}_{{\rm{b}}}\right\rangle\) denotes a two-mode code state with na and nb photons in modes \(\hat{a}\) and \(\hat{b}\), respectively. The basis states of the two-mode χ(2) binomial code in Eq. (7) do not lie in an irreducible subspace of the χ(2) Hamiltonian, and therefore cannot be prepared using just linear optics and χ(2) nonlinear processes. The network architecture is not constrained by any symmetry among its modes, and should therefore be able to generate the complete Lie algebra for all qudit subspaces3,83,84. This architecture is therefore ideally suited to encode and perform logical operations on the bosonic code.

The (2N − 1) photon two-mode χ(2) binomial code offers an attractive option for encoding logical states because of its ability to correct photon loss and gain errors, upto N photons, Nth order dephasing, and amplitude damping error. Furthermore, to protect the logical qubits up to N photon losses, only (2N − 1) input photons are required, implying that it has a constant code rate2.

To verify that the network is indeed able to encode the two-mode χ(2) binomial code, we train the network to map orthogonal input states ((2N − 1) photons populating either one of the modes) into the logical basis states. Figure 3a plots the encoding infidelity as a function of the network depth for the 5-photon code, with a 4-layer deep network achieving average fidelity greater than 99.99%. The histograms indicate the encoding infidelity under the influence of coherent errors from splitter errors (σ = 0.001) and (σ = 0.01). In this case, the splitter errors severely degrade the channel fidelity, since the relative phase among the output modes is necessary to achieve a high fidelity. Splitter errors of σ = 0.01 drastically reduce fidelity, with worst cases dropping below 10%, highlighting sensitivity to component imperfections.

Fig. 3: Performance of the network in learning the encoding channel and single qubit gates for the two-mode 5-photon χ(2) binomial code.
figure 3

a Encoding channel infidelity as a function of the depth of the network in a two-mode network. The black line indicates the infidelity of the ideally trained channel. The green and blue histograms illustrate the distribution of infidelities when splitter errors are σ = 0.001 and σ = 0.01, respectively. b Gate infidelities of single qubit logical gates (Hadamard, S and T phase gates) trained by a 4-layer network. The black bar indicates that an ideal gate can be trained to a fidelity >99.99%. The network performance is severely affected by component imperfections, with the worst-case fidelity approaching ~10%.

Universal quantum computation in this logical basis necessitates the ability to perform arbitrary encoded operations. We consider the generators of the logical Clifford group—namely the Hadamard gate, the S (π/4) phase gate and the controlled phase gate. Along with these gates, the T (π/8) phase gate completes the gate set required to construct a universal gate set in the encoded basis. Figure 3b shows the single-qubit logical gate fidelity achieved by a 4-layer network, achieving >99.99% with a 4-layer network as indicated by the black bar. Even under realistic splitter errors (σ = 0.01), many circuits achieve fidelities above 99%. These networks are, however, still very sensitive to splitter errors, since the relative phase among the output logical states is essential for high-fidelity operation.

Together with the single-qubit gates, a two-qubit logical entangling gate completes the universal gate set. A schematic of the circuit to implement the controlled-phase gate is shown in Fig. 4a. This circuit consists of encoding and decoding steps, implemented by independently trained two-mode networks. Between these networks, we use a circuit similar to the traditional controlled-phase gate. The input to this circuit into either set of rails is the N-photon two-mode logical code state. The encoding circuit rotates the logical state into the Fock basis via the transformation:

$$\begin{array}{l}\left\vert \tilde{0}\right\rangle \mapsto \left\vert N,0\right\rangle \\ \left\vert \tilde{1}\right\rangle \mapsto \left\vert N-1,1\right\rangle \end{array}$$
(8)

where the state \(\left\vert {n}_{{\rm{a}}},{n}_{{\rm{b}}}\right\rangle\) denotes a state where na photons populate the outer rails (uppermost and lowermost rails of the network) and nb photons populate the inner rails of the network (the middle two rails). The logical \(\left\vert \tilde{0}\right\rangle\) state is mapped into the N-photon state that populates only the outer rail, while the logical \(\left\vert \tilde{1}\right\rangle\) state is mapped into a state with only a single photon on the inner rail. From here, it is straightforward to see that only the \(\left\vert \tilde{1}\tilde{1}\right\rangle\) state exploits the Hong-Ou-Mandel effect and receives a phase shift φ1 + φ2 from the nonlinear element, while the states \(\left\vert \tilde{0}\tilde{1}\right\rangle\) and \(\left\vert \tilde{1}\tilde{0}\right\rangle\) receive a phase shift φ1. In the case where φ1 = 0 and φ2 = π/2, this circuit acts as a controlled-Z gate on the logical basis. The state is then transformed by the decoding step, which rotates the Fock basis states back into the two-mode encoded logical state. This construction is agnostic to the encoding being used, meaning the same circuit model with updated phases can be reused for any family of bosonic codes.

Fig. 4: Logical controlled phase gate schematic and performance.
figure 4

a Circuit implementation of the logical controlled phase gate with an encoding and decoding step that rotates the logical states into the fock basis and vice-versa. b Gate fidelity of the logical controlled phase gate illustrated in (a) on the 5-photon two-mode binomial code. A network that is 10 layers deep (5 layers in the encoding and decoding steps each) is able to perform the transformation to a fidelity >99.99%. The green and blue histograms indicate the distribution of infidelities when splitter errors are σ = 0.001 and σ = 0.01, respectively, with worst-case fidelities on the order of ~10%.

The performance of the logical controlled phase gate is shown in Fig. 4b. Both the encoding and decoding steps were trained independently on the reduced Hilbert space on 5 photons in two modes, as opposed to 10 photons in four modes. A 10-layer deep network (with 5 encoding layers and 5 decoding layers) achieves gate fidelity >99.99%. Component imperfections severely impact the performance of the gate, where networks with larger errors perform ~100× worse than networks with near-ideal components. The worst-case fidelities reach as low as ~10% when σ = 0.01, regardless of network depth.

Quantum error correction via non-demolition measurements

Error corrected operation of the quantum photonic neural network is essential for the efficient scaling up of such an architecture. Typically, information encoded in bosonic modes is protected by performing non-demolition measurements, followed by a unitary correction conditioned on the measurement1,2,3,4,5,6,7. In our architecture, errors caused by the dominant decoherence channel—photon loss errors—are detected by the intrinsic light-matter interaction of the optical nonlinearity. With the non-demolition measurement of the photon-number parity \({p}_{{\rm{BC}}}=\langle {\hat{n}}_{{\rm{a}}}+{\hat{n}}_{{\rm{b}}}\rangle {\rm{mod}}\left(2N-1\right)\), the bosonic error-correcting codes from ref. 2 are able to uniquely identify the type of error that has occurred. Photon loss due to the amplitude damping channel is also shown to satisfy the Knill-Laflamme condition of our encoding.

The non-demolition measurement of the total photon number is achieved by cascading three-level Λ atomic systems, shown in Fig. 5a. An erroneous code state sequentially interacts with the cascaded atomic systems. Each interaction moves a single photon from the code state into an orthogonal mode and deterministically flips the state of the atom. Therefore, information regarding whether an atom has interacted with a single photon is stored in the state of that atom. Measuring and resetting the state of the atoms after the photonic state has propagated through the atoms provides an error syndrome of the total photon number non-destructively. After the measurement, the original erroneous photonic code state is reconstructed via time-reversed interactions. Photon-number resolving non-demolition measurements can also be performed using second or third-order optical nonlinearities85,86,87,88,89,90 or atomic systems91,92,93.

As an example, we consider the result of syndrome measurements on a 2-photon code state. If this code state loses a single photon, the remaining photon interacts with only the first three-level Λ system, thereby flipping its state. Measuring the state of the atoms now gives us the error syndrome \(\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle \left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle\), indicating the loss of a single photon. In the case when there is no loss of a photon, both photons would interact with the atoms, flipping their states. This gives us the error syndrome \(\left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle \left\vert {{\rm{g}}}_{{\rm{v}}}\right\rangle\), indicating there was no loss of a photon. Similarly, if both photons are lost from this state, none of the atomic systems would flip their states. This gives us error syndrome \(\left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle \left\vert {{\rm{g}}}_{{\rm{h}}}\right\rangle\), indicating that both photons were lost.

Conditioned on the non-demolition measurement, the appropriate correction operations can be performed to correct the error. A schematic of this correction circuit is illustrated in Fig. 5b. The routing gate restores the total number of photons in the code space by moving a single photon from the ancillary mode based on feedback from the non-demolition measurement. The additional layers correspondingly rotate this state into the corrected state.

To provide a more concrete example, we consider the case of a single photon loss error in a two-mode three-photon χ(2) binomial code. In general, the code-space of the network can be described as \(\left\vert C\right\rangle =\alpha \left\vert \tilde{0}\right\rangle +\beta \left\vert \tilde{1}\right\rangle =\alpha \left(\left\vert 0,3\right\rangle +\left\vert 2,1\right\rangle \right)+\beta \left(\left\vert 3,0\right\rangle +\left\vert 1,2\right\rangle \right)\). The photon loss is equally likely to occur from either mode. If there is a loss from the first mode, the state of the system becomes \(\left\vert {E}_{1}\right\rangle =\alpha \left\vert 1,1\right\rangle +\beta \left(\left\vert 2,0\right\rangle +\left\vert 0,2\right\rangle \right)\). On the other hand, if this loss occurs from the second mode, the state of the system becomes \(\left\vert {E}_{2}\right\rangle =\alpha \left(\left\vert 0,2\right\rangle +\left\vert 2,0\right\rangle \right)+\beta \left\vert 1,1\right\rangle\). Upon detection of this loss, the routing gate moves a single photon from the ancillary mode into the first mode of the code space. This transforms the error states \(\left\vert {E}_{1}\right\rangle\) into \(\left\vert {C}_{1}^{{\prime} }\right\rangle =\alpha \left\vert 2,1\right\rangle +\beta \left(\left\vert 3,0\right\rangle +\left\vert 1,2\right\rangle \right)\) and \(\left\vert {E}_{2}\right\rangle\) into \(\left\vert {C}_{2}^{{\prime} }\right\rangle =\alpha \left(\left\vert 1,2\right\rangle +\left\vert 3,0\right\rangle \right)+\beta \left\vert 2,1\right\rangle\). Finally, the additional layers rotate the states \(\left\vert {C}_{1}^{{\prime} }\right\rangle\) and \(\left\vert {C}_{2}^{{\prime} }\right\rangle\) back into the original state \(\left\vert C\right\rangle\). To summarize, the error-correction procedure performs the following transformations over the ancilla-code space:

$$\begin{array}{l}\left\vert 1\right\rangle \otimes \left\vert {E}_{1}\right\rangle \mapsto \left\vert 0\right\rangle \otimes \left\vert {C}_{1}^{{\prime} }\right\rangle \mapsto \left\vert 0\right\rangle \otimes \left\vert C\right\rangle \\ \left\vert 1\right\rangle \otimes \left\vert {E}_{2}\right\rangle \mapsto \left\vert 0\right\rangle \otimes \left\vert {C}_{2}^{{\prime} }\right\rangle \mapsto \left\vert 0\right\rangle \otimes \left\vert C\right\rangle \end{array}$$
(9)

The routing gate adds a single photon into the code space conditioned on feedback from the non-demolition measurement. In other words, it transforms the two-mode state \(\left\vert 1,n\right\rangle \mapsto \left\vert 0,n+1\right\rangle\), where n is an unknown number of photons propagating in the computational mode. Photon addition can be performed using the approaches in refs. 43,94. These schemes are, however, not deterministic and require pulses with specific temporal profiles in conjunction with weak second-order nonlinearities. Therefore, we train an independent network tiled between the ancilla and a computational mode to transform an input state \(\left\vert 1,n\right\rangle\) into the state \(\left\vert 0,n+1\right\rangle\) for the 5-photon code state, where n [0, 4]. Figure 5c plots the infidelity of this gate as a function of the number of layers for all possible values of n. A network that is 5 layers deep reaches a fidelity greater than 99% for all the required transformations.

Fig. 5: Implementation of error-correction for photon loss.
figure 5

a Schematic of non-demolition measurement using the three-level Λ atomic system. This construction allows for the detection of photon loss among two input photons from the computational modes. b Schematic of circuit to correct for single photon loss from the first mode using the non-demolition measurement, conditional routing from an ancillary mode, and unitary rotation using additional layers. c Infidelity of routing a single photon from the ancillary mode into the computational modes, when the routing gate is constructed using a separately trained network. Deeper networks result in higher fidelity for addition of the photon into all possible code states.

In practice, these operations performed by the neural network are not fault-tolerant. Therefore, these simulations give us a lower bound on the hardware resources required to implement these quantum operations on an encoded basis. Proper analysis of realistic error propagation through the optical neural network can help us mitigate these issues once combined with real-time error correction. The impact of the added complexity and its potential for fault-tolerant error correction will be the subject of our future work.

Discussion

Ensuring that the neural network architecture operates at high fidelity requires strong light–matter interactions under realistic hardware conditions. This necessitates that both transitions of the three-level Λ atomic system be coupled to the cavity modes with equal cooperativity, given by:

$$C=\frac{{g}^{2}}{2\left({\kappa }_{{\rm{i}}}+{\kappa }_{{\rm{ex}}}\right)\gamma }$$
(10)

where g is the coupling rate of the atomic transitions to the cavity modes, κi is the intrinsic coupling rate, κex is the extrinsic coupling rate, and γ is the atomic emission rate into free space. Rosenblum et al.36 show that using 87Rb atoms coupled to a microsphere resonator, a single photon can be deterministically extracted from an optical pulse, demonstrating the feasibility of our nonlinearity. The parameters achieved in their experiment are (g, κex, κi, γ) = (24, 40, 6.6, 3) MHz, corresponding to a cooperativity C ≈ 8.2. In sec. 4 of the Supplementary Information, we plot the coupling strengths g and κ of recent experiments involving the integration of quantum emitters to optical cavities in different material platforms. These experiments place the achieved coupling rate g/(2π) in the neighborhood of ~1−10 GHz, and the cavity decay rate κ/(2π) in the range of ~10–100 GHz. This indicates that experimental devices are already operating in the range where g/κ ≈ 0.1 −1, and are approaching the strong coupling regime where g/κ > 1. This suggests that the nonlinear element can be realized with near-term photonic hardware.

Single-mode operation of the nonlinear element also imposes the condition that the pulses are spectrally narrow. Specifically, the temporal width of the pulses is required to be longer than the cavity-enhanced decay rate (2g2/κσ) and the line-width of the cavity (κσ). In sec. 2 of the Supplementary Information, we evaluate the effect of temporal mode distortions on the fidelity of the nonlinear phase gate as a function of the parameters g and κ. In the regime where the spectral width of the pulse σ/g is on the order of ~0.2κ/g, the fidelity of the nonlinear phase gate exceeds 99.9%, indicating that the nonlinear dynamics are strongly confined to the single-mode subspace. To estimate exact physical values of the pulse width required to maintain single-mode operation, we refer to the data of coupling strength g and cavity decay rate κ achieved in recent experimental data plotted in sec. 4 of the Supplementary Information. Assuming nominal values of g~2π × 10GHz, and κ~2π × 50GHz, and operating in the regime where 2g2/κ ≈ 10σ, the full-width half-maximum width of the pulse can be estimated to be ≈ 1ns. Spectrally narrow pulses with temporal widths on the order of ~1ns can be readily generated using quantum emitters in a number of material platforms, including quantum dots95,96,97,98 and defect centers in diamond99,100,101 and silicon102,103,104.

Partial distinguishability among the photons either present in the input state or arising from imperfect interaction with the nonlinear phase gate can drastically degrade the performance of the network. Such interactions may give rise to undesired correlations, including the formation of photon-bound states and higher-order extended states. These effects lead to parasitic coupling between spatio-temporal modes, which effectively act as decoherence channels and lead to the pulses acquiring incorrect phases. Modeling and mitigating the impact of these multimode dynamics remains an important avenue for future work.

In summary, the architecture we propose is remarkably versatile in its ability to manipulate bosonic states and implement universal logical operations on encoded bases. We provide a blueprint for a programmable optical nonlinearity that also serves as a method to perform non-demolition measurements of the total photon number. Consequently, this enables the preparation and control of bosonic error-correcting codes through non-transversal gates, as well as correcting photon loss errors. We analyze the performance of the architecture (including encoding and logical operations) under coherent beam-splitter error, although we anticipate the dominant source of error will be photon loss.

The core element of this architecture is the optical nonlinearity, which is based on passive interactions of light with a three-level Λ system. With the appropriately selected hardware parameters, this scheme overcomes a limitation commonly faced by traditional optical nonlinearities –asymptotically perfect single-mode operation. This device can therefore be used to construct deterministic high-fidelity photonic gates. Moreover, the architecture proposed above is not constrained by any symmetries imposed by an evolution under given system Hamiltonian, such as a χ(2) or χ(3) process2,3,7,105. This suggests that the network is ideally suited to compute on encoded bases, particularly for applications involving bosonic error-correcting codes. This also opens up avenues for exploring alternative architectures for realizing non-traditional beyond-Clifford logical gates that are hardware efficient for photonic logical qubits.

Advances in integrated photonics now allow the monolithic integration of quantum photonic components106. While building a complete neural network processor on a single chip is experimentally challenging107, our estimates show that the individual components can be realized with present-day hardware. Investment in this approach would unlock large-scale, fault-tolerant photonic quantum computation, simulation, sensing, and communication.

Methods

Computational techniques

The quantum optics simulations in this manuscript were performed with custom and dedicated code written in Python. We rely on gradient-based optimization techniques that minimize the cost functions discussed in previous sections of the manuscript. Therefore, our digital models are automatically differentiable and are programmed using the open-source JAX and OPTAX libraries. We empirically determined convergence of our optimization routine is best achieved with a learning rate of 0.025 that is annealed to 0.001 exponentially. Simulations of multimode behavior to analyze the fidelity of the optical nonlinearity were performed using the DIFFRAX package.

Simulation parameters

All of the neural network training performed in this manuscript was unsupervised. The linear layer of the architecture was parameterized by the phases in the CLEMENTS configuration, while the nonlinear layer was parameterized by a vector of phases, determining the phases of the nonlinear sign gate. All of the parameters of the digital model were initialized uniformly at random. Errors in the splitting ratio of the beam-splitters in the linear layer of the network are assumed to be sampled from a Gaussian distribution \({\mathcal{N}}(0,\sigma )\). To obtain reliable estimates of the state and channel fidelities, circuits were sample 104 times.