Introduction

Harnessing usefulness from current noisy intermediate-scale quantum (NISQ)1 computers has emerged as the main objective of the quantum computing community. Variational quantum algorithms (VQAs) are the leading candidates to achieve this goal, making use of the limited quantum resources that existing NISQ computers offer2. These hybrid algorithms aim to solve computationally demanding tasks with enhanced efficiency by synergistically combining the computational power of classical systems. In the past few years, VQAs have demonstrated significant potential in addressing numerous challenges in contemporary science, such as problems involving many-body quantum Hamiltonians3,4, quantum chemistry5,6,7, combinatorial optimization8,9, and others10,11.

Currently, most research efforts are focused on encoding discrete optimization problems using qubit-based approaches, which are well-suited for implementation in superconducting circuits and trapped ions. In contrast, photonic quantum computing (PQC) incorporates the continuous variable (CV) formalism. This allows quantum information to be encoded in the quadrature amplitudes of an electromagnetic field12 known as qumodes. This encoding provides benefits in representing continuous optimization problems that are expensive to encode with qubits. Recently, there has been a growing interest in developing CV quantum algorithms for solving various problems in the PQC paradigm. For instance, a CV-based quantum approximate optimization algorithm (CV-QAOA) has recently been proposed and benchmarked with the minimization of the non-convex Styblinski-Tang function13. Likewise, a CV adiabatic quantum algorithm was also proposed that investigated mixed-integer programming problems using Fock encoding14. In addition, efforts have been made to encode graph problems, imaginary time evolution for quantum field theories, Grover search and instantaneous quantum polytime circuits over continuous spaces15,16,17,18. Apart from optimization, the CV regime is also utilized as a tool for error-correcting codes19,20 and for quantum state learning21.

One of the challenges encountered while implementing adiabatic algorithms using the CV approach is the inherent bottleneck arising from the quadratic nature of Gaussian operations in phase space. Generally, complex optimization problems require non-Gaussian operations and intricate gate decomposition for effective time evolution. The experimental feasibility of these algorithms in solving high-degree problems is limited, as recent experiments have only been able to solve simple quadratic functions of x in the case of CV-QAOA22. Therefore, there is a pressing need for algorithms that operate within low degrees while still enabling experimental exploration of complex high-degree problems using near-term photonic devices.

In this article, we propose the photonic counterdiabatic quantum optimization (PCQO) algorithm to address this challenge. PCQO is a hybrid quantum-classical algorithm designed to solve problems suitable for currently available photonic devices by utilizing a circuit ansatz and a classical optimization routine. This circuit ansatz is designed from a pool of Gaussian and non-Gaussian operations that are obtained by drawing inspiration from counterdiabatic (CD) protocols23. These CD protocols, known as shortcuts-to-adiabaticity methods24,25, accelerate the adiabatic process and circumvent the typically slow evolution mandated by the adiabatic theorem24. Previous applications of these methods have demonstrated substantial improvements in QAOA and digitized adiabatic evolution26,27,28,29,30. Moreover, PCQO fell under the category of problem-inspired ansatz, making it more favorable compared to random ansatz that faced trainability issues when scaling the system size.

We investigate the performance of this algorithm for (a) phase-space encoding, which encodes information in the \(\hat{x}\) quadrature, and (b) Hilbert-space encoding, which represents information using Fock states \(\left\vert n\right\rangle\). The former includes classical non-convex continuous optimization problems31,32 and the latter includes integer programming problems33. In addition, we show that PCQO outperforms state-of-the-art quantum algorithms such as CV-QAOA in terms of performance and implementability. Lastly, we provide considerations for implementing PCQO with NISQ devices and show a proof-of-principle experiment that solves a simple problem in Xanadu’s eight-mode photonic quantum computer34.

Results

The algorithm

To design the algorithm, we start with an adiabatic quantum Hamiltonian Ha(t), given by

$${H}_{a}(t)=(1-\lambda (t)){H}_{m}+\lambda (t){H}_{p},$$
(1)

where λ(t) is a scheduling function such that λ(0) = 0 and λ(T) = 1 and T is the total evolution time. Hp is a problem Hamiltonian whose ground state we need to find and Hm is a mixer Hamiltonian, such that [HpHm] ≠ 0. According to the adiabatic theorem, if the system is prepared in an eigenstate of Hm, it remains in the instantaneous eigenstate during the evolution, provided that the evolution is slow such that \(| \,\dot{\lambda }\,| \, \ll \, 1\). The difficulty that comes with adiabatic algorithms is the requirement of slow evolution. If the evolution is not sufficiently slow, there will be diabatic transitions that will reduce the probability of finding the system in the ground state of Hp. In order to get fast evolutions, CD protocols are implemented. Here, the task is to add velocity-dependent terms \({A}_{\lambda }^{(l)}\) to minimize the non-adiabatic transitions. This results in the Hamiltonian Hcd given by

$${H}_{{\rm{cd}}}(t)={H}_{a}(t)+\dot{\lambda }(t){A}_{\lambda }^{(l)}.$$
(2)

The calculation of the exact CD term requires full spectral information35,36. This information may not always be available hence approximate CD terms can be used instead37. One of the ways to obtain these terms is to utilize adiabatic gauge potentials that can be calculated using the nested commutator method38 given by

$$A_{\lambda}^{(l)} = i \mathop{\sum}\limits_{k = 1}^l \alpha_k(t) \underbrace{\left[\right.H_{a},\left[\right.H_{a},\ldots,\left[\right.H_{a},}_{2k-1}\partial_{\lambda} H_{a}\left]\right.\left]\right.\left]\right.,$$
(3)

where l is the order of expansion and αk(t) are the CD coefficients that need to be optimized. In principle, the optimization of the CD coefficients can be performed analytically38. Since this guarantees the existence of an optimal protocol, one can use variational circuits for numerical optimization as well39.

Now, let us assume that there exists a λ such that \(| \,\dot{\lambda }\,| \, \gg \, 1\) at the beginning of the evolution. For this condition, the Ha term from Eq. (2) can be neglected because only contributions that will dominate will be from the CD terms \({A}_{\lambda }^{(l)}\). By introducing a rapid change in the scheduling function at the onset of the effective evolution, we can approach an approximate ground state quickly due to CD protocols. Consequently, there is no need for the ansatz to capture the evolution behavior near t = T. Thus, we can devise a parameterized trotter evolution that will look like

$$U({\boldsymbol{\theta }})=\mathop{\prod }\limits_{k=1}^{p}{U}_{{\rm{cd}}}({{\boldsymbol{\theta }}}_{k}),$$
(4)

where \({U}_{{\rm{cd}}}({\boldsymbol{\theta }})={e}^{-i{\sum }_{\ell }{\theta }_{\ell }{{\mathcal{A}}}_{\ell }}\), which corresponds to the sum over the elements in \({\mathcal{A}}\) weighted by the parameters θ. \({\mathcal{A}}\) is a set of operators obtained by Eq. (3) with low order of expansion and θk is a set of tunable parameters for the k-th layer. This method has been shown to reduce the circuit depths drastically as compared to other adiabatic algorithms in qubit-based technologies40.

The main motivation for this work is to develop hybrid CD-inspired protocols for CV systems, specifically PQC. The expected advantages of doing this are two-fold. Firstly, these methods should result in a reduction in the required number of operations. Since the operations in photonics are beam-splitters and interferometers which are imperfect, this reduction should improve the performance to a great extent. Secondly, since at a finite order l, we get a pool of CD operators from which we can choose suitable operations that are natively available in the device. This makes the algorithm more flexible for NISQ devices.

Based on the gates available in qumode-based architectures, a bottleneck that comes is to implement evolutions of degree d ≥ 3 as all the Gaussian operations are quadratic in terms of phase-space quadrature (see “Methods”). Even for d = 3, implementing a cubic-phase gate is required, which is a non-Gaussian operation. For d > 3, we have to decompose the operations with higher degrees in terms of available gates41. This challenge applies to both phase space and Hilbert space encodings. Consequently, there is a need for an algorithm that can efficiently handle functions of any degree using a low number of gates, making it feasible for implementation on currently available photonic devices. As we will see later, this algorithm will also allow us to attempt complicated problems without the need for decompositions. This is crucial since decompositions are non-trivial and require lots of resources in the PQC regime.

Now, we devise a CD-inspired algorithm for PQC. To do so, we start by considering a problem Hamiltonian Hp. This can be a function of \({\bf{x}}=({x}_{1},\ldots ,{x}_{N})\in {{\mathbb{R}}}^{N}\) or a function of \({\bf{n}}=({n}_{1},\ldots ,{n}_{N})\in {{\mathbb{W}}}^{N}\) depending upon the encoding. Then we define a mixer Hamiltonian Hm whose ground-state is easy to prepare and satisfies the condition [HpHm] ≠ 0. With Hm and Hp, we now have Ha given by Eq. (1). From this, we can compute \({A}_{\lambda }^{(l)}\) using Eq. (3) with a specific order l to get a pool of operators. We heuristically select operators from this pool and parameterize them. This selection is made based on the requirements of the problem and the hardware. The unitaries corresponding to these operators (Eq. (4)) can be implemented as gates on the photonic hardware. Some of the widely implementable gates are shown in Table 1. Starting from a specific initial state, these parameters can be optimized by classical optimization routines to minimize a suitable target cost function. A schematic diagram of the algorithm is shown in Fig. 1.

Fig. 1: A schematic diagram illustrating the p = 1 photonic counterdiabatic quantum optimization (PCQO) algorithm.
figure 1

a Encoding phase: a logical problem, denoted as Hp, is encoded into F(n) or F(x) based on the problem type. The mixer Hm is introduced satisfying the non-commutativity relation, enabling the generation of the operator pool using the nested commutator method. b Processing phase: the operators from \({\mathcal{A}}=\{{A}_{\lambda }^{(2)}\}\) are exponentiated and employed as a circuit ansatz Ucd(θ) with adjustable parameters θ = {θ1θ2, …, θQ}. Q shows the total number of parameters. The algorithm initiates with random parameter values and iteratively updates them through classical optimization, aiming to determine F(〈n〉) or F(〈x〉) until convergence is achieved. c Decoding phase: performing measurements and extracting solutions from the minimum values, \({F}_{\min }(\langle {\bf{n}}\rangle )\) or \({F}_{\min }(\langle {\bf{x}}\rangle )\), enables the representation of solutions in the form of the mean photon number 〈n〉 or the mean quadrature values 〈x〉.

The target cost function that should be employed for phase-space encoding is 〈F(x)〉. However, this involves measuring the expectation values of highly non-local operators, which is a challenging task. Conversely, one can also use a relaxed version of these problems by defining the cost function as F(〈x〉). In this case, we would only need information about the expected values of the operators \(\langle {\hat{x}}_{i}\rangle\). This provides a useful approach for hybrid algorithms seeking approximate solutions. By considering the mean values along with the optimal state, the cost function takes into account the probability of obtaining near-optimal states. This choice is driven by the fact that the latter involves the measurement of expectation values for only local operators, which is more manageable than performing multi-qumode measurements or state tomography. Moreover, this cost function can exhibit robustness against noise since it relies on the mean of the expectation values of local operators rather than the specific state itself. This robustness can be advantageous in practical implementations where noise and imperfections are inevitable. For example, if we want to solve F(n) = (n−1)2, the circuit ansatz will optimize the parameters such that \(\langle \hat{n}\rangle =1\). Hence, along with \(\left\vert n=1\right\rangle\), \(\left\vert n=0\right\rangle\) and \(\left\vert n=2\right\rangle\) will also have finite probabilities.

Phase-space encoding

We examine the performance of the algorithm by applying it to two non-convex classical optimization problems F(x), both represented as polynomials of degree d. For both problems, the initial state preparation involves setting all the qumodes to vacuum states. The mixer Hamiltonian \({H}_{m}={\sum }_{i}{(\hat{{p}_{i}}-{p}_{0})}^{2}\) was selected, where p0 is a constant. This resembles the kinetic part of the harmonic oscillator and the potential part would be F(x). This mixer satisfies the non-commutativity condition with F(x). The next step involves finding the operator pool \({\mathcal{A}}\) by evaluating \({A}_{\lambda }^{(l)}\) which enables us to choose suitable parameterized gates from \({\mathcal{A}}\) as a circuit ansatz. We utilize homodyne measurements to compute 〈x〉. Since this is a numerical analysis, we have to define a cutoff dimension of the Fock space which was set to D = 15 for the toy function, D = 10 for the Rosenbrock function due to computational limitations and we also set  = 2. We considered two polynomials, one with d = 4 and N = 4 variables and the other with d = 6 and N = 3 variables.

We start with the Rosenbrock function31, defined as

$$F({\bf{x}})=\mathop{\sum }\limits_{i=1}^{N-1}\left[100{({x}_{i+1}-{x}_{i}^{2})}^{2}+{(1-{x}_{i})}^{2}\right]$$
(5)

where \({\bf{x}}=({x}_{1},\ldots ,{x}_{N})\in {{\mathbb{R}}}^{N}\). Due to its non-convex nature and difficulty in reaching global optima, this function is widely used as a benchmark for optimization algorithms. We consider N = 4 case with global minimum \({F}_{\min }({{\bf{x}}}_{{\rm{opt}}})=0\) at xopt = (1, 1, 1, 1). We limit ourselves to the \({A}_{\lambda }^{(2)}\) and get \({\mathcal{A}}=\{\hat{p},\,\hat{x}\hat{p}+\hat{p}\hat{x},\,{\hat{x}}^{3},\ldots \}\). This will contain many higher-order terms but we heuristically select the first three terms. These correspond to the X gate, S2 gate, and V gate (see “Methods”). This choice was made to include single-mode Gaussian, two-mode Gaussian, and non-Gaussian gates. Since we will perform Gaussian measurements, these non-Gaussian gates will make sure that the ansatz is not effectively simulated classically42. Since S2 is a two-mode gate, we have to decide the connectivity. For simplicity, we keep this connectivity to the nearest neighbors but this can be further fine-tuned based on hardware constraints. Therefore, the circuit ansatz will have X gates applied to all qumodes, S2 gates applied to nearest-neighbor qumodes, and V gates applied to all qumodes and thus the number of parameters required will be Q = 2N + (N − 1) = 3N − 1.

The second benchmark function is a toy problem given by

$$F({\bf{x}}) = \, {({x}_{1}^{3}+{x}_{2}^{3}+{x}_{3}^{3}-1{x}_{1}+2{x}_{2}-3{x}_{3})}^{2}\\ + {(-{x}_{1}+{x}_{2}+{x}_{3})}^{2}+0.01{x}_{1}+0.01{x}_{2}+0.01{x}_{3}.$$
(6)

For this problem, the solution is \({F}_{\min }({{\bf{x}}}_{{\rm{opt}}})= -0.028457\) at xopt = (−1.42212, −0.127017, −1.29723). This function is selected because its degree is much larger than the degree of the operations implementable in current technologies (see “Methods”). Furthermore, this function forces correlations between the variables, so the problem cannot be solved independently for each variable via a greedy algorithm. The linear terms with low coefficients ensure that the global minima are reached only at a single point. This function will give the same lower-order operators in the \({\mathcal{A}}\) pool as the Rosenbrock function hence we implement the same circuit ansatz as before. In both cases, the mean values of F(〈x〉) (termed as energy) over 100 random initializations across several iterations for p = 1 layer of the ansatz are computed and results are shown in Fig. 2 along with the standard error. Energy variation corresponding to the best instance is also plotted. For parameter optimization, we have used the Adam optimizer43.

Fig. 2: Energy (values of F(〈x〉)) profiles as a function of iterations.
figure 2

a For the Rosenbrock function and b for the toy function (Eq. (6)) with a p = 1 ansatz for 100 different initial parameters. Solid lines show the mean energies, shaded regions depict standard error and the best instance energy profile is shown by starred and dotted lines, respectively. The inset depicts a histogram of the minimum energies achieved at the end of the iterations, with the best five instances.

Figure 2a illustrates the convergence of the mean energy over 200 iterations for the Rosenbrock function with some iterations skipped to disregard initial fluctuations. It can be observed that the mean energy is slightly higher than the exact energy required to solve the problem. This indicates that the algorithm’s performance is affected by the initial parameters chosen. In the best-instance run, the algorithm achieves the exact solution. The inset shows the minimum energy achieved during the optimization with the best five instances. We can see that, for these instances, the algorithm results in solutions very close to the exact solution. This demonstrates that the algorithm performs well if the objective is to obtain approximate solutions.

Figure 2b displays the convergence of the mean energy over 200 iterations for the toy function in Eq. (6). It can be observed that the mean energy converges close to the exact energy within the first 200 iterations. Initially, there is a high standard error, but in later iterations, the algorithm successfully finds approximate solutions. Similar to the previous case, the inset plot shows the lowest energy for the last iteration. We can observe that for the best initial parameters, the energy goes very near to the exact energy. The best instances also give information about the performance ansatz like reachability and trainability. For instance, we can observe that if we have a good initialization, this ansatz is efficient in reaching the solution. This further validates the performance of this algorithm for finding approximate solutions. In conclusion, the algorithm can perform well even with random initialization for obtaining approximate solutions. It is important to note that the resource efficiency of our designed ansatz is independent of the chosen cost function. Even if a global cost function is selected for phase-space encoding, the advantages described above will still apply.

The results demonstrate that the proposed PCQO algorithm can obtain good approximate solutions with just p = 1 layer of the ansatz. This algorithm can be extended to handle polynomials of any degree, and it would be intriguing to explore its performance on higher-degree polynomials. In addition, investigating problems with a larger number of variables would provide further insights into the algorithm’s capabilities and potential applications.

Despite the advantages, some challenges need to be addressed. For instance, clever optimization techniques need to be developed to find the optimal parameters faster. Strategies to efficiently choose the circuit ansatz from \({\mathcal{A}}\) ensuring trainability and expressibility need to be developed since there is no performance guarantee of the heuristic choice of ansatz. Nevertheless, PCQO works extremely well for phase-space encoding.

To tackle these problems with qubit-based algorithms, the system size will depend upon the bit resolution that needs to be achieved. This will lead to large resource requirements and the cost functions will be many-body Ising Hamiltonians44. Since the requirement to discretize the solution here is elevated, any arbitrary precision can be reached using the PCQO algorithm with a linear encoding with respect to the variables. However, this precision will be limited by the accuracy of the measurement apparatus. For instance, homodyne measurements can achieve an efficiency of 69% in general setups45, while for photon number resolving measurements, efficiencies up to 92.5% can be obtained46. As we are interested in the expected value, the variance, and the measurement errors can be mitigated by increasing the number of experiment repetitions. These advantages make PCQO far more suitable for tackling continuous optimization problems.

Comparison with CV-QAOA

In QAOA47, two non-commuting unitaries called the mixer term Ub(β) and the Hamiltonian term Uc(γ) are applied iteratively for p layers to an initial state \(\left\vert {\psi }_{0}\right\rangle\), which is the ground state of Hm. γ and β are optimizable parameters, and the parameterized unitary looks like

$$U(\overrightarrow{\gamma },\overrightarrow{\beta })=\mathop{\prod }\limits_{k=1}^{p}{U}_{b}({\beta }_{k}){U}_{c}({\gamma }_{k}).$$
(7)

where, \({U}_{b}(\beta )={e}^{-i\beta {H}_{m}}\), \({U}_{c}(\gamma )={e}^{-i\gamma {H}_{p}}\). The mixer Hamiltonian Hm can take various forms depending on the specific requirements of the problem being solved48. In CV-QAOA, we define \({H}_{m}={\sum }_{i}{p}_{i}^{2}\) and prepare the initial state as a squeezed state for all qumodes with squeezing parameter r = 113. The corresponding mixer term is given by \({U}_{b}(\beta )=\exp (-i\beta {\sum }_{i}{p}_{i}^{2})\), which can be implemented using a custom gate Pz defined as

$${P}_{z}(s)=R\left(-\frac{\pi }{2}\right)P(s)R\left(\frac{\pi }{2}\right),$$
(8)

where the R gates act as a Fourier transform, rotating the position quadrature into the momentum quadrature.

One main motivation for the proposed algorithm is the efficient utilization of resources available in current photonic devices. The resources required to implement CV-QAOA on a gate-based photonic platform considering we have access to gates mentioned in Table 1 is rather large. For instance, to decompose the Rosenbrock function as given by Eq. (5), one needs to implement \({U}_{c}({\gamma }_{k})={e}^{-i{\gamma }_{k}{x}^{4}}\). The exact decomposition of this gate requires 29 gates of which 15 are non-gaussian cubic phase gates for each mode41. If we have N = 4, it will require 116 gates to implement this unitary. Similarly, for the toy problem we considered in Eq. (6), we have to decompose \({U}_{c}({\gamma }_{k})={e}^{-i{\gamma }_{k}{x}^{6}}\) using the identity derived in ref. 41,

$${e}^{-i\gamma {x}_{k}^{6}}={e}^{2i{p}_{j}{x}_{k}^{3}}{e}^{-i\gamma {x}_{j}^{2}}{e}^{-2i{p}_{j}{x}_{k}^{3}}{e}^{i\gamma {x}_{j}^{2}}{e}^{2i\gamma {x}_{j}{x}_{k}^{3}},$$
(9)

where the last term \({e}^{-2i\gamma {x}_{j}{x}_{k}^{3}}\) will require 135 cubic phase gates for the exact decomposition. The term \({e}^{2i{p}_{j}{x}_{k}^{3}}\) needs to be further expanded as,

$${e}^{2i{\gamma }^{2}{p}_{k}{x}_{j}^{3}}={e}^{2i\gamma {x}_{j}{x}_{k}}{e}^{-i\gamma {x}_{j}^{2}{p}_{k}^{2}}{e}^{-2i\gamma {x}_{j}{x}_{k}}{e}^{i\gamma {x}_{j}^{2}{p}_{k}^{2}}{e}^{-2i{\gamma }^{3}{x}_{j}^{4}}.$$
(10)

Note that the last term of this expansion is \({e}^{i{\gamma }^{3}{x}_{j}^{4}}\) which will require 29 gates as mentioned before.

In conclusion, CV-QAOA will require a huge overhead of resources while tackling CV optimization problems and we find it difficult to compare with the PCQO algorithm with the phase-space encoding. Instead, we compare it with CV-QAOA for the problems with Hilbert-space encoding, where we expected that the number of layers required to achieve approximate solutions would be lower for PCQO compared to QAOA, which requires large circuits to achieve convergence.

In Hilbert-space encoding, the variables are represented by Fock states \(\left\vert n\right\rangle\). Similar to the phase-space encoding, the initial state of all the qumodes is prepared in the vacuum state. The mixer we selected was \({H}_{m}={\sum }_{i}\,{({x}_{i}-{x}_{0})}^{2}+{({p}_{i}-{p}_{0})}^{2}\) where x0 and p0 are constants. The mixer resembles a shifted harmonic oscillator in both quadratures. Instead of homodyne measurements, number-resolving measurements are used to determine the values of 〈n〉. These measurements provide information about the number of photons in each mode, essential in the Hilbert space encoding.

We compare both algorithms using a small instance of the unbounded knapsack problem (UKP)49 as a testbed. In UKP, we consider a set of different types of items i, each with a value vi and weight wi. The objective is to maximize the total value while ensuring that the total weight of the selected items does not exceed the capacity C of the knapsack. Unlike the bounded knapsack problem, the UKP allows for an unlimited number of items of the same type to be included in the knapsack. If the number of items of type i that can be included in the knapsack is given by ni, then the optimization problem looks like

$$\mathop{\min }\limits_{{n}_{i}}\,-\mathop{\sum }\limits_{i=1}^{N}{v}_{i}{n}_{i}\quad \,{\text{subject}} \, {\text{to}}\,\quad \mathop{\sum }\limits_{i=1}^{N}{w}_{i}{n}_{i}\le C,\,{n}_{i}\ge 0\,\forall i,$$
(11)

where N shows the total number of item types. This problem is classified as an integer linear programming problem, where both the cost function and constraints are linear functions of ni. This can be converted into a minimization problem of

$$F({\bf{n}})=-\mathop{\sum }\limits_{i=1}^{N}{v}_{i}{n}_{i}+\delta \mathop{\sum }\limits_{i=1}^{N}{({w}_{i}{n}_{i}-C)}^{2}.$$
(12)

Here, δ represents the penalty term. Usually, an auxiliary variable is added to account for the inequality. However, as this algorithm is aimed at finding approximate solutions, we add the inequality as a ‘soft’ constraint in the problem Hamiltonian.

As for the cost function for the tests, we have employed both the exact cost function and its relaxed version. We remark here that when we use the relaxed version F(〈n〉) as a cost function, we lose the integrality of the problem. This is popularly known as integer relaxation. In some cases, as for the integer linear programming problems, this relaxation makes the relaxed problem solvable in polynomial time50. Hence, using this cost function might not be particularly useful to tackle integer problems like UKP. Nevertheless, to show the advantages of ansatz design using PCQO, we employ these problems as a testbed for our method. Furthermore, this relaxation allows us to extract the information needed using fewer shots compared to the number of shots to obtain the tomographic information needed to compute 〈F(n)〉. This makes this relaxation a viable strategy to be implemented in systems with limited access to the quantum resources. To compare the performance with F(〈n〉) and 〈F(n)〉 cost functions, here we run simulations of both for manageable system sizes. However, in future works, the exact approach should be favored to obtain more competitive results.

To benchmark against the best-known CV quantum algorithm, we compare the performance of the PCQO algorithm with two variants of CV-QAOA for a N = 3 qumode UKP case. In PCQO, we considered l = 2 order nested commutator, which will result in \({\mathcal{A}}=\{\hat{x},\,\hat{p},\,\hat{x}{\hat{p}}^{2},\,\hat{x}\hat{x},\ldots \}\). From this, we choose \(\hat{x}\) and \(\hat{x}\hat{x}\), whose exponentiation will correspond to X gates and CZ gates, respectively. Thus the circuit ansatz consists of X gates applied to all qumodes and CZ gates applied to nearest neighbor connections. Hence, the number of parameters required will be Q = N + (N − 1) = 2N − 1. The cutoff dimension was chosen as D = 10, the penalty term was set to δ = 4 for the local cost function, δ = 1 for the global cost function, and we kept  = 2.

Regarding parameterization, conventional QAOA has one parameter per unitary, so for the p = 1 layer, we have Q = 2. However, in PCQO, we set one parameter per gate, resulting in Q = 5 parameters for the p = 1 layer because of the nearest neighbor two-mode gates. To ensure a fair comparison, we also consider a variant of QAOA called MA-QAOA, where each gate has its free parameter51. For the p = 1 layer, this leads to Q = 10 parameters due to all-to-all connected two-mode gates. For classical optimization, we have implemented the Adam optimizer for all the cases.

In Fig. 3a, we show the energy as a function of iteration steps using the relaxed cost function F(〈n〉) for the p = 1 layer in three different algorithms: PCQO and QAOA with Q = 2 and Q = 10. The energy values shown correspond to the best outcome out of five randomly initialized instances. The best instances are also depicted here as the energy landscape depends upon the choice of ansatz. Therefore, the best initial parameters for the PCQO algorithm might not be the same for QAOA. We observe that PCQO outperforms both variants of QAOA, achieving the exact energy within 200 iterations. This implies that the operator pool calculated by the nested commutator method serves as a better ansatz for low-layered algorithms compared to the QAOA ansatz. In addition, the performance of QAOA with Q = 10 surpasses that of QAOA with Q = 2 due to the increased degree of freedom given by optimizable parameters. In Fig. 3b, we show the mean of the energies with standard error as a function of iteration steps. We can see that on average, the PCQO also outperforms both variants of QAOA. The standard error for QAOA (Q = 2) is initially high but quickly converges to zero. This can be attributed to the fact that with only two parameters, the ansatz is not expressible enough. However, as the parameterization increases, QAOA converges to lower energy. On the other hand, PCQO ansatz shows high errors due to its sensitivity to the initial parameters chosen. These findings are further consolidated by investigating the PCQO algorithm for N = 4 UKP and max-clique problems, which shows similar convergence behavior of energy (see Supplementary Note 1).

Fig. 3: Energy as a function of 200 iteration steps comparing QAOA (Q = 2, Q = 10) with PCQO (Q = 5) for a N = 3 UKP case.
figure 3

a The best convergence with the local cost function, b the mean of 5 randomly initialized parameters with the local cost function, and c mean energies of 20 random initializations for the global cost function are shown. Different markers depict the energy convergence for different algorithms and the green solid line shows the exact energy for the solution. Shaded regions show standard error.

In Fig. 3c, we compare the same with global cost function 〈F(n)〉. For this comparison, we implemented COBYLA as a classical optimizer instead of Adam due to its gradient-free nature, which reduces the requirement of the computational resources. Figure 3c shows that the energy scales are significantly higher for all three algorithms due to the cost function chosen. Nevertheless, PCQO still outperforms both the algorithms. This is simply because QAOA, inspired by adiabatic evolution, may require a high-depth circuit for optimal solutions. In contrast, CD protocols offer a significant performance enhancement for a p = 1 ansatz. Also, for 200 iterations, MA-QAOA performs similarly to QAOA but the energy is not yet converged implying the requirement for more iterations to outperform QAOA. The sensitivity to initial parameters can be estimated by the errors shown in the plot. It should be noted that here the error for PCQO is not zero. It is minimized due to the scale of the energy shown. In short, for the global cost function, PCQO outperforms QAOA by a large factor. However, further developments are still required before PCQO can establish itself as a global solver. We also compared four more instances of UKP and the results are shown in Supplementary Note 1. For further benchmarking, it will be interesting to compare the performance with other classical solvers like greedy search, gradient-based methods, Ising machines, and others.

It is worth mentioning that the implementation of non-Gaussian gates is approximate at finite cutoff dimensions, and increasing the cutoff could potentially enhance the performance of QAOA in our simulations. Nevertheless, the PCQO algorithm demonstrates superior performance and is particularly suitable for current near-term devices, as it utilizes native Gaussian operations, which are easier to implement experimentally. As mentioned earlier for polynomial functions with degrees higher than three, decomposing them into lower-order gates would be necessary for optimization using the available gates. Such decomposition would require substantial resources that are often unavailable, reinforcing the preference for PCQO as it allows truncating the operator pool to match the available gates. Also, this reduction is advantageous in Hilbert-space encoding as well as in CV-QAOA, decomposing Eq. (12), will require Kerr gate \({U}_{c}(\gamma )={e}^{-i\gamma {\hat{n}}^{2}}\) and Cross Kerr \({U}_{c}(\gamma )={e}^{-i\gamma \hat{{n}_{i}}\hat{{n}_{j}}}\) which are both non-gaussian in nature whereas, PCQO will require only Gaussian gates.

For encoding all possible solutions to this problem with qubits, we need the number of qubits \({N}_{{\rm{qubits}}}=\mathop{\sum }\nolimits_{i = 1}^{N}\left\lceil{\log }_{2}\left(\left\lfloor \frac{C}{{w}_{i}}\right\rfloor +1\right)\right\rceil\ge N\). Here, the term \(\left\lfloor \frac{C}{{w}_{i}}\right\rfloor\) corresponds to the maximum number of items of the element i we can fit in the knapsack without breaking the constraint. Since this number is at least 1, we would need a minimum of one qubit for encoding the trivial solution containing only element i. By repeating this for each of the N elements, we see that this scales worse than linearly with the number of items. Nqubits corresponds to the number of qubits necessary to encode trivial solutions, giving an idea of the order of the resources required. On the other hand, when employing our approach, we can trivially see that the number of qumodes scales linearly with the number of items.

Experiment

We conducted proof-of-principle experimental demonstration utilizing Xanadu’s state-of-the-art eight qumode fully-programmable nanophotonic chip34 (see Supplementary Note 2). A Schematic diagram of the hardware is shown in Fig. 4a. We considered a two-mode toy problem in the Hilbert-space encoding defined as

$$F({\bf{n}})={({n}_{0}+{n}_{2}-0.75)}^{2}.$$
(13)

For the initial state, the two-mode squeezing parameter was kept to r = 1 for (0, 4) and (2, 6) qumodes, and the other two pairs were kept to zero. This was done because the F(n) is a function of n0 and n2, so the squeezing will help in getting a larger mean-photon number. The mixer was chosen to be \({H}_{m}={\sum }_{i}\,{({x}_{i}-{x}_{0})}^{2}+{({p}_{i}-{p}_{0})}^{2}\). For simplicity, we considered a four qumode circuit but it is important to note that the same circuit structure was applied to the other four qumodes as well. To construct the ansatz for our experiment, we considered the operator pool \({\mathcal{A}}\) obtained from \({A}_{\lambda }^{(l)}\) with l = 2. Among many, this pool included the R gate and the BS gate, which are native to the hardware. Thus, we selected these two gates as the building blocks for our circuit ansatz. To enhance the expressive power of the circuit, we incorporated two ancillary qumodes in addition to the two qumodes required for the problem. This allowed us to exploit the full SU(4) transformation. Therefore, the circuit ansatz consisted of a series of R gates applied to four qumodes, followed by BS gates applied to nearest-neighbor qumodes. A graphical representation of the ansatz is shown in Fig. 4b.

Fig. 4: Experimental demonstration of PCQO algorithm.
figure 4

a Schematic diagram for eight-qumode nanophotonic chip from Xanadu. The chip is divided into a pair of identical qumodes (0, 1, 2, 3) and (4, 5, 6, 7) utilizing S2(r) gate. This gate can be decomposed as a S(r) gate (S(rϕ = 0) in Table 1) with r = 0 or r = 1 and a \(BS\left(\frac{\pi }{4},0\right)\) gate as shown. Then, an arbitrary U4 unitary is applied to the pair followed by number-resolving measurements. b The PCQO ansatz considered for the experiment. This includes R(ϕ) gates applied to all the qumodes and BS(θ, 0) gates applied to nearest-neighbor connections. θ = {θ1θ2, …} are optimizable parameters. c The mean photon number for all the qumodes obtained with the optimal circuit solving \(F({\bf{n}})={({n}_{0}+{n}_{2}-0.75)}^{2}\). The results shown are of a numerical simulation with D = 3 cutoff and the experiment with 1000 shots. The inset plot shows the mean photon number obtained by taking an average of the identical qumodes for both numerical simulation and experiment.

Figure 4c illustrates the mean photon numbers obtained from both the simulator and the experimental setup for all qumodes. Even with a moderate value of D = 3, we successfully obtained the exact solution to the problem. In the numerical simulation, the mean photon numbers for qumodes (0, 1, 2, 3) would be exactly the same as qumodes (4, 5, 6, 7) since the operations are identical. However, in the experimental setting, the distribution deviates due to inherent limitations such as the finite number of shots, noise, and losses in the chip. In addition, the utilization of a low cutoff in the numerical simulation may have resulted in suboptimal parameters for the actual chip. Despite these factors, a notable resemblance is observed between the experimental results and the numerical simulations. To further analyze the agreement, we computed the average of the mean photon numbers for identical qumodes (0, 4), (1, 5), (2, 6), and (3, 7) and the results are shown in Fig. 4d. Encouragingly, this analysis demonstrates a high level of concurrence between the experimental outcomes and the numerical simulations. Therefore, the PCQO algorithm provides a promising circuit ansatz that can be readily implemented using currently available hardware.

It is important to note that the implemented problem possesses relative simplicity as the chip exclusively incorporates fixed squeezing and lacks displacement operations. When encountering problems with large integer solutions, achieving the desired mean photon number becomes challenging without variable squeezing and displacement. Our current experimental results focus on demonstrating the feasibility of the PCQO algorithm through simple experiments. However, future advances enabling variable squeezing or displacement operations will facilitate tackling more complex problem instances.

Discussion

We proposed a hybrid quantum-classical optimization algorithm for photonic quantum computing to tackle continuous variable problems with the currently available technologies. The circuit ansatz for this algorithm is a problem-inspired ansatz computed by utilizing shortcuts-to-adiabaticity techniques, specifically counterdiabatic protocols. We investigated the performance of the algorithm by considering two non-convex continuous-variable optimization problems up to four variables and with a degree of six. Furthermore, we compared our algorithm with CV-QAOA for a simple unbounded knapsack problem for three qumode systems where we observed that the PCQO algorithm successfully finds good approximate solutions to these problems using a few gates. To showcase the practical feasibility of PCQO, we conducted an experimental demonstration on an eight-mode nanophotonic chip. These experiments substantiated that PCQO can be implemented on near-term photonic chips, thereby providing a promising avenue for utilizing photonic quantum computing to solve optimization problems.

Although we have focused solely on optimization problems here, this algorithm can also be extended to study physical problems. While PCQO is a hybrid algorithm, purely quantum counterdiabatic algorithms can also be developed in the future to evaluate performance from the perspective of shortcuts to adiabaticity. The backend for this algorithm is a photonic system but this can be extended to any bosonic systems as well and the performance analysis would be valuable in this regard. Advanced machine learning techniques like reinforcement learning52 and adaptive techniques6 are other aspects that can be incorporated in PCQO to select the circuit ansatz in a better way. Also, finding effective initialization strategies would be interesting for future work. We believe that this work will serve as a starting point for designing more advanced hybrid qubit-bosonic optimization algorithms53. Also integrating recent methods where numerical simulations can be performed without the truncation of Hilbert space will be interesting to understand the dependence of PCQO on truncation54. In summary, this work introduces the PCQO algorithm as a compelling approach for addressing hard optimization problems using photonic quantum computing. The successful application of PCQO to various problem domains, combined with its potential for further advancements and extensions, positions photonic quantum computing as a competitive candidate alongside qubit-based technologies for tackling optimization tasks.

Methods

Photonic variational quantum algorithms

In the qubit-based regime, the encoding is generally done by considering a physical system such as a molecule or a spin chain. A problem Hamiltonian Hp corresponding to this system is found such that its ground state entails the information of the solution. The processing phase employs a circuit ansatz comprising various parameterized gates. The design of this ansatz is critical to the performance of the VQA, as it directly affects the energy landscape and consequently, the convergence and success rate of the optimization process.

Circuit ansatzes can be broadly divided into two categories: problem-inspired and hardware-efficient. In problem-inspired ansatzes, the parameterized unitary is constructed by taking information from the Hp which usually corresponds to time evolutions. On the other hand, the hardware-efficient ansatzes are specifically designed to take into account the connectivity constraints of the underlying quantum hardware7. Both problem-inspired and hardware-efficient ansatzes have distinct roles in VQA design, providing varying trade-offs between performance and feasibility in practical implementations. The choice of ansatz depends on factors like the problem characteristics, available resources, and desired accuracy.

Ideally, problem-inspired ansatzes should be prioritized over hardware-efficient ansatzes because they narrow down the search space, enhancing trainability. However, problem-inspired ansatzes often result in increased circuit depths and unfavorable connections. Consequently, there is significant interest in developing algorithms containing problem-inspired ansatzes that can be implemented on near-term devices55.

Lastly, the decoding step involves measurements, typically in the computational basis, to evaluate a cost function. This cost function maps the optimizable parameters to real numbers. In many cases, the cost function corresponds to the expectation value of the Hp. However, alternative metrics such as fidelity or conditional value at risk56 can also be considered.

In the qumode-based regime, the components required for designing a VQA differ intrinsically from those in the qubit-based regime. In the encoding step, we have the flexibility to choose between two formulations: the phase space formulation and the Hilbert space formulation. These correspond to the wave-like and particle-like nature of light, respectively. In the phase space picture, the state of a single qumode is represented using \((\hat{x},\hat{p})\), which are the position and momentum operators, respectively. The problem Hamiltonian can be expressed as \({H}_{p}=F(\hat{x},\hat{p})\). On the other hand, qumode states can also be represented in a countably infinite Hilbert space spanned by Fock states \(\left\vert {n}_{i = 0,1,\ldots }\right\rangle\). Consequently, the problem Hamiltonian can be written as \({H}_{p}=F(\hat{n})\).

In the processing stage, the overall recipe remains the same, but PQC involves different operations designed for qumodes. These operations can be categorized into Gaussian operations and non-Gaussian operations. As the name suggests, Gaussian operations map Gaussian states to themselves and are generated by quadratic operations in \(\hat{x}\) and \(\hat{p}\). Single-mode Gaussian gates include phase shifts, displacement, and squeezing. Two-mode gates include beamsplitters. A combination of these gates can be used to implement gates such as quadratic phase gates, controlled-phase gates, Mach-Zehender interferometers, etc. On the other hand, non-Gaussian operations do not preserve the Gaussian nature of the quadratic states. These include the cubic phase gate, the Kerr gate, and the Cross-Kerr gate. Expressions of all these gates are given in Table 1. It has been shown that all these Gaussian transformations combined with any single non-Gaussian transformation make a universal gate set for PQC57.

Table 1 Currently available qumode operations as gates

In the decoding stage, the measurement scheme can be homodyne, heterodyne, or photon number-resolving measurements. The choice of measurement scheme is determined by the nature of the encoding and the specific variables of interest, either position or number operators.