Introduction

Inverse Kinematics (IK) maps task–space targets to joint–space configurations for robotic manipulators. While closed–form solutions exist for some geometries, general IK is typically non–convex and solved numerically. With good initial guesses, local methods–e.g., gradient–based solvers, SQP, or related techniques–perform well1,2,3; without such guesses, global search can be desirable. Mixed–integer convex programming (MICP) offers a principled way to address local minima and to integrate IK with other constraints4, but worst–case runtimes can be prohibitive as the number of integer variables grows, limiting practicality for larger, time–sensitive problems.

Many robotics problems exhibit NP–hard characteristics, motivating approximation and heuristic approaches when exact solutions are computationally prohibitive5,6,7,8. In parallel, quantum computing has emerged as an alternative computational paradigm with potential advantages for certain classes of optimization problems9,10,11. Quantum annealing (QA), implemented in devices such as D-Wave’s quantum annealers17, provides hardware–assisted sampling of low–energy configurations in QUBO/Ising models.

This paper studies a proof-of-concept workflow that reformulates planar IK as a binary Quadratic Unconstrained Binary Optimization (QUBO): we discretize joint angles via a linear binary approximation and enforce one–hot selections with quadratic penalties. The resulting QUBO is solved using QA, and decoded joint angles are evaluated back in the original IK space. Our goals are to (i) characterize runtime with a formal time–to–solution (TTS) metric, (ii) quantify solution quality in IK space (end–effector error and one–hot feasibility), and (iii) study embeddings on Pegasus/Zephyr hardware. The pipeline follows the stages illustrated in Fig. 1.

Scope and claims. We position this work as a hardware-validated, reproducible baseline for IK \(\rightarrow\) QUBO on current annealers. Throughout this paper, all time-to-solution (TTS) comparisons are performed within the QUBO-solver family (QA, SA, PROTES, hybrid). We intentionally do not compare against continuous IK solvers (e.g., TRAC-IK, SQP-based methods), and we therefore do not claim any runtime advantage over state-of-the-art continuous IK algorithms. Our goal is instead to quantify the behavior of different QUBO solvers on the same IK\(\rightarrow\)QUBO formulation.

Motivation

Casting IK into a binary optimization model provides access to mature algorithms and hardware ecosystems tailored to QUBO/Ising forms, including QA. Theoretically, QA can exhibit advantages over classical simulated annealing (SA) for certain energy landscapes13, and empirical reports show instance-dependent benefits on practical problems14,20. Devices like D-Wave’s annealers17 additionally enable hybrid quantum–classical workflows that interleave QPU sampling with classical improvement steps.

However, binary reformulation introduces design choices that directly affect fidelity to the original IK objective: (i) discretization (choice of angle grids) and (ii) penalty design (enforcing one–hot feasibility). Our study makes these trade-offs explicit and measurable: we give a safe penalty-setting recipe (Section 3.5), evaluate decoded solutions in IK space (Section "IK-Space solution quality"), and use a consistent TTS definition for all solvers (Section "Time-to-Solution (TTS) Metric").

Novelty

The novelty is an end-to-end, accuracy-aware, hardware-validated IK \(\rightarrow\) QUBO pipeline. Concretely, we:

  • implement a linear binary discretization with one–hot penalties for planar IK and run it on real QA hardware17;

  • adopt a penalty design that preserves one–hot feasibility while limiting objective distortion (Section 3.5);

  • evaluate with formal TTS and IK–space accuracy rather than energy alone (Sections "IK-Space solution quality"–"Time-to-Solution (TTS) Metric");

  • study embeddings/topologies (Pegasus, Zephyr) and discuss solver behavior, including non-monotonic SA TTS on these instances (Section 4.6).

This reframing responds to reviewer requests for rigor and scope, positioning the work as a reproducible baseline rather than a claim of end-to-end dominance over continuous IK.

Contributions

  1. 1.

    IK \(\rightarrow\) QUBO with rigorous metrics. A complete pipeline for planar IK with formal TTS and IK–space accuracy (end-effector error, one-hot feasibility), executed on D-Wave hardware17.

  2. 2.

    Penalty-design guideline. A practical, norm-based recipe for setting the one-hot penalty to ensure feasibility without excessive objective warping (Section 3.5).

  3. 3.

    Embedding study. An empirical comparison of embeddings/topologies (Global vs. Clique; Pegasus vs. Zephyr) showing when Global/Zephyr is most qubit-efficient for our dense (but non-clique) IK QUBOs.

  4. 4.

    Runtime behavior analysis. A discussion of when hybrid solvers reduce QUBO-level TTS and why SA can show non-monotonic (“zig-zag”) TTS with problem size on these instances (Section 4.6), while noting ICE effects when relevant.

Paper structure

The paper is organized as follows: Section “Quantum Annealing” provides an overview of Quantum Annealing in D-Wave quantum processing unit (QPU), discussing the principles and mechanisms underlying the quantum annealing process. Section "Inverse kinematics as a QUBO" introduces QUBO, formulating the IK problem within the framework of Quantum Unconstrained Binary Optimization. Section "Empirical results and analysis for solving IK using quantum annealing" presents Empirical Results and Analysis for Solving IK Using Quantum Annealing. Section "Limitations and shortcomings" discusses Limitations and Shortcomings. Finally, Section “Conclusions” concludes the paper with a summary of key contributions and future research directions.

Fig. 1
figure 1

Workflow for solving inverse kinematics using quantum annealing: Translating joint angle equations into QUBO/Ising models, embedding them on a D-Wave quantum processor17, and decoding the quantum solution into actionable results.

Fig. 2
figure 2

Schematic representation of the adiabatic quantum annealing process, illustrating how a system transitions from an initial, relatively simple potential landscape to a more complex one while quantum fluctuations are gradually reduced. The red arrows denote quantum tunneling events, through which the system can escape local minima and ultimately converge on the global minimum energy configuration.

Quantum annealing

Let us consider a quadratic unconstrained binary optimization (QUBO) problem:

$$\begin{aligned}&\underset{x}{\text {minimize}} \ \sum _i a_i x_i + \sum _{i>j} b_{ij} x_i x_j, \end{aligned}$$
(1)

where \(x_i \in \{0, \ 1 \}\) are binary optimization variables, \(a_i\) are biases and \(b_{ij}\) are coupling weights of the optimization problem. This type of problem can be efficiently solved by quantum annealing (QA).

Quantum annealing is a heuristic method for solving combinatorial optimization problems, particularly binary optimization, by controlling a quantum system. It evolved from classical simulated annealing (SA)16 but replaces thermal fluctuations with quantum fluctuations to drive state transitions as shown in Fig. 2. Kadowaki and Nishimori pioneered this approach13, implementing it using a transverse Ising model where a time-varying transverse field acts analogously to temperature changes in SA, aiming for faster convergence to optimal states. This requires special hardware that can support quantum annealing, such as D-Wave’s Advantage quantum processing unit (QPU). It also limits the optimization problems to the form that can be embedded in the given QPU; for example, D-Wave’s QPU are limited to the problems that can be cast as a QUBO18.

One can see quantum annealing as a process with two stages. The first stage involves embedding the problem and initializing the QPU by mapping logical binary variables to physical qubits (potentially requiring multiple linked qubits to represent one variable) and configuring couplings to match the QUBO structure. The system is initialized in a superposition state with all possible configurations equally likely through a transverse magnetic field. The values of the qubits of the QPU are linked with the values of the binary variables in the original problem; the coupling between qubits is linked with the coupling weights of the original problem, and the external flux bias is linked with the weights of the original problem. On the second stage, the system undergoes controlled quantum evolution: the transverse field is gradually reduced while the problem Hamiltonian (encoding the QUBO) is increased, allowing the system to explore low-energy states quantum-mechanically and settle into an optimal or near-optimal configuration. Lastly, we read the state of the QPU and recover the solution to the original problem. The rest of the section describes these stages in more detail. Fig. 3 gives an illustration of the described process.

Fig. 3
figure 3

Overview of the quantum annealing workflow for solving Quadratic Unconstrained Binary Optimization (QUBO) problems: (1) QUBO formulation \(\text {argmin}\, x^T Q x\) with binary variables \(x_i \in \{0,1\}\) and conversion QUBO into graph (Ising model) (2) Hardware components (qubits, couplers) and minor graph embedding, (3) Programming and initialization of annealing parameters (A(t): transverse field, B(t): longitudinal field, J(t): coupling strength), (4) Quantum annealing dynamics evolving spins from quantum superposition to classical states, (5) Readout and mapping of optimal spin configurations to binary solution x. This overview illustrates how QUBO problems are solved via quantum annealing in Fig. 1, demonstrating the inverse Ising mapping process from combinatorial optimization to physical quantum implementation. Figure adapted from15 and edited by us.

Embedding QUBO

The QUBO problem must be mapped to the physical hardware graph of the quantum processor. Since D-Wave’s Pegasus architecture has limited connectivity (each qubit connects to 15 others at most), complex QUBO interactions require minor embedding. This involves: (1) Identifying the problem’s logical graph structure, (2) Finding subgraphs (chains) in the hardware graph that can represent single logical variables, and (3) Setting strong ferromagnetic couplings (\(J_{\text {chain}} < 0\)) between chained qubits to ensure they remain correlated.

Annealing

Ising Hamiltonian that represents the embedded optimization problem during quantum annealing can be written as19:

$$\begin{aligned}&H(s) = -\frac{A(s)}{2} \sum _i \sigma _x^{(i)} + \frac{B(s)}{2} \left( \sum _i h_i \sigma _z^{(i)} \right. \ \left. + \sum _{i>j} J_{ij} \sigma _z^{(i)} \sigma _z^{(j)} \right) \end{aligned}$$

where:

  • \(J_{ij}\) is the coupling strength between qubits i and j (related to \(b_{ij}\) in the QUBO formulation);

  • \(h_i\) is the external flux bias for qubit i (related to \(a_i\) in the QUBO formulation);

  • \(\sigma _x^{(i)}, \sigma _z^{(i)}\) are Pauli matrices operating on the qubit i in the x and z directions, respectively;

  • \(s \in [0, 1]\) is the annealing parameter representing the evolution of the system from the initial state \(s = 0\) to the final state at \(s = 1\).

At \(s=0\), the system is in the quantum ground state dominated by \(H_{\text {initial}}\) (\(A(0) \gg B(0)\)). As s progresses, A(s) decreases while B(s) increases, transitioning the system into a classical regime. At \(s=1\), the final Hamiltonian describes a classical Ising spin model.

For Solving the IK, it must first be converted into a QUBO formulation. This is then transformed into an Ising model. Afterward, by applying minor embedding, we embed the IK problem onto the Pegasus and Zephyr topologies. The Quantum Annealing process in the D-Wave QPU aims to find the ground state of \(H_p\), corresponding to the optimal solution of the optimization problem. By starting with a known ground state of \(H_{\text {initial}}\) and evolving the system adiabatically through H(s), the process converges to the ground state of \(H_p\), thereby solving the problem efficiently. The entire pipeline, as depicted in the Fig. 1, is followed to decode the results into the required joint angles.

Inverse kinematics as a QUBO

Linear binary approximation

We introduce a binary selector vector \(q \in \mathbb {B}^m\), for which one and only one of its elements \(q_i \in \{0, 1\}\) is equal to 1:

$$\begin{aligned}&\sum _{i = 1}^m q_i = 1. \end{aligned}$$
(2)

Given a sequence of angles \(0 \le \varphi _1< \varphi _2<... < \varphi _m \le 2\pi\) we can find associated values of the trigonometric functions \(\cos (\varphi )\) and \(\sin (\varphi )\) and arrange them as elements of a vector:

$$\begin{aligned}&\varvec{t}_1 = \begin{bmatrix} c_1&...&c_m \end{bmatrix}^T, \ \ \ c_i = \cos (\varphi _i), \end{aligned}$$
(3)
$$\begin{aligned}&\varvec{t}_2 = \begin{bmatrix} s_1&...&s_m \end{bmatrix}^T, \ \ \ s_i = \sin (\varphi _i); \end{aligned}$$
(4)

we call \(\textbf{t}_1\), \(\textbf{t}_2\) value vectors. With that, we can introduce a linear binary approximation (LBA) of the cosine and sine functions:

$$\begin{aligned}&\cos (\varphi ) \approx \textbf{t}_1^T\textbf{q}, \ \ \&\sin (\varphi ) \approx \textbf{t}_2^T\textbf{q}. \end{aligned}$$
(5)

LBA is exact for the angles \(\varphi _i\) that were used to compute elements of the value vectors \(\textbf{t}_1\), \(\textbf{t}_2\). For any selector vector \(\textbf{q}\) the fundamental identity holds:

$$\begin{aligned}&(\textbf{t}_1^T\textbf{q})^2 + (\textbf{t}_2^T\textbf{q})^2 = 1. \end{aligned}$$
(6)

For the two-link case, the full selector vector consists of two disjoint one-hot blocks of size m, each enforcing selection of a single candidate angle per joint.

Inverse kinematics of a planar serial linkage

Consider a planar kinematic chain with n links connected via rotary joints. Let the length of the j-th link be \(l_j\) and its orientation in the global frame be defined by the angle \(\varphi _j\). Then the Cartesian coordinates \((r_1, \ r_2)\) of the end effector placed at the end of the n-th link are given by the following expression:

$$\begin{aligned}&r_1 = \sum _{j = 1}^n l_j \cos (\varphi _j), \ \ \quad r_2 = \sum _{j = 1}^n l_j \sin (\varphi _j). \end{aligned}$$
(7)

To find angles \(\varphi _j^*\) corresponding to the desired position \(r_1^*\), \(r_2^*\) of the end-effector we solve the following optimization problem:

$$\begin{aligned}&\varphi _j = \underset{\varphi _j}{\text {argmin}} (r_1^* - \sum _{j = 1}^n l_j \cos (\varphi _j) )^2 + (r_2^* - \sum _{j = 1}^n l_j \sin (\varphi _j) )^2 \end{aligned}$$
(8)

This represents an inverse kinematics (IK) problem cast as a nonconvex optimization with continuous variables. In the next subsection, we approximate it as a quadratic binary optimization problem.

Inverse kinematics as a QUBO

To approximate the IK problem described in the last subsection, we use n linear binary approximations \(\cos (\varphi _j) \approx \textbf{t}_1^T\textbf{q}_j\) and \(\sin (\varphi _j) \approx \textbf{t}_2^T\textbf{q}_j\), each associated with its selector vector \(\textbf{q}_j \in \mathbb {B}^m\). Concatenation of all selector vectors is denoted as \(\bar{\textbf{q}} = [ \textbf{q}_1^T, \ldots \ \textbf{q}_n^T]^T\).

For convenience, we define vector \(\textbf{d} = [ l_1, \ldots \ l_n ]^T\). The resulting optimization problem takes the form:

$$\begin{aligned}&\underset{\bar{\textbf{q}}}{\text {minimize}} \ \sum _{k = 1}^2 \left( \bar{\textbf{q}}^T\left( (\textbf{d}\textbf{d}^T) \otimes (\textbf{t}_k\textbf{t}_k^T) \right) \bar{\textbf{q}} - 2 r_k \left( \textbf{d} \otimes \textbf{t}_k\right) ^T\bar{\textbf{q}} \right) \end{aligned}$$
(9)

where \(\otimes\) is a Kronecker product.

To ensure that every vector \(\textbf{q}_j\) abides by the selector constraints (2) we introduce a big-M penalty function \(p(\textbf{q}_j)\):

$$\begin{aligned}&p(\textbf{q}_j) = M (\textbf{1}_m^T\textbf{q}_j - 1)^2 \end{aligned}$$
(10)

where \(\textbf{1}_m \in \mathbb {R}^m\) is a vector of ones and \(M>> 0\) is a sufficiently big number. With big-M penalty, the optimization takes the form:

$$\begin{aligned} \begin{aligned} \underset{\bar{\textbf{q}}}{\text {minimize}}&\sum _{k = 1}^2 \left( \bar{\textbf{q}}^T\left( (\textbf{d}\textbf{d}^T) \otimes (\textbf{t}_k\textbf{t}_k^T) \right) \bar{\textbf{q}} - 2 r_k \left( \textbf{d} \otimes \textbf{t}_k\right) ^T\bar{\textbf{q}} \right) \\&+ M \left( \bar{\textbf{q}}^T\left( \textbf{I} \otimes (\textbf{1}_m\textbf{1}_m^T) \right) \bar{\textbf{q}} - 2 (\textbf{1}_m \otimes \textbf{1}_m)^T\bar{\textbf{q}} \right) \end{aligned} \end{aligned}$$
(11)

Note that the constant components of the cost function are omitted, as they do not influence the optimal choice of the decision variable, denoted as \(\bar{\textbf{q}}\). The resulting problem is formulated as a Quadratic Unconstrained Binary Optimization (QUBO), which is compatible with quantum annealer hardware, such as the D-Wave Advantage. This problem will be embedded into a specific topology for an Ising model, where the variables in the QUBO are mapped (i.e. minor embedded31) to physical qubits, as illustrated in Fig. 4.

Fig. 4
figure 4

Mapping a Two-Linked IK problem graph onto a quantum annealer’s hardware architecture–highlighting the embedding of a densely connected problem graph (left) onto the physical qubit connectivity of a D-Wave quantum processing unit (right). The yellow arrows illustrate the correspondence between logical nodes and their mapped qubits, while the red and green edges represent couplings between qubits in the hardware graph. The white nodes on the left figure and the right figure represent zeroes in the QUBO and Ising formulations, respectively, while the yellow and blue nodes represent ones in the QUBO and Ising formulations, respectively.

In our IK formulation, the big-M penalty is used exclusively to enforce two independent one-hot constraints: selecting exactly one angle from the first n candidate angles (joint 1) and exactly one angle from the second n candidate angles (joint 2). Since all angle candidates lie within fixed joint limits, the maximum possible difference between any candidates is bounded by \(2\pi\). Therefore, the required M is a small constant that does not grow with problem size. This ensures that the penalty does not distort the objective landscape and preserves the fidelity of the IK\(\rightarrow\)QUBO mapping, unlike general big-M constructions that may require exponentially large constants.

Highter-order inverse kinematics problems

Only a subset of inverse kinematics problems resolves to a QUBO form under the proposed linear binary approximation. Instead, the resulting problem will take a form of a higher-order unconstrained optimization (HUBO), unless a different approximation is used. HUBO problems cannot be solved by the quantum annealer hardware directly, but they can be approximated by a larger (in terms of the number of variables) QUBO problems sharing the same global minimum25,26,27,28,42.

Empirical results and analysis for solving IK using quantum annealing

All experiments are based on the same family of planar Two-Linked IK instances and the same linear binary encoding of joint angles. Each link has unit length (\(L_1 = L_2 = 1\)), and the desired end-effector position \((g_x, g_y)\) is chosen inside the reachable annulus of the manipulator.

For a given problem size \(m \in \{5,\dots ,10\}\), each joint angle is discretized into \(m\) equidistant samples \(\varphi _i = 2\pi (i-1)/m\), \(i = 1,\dots ,m\), and represented by a one-hot binary vector \(x^{(k)} \in \{0,1\}^m\) for joint \(k \in \{1,2\}\). The corresponding unit direction vectors are precomputed as \(h_x = (\sin \varphi _i)_i\) and \(h_y = (\cos \varphi _i)_i\), and used to construct the quadratic cost in the workspace coordinates.

The continuous objective \(\Vert p(\theta _1,\theta _2) - g\Vert _2^2\) is then approximated by a quadratic form in the binary variables, \(E(x) = x^\top Q x + \text {const}\), where the block-structured matrix \(Q \in \mathbb {R}^{2m \times 2m}\) is obtained from \(H = h_x h_x^\top + h_y h_y^\top\) together with additional diagonal and off-diagonal penalties that (i) enforce the one-hot constraint for each joint and (ii) encode the linear terms induced by the target coordinates \((g_x,g_y)\). This construction yields a dense, but not fully connected, QUBO of logical size \(2m\), matching the implementation used to generate the matrices submitted to the D-Wave solvers.

To evaluate the performance of solving IK problem reformulated into a QUBO model using quantum annealing, two embedding strategies–Global Embedding and Clique Embedding32–were systematically analyzed. These methods were evaluated for their effectiveness in mapping the problem onto quantum annealing hardware. Specifically, the analysis focused on physical qubit utilization and quantum processing unit access times, leveraging two state-of-the-art hardware topologies: “Pegasus29 and “Zephyr30”. The IK problem addressed was a Two-Linked Inverse Kinematics problem. Problem sizes ranged from \(m = 5\) to \(m = 10\), corresponding to 10 to 20 logical qubits. Each instance of the QUBO problem required a matrix size of \(2m \times 2m\), necessitating \(2m\) logical qubits to accurately represent the problem.

  • Global Embedding: A general-purpose embedding tool which models all constraints collectively and maps the aggregate onto the QPU graph. This approach typically uses fewer qubits and shorter chains, making it more efficient for smaller problems.

  • Clique Embedding: A specialized embedding technique optimized for fully connected QUBOs.

The IK problem is not fully connected but exhibits dense connectivity, with the number of connections approximated as \(\frac{3}{2}m\).

Two quantum annealing hardware topologies were considered:

  • Pegasus Topology (Advantage System 4.1): Capable of embedding fully connected QUBOs with up to 177 logical qubits.

  • Zephyr Topology (Advantage Prototype 2.6): Supports embedding fully connected QUBOs with up to 80 logical qubits, featuring higher local connectivity compared to Pegasus.

Qubit usage analysis

Table 1 Average physical qubits for different embedding techniques in Pegasus and Zephyr.

The analysis began with the number of physical qubits required to represent logical variables under different embedding techniques and topologies. Table 1 presents the average physical qubit usage for various problem sizes and embedding strategies. The columns represent:

  • \(N_{GP}\): Global Embedding on Pegasus

  • \(N_{CP}\): Clique Embedding on Pegasus

  • \(N_{GZ}\): Global Embedding on Zephyr

  • \(N_{CZ}\): Clique Embedding on Zephyr

We observe that \(N_{GZ}\) is the most efficient in terms of physical qubit usage. The results are the average of multiple tests conducted for each embedding technique and topology.

QPU access time analysis

In Table 2, we present the average QPU access times for each embedding strategy. The columns represent:

  • \(t_{GP}\): Global Embedding on Pegasus

  • \(t_{CP}\): Clique Embedding on Pegasus

  • \(t_{GZ}\): Global Embedding on Zephyr

  • \(t_{CZ}\): Clique Embedding on Zephyr

Table 2 Average QPU access time for different embedding techniques in Pegasus and Zephyr (ms).

The results show that \(t_{GZ}\) was the best-performing configuration, aligning with the findings from the qubit usage analysis. Global Embedding is better suited for this problem because the QUBO is not fully connected. Global Embedding required fewer physical qubits compared to Clique Embedding, particularly for larger problem sizes. For smaller problem sizes, the dense local connectivity of the Zephyr topology provided a slight advantage in embedding efficiency. Clique Embedding exhibited longer QPU access times due to its general-purpose nature, whereas Global Embedding achieved faster programming but required manual optimization.

Large-scale QUBOs

For large-scale problem instances that exceed the limitations of purely quantum solvers, we utilize the D-Wave Hybrid Solver Service (HSS). This hybrid solver integrates quantum annealing with classical optimization techniques to efficiently process QUBO problems that are too large to be embedded directly on the QPU. The overall workflow is illustrated in Fig. 5.

Fig. 5
figure 5

Structure of a hybrid solver in D-Wave’s hybrid solver service. Adapted from38.

Mathematical Structure. Let \(Q \in \mathbb {R}^{n \times n}\) denote the QUBO matrix. The hybrid solver extends the principles of the open-source qbsolv algorithm by decomposing Q into smaller subproblems that fit on the QPU37. For a subset of variable indices \(S \subseteq \{1,\dots ,n\}\), let \(Q_S\) denote the restriction of Q to rows and columns in S, and let \(x_S\) be the corresponding partial assignment.

Given a current full assignment \(x \in \{0,1\}^n\), the subproblem optimized at iteration t is the conditional sub-QUBO

$$\begin{aligned}&E_{S_t}(x_{S_t} \,;\, x_{\bar{S}_t}) \;=\; x_{S_t}^\top Q_{S_t} x_{S_t} \;+\; 2 x_{S_t}^\top Q_{S_t,\bar{S}_t} x_{\bar{S}_t}, \end{aligned}$$
(12)

where \(\bar{S}_t\) is the complement of \(S_t\). This formulation fixes variables outside \(S_t\), enabling the subproblem to remain small enough to be embedded onto the QPU.

Hybrid Optimization Loop. The hybrid solver iteratively refines the global solution using a combination of classical neighborhood selection, quantum subproblem optimization, and classical postprocessing:

  1. 1.

    Neighborhood selection. A subset \(S_t\) is chosen using heuristics such as energy-based gradients, constraint violations, or tabu memories. This selects a promising region of variables to optimize jointly.

  2. 2.

    Quantum refinement. The restricted QUBO \(Q_{S_t}\) is embedded onto the QPU, where quantum annealing generates K low-energy samples \(\{ x_{S_t}^{(k)} \}_{k=1}^K\).

  3. 3.

    Classical postprocessing. Each quantum sample is merged with the fixed variables:

    $$x^{(k)} = (x_{S_t}^{(k)}, x_{\bar{S}_t}).$$

    Classical local search (hill climbing, steepest descent, and tabu search) is then applied to refine \(x^{(k)}\) and reduce the global energy \(E(x^{(k)})\).

  4. 4.

    Update step. The next iterate is chosen as

    $$x_{t+1} = \arg \min _k E(x^{(k)}),$$

    and neighborhood-selection heuristics are updated accordingly.

This scheme can be seen as a large-neighborhood search where quantum annealing acts as a specialized optimizer for selected subregions, while classical heuristics manage global coordination, decomposition, and refinement.

Algorithm 1
figure a

Hybrid QA–classical optimization loop

This hybrid strategy enables the solver to handle QUBO instances with thousands of variables–well beyond the direct embedding capacity of the QPU–while still leveraging the quantum annealer’s ability to efficiently explore low-energy regions of complex energy landscapes.

In Fig. 6, the difference in time-to-solution (TTS) among Quantum Annealing (QA), Simulated Annealing (SA)39,40, and the Probabilistic Tensor Train Sampler (PROTES), which employs probabilistic sampling from a probability density function represented in a low-parametric tensor train format41, is illustrated (In these experiments, number of executed samples in SA and QA were 5000, while the PROTES number of allowed requests to the objective function–was set to 2000). For a QUBO instance of size \(900 \times 900\) (corresponding to \(2^{900}\) possible solutions), QA demonstrates a speedup of more than 30 times compared to classical methods. SA attains the exact optimal QUBO energy for all tested sizes and always produces IK-correct solutions. QA applied to the \(500 \times 500\) QUBO typically achieves the second-lowest energy, yet still returns IK-correct joint configurations. In contrast, PROTES/TT applied to the \(300 \times 300\) QUBO does not produce any IK-correct solution: its lowest-energy configurations violate one-hot consistency or result in significantly larger workspace errors. Therefore, all TTS values in Fig. 6 reflect the time required to obtain an IK-correct solution.

Fig. 6
figure 6

Mean time-to-solution (TTS) for Quantum Annealing (QA), Simulated Annealing (SA), and PROTES/TT on large-scale two-link IK QUBO instances. A result is counted as successful only if it is IK-correct (satisfies both one-hot constraints and yields end-effector accuracy within the discretization tolerance). SA always finds the exact optimal QUBO energy. QA, when applied to the \(500 \times 500\) QUBO, typically finds the second-best energy but still yields IK-correct angles. PROTES/TT fails to produce an IK-correct solution for the \(300 \times 300\) QUBO.

The results in Fig. 6 emphasize the significant advantage of the Hybrid D-Wave Solver in reducing time-to-solution for practical, large-scale optimization problems. The hybrid approach effectively combines the strengths of quantum and classical techniques, making it a robust solution for tackling computationally intensive QUBO instances. However, its performance is limited by the overhead and variability introduced by integrating these two components, which means that for certain problems it may not achieve the optimal solution. Reliance on classical processing can restrict the overall speedup and may prevent consistently achieving the global optimum.

IK-space solution quality

While Fig. 6 compares solver performance in terms of time-to-solution (TTS), it is crucial to verify that the returned binary vectors correspond to valid joint configurations in the original IK space. For each solver, we therefore evaluate the following:

  1. 1.

    One-hot feasibility. Each joint angle is represented by a one-hot block. We validate that each block contains exactly one active element. Any violation indicates that the big-M penalty failed to enforce angle selection.

  2. 2.

    End-effector accuracy. After decoding the joint angles \((\varphi _1,\varphi _2)\), we evaluate the forward-kinematics position \(p(\varphi _1,\varphi _2)\) and compute the workspace error

    $$\Vert p(\varphi _1,\varphi _2) - g \Vert _2.$$

    A solution is classified as IK-correct if both one-hot constraints are satisfied and the error is within the discretization tolerance implied by the angular grid.

Across all QUBO sizes tested in Fig. 6 (from \(100 \times 100\) up to \(900 \times 900\)), SA consistently attains the optimal QUBO energy (as verified using an exact classical solver), and its decoded joint angles are always IK-correct.

QA produces IK-correct solutions across the entire range of QUBO sizes; however, beginning at QUBO sizes of approximately 500 \(\times\) 500 and above, QA typically returns the second-best QUBO energy due to the Integrated Control Errors (ICE) effect. Specifically, the dynamic range of \(h\) and \(J\) values may be limited by integrated control errors (ICE). Instead of finding low-energy states for an optimization problem defined by \(h\) and \(J\) as in equation 13, the quantum processing unit solves a slightly altered problem that can be modeled as:

$$\begin{aligned}&E_{\delta \text {ising}}(s) = \sum _{i=1}^{N}(h_i + \delta h_i)s_i + \sum _{i=1}^{N} \sum _{j=i+1}^{N}(J_{i,j} + \delta J_{i,j})s_i s_j, \end{aligned}$$
(13)

where \(\delta h_i\) and \(\delta J_{i,j}\) characterize the errors in the parameters \(h_i\) and \(J_{i,j}\), respectively33. Because \(\delta h\) and \(\delta J\) are summed over \(N\), fidelity limitations tend to have a greater effect on performance for full-QPU-sized problems, for a given dynamic range and distribution of \(h\) and \(J\). This can result in slightly different solutions compared to the ideal case. Despite this slight energy deviation, the decoded angles remain fully IK-correct and yield the same end-effector position as SA within discretization accuracy. In contrast, PROTES begins to lose IK correctness once the QUBO size exceeds approximately 300 \(\times\) 300. For QUBO dimensions larger than this threshold, TT’s lowest-energy configurations either violate one-hot feasibility or produce noticeably larger workspace errors. Consequently, TT does not yield IK-correct solutions for the larger QUBO sizes reported in Fig. 6.

Time-to-Solution (TTS) metric

To compare stochastic solvers on a common runtime scale, we use the standard time-to-solution (TTS) metric. For each solver, a single run has a measured wall-clock time \(t_{\textrm{run}}\) (including all overheads such as QPU programming, annealing, and readout for QA, and CPU overhead for classical solvers). A run is deemed successful if it returns an IK-correct solution (Sec. 4.4) whose QUBO energy is at most a reference value \(E_{\textrm{ref}}\), chosen as the SA optimum, verified where possible using an exact classical solver. Let \(p_{\textrm{succ}}\) denote the empirical success probability.

Assuming independence between runs, the expected number of repetitions needed to achieve at least one success with confidence \(1 - \delta\) is

$$\begin{aligned}&N_{\textrm{TTS}}(\delta ) = \frac{\log \delta }{\log (1 - p_{\textrm{succ}})}. \end{aligned}$$
(14)

We fix \(\delta = 0.01\) (99% confidence), and define

$$\begin{aligned}&\textrm{TTS} = N_{\textrm{TTS}}(0.01)\, t_{\textrm{run}}. \end{aligned}$$
(15)

The curves in Fig. 6 report the mean TTS across all two-link IK instances of a given QUBO size. This metric accounts for both the wall-clock cost and the probability of obtaining an IK-correct solution.The TTS curve for simulated annealing (SA) is not strictly increasing; in particular, the TTS at QUBO size \(400\times 400\) is slightly lower than at \(300\times 300\). From our definition for TTS in eq 15 with \(t_{\textrm{run}}\) essentially constant across these sizes, this simply means that the empirical success probability \(p_{\textrm{succ}}\) of SA is slightly higher at 400 than at 300. In other words, for our IK-derived instances SA finds the ground state more frequently at size 400, so the expected time to solution is lower, even though the problem is nominally larger. This is not contradictory: the difficulty of SA is governed by the detailed energy landscape (barrier structure, number and width of attraction basins), which changes non-monotonically with the discretization parameter that defines each QUBO, and small local reversals of TTS between neighboring sizes are therefore expected and not interpreted as a fundamental complexity effect44,45,46.

Limitations and shortcomings

Unlike general mixed-integer IK reformulations, our use of big-M for one-hot angle selection does not require large constants and does not contribute to objective warping. Additionally, while quantum annealing offers compelling advantages for IK optimization, this work faces several constraints inherent to current quantum hardware and methodological choices:

  1. 1.

    Hardware Limitations:

    • Qubit scarcity: The D-Wave QPU’s limited qubit count (\(\sim\)5,000 qubits in Advantage systems) and sparse connectivity restrict problem size. For \(m=20\) (requiring 60 physical qubits), scaling to \(m>150\) becomes infeasible due to embedding overhead.

    • Noise and errors: Integrated Control Errors (ICE) alter the effective Ising Hamiltonian (Eq. 12), causing deviations from optimal solutions.

  2. 2.

    Approximation Trade-offs: The linear binary approximation (LBA) introduces discretization errors for non-sampled angles \(\varphi \notin \{\varphi _i\}\). High-resolution approximations increase QUBO size quadratically.

  3. 3.

    Scalability of Embedding: Global Embedding exhibits quadratic growth in physical qubits (\(N_{GZ} \propto N^2\)). For \(N=60\), \(N_{GZ}\) exceeds 150 qubits—straining current QPUs.

  4. 4.

    Problem Applicability: The QUBO reformulation applies only to planar serial linkages. Complex kinematics yield higher-order optimizations (HUBO) requiring non-trivial reductions.

  5. 5.

    Hybrid Solver Overhead: The 30\(\times\) speedup relies on classical heuristics, introducing latency without global optimality guarantees.

Conclusions

The application of quantum annealing to IK problem, reformulated as a QUBO, demonstrates the potential of quantum computing for complex optimization in robotics. Among embedding strategies, Global Embedding on Zephyr proved most efficient, offering lower physical qubit usage and faster QPU access times. Hybrid quantum-classical approaches achieved over 30-fold speedups. As discussed in Section "Limitations and shortcomings", future work will address hardware-aware embedding optimizations, error mitigation for ICE, and efficient HUBO-to-QUBO conversions.