Introduction

Quantum computing is emerging as a powerful change that promises to significantly enhance scientific computing and simulations. Quantum computers, operating with quantum bits (qubits), have the potential to execute complex calculations at speed and levels of precision that traditional supercomputers cannot achieve1,2,3. The realm of drug discovery, characterized by its need for meticulous molecular modeling and predictive analytics4,5,6,7, stands as an ideal candidate to benefit from this quantum leap. Recent endeavors have commenced the integration of quantum computing into drug design research, marking a progressive stride in the application of advanced computational technologies to drug discovery8,9,10,11. In drug design, existing classical computational chemistry methods are not able to compute exact solutions, and the required computational cost grows exponentially as the scale of the system grows. Quantum algorithms exemplified by the Variational Quantum Eigensolver (VQE)12, hold the potential to advance classical methods like Hartree-Fock (HF)13 towards more accurate solutions within the quantum computing paradigm. As the scale of quantum computers expands, quantum computing approaches are expected to significantly outperform existing solutions, such as Density Functional Theory (DFT)14, in terms of both accuracy and efficiency in scenarios involving quantum chemical calculations. In addition to the quantum chemistry approach, a variety of drug-design problems can be cast into optimization problems11,15. The quantum approximate optimization algorithm16 or quantum annealing algorithms17,18 can then be employed to solve these optimization algorithms.

However, in the current landscape, the involvement of quantum computing in drug discovery is primarily restricted to conceptual validation, with minimal integration into real world drug design19,20,21,22,23,24. Our hybrid quantum computing pipeline (see Fig. 1) is real-world drug discovery problem oriented. Our approach addresses this gap by investigating two pertinent case studies rooted in actual clinical and pre-clinical contexts. The key step for quantum computation of molecular properties is to prepare the molecular wave function on a quantum device. To this end, the VQE framework is suitable for near-term quantum computers. As shown in Fig. 1b, the core of VQE is to employ parameterized quantum circuits to measure the energy of the target molecular system. Then, a classical optimizer is employed to minimize the energy expectation until convergence. Due to the variational principle, the state of the quantum circuit becomes a good approximation for the wave function of the target molecule, and the measured energy is the variational ground state energy. After that, additional measurements can be performed on the optimized quantum circuit for other interested physical properties.

Our first case study focuses on a carbon-carbon bond cleavage prodrug strategy25, which investigates an innovative prodrug activation approach applied to \(\beta \)-lapachone for cancer-specific targeting and has been validated through animal experiments. This prodrug design primarily aims to address the limitations of active drugs in pharmacokinetics and pharmacodynamics, offering a valuable supplement to the existing prodrug strategies26,27,28,29,30,31,32. The simulation of the prodrug activation process requires precise modeling of the solvation effect in human body. To achieve this, we implement a general pipeline that enables quantum computing of the solvation energy based on the polarizable continuum model (PCM). Our findings demonstrate the viability of quantum computations in simulating covalent bond cleavage for prodrug activation calculations, which are important steps in real-world drug design tasks.

We then turn to the covalent inhibition of KRAS (Kirsten rat sarcoma viral oncogene), a protein target prevalent in numerous types of cancers. KRAS plays a crucial role in the RAS/MAPK (Mitogen-Activated Protein Kinase) signaling pathway, significantly influencing cell growth, differentiation, and survival33. Mutations of this protein, particularly the G12C variant, are common in various cancers, including lung and pancreatic cancers, and are associated with uncontrolled cell proliferation and cancer progression34,35,36,37. Sotorasib (development code name AMG 510), a covalent inhibitor targeting this mutation, has demonstrated potential in providing a more prolonged and specific interaction with the KRAS protein, a crucial approach in cancer therapy38,39. Since the introduction of AMG 510, a flurry of new inhibitors targeting G12C has been developed, and even expanding to other KRAS mutations40,41,42,43,44, and several candidates for broad spectrum inhibition have also been proposed45. However, the other mutations usually don’t have a potential site for covalent binding, so their efficacy remained to be rigorously tested.

Quantum computing can enhance our understanding of such drug-target interactions through QM/MM (Quantum Mechanics/Molecular Mechanics) simulations, which are vital in the post-drug-design computational validation phase. To realize this, we implemented a hybrid quantum computing workflow for molecular forces during QM/MM simulation. This development not only facilitates a detailed examination of covalent inhibitors like Sotorasib, but also propels the field of computational drug development forward.

Through these two real-world drug design examples, we present a hybrid quantum computing pipeline for drug design. Our workflow has advantages in its flexibility and has been carefully constructed to accommodate various applications in the area of drug discovery. The universality of our pipeline highlights its potential as a foundational tool, empowering researchers with a ready-to-use computational resource.

En route of our computational investigations, we also established a number of benchmarks, which not only exemplify the robustness of our approach but also serve as a valuable reference for the field of quantum computing-enhanced drug discovery. By democratizing access to this advanced pipeline, we lay the groundwork for expanded collaborative endeavors within the scientific community, thereby accelerating the translation of quantum computing power into tangible therapeutic outcomes.

Results

Gibbs free energy profiles in prodrug activation strategy

Carbon–carbon bond cleavage in prodrug activation strategy

In modern drug research, prodrug activation is a very important strategy46,47. It helps turn inactive ingredients into active drugs inside the body. This strategy helps make drugs work better by making sure they only activate at certain places in the body, which lowers the risk of side effects and leads to safer and more effective treatments.

Among various prodrug activation strategies25,26,27,28,29,30,31,32, that based on the cleavage of carbon–carbon (C–C) bonds is particularly innovative. It is a novel strategy with applicability to drugs without traditional modifiable groups. The C–C bond, a quintessential linkage in organic chemistry, imparts robustness to molecular frameworks, and its selective scission demands conditions of exquisite precision. Synthesizing prodrugs that are primed for C–C bond cleavage under pathophysiological conditions confronts us with the dual challenges of sophisticated synthetic chemistry and intricate mechanistic elucidation.

In this cleavage of carbon–carbon bonds based prodrug activation strategy, the calculation of the energy barrier is crucial because it determines whether the chemical reaction can proceed spontaneously under physiological conditions. It also plays a significant role in determining stable molecular structures, guiding molecular design, and evaluating molecular dynamic properties. To simplify the computations, in the subsystem where quantum computing is employed, we have selected five key molecules involved in the cleavage of the C–C bond as simulation subjects, performing the single-point energy calculation and the essential solvent model calculations after conformational optimization process (see Fig. 2 for details). Considering the practical value and significance of prodrug activation strategies in current drug design, especially for drug delivery, our calculations are suitable for extension to more similar scenarios.

Gibbs free energy profiles of covalent bond cleavage for prodrug activation

Gibbs free energy profiling of covalent bond cleavage is a critical task for drug design, especially prodrug activation. It is of great importance for the selectivity and efficacy of therapeutic agents, guiding synthetic routes, and even achieving accurate molecular models for complex chemical reactions.

In this work, we study the prodrug design for \(\beta \)-lapachone, a natural product with extensive anticancer activity. In the original study25, the authors use DFT and select M06-2X functional to calculate the energy barrier. The results show that the energy barrier for C–C bond cleavage is small enough for the chemical reaction to proceed spontaneously under physiological temperature conditions. It’s worth mentioning that in the original study, this novel prodrug design strategy is validated through wet laboratory experiments. In this study, we employed two classical computational methods, namely HF and Complete Active Space Configuration Interaction (CASCI), to compute reference values for quantum computation. While DFT is typically the preferred method in conventional pharmacochemical reaction calculations due to its efficiency and accuracy, the choice of HF and CASCI methods in this study yields reaction barrier that is consistent with wet lab results.

Despite that quantum devices with more than 100 qubits are becoming available, simulating large chemical systems would require very deep circuits, which will inevitably lead to inaccurate outcomes due to intrinsic quantum noise. Additionally, the \(N^4\) terms to measure to calculate molecular energy is another bottleneck for quantum computation due to the limited measurement shot budget. Thus, it is often desirable to reduce the effective problem size of chemical systems, so that they can be processed on available quantum devices. The quantum embedding methods and downfolding methods have gathered significant attention recently48,49. In this work, we employed the active space approximation due to its popularity and versatility, which simplifies the QM region into a more manageable two electron/two orbital system. The CASCI energy can be considered as the exact solution under the active space approximation and the results by quantum computers are expected to be consistent with the CASCI energy. The fermionic Hamiltonian is then converted into a qubit Hamiltonian using parity transformation. The wave function of the active space can then be represented by a 2-qubit superconducting quantum device. We utilized a hardware-efficient \(R_y\) ansatz with a single layer as the parameterized quantum circuit for VQE, as depicted in Fig. 3. We applied standard readout error mitigation to enhance the accuracy of the measurement results. For more detailed technical information, please refer to the Methods section. We implemented the entire workflow in the TenCirChem package50, allowing users to utilize these functions with just a few lines of code.

By calculating the energy barrier for C–C bond cleavage, we compare our quantum computing results with those from the original study. Our computation involves single-point energy calculations with the influence of water solvation effects. For both classical and quantum computations, we selected the 6-311G(d,p) basis set and chose the ddCOSMO model as the solvation model. The thermal Gibbs corrections were calculated at the HF level. Additionally, we included the results from HF and CASCI, which are based on classical computational chemistry, for comparison. In Table 1 we list the reaction barrier \(\Delta G^{\ddag }\) and the reaction Gibbs free energy change \(\Delta G\) for the prodrug-activation reaction. The Gibbs free energy for relevant molecules is listed in Table 2 in the Methods section. The results of the reaction energy barrier \(\Delta G^{\ddag }\) obtained from both classical quantum chemistry calculation methods and quantum computing methods are in good agreement. They also align closely with the calculation results of the original paper, which employed the M06-2X functional and Gaussian as the computational tool. From the activation barrier results from quantum computers in Table 1, we observe that the activation barrier is less than 20 kcal/mol. In the field of drug design, this indicates that the reaction could spontaneously occur within a biological organism. Therefore, the results from quantum computers in our pipeline can be used for the assessment of prodrug activation processes. On the other hand, we obtained significantly lower energy values \(\Delta G\) compared to the DFT method in the original study. It is worth noting that without considering the solvation effect, both HF and CASCI calculations yield much lower reaction barriers \(\Delta G^{\ddag }\). In fact, the VQE method even produces a non-physical negative reaction barrier. This observation emphasizes the importance of considering the solvation effect in the drug-design pipeline.

The similarity between the results obtained from HF, CASCI, and VQE can be attributed to the relatively small active space considered in this study. There are studies indicating that quantum computational methods like VQE can achieve near-exact solutions for medium-sized chemical systems51,52. As the scale of quantum computing continues to grow, we may be able to alleviate the active space approximation employed in this work and make significant improvements to the HF method. Our results demonstrate the effectiveness of quantum computing in scenarios involving Gibbs free energy profile calculations of covalent bond cleavage, as well as the versatility and plug-and-play advantages of our pipeline.

Next, we discuss the computational wall time required for quantum computation. In the (2e, 2o) active space, the bottleneck for both classical and quantum computation is obtaining the HF solution with the solvation effect. Thus, the total wall times are comparable for all molecules computed in this study, ranging from several minutes to approximately one hour, depending on the size of the molecule. The time cost for solving (2e, 2o) active space does vary between CPU and QPU, as illustrated in Table 3. Taking molecule \({\textbf {5}}\) as an example, classical computers require 3 s to complete the computation. On the other hand, quantum computers take 63 s to perform the computation, and the majority of the time is dedicated to measuring the active space energy and the one-body reduced density matrix for the solvation effect. Since active resetting is not implemented yet, for each measurement shot, the quantum computational bottleneck is to wait several times the decay (T1) time so that the energy stored in qubits is relaxed into the environment. This results in an approximate duration of 1 ms for each measurement shot. To determine the expectation value for each Pauli operator, 8192 measurement shots are performed, corresponding to a duration of 8 s. For energy evaluation, there are eight Pauli strings to be measured, which are grouped into five measurement groups based on commutation relations. As a result, calculating the energy expectation takes approximately 40 s. The calculation of one-body reduced density matrices in the active space involves measuring three additional expectation values. Thus, the total time cost for the quantum computing kernel is approximately 60 s, consistent with our experimental findings. Although the active space size remains the same for different molecules, the time cost for classical and quantum computation does vary. For example, calculating 4 and TS is significantly more time-costly than computing 5. This discrepancy arises from the differing time required for active space integral transformation across molecules. Nevertheless, for all molecules, quantum computation takes approximately one minute longer than classical computers.

In this study, we have limited the utilization of quantum computers to a few qubits employing the active space approximation, due to the limited size and gate noise of currently available quantum computers. Herein, we estimate what kind of quantum computers are required for a fully correlated computation of the systems studied in this work without incurring the active space approximation. Taking molecule 4 as an example, with 6-311G(d,p) basis set, the system corresponds to \(N=630\) orbitals and \(N_{\text {elec}}=196\) electrons. To reduce the qubit requirement, the paired unitary coupled-cluster ansatz can be employed, which requires only 1 qubits for each orbital due to the restriction of electron pairing53,54. Other advantages of the ansatz include that evaluating the energy requires only constant measurement and linear circuit depth due to the efficient Givens-SWAP network. Additionally, since there are \(N_{\text {elec}}=196\) electrons, the number of all possible double excitations is \(\frac{N_{\text {elec}}}{2}\times (N-\frac{N_{\text {elec}}}{2})=52136\). Thus, a fully correlated computation at PUCC level with double excitations (PUCCD) involves a quantum circuit with approximately \(10^3\) qubits and \(10^5\) Givens-SWAP gates. The PUCCD ansatz has been successfully implemented on both superconducting and trapped-ion quantum computers53,54. The number of qubits employed in these studies is around 10, sufficient to describe 1 to 2 heavy atoms if active space or embedding techniques are not used55. While digital quantum computers with over 100 qubits are accessible3, their application in quantum chemistry has been limited, primarily due to the restricted fidelity of two-qubit gates. However, with improved two-qubit gate fidelity, these quantum computers can handle complex molecules comprising dozens of atoms, such as molecule 4.

Empirically, there are \(0.7\times N^2\) Pauli strings in the PUCC Hamiltonian, which leads to approximately \(M=3\times 10^5\) terms when \(N=630\). These terms can be divided into three measurement groups. Assuming for each group K repeated circuits are executed for measurement, the expectation variance \(\epsilon ^2\) is approximately \(\frac{1}{K}\sum _j^M |\alpha _j|^2\). Thus, if we wish to achieve the measurement precision to \(\epsilon =0.01\) Hartree and \(|\alpha _j|\) is assumed to be 0.1 Hartree, the number of measurement shots K is \(10^7\) and the total number of shots for three measurement groups is approximately \(10^8\). On superconducting quantum computing platforms, the reset time is the bottleneck for circuit execution, which can be estimated as \(10^{-3}\) s. Thus, it takes \(10^5\) s to measure the molecular energy, the key step for VQE. Since computing the solvation energy requires only one-body reduced density matrix, the additional measurement cost can be neglected. The multiplicative factor for parameter optimization is not considered. If a set of accurate circuit initial parameters can be computed through classical preprocessing, such as quantum chemistry computation or machine learning56,57, we may conclude that using a single quantum processor it takes \(10^5\) s to compute the solvation energy. The 3K repeated circuits can be easily paralleled. In the optimal situation where 3K quantum processors are available for usage, the time cost for QM calculation can be reduced to \(10^{-3}\) s.

Covalent bond simulation

KRAS is a prominent target in cancer therapy due to its significant role in various cancers, and the G12C mutation has been its most frequent and consequential mutation. The Sotorasib, an innovative covalent inhibitor targeting this mutation, represents a paradigm shift to KRAS-related cancer treatment. We set up a QM/MM simulation framework for the target-inhibitor interaction, and chose the QM region carefully to cover the key atoms involved in the covalent bond formation (see Fig. 4 for a schematic exposition). We first run the QM/MM simulation on classical computers to get the baseline statistics, then move the QM energy computation to quantum computers and make sure that we can get comparable results. The same with the case study for prodrug activation, a (2e, 2o) active space approximation is employed to reduce the measurement cost, and the active space wavefunction is processed using 2 qubits.

KRAS and covalent inhibition

To establish a robust baseline for the later quantum computer adaptation, close supervision of the energy evolution of the QM region was conducted throughout the simulation, as shown in Fig. 5b. Complementarily, the MM region encompassed the larger protein environment, including water molecules and other cellular components, offering a realistic context for the interaction. The energy transitions, including the potential energy, the kinetic energy, and the system total energy, had been recorded, as shown in Fig. 5a.

A critical reason that inhibiting KRAS had been so difficult, and the inhibition of the KRAS G12C mutation had been so significant, is the possibility of designing small molecular inhibitors that specifically target the G12C mutation by forming a covalent bond between the target and the inhibitor. For this reason, it’s imperative that Sotorasib can form a stable bonded complex with the target, through covalent bonding. The bond length, bond angles, and dihedral angles around the covalent bond had also been closely monitored during the simulation, as shown in Fig. 5c, d.

We observed a specific and strong bond between Sotorasib and the target mutation, offering critical insights into the drug’s potential efficacy. This understanding is pivotal for the rational design of future inhibitors targeting similar mutations.

QM energy update using quantum computers

After establishing the QM/MM baseline, we then moved the QM computation first to a quantum emulator using TenCirChem, and then to a quantum computer. The kernel of our calculation is again the VQE algorithm. The MM region, represented as point charges, contributes a background potential to the Hamiltonian. The calculation of molecular forces is a common routine in classical computational chemistry. Recent attempts have been made to transfer the algorithm to quantum computing platforms58,59,60,61,62. In our work, the calculation is more complicated compared with previous studies, due to the active space approximation employed. In addition, our work is the first example of integrating quantum computed forces into a full-scale QM/MM simulation workflow. The details of the procedure are shown in the Methods section.

Considering the computational load, to check the soundness of the computation, we run the first 1600 steps of the simulation on a quantum computer as a sanity check, and the results closely follow the baseline QM/MM simulation, as can be observed in Fig. 6. We then moved some key steps of the QM/MM simulation to the quantum computer, to establish a QM/MM-QC hybrid simulation system. In Fig. 7a, the simulation is started on the quantum computer and continued on a classic computer; in Fig. 7b, the simulation is started on a classical computer, continued on a quantum machine, and subsequently moved back to the classic computer. Compared with the previous QM/MM simulation, we can see that the hybrid simulations have been able to closely follow the baseline trajectory, which gives us confidence that such hybrid simulations are a feasible use of the limited quantum computer computation powers.

The computational time cost comparison can be seen in Table 4. For classic QM/MM simulation, we utilized a high-performance system with dual Intel(R) Xeon(R) Gold 5220 CPUs (72 cores, 144 threads total, 2.20 GHz base frequency), augmented by six NVIDIA A100-PCIe GPUs with 40,960 MiB memory each. The system is supported by 385 GB of RAM, facilitating the handling of extensive computational workloads. Similar to the case study of prodrug activation, the time cost for quantum computers is larger than that for classical computers. To compute molecular forces, the two-body reduced density matrices need to be measured, so at each step, the time cost is approximately two times the time cost for single-point energy calculation in Table 3.

The insights gained from these QM/MM simulations are not just confined to the molecular interaction between Sotorasib and the KRAS(G12C) protein. They lay the groundwork for future computations on a quantum computer, promising to enhance the accuracy and speed of our drug discovery processes. This step towards quantum computing implementation represents a transformative progression in our research methodology, aligning with our ongoing efforts to integrate advanced computational techniques in drug discovery.

Similar to the prodrug activation case, here for the covalent bond simulation case, we provide an estimation of the quantum resource required for a fully correlated treatment of the QM region using the pUCCD circuit. The QM region is composed of 5 heavy atoms, which are translated to \(N=49\) orbitals with 6-31G basis set. Thus a correlated computation without active space approximation requires a quantum circuit with approximately 50 qubits and 588 Givens-SWAP gates. The total number of measurement shots is \(10^6\) and \(10^3\) seconds are required for an energy evaluation. Since all elements of one and two-body reduced density matrices are also available from the three groups of measurement, the wall time cost for the additional computation of molecular forces can be neglected.

Discussions

In this study, we have established a model pipeline that enables quantum computers to tackle real-world drug discovery problems. Specifically, we have addressed two crucial challenges of computer-aided drug design, computing reaction barriers and molecular dynamics simulation. Our pipeline combines quantum-classical hybrid computing platforms, leveraging the VQE framework on the quantum computing side to efficiently store and manipulate molecular wave functions. On the classical computing side, we employ the ddCOSMO solvation model to compute solvation energy and analytical CASCI force formula to compute molecular forces for QM/MM simulation, respectively. The interface between the quantum and classical computing sides relies on the one and two-body reduced density matrices.

To demonstrate the potential of our pipeline, we conducted two case studies using a superconducting quantum device. In the first case, we studied the Gibbs free energy profile for prodrug activation involving carbon-carbon bond cleavage under solvent conditions. The obtained reaction barrier and Gibbs energy change align well with previous experimental and theoretical studies. In the second case, we investigated a covalent inhibitor for KRAS(G12C) using QM/MM simulation. We closely monitored the evolution of energy and compared the time cost based on classical computers and quantum computers.

Based on the two cases, we provide evidence that our hybrid quantum computing pipeline has the potential to solve real-world drug design problems. However, it is important to note that the accuracy of VQE calculations and the resources consumed require further improvement. On the quantum hardware side, continuous efforts should be made to enhance gate fidelity and strive toward achieving error correction. In terms of quantum algorithms, advanced VQE additions such as neural networks or Clifford circuits63,64 can be explored to enhance the accuracy of the VQE circuit. While our study employed classical pre-optimization instead of parameter optimization on quantum computers due to associated overhead, the next step in the development of the VQE pipeline should involve better circuit parameter initialization and more efficient parameter optimization algorithms. This will enable the complete transfer of the pipeline onto quantum computers, further leveraging their computational power.

While there are plenty of works in leveraging quantum computing for drug discovery19,20,21,22,23,24, the focus of our pipeline is for tackling specific real-world drug design problems. We emphasize the use of a convenient, modular, and hybrid quantum pipeline for drug discovery, which will make it more accessible for drug design experts without a quantum computing background. Additionally, referencing established criteria in the drug design domain, our computational results indicate that they also fall within reasonable bounds. Furthermore, the quantum computing methodologies developed in this study have the potential to extend beyond the presented case studies of Sotorasib and \(\beta \)-lapachone. The integration of quantum computing into QM/MM simulations offers a versatile platform that can be adapted and scaled to address a wide range of molecular targets and complex biological interactions.

Methods

Quantum computing for molecular systems

The VQE algorithm uses a parameterized quantum circuit (PQC) \({|{\psi (\varvec{\theta })}\rangle }\) to construct a quantum state that approximates the ground state of the system. The parameters of the quantum circuit \(\varvec{\theta }\) are optimized to its optimal value \(\varvec{\theta }^*\) using a classical optimization algorithm, such as gradient descent or Newton’s method, to minimize the energy of the quantum state \(E(\varvec{\theta })= {\langle {\psi (\varvec{\theta })|\hat{H}|\psi (\varvec{\theta })}\rangle }\). According to the Rayleigh-Ritz variational principle, \(E(\varvec{\theta }^{*}) \ge E_{\text {ground}}\) and the equity is reached when \({|{\psi (\varvec{\theta }^*)}\rangle }\) is the ground state wave function. Thus, given an expressive PQC, \({|{\psi (\varvec{\theta }^*)}\rangle }\) is a good estimation of the ground state wave function.

For molecular systems, the ab initio Hamiltonian is written as

$$\begin{aligned} \hat{H} = \sum _{pq}h_{pq} \hat{a}^\dagger _p \hat{a}_q + \frac{1}{2}\sum _{pqrs}h_{pqrs}\hat{a}^\dagger _p \hat{a}^\dagger _q \hat{a}_r \hat{a}_s \ , \end{aligned}$$
(1)

where \(h_{pq}\) and \(h_{pqrs} = [ps|qr]\) are one-electron and two-electron integrals, and \(\hat{a}^\dagger _p, \hat{a}_p\) are fermionic creation and annihilation operators, respectively. For chemical systems, the VQE algorithm is composed of several steps. The first step is to calculate the integrals in the Hamiltonian under the molecular orbital basis. Then, the second-quantized fermion Hamiltonian is mapped to a spin Hamiltonian using fermion-qubit mapping, since quantum computers are built based on the spin model. In this work, we employ the parity transformation for saving two qubits

$$\begin{aligned} \hat{a}_j = (\hat{c}_j \otimes {|{0}\rangle } {\langle {0}|}_{j-1} - \hat{c}^\dagger _j \otimes {|{1}\rangle } {\langle {1}|}_{j-1} ) \otimes \bigotimes _{l={j+1}}^{N-1} \hat{X}_l \end{aligned}$$
(2)

Here \(\hat{c}\) is the qubit annihilation operator \(\frac{1}{2} (\hat{X}+i\hat{Y})\), and \(\hat{X}\), \(\hat{Y}\) and \(\hat{Z}\) are Pauli operators. The transformation ensures the preservation of the commutation and anti-commutation properties of fermion operators. After the fermion-qubit mapping, the Hamiltonian in Eq. (1) is transformed to a summation of the products of Pauli operators. More formally, the Hamiltonian can be written as \(\hat{H} = \sum _j^M \alpha _j \hat{P}_j\) where \(\alpha _j\) is the coefficient and \(\hat{P}_j\) is the product of Pauli operators. M is the total number of terms. Each \(P_j\) can be measured on a quantum computer and subsequently, the overall energy is obtained by taking the weighted summation.

The active-space approximation is employed to reduce computational cost and enhance accuracy. The approximation adopts the Hartree-Fock state as the baseline state and chooses an “active space” that is treated with a high-accuracy computational method. In classical computation, the high-accuracy method is usually full configuration interaction (FCI) or density matrix renormalization group (DMRG)65. In our case, quantum computers are employed to solve the problem with the VQE algorithm. The active space is usually constructed in the molecular orbital space. Most commonly, orbitals that have the closest energy with the HOMO and LUMO orbitals will be included in the active space. Meanwhile, the inner shell orbitals are treated at the mean-field level. Thus, this approximation is sometimes also called the frozen core approximation. Denote the set of frozen occupied spin-orbitals by \(\Lambda \). The frozen core provides an effective repulsion potential \(V^{{\rm eff}}\) to the remaining electrons

$$\begin{aligned} V^{{\rm eff}}_{pq} = \sum _{m \in \Lambda } \left( [mm|pq] - [mp|qm] \right) . \ \end{aligned}$$
(3)

The frozen core also bears the mean-field core energy

$$\begin{aligned} E_{{\rm core}} = \sum _{m \in \Lambda } h_{mm} + \frac{1}{2}\sum _{m, n \in \Lambda } \left( [mm|nn] - [mn|nm] \right) . \ \end{aligned}$$
(4)

The ab initio Hamiltonian with the active space approximation is rewritten as

$$\begin{aligned} \hat{H} = \sum _{pq}(h_{pq} + V^{{\rm eff}}_{pq}) \hat{a}^\dagger _p \hat{a}_q + \sum _{pqrs}h_{pqrs} \hat{a}^\dagger _p \hat{a}^\dagger _q \hat{a}_r \hat{a}_s + E_{{\rm core}}, \ \end{aligned}$$
(5)

where the indices p, q, r and s refer to spin-orbitals in the active space.

Quantum computation of solvation effect

The solvation effect is an important topic in classical computational chemistry66. The PCM model is one of the most popular methods to treat the solvation effect67, and its combination with VQE has been demonstrated based on a classical emulator68. The PCM model treats the solvent molecules as a continuous homogeneous medium with relative permittivity \(\varepsilon _s>1\). The solvent continuum is polarized by the solute molecule, and in turn, modifies the charge distribution of the solute molecule. More specifically, the molecule is considered to reside in a van der Waals molecular cavity defined as a union of spheres centered at the atoms

$$\begin{aligned} \Omega = \bigcup _{j=1}\Omega _j \ . \end{aligned}$$
(6)

The relative permittivity \(\varepsilon (\textbf{r})=1\) for \(\textbf{r} \in \Omega \) and \(\varepsilon (\textbf{r})=\varepsilon _s\) for \(\textbf{r} \notin \Omega \). The additional energy contribution of the electrostatic interaction is

$$\begin{aligned} E_s = \frac{1}{2}\int _{\mathbb {R}^3} \rho (\textbf{r}) V_r(\textbf{r}) d \textbf{r} \, \end{aligned}$$
(7)

where \(\rho \) is the charge distribution of the solute molecule and \(V_r\) is the reaction-field potential by the dielectric continuum. The reaction field \(V_r\) is modeled by a single layer of charges \(\sigma (\textbf{s})\) on the cavity surface \(\Gamma = \partial \Omega \)

$$\begin{aligned} V_r(\textbf{r}) = \int _\Gamma \frac{\sigma (\textbf{s})}{|\textbf{r} - \textbf{s}|} d\textbf{s} \end{aligned}$$
(8)

In this work, we use quantum computers to model the solute molecule and use classical computers to calculate the solvent potential \(V_r\) that is added to the Hamiltonian of the solute molecule.

In our work, we employ one of the variants of PCM, namely the conductor-like screening model (COSMO)69, to model the solvation effect on real quantum devices. COSMO has become very popular due to its ease of implementation, numerical stability, and insensitivity to outlying charge errors. The COSMO method treats the solvent continuum as a conductor, which simplifies the calculation of \(V_r\), and scales \(E_s\) by a constant factor \(f(\varepsilon _s)\) to take into account the non-conductor nature of the solvents. In the large \(\varepsilon _s\) limit \(f(\varepsilon _s)\) should converge to 1. Based on the conductor model, the surface charge \(\sigma (\textbf{s})\) is obtained by solving the integro-differential equation numerically

$$\begin{aligned} \begin{aligned} -\nabla ^2 V_r(\textbf{r})&= 0 \quad&\text {for} \ \textbf{r} \in \Omega \ , \\ V_r(\textbf{s})&= - \Phi (\textbf{s}) \quad&\text {for} \ \textbf{s} \in \Gamma \ . \end{aligned} \end{aligned}$$
(9)

Here \(\Phi (\textbf{r})=\int _{\mathbb {R}^3} \frac{\rho (\textbf{s})}{|\textbf{r} - \textbf{s}|} d\textbf{s}\) is the potential generated by \(\rho \) in vacuo. The domain decomposition algorithm is one of the most popular methods to solve Eq. (9), which offers both high accuracy and high efficiency70,71. Thus the method is dubbed as ddCOSMO.

The input for solving Eq. (3) is the solute charge distribution \(\rho \). \(\rho \) can be computed from the one-body reduced density matrix of the solute molecule. Similar to the case of computing molecular forces, one-body reduced density matrix can be measured on a quantum computer after the main VQE iteration. After \(\sigma (\textbf{s})\) is determined, the generated potential \(V_r\) is added to the Hamiltonian of the solute molecule and effectively modifies \(h_{pq}\). Then, VQE is performed based on the updated Hamiltonian after active-space reduction and yields the updated one-body reduced density matrix. In quantum computer simulation, we observed that the effect of the iteration is smaller than the measurement uncertainty. Therefore in our quantum computations, we forego iteration and perform only a single calculation.

Quantum computation of molecular forces

Most straightforwardly, the molecular forces can be calculated with numerical finite-difference over the nuclear coordinates. Analytical computation, if available, is preferred over such an approach, since analytical computation is both more efficient and accurate. In our approach, the HF molecular orbital coefficients \(\textbf{C}\) are determined before the VQE calculation. As a result, the energy is not stationary to the variation of orbital coefficients \(\frac{\partial E}{\partial \textbf{C}} \ne 0\) and this term will contribute to the force expression72.

In the general form, the force expression can be obtained by chain-rule differentiation as73

$$\begin{aligned} \begin{aligned} \frac{\textrm{d}E}{\textrm{d}x}&= -f_\text {nuc} - f_{\text {elec}} + \sum _{\mu \nu } (\mu '|\nu ) R_{\mu \nu } + 2\sum _{\mu \nu } (\mu '|h|\nu ) \left( D^I_{\mu \nu } + D^A_{\mu \nu } \right) \\&\quad + 4\sum _{\mu \nu \lambda \sigma } [\mu '\nu |\lambda \sigma ] \times \Big ( 2D^I_{\mu \nu } D^I_{\lambda \sigma } - \frac{1}{2} D^I_{\mu \lambda } D^I_{\nu \sigma } - \frac{1}{2} D^I_{\mu \sigma } D^I_{\nu \lambda } \\&\quad + 2D^I_{\mu \nu } D^A_{\lambda \sigma } - \frac{1}{2} D^I_{\mu \lambda } D^A_{\nu \sigma } - \frac{1}{2} D^I_{\mu \sigma } D^A_{\nu \lambda } \\&\quad + 2D^A_{\mu \nu } D^I_{\lambda \sigma } - \frac{1}{2} D^A_{\mu \lambda } D^I_{\nu \sigma } - \frac{1}{2} D^A_{\mu \sigma } D^I_{\nu \lambda } + P^A_{\mu \nu \lambda \sigma } \Big ) \ . \end{aligned} \end{aligned}$$
(10)

Here, we have switched from molecular orbital basis to atomic orbital basis, and we use \(\mu , \nu , \lambda , \sigma \) instead of pqrs as orbital indices to indicate the different basis. In Eq. (10), \( -f_\text {nuc}\) and \((- f_{\text {elec}} )\) are the nuclear and electronic Hellmann-Feynman force by \({\langle {\psi |\frac{\partial \hat{H}}{\partial x}|\psi }\rangle }\). Since electron repulsion is invariant to x, \(\frac{\partial \hat{H}}{\partial x}\) is a single-body operator and \({\langle {\psi |\frac{\partial \hat{H}}{\partial x}|\psi }\rangle }\) can be computed from one-body reduced density matrix. \(\sum _{\mu \nu } (\mu '|\nu ) R_{\mu \nu }\) represents the “density force” contribution74 which stems from the variation over the orbital coefficients \(\textbf{C}\). \(R_{\mu \nu }\) are the matrix elements of \(\textbf{R}=\textbf{C}\varvec{\epsilon }\textbf{C}^\dagger \) where \(\varvec{\epsilon }\) are HF molecular orbital energies. The remainder of Eq. (10) represents the “integral force” contribution74 which stems from the variation over the basis sets. The primed atomic orbital index in the integrals denotes the derivative of the primed atomic orbital with respect to x. \(\textbf{D}^I\) and \(\textbf{D}^A\) are the one-body reduced density matrices for the inactive and active space respectively and \(\textbf{P}^A\) is the two-body reduced density matrix for the active space. Thus, in order to compute the molecular forces on quantum computers, the key is to measure the one- and two-body reduced density matrices of the active space.

We note that it is possible to rewrite Eq. (10) as the expectation of a “force operator” that is formally similar to the ab initio Hamiltonian Eq. (1)60. As a result, measurement grouping methods developed for energy measurement can be employed directly to reduce the measurement cost75. In this study, we do not consider this optimization for ease of implementation.

Quantum computation details

We employ the hardware efficient \(R_y\) ansatz as the parameterized quantum circuit for both the covalent bond simulation and the prodrug activation optimization. The \(R_y\) ansatz is suitable for the simulation of chemical systems since it enforces real amplitudes76,77,78,79. Compared with the unitary coupled-cluster family of ansatz80, hardware-efficient ansatz requires shorter circuit, which ensures that the effect of quantum gate noise does not significantly deteriorate our result. The \(R_y\) ansatz is composed of interleaved layers of single-qubit \(R_y\) rotation gates and two-qubit CNOT gates

$$\begin{aligned} {|{\psi (\varvec{\theta })}\rangle }_{R_y} := \prod _{l=k}^1 \left[ L_{R_y}^{(l)}(\varvec{\theta }) L_{{\rm CNOT}}^{(l)} \right] L_{R_y}^{(0)}(\varvec{\theta }) {|{\phi }\rangle }, \end{aligned}$$
(11)

where k is the total number of layers. In this study, we employ \(k=1\) to reduce the negative impact of the quantum gate noise. The layers are defined as

$$\begin{aligned} \begin{aligned} L_{{\rm CNOT}}^{(l)}&= \prod _{j=N-1}^{1} \text {CNOT}[j, j+1], \ \\ L_{R_y}^{(l)}(\varvec{\theta })&= \prod _{j=N}^{1} R_y[{j}](\theta _{lj}). \ \end{aligned} \end{aligned}$$
(12)

Here, \(\text {CNOT}[j, j+1]\) represents CNOT gate acting on the jth and \((j+1)\)th qubit, and \(R_y[{j}]\) is \(R_y\) rotation gate acting on the jth qubit. N is the total number of qubits. In our superconducting platform, the \(R_y\) gates are compiled into native \(R_z\) gates.

The classical emulation of quantum computers is performed using the TenCirChem50 and TensorCircuit81 package. The circuit parameters are pre-optimized on a classical simulator employing the L-BFGS-B optimizer in SciPy82 and the parameter-shift rule for gradients83,84. Due to its efficient architecture, in TenCirChem it takes only a few lines of code to transfer the calculation workflow from classical emulators to real quantum devices. The solvation energy and molecular forces are calculated classically, after obtaining reduced density matrices on quantum computers, via PySCF85.

Classical computation details

Methods for obtaining the optimal geometry configuration

In our C–C bond cleavage based prodrug activation strategy, we should first obtain the optimal geometric configurations of the corresponding molecules to compute the Gibbs free energy profiles. DFT calculations were performed using Gaussian 16. Specifically, we employed the B3LYP functional within DFT, chose the 6-31+G(d) basis set for the molecular orbitals and used Solvation Model Based on Density (SMD) as the solvation model. Throughout the optimization process, we maintained constant connectivity between atoms and applied Grimme’s D3 dispersion correction.

System preparation for covalent bond simulation

Our simulation started with the intricate system preparation using Amber software suite86, especially packages including pdb4amber, antechamber, parmchk2, tleap, etc, defining the fundamental molecular and environmental parameters. This initial setup was crucial in modeling the drug-target complex accurately, since our simulation involves the modified non-standard Amino Acid, in which the Sotorasib molecule had been glued to the mutated cystein on KRAS. The general process includes the preprocessing of the protein structure, its split into different parts, its format conversion, force field parameters generation, and finally collecting the parts into a complete system ready for the simulation.

In our simulations, QM region was carefully chosen to include the critical reactive atoms of the KRAS(G12C) mutation. (See Fig. 4b) Five atoms that are key to the stability of the covalent bond (SG on the cysteine side, and C18, C17, O16 and C15 on the Sotorasib side), have been included in the QM region. A covalent bond is formed between the C18 atom on Sotorasib and the SG atom on Cysteine. Two other atoms, C17, which is covalent bonded to C18, and C15, which is in turn covalently bonded to C17, are also included. Another atom, O16, that is sterically positioned close to SG and might consequentially affect its atom position and bonding, is also included. This meticulous selection allowed for a detailed analysis of the electronic and structural changes occurring upon drug binding. We also took some inspiration from87 on how to set up the system for covalent bond simulation.

We employed OpenMM88 for conducting the molecular dynamics aspect of our study, while PySCF provided the quantum mechanical calculations essential for simulating the covalent interactions with high precision. This combination also guarantees a smooth transition to the later quantum computer implementation, since our quantum simulation and real machine adaption will be based on TenCirChem and PySCF.

A crucial aspect of our simulation was the calibration of parameters such as temperature and pressure to replicate physiological conditions accurately, and considerable care had been taken to formulate a customized Langevin integrator, to cater to the special energy communications between the QM region and the QC region of the system. This calibration, along with the integration of a custom force field, enabled us to capture the nuanced quantum mechanical energies and forces at play during the formation of the covalent bond between Sotorasib and the KRAS(G12C) mutation.

Figure 1
figure 1

Schematic demonstration for the generalizable quantum computing pipeline for drug discovery. (a) The standard workflow of computer-aided drug design (CADD). (b) The module detailing the quantum computing process involved.

Figure 2
figure 2

Schematic illustration of components(4, 5, 6 and TS) involved in the process of the C-C bond cleavage-based activated drug release. For ease of comparison, we have adopted the molecular numbering from the original work of carbon-carbon bond cleavage based prodrug activation strategy.

Figure 3
figure 3

The 2-qubit quantum circuit used in this study. The state of the quantum circuit is adjusted by 4 parameterized \(R_y\) gates.

Figure 4
figure 4

Left: the KRAS-Sotorasib bonded structure. The cystein-Sotorasib part is shown as sticks while the rest of the system as ribbons. Right: choosing the QM region. The atoms labeled SG, C18, C17, O16, and C15 are chosen.

Figure 5
figure 5

(a) Energy evolution during the MD simulation. The energy stabilized after an initial equilibrating phase. (b) Monitoring the QM region energy evolution. (c) The covalent bond is remarkably stable during the whole simulation process. The bond length fluctuates around 1.86 Angstrom with a standard deviation less than 0.1 Angstrom. The bond length is aligned with previous literature discoveries, and the small deviation is a good indication of the bonding stability.(d) Visualising the bond angle variations during the simulation (CB-SG-C18 and SG-C18-C17).

Figure 6
figure 6

Energy transition of the classical simulation(in blue), the noiseless quantum emulation (in orange), and the quantum computer simulation(in green). The fluctuation falls neatly in the permissible deviations of the molecular dynamics simulation.

Figure 7
figure 7

Moving key computation to the quantum computer. Left: the simulation is started on a quantum computer, and then moved to a classical machine. Right: the simulation is started on a classical machine, moved to a quantum computer halfway, and then moved back to a classical computer.

Table 1 Comparison of the energy barrier \(\Delta G^{\ddag }\) and Gibbs free energy change \(\Delta G\), measured in kcal/mol, for the C–C bond cleavage reaction using classical and quantum computational methods.
Table 2 Gibbs free energy for the molecules studies in this work by classical and quantum computational methods.
Table 3 Comparison of computational wall times for classical computing (CASCI) and quantum computing (VQE) on solving the active space of molecule 4, 5, 6, and TS.
Table 4 Comparison of simulation times for three different experiments: a classic QM/MM simulation, a noiseless quantum emulation, and a quantum computer simulation.