Integrative residue-intuitive machine learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR

Chen, Xin; Wang, Kexin; Chen, Jianfang; Wu, Chao; Mao, Jun; Song, Yuanpeng; Liu, Yijing; Shao, Zhenhua; Pu, Xuemei

doi:10.1038/s41467-024-52399-y

Download PDF

Article
Open access
Published: 16 September 2024

Integrative residue-intuitive machine learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR

Nature Communications volume 15, Article number: 8130 (2024) Cite this article

7924 Accesses
20 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Allosteric drugs offer a new avenue for modern drug design. However, the identification of cryptic allosteric sites presents a formidable challenge. Following the allostery nature of residue-driven conformation transition, we propose a state-of-the-art computational pipeline by developing a residue-intuitive hybrid machine learning (RHML) model coupled with molecular dynamics (MD) simulation, through which we can efficiently identify the allosteric site and allosteric modulator as well as reveal their regulation mechanism. For the clinical target β2-adrenoceptor (β2AR), we discover an additional allosteric site located around residues D79^2.50, F282^6.44, N318^7.45 and S319^7.46 and one putative allosteric modulator ZINC5042. Using Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and protein structure network (PSN), the allosteric potency and regulation mechanism are probed to further improve identification accuracy. Benefiting from sufficient computational evidence, the experimental assays then validate our predicted allosteric site, negative allosteric potency and regulation pathway, showcasing the effectiveness of the identification pipeline in practice. We expect that it will be applicable to other target proteins.

Integrative machine learning and molecular simulation approaches identify GSK3β inhibitors for neurodegenerative disease therapy

Article Open access 01 July 2025

Neural relational inference to learn long-range allosteric interactions in proteins from molecular dynamics simulations

Article Open access 29 March 2022

In Silico discovery of novel androgen receptor inhibitors for prostate cancer therapy using virtual screening, molecular docking, and molecular dynamics simulations

Article Open access 11 August 2025

Introduction

Allostery represents a critical biological mechanism wherein distant sites within a biomolecule undergo fine-tuned structural and dynamic alterations in response to specific perturbations. Allosteric regulation plays a vital role in diverse biological processes¹. Allosteric drugs can modulate the protein activity by means of non-competitive binding in the allosteric site², thus yielding higher selectivity, specificity, and lower off-target toxicity. Allosteric drugs have been approved for the treatment of various diseases, including cancers, neuropsychiatric disorders, and immune diseases, which offer a new paradigm for modern drug development^3,4. Despite the fascinating advantages of allosteric drugs, their development remains a great challenge, in particular allosteric site identification.

Allostery is an intrinsic property of the protein conformational landscape while the allosteric sites are often cryptic, generally only opening in specific conformational ensembles that may not have an associated resolved 3D structure⁵. Complicated conformational changes often lead to difficulties in discovering the allosteric site experimentally⁶. MD simulation can provide target conformational changes over time with high resolution in full atom detail, thus being considered one of the best approaches to identify and characterize cryptic binding sites⁶. There has been some exploration in using the MD technique to successfully identify the allosteric site^7,8,9. A crucial step for the success of MD-based methods is to mine specific conformational states with an open allosteric pocket from the massive MD conformational space, as it is a prerequisite for subsequent detection of the allosteric site.

The opening of the allosteric site generally occurs on a long time scale. With advancements in computing power and enhanced sampling techniques, MD simulation can more sufficiently sample conformational changes involving the open cryptic pocket, yet it simultaneously leads to data explosion. In this case, manual analysis is very difficult in a complex environment with a risk of overlooking some subtle but important conformational changes, which is also generally restricted by conscious human bias. Thus, we face another difficult challenge: how to efficiently capture the conformational state involving the allosteric site? In existing MD works, the free energy analysis serves to identify low-energy states along with coordinates pre-defined. In addition, the Markov state model (MSM) is used to find important intermediates in certain processes of interest like activation, ligand binding, or disassociation. These low-energy states and intermediate states are taken as target conformations to identify the allosteric sites. Despite some successes achieved, the opening of the allosteric site does not necessarily correspond to the low-energy states or the MSM macro-states in terms of the allostery nature. Furthermore, the predefined coordinates inferred from domain knowledge would confine the discovery of allosteric sites due to the highly intricate mechanism of the allostery, which has not been fully elucidated. Thus, there remains an unmet need to develop unbiased and general methods to efficiently identify the conformational states with the open allosteric site from the vast conformational space.

Given that the nature of allostery is a residue-driven conformational transition^5,10, it is reasonable to hypothesize that the residues, which play a key role in conformational changes and coincide with or couple to structural elements in the functional site, are likely to form viable cryptic sites. Theoretically, if we can develop a computational method that can identify important residues determining the signature fluctuations and detect whether they can communicate with functional domains of the protein, the identification efficacy should be substantially improved. Inspired by this, we aim to couple machine learning (ML) into MD to develop an effective identification framework, as ML possesses a powerful capacity in mining causality underlying massive and complex data^11,12. Although ML has been successfully applied in MD fields^13,14 to generate force fields, reduce dimensionality, and estimate free energy surfaces, its application in conformational analysis is very limited. Several works attempted to use traditional ML models with relatively simple model architecture and complicated feature engineering to distinguish the conformations between the ligand-bound state and the apo one for the MD trajectory^15,16, yet the category labels need to be known as an essential prerequisite. In addition, these ML models lack interpretability, thus residue information involving the allostery cannot be obtained. Consequently, it is inaccessible to use the existing ML classification models to identify the conformation state with the opening of a cryptic allosteric site.

GPCRs are the largest and the most successful drug targets, with approximately 35% of FDA-approved drugs targeting them. However, the highly conserved orthosteric site of GPCRs poses challenges in developing subtype-selective orthosteric ligands, while the allosteric modulators with higher selectivity and specificity offer an attractive avenue for GPCR drug development¹⁷. β2AR plays a vital role in cardiovascular and respiratory physiology and thus is a clinically crucial target for widely prescribed drugs like beta-blockers and beta-agonists. Drugs targeting the β2AR orthosteric site often cause cross-reactivity and lead to various therapeutic side effects, which have garnered increasing attention in clinical arenas^18,19. Thus, developing new drugs targeting the β2AR allosteric site holds great significance. Positive allosteric modulators (PAM) of β2AR have therapeutic value for diseases like asthma and chronic obstructive pulmonary disease⁸, while negative allosteric modulators (NAM) have the potential to treat hypertension, arrhythmia, and heart failure²⁰. Unfortunately, only six β2AR allosteric sites have been reported so far^{8,20,21,22,23,24}, and no allosteric drugs for β2AR have been approved. Therefore, the identification of additional β2AR allosteric sites and allosteric ligands is highly desired.

In this work, to circumvent the technical obstacles above, we explore a residue-intuitive hybrid machine learning (named RHML) framework by combining unsupervised clustering and an interpretable deep learning multi-classification model. With the framework, we can address the absence of category labels and achieve accurate classification with residue-level interpretability, thus identifying important residues involving the allosteric site. After we identify the putative allosteric site and screen-related modulators, we further probe their communication with functional domains like the orthosteric site and the active region. Our objective is to further pre-evaluate the potential as the allosteric site/modulator and to reveal their regulation mechanism, which is key for ensuring the prediction success rate and rationally engineering allostery in protein, yet often being overlooked in previous methods of allosteric drug design. In order to validate the efficiency of the identification strategy, we select the β2 adrenergic receptor (β2AR) of the G protein-coupled receptor (GPCR) family as a case study. We discovered an allosteric site and a negative allosteric modulator (ZINC5042) for β2AR, which we validate by cell-based function experiments.

Results

The proposed identification pipeline is illustrated by Fig. 1. Here, extensive gaussian accelerated molecular dynamics (GaMD) simulations are first performed to enhance sampling in order to construct a sufficient conformation space (Fig. 1a). With the conformation space, a residue-intuitive hybrid machine learning (RHML) framework is constructed, which is composed of an unsupervised clustering and an interpretable convolutional neural network (CNN) based multi-classifier (Fig. 1b). By using RHML, we can determine the optimal number of clusters (labels) and the conformation state with the opening of the allosteric site (Fig. 1c). Then, the allosteric site is identified by FTMap coupled with the LIME interpreter of RHML (Fig. 1c). The potential allosteric modulators are screened from two compound datasets, based on the identified allosteric site (vide Fig. 1d). As illustrated by Fig. 1e, the regulation effect of the allosteric site/drug and its regulation pathway are further probed by conventional MD (cMD), binding energy analysis, structural analysis, and regulation pathway analysis. Finally, experimental validation is performed by cAMP accumulation assays, β-arrestin recruitment assay and site-directed mutagenesis experiments (vide Fig. 1f). In total, this work involves six systems, 15-μs GaMD simulations and 22.5-μs cMD simulations. Supplementary Table 1 lists detailed MD information for each system.

**Fig. 1: Overview of investigation framework.**

Construction of the residue-intuitive hybrid machine learning (RHML)

In order to capture the conformation state with a cryptic allosteric site from the vast unknown conformation space, we need to perform two tasks. One is to first classify the conformations according to structure differences between conformational classes. The other is to determine the important residue fluctuations of which class is associated with the function region, as the allostery is a functional mechanism driven by the residue conformation transition. To this end, an unsupervised classification task was first utilized to label the conformation categories in the unknown trajectory space generated by GaMD. Herein, we selected unsupervised clustering, which has been widely served as an auto-labeling strategy^25,26. With the labels obtained, we further trained a supervised classification model to identify the conformation state with the opening of the allosteric site. In the two tasks, we need to address two main technical challenges. One is how to determine the optimal number of clusters in the unsupervised clustering, and the other is how to obtain the residue-level interpretability in the supervised classification model. Thus, a residue-intuitive hybrid machine learning framework (RHML in Fig. 1b) was constructed by combining an unsupervised clustering and a supervised classification model. Herein, the k-means algorithm was adopted, as it has been considered as a popular and effective method for MD trajectory clustering²⁷. For the supervised model, an interpretable CNN-based multi-classification model was exploited to achieve accurate classification with the capacity to identify important residues deciding the classification result (vide Fig. 2). In the deep learning model, the pixel map representation was proposed to avoid hand-feature engineering with the risk of information loss in conformation representation. Accordingly, the convolution neural network with powerful learning capacity on the image was utilized to realize accurate classification in terms of the category labels inferred from the unsupervised clustering (Fig. 2a). More importantly, we explored an interpreter based on the locally linear approximation paradigm (named as LIME interpreter) to address the black-box limitation of deep learning in interpretability. Based on the interpreter, we could further identify key residues deciding the classification result, through which the conformation state with the putative allosteric site can be captured (Fig. 2b). Technical details regarding the interpretable CNN-based multi-classification model are described in Methods.

**Fig. 2: Architecture of the interpretable CNN-based multi-classification model proposed by the work.**

To ensure a reliable classification result and a rational interpretation, a high prediction accuracy is required for the supervised classifier. Thus, the accuracy of the classification model can serve as a feedback metric to determine the optimal number of categories for the unsupervised clustering (Fig. 1b). The optimal number will act as the final labels of the classification model to identify the key residues with the aid of the LIME interpreter.

A conformation ensemble with putative allosteric site is identified by RHML for β2AR

To cover the opening of cryptic pockets, five independent 3-μs GaMD simulations were carried out for the inactive β2AR bound by the agonist norepinephrine (NE) to generate an extensive ensemble of receptor conformations, through which 150,000 conformations from the five trajectories were extracted to construct data sets for machine learning (see “Methods” for more details). The conformations of every trajectory were first clustered based on the root mean square deviation (RMSD) of the receptor backbone atoms excluding the highly flexible ICL3 region (residue numbers: 231–262). In the text, we only present the result from one trajectory (labeled as traj1) with a putative allosteric site. The other four trajectory results and related discussion are placed in Supplementary Figs. 1, 2 which do not involve new allosteric sites. To determine the optimal number of clusters k, which is also an open and challenging problem for unsupervised clustering, we considered three clustering evaluation indices (SSR/SST, pSF, DBI) to initially estimate k values from 2 to 7 (Fig. 3a). SSR/SST represents the explained variance, and the value closer to 1 indicates better clustering. As reflected by the green line in Fig. 3a, SSR/SST increases gradually with increasing k, but the rise becomes weak after a critical value. Using the elbow method²⁸ to identify the point of maximum curvature in the curve, the optimal number of clusters can be determined to be k = 3. pSF, a metric measuring separation between all the clusters, suggests that larger values correspond to better clustering. The red line in Fig. 3a shows the rise of pSF with increasing k values, favoring k = 7 or higher as the best choice. DBI measures similarity within and between clusters. Smaller DBI values imply better clustering. It turns out that k = 2 has the smallest DBI (DBI = 0.522), while k = 3 (DBI = 0.526) is very close to k = 2. In other words, both k = 2 and k = 3 can be taken as reasonable choices (yellow line in Fig. 3a). The inconsistency of the optimal cluster number between different clustering metrics is a common phenomenon, as there may not be a definite optimal k value for complex data. Thus, the choice of the k value depends on balancing different validity indices and considering specific research purpose²⁹. Herein, we referenced the classification accuracy of the CNN-based model. Figure 3b shows the CNN-based classification accuracy for different numbers of clusters. We first excluded k = 6 and k = 7 due to low classification accuracy (<0.8), which would drop the reliability of our LIME interpreter. Similarly, k = 5 was not considered due to its poor performance in the DBI index. Compared with k = 2 and k = 4, both SSR/SST and DBI indices favor k = 3. Furthermore, our DL-based classification model also achieves a prediction accuracy of 0.903 ± 0.004 at k = 3. Taken together, we used the labels of the three categories to train the interpretable CNN-based classifier, in which the LIME interpreter identified important residues deciding the classification result, as shown in Fig. 3c–h.

Fig. 3: Performance of the RHML model and important residues identified by the LIME interpreter for *traj1.*

It can be seen from Fig. 3f, h that the important residues deciding the cluster0 and cluster2 mainly distribute at the extracellular end of TM6 and TM7 as well as the extracellular loops (ECL2 and ECL3). These regions were already revealed to involve an allosteric site of β2AR reported²³, demonstrating the effectiveness of our RHML for identifying the allosteric site. Since our objective is to discover additional allosteric sites, the reported site was not considered for further investigation. Interestingly, for cluster1, important residues identified distributed in the middle and near the intracellular end of TM6 and TM7, including key residues such as F282^6.44, N318^7.45, N322^7.49, and P323^7.50, which were revealed to be molecular switches in the receptor activation (vide Fig. 3d, g). The observation suggests that the cluster1 undergoes specific conformational changes in these important regions associated with the activity of β2AR, implying its functional potential. Given that allostery is a functional mechanism involving the GPCR activity, we selected representative conformations from cluster1 to further identify allosteric sites by using FTMap.

One additional allosteric site identified by FTMap coupled with the RHML interpreter

FTMap is an energy-based method for identifying binding sites, which has been accepted as an effective tool to predict potential allosteric sites within the helical regions of GPCRs³⁰. We selected two representative conformations (labeled as Conf1 and Conf2) from cluster1, which can account for 70% of conformations, to perform the site mapping by means of FTMap. In the two conformations, FTMap detects more than ten consensus sites (CSs), as shown in Supplementary Fig. 3. In fact, how to effectively identify the sites with potential allosteric function has been a difficult task for various pocket identification tools. Since the cryptic allosteric sites are exposed to conformational changes, it is reasonable to assume that the pockets near the important residues, which decide the conformational states and associate with function regions, will have a high possibility of acting as the allosteric site. Thus, the result from the LIME interpreter of our RHML framework was utilized to facilitate the allosteric site identification. It was found that one allosteric site in Conf1 identified by FTMap is comprised of CS0, CS1, CS2, and CS6, while the identified site in Conf2 consists of CS0, CS3, and CS4. These probe clusters are in proximity to some important residues revealed by our LIME interpreter (vide Supplementary Fig.3). Table 1 lists the allosteric site residues identified for Conf1 and Conf2. It can be seen that the majority of residues are the same for the two conformations and only several residues are different due to the flexibility of the site. Thus, they represent the same binding site despite some differences in shape and size, as reflected in Fig. 4a, b. The binding site is located in the middle of the protein helical bundle and near to the sodium binding site (see Supplementary Fig. 4 for details). In addition, the pocket includes I121^3.40, F282^6.44, and S319^7.46, which play important roles in regulating the activity of β2AR²³. Taken together, it can be expected that drugs targeting the binding site most probably modulate the β2AR signaling and function, implying high potential as an allosteric site. It is noted that the allosteric site does not open in active and inactive crystal structures of β2AR, as evidenced by Supplementary Fig. 5. More interestingly, the cryptic allosteric site has not been reported for other GPCRs^{31,32,33,34,35}.

Table 1 The allosteric site residues predicted the two representative conformations (Conf1 and Conf2) of the cluster1 category

Full size table

**Fig. 4: Allosteric site and drug screening strategy.**

Screening potential allosteric modulator by virtual screening and MM/GBSA

Virtual screening has been successfully employed to identify allosteric modulators, including those targeting GPCRs^36,37. As accepted, protein flexibility is crucial for structure-based drug design, and it was reported that multi-conformational virtual screening with two or three conformations of the target could improve the final enrichment and chemical diversity of the hit compounds³⁸. As outlined above, the putative allosteric sites identified exhibit some differences in shape and size between Conf1 and Conf2 due to pocket flexibility. Therefore, we conducted the multi-conformational virtual screening based on the two conformations, as depicted in Fig. 4c. The ligand set is composed of two datasets (Diverse-lib and Drugs-lib) obtained from MTiOpenScreen³⁹. Diverse-lib consists of 99,288 chemically diverse molecules suitable for screening novel drug scaffolds. Drugs-lib contains 4574 purchasable approved drug molecules, which facilitates drug repurposing with the advantages of reduced time and costs in drug development. In total, the ligand set contains 103,862 ligand molecules after a series of operations, for example, removing redundancy, evaluating drug-likeness, filtering toxic groups, and analyzing chemical diversity. Then, a preliminary screening was performed using the MTiOpenScreen platform. The top 3000 molecules from each conformation underwent further docking evaluations using Autodock 4.2. The docking score (AD4.2 energy) was used for ranking, along with visual inspection (see Supplementary Information for details) to exclude potential high-ranked false positives. Finally, four hit compounds (Fig. 4c) were selected from the top 20 compounds. Figure 4d shows predicted binding modes for the four ligands with β2AR. Interestingly, the compound ZINC5042 from Drugs-lib exhibits good scores both in the two conformations, as evidenced by Supplementary Table 2. As the MD simulation combined with the MM/GBSA calculation can improve the binding affinity prediction of poses obtained from docking protocols⁴⁰, we selected the conformation with the best docking score for each of the four ligands to perform a short 120-ns cMD simulation and used the last 10 ns of each MD trajectory to conduct MM/GBSA binding free energy calculation. As shown in Fig. 4c, the MM/GBSA binding energies indicate that the four hit compounds all can stably bind to β2AR. Practically, ZINC11681534, with the weakest binding energy (− 33.35 kcal/mol), was already reported to be a β2AR antagonist²³, implying that the other three compounds with better binding affinity may have potential efficacies. Herein, we selected ZINC5042 with the highest affinity as a representation of the promising compounds to verify our design strategy.

Interaction mode between the allosteric ligand ZINC5042 and the receptor

To more reliably estimate the interaction of ZINC5042 with the receptor, we extended the simulation time of the β2AR-ZINC5042 complex to three independent 1.5-μs cMD simulations (see Supplementary Fig. 6 for RMSD). Based on the last 100-ns equilibrium trajectory, we calculated their binding free energies by using MM/GBSA and decomposed the energy into the corresponding residues. As reflected by Fig. 5a, the hotspot residues to the ZINC5042 binding distribute in TM2, TM3, TM6, and TM7 of β2AR. Figure 5b illustrates detailed interactions between ZINC5042 and the hotspot residues. It can be seen that ZINC5042 is bound deep within the transmembrane helix bundle mainly through polar and van der Waals interactions. Notably, the residue D79^2.50 contributes significantly to the ligand binding by forming an essential salt bridge with the polar head of ZINC5042. Another hotspot residue, S319^7.46, forms hydrogen bonding with the alcohol hydroxyl group of the ligand’s polar head, further stabilizing the ligand-receptor complex. The chrysene moiety of ZINC5042 occupies a hydrophobic pocket composed of several hydrophobic hotspot residues (L75^2.46, L124^3.43, I278^6.40, F282^6.44, N322^7.49, and P323^7.50), forming extensive van der Waals interactions, in turn contributing to the overall stability of the complex. It is noted that some functional residues of the allosteric pocket revealed above, such as D79^2.50, S319^7.46, F282^6.44, N322^7.49, and P323^7.50, devote important contribution to binding ZINC5042, implying a functional potential of ZINC5042.

**Fig. 5: Hotspot residues for the interaction of β2AR with the ZINC5042 ligand in the β2AR-ZINC5042 system and binding energies of β2AR with three ligands in five systems.**

The allosteric modulator weakens the binding of orthosteric agonist

In order to estimate the allosteric potency of ZINC5042, we first examined its impact on the orthosteric ligand and the receptor. To this aim, two endogenous agonists of β2AR with to some extent differences in signaling, i.e., NE and L-epinephrine (ALE)⁴¹, were considered in the work in order to provide more sufficient evidence. Three independent 1.5-μs cMD simulations (See Supplementary methods for simulation details) were carried out for each of the five complex systems, including β2AR-ZINC5042, β2AR-NE, β2AR-ALE, β2AR-NE-ZINC5042, and β2AR-ALE-ZINC5042. RMSD values show that the five systems reach equilibrium, as reflected by Supplementary Fig. 6. MM/GBSA was used to calculate their ligand-receptor binding free energies based on the last 100 ns trajectory. As shown in Fig. 5c, without binding the allosteric modulator ZINC5042, the binding free energies between the orthosteric ligands and the receptor are − 21.34 ± 2.19 kcal/mol for NE and − 31.73 ± 2.77 kcal/mol for ALE. The agonist ALE exhibits a higher affinity than NE, consistent with the experimental results of the inhibitory constant (Ki)⁴¹. However, after the allosteric modulator ZINC5042 is bound, the binding energies between the two agonists and the receptor are weakened to − 14.58 ± 3.65 kcal/mol for NE and − 13.07 ± 0.85 kcal/mol for ALE. These results clearly indicate that the allosteric modulator ZINC5042 significantly reduces the affinity of the orthosteric agonists to the receptor. Besides, it is found that the two agonists also weaken the binding of ZINC5042 to the receptor, as evidenced by a comparison between β2AR-ZINC5042 (− 56.46 ± 1.68 kcal/mol), β2AR-NE-ZINC5042 (− 48.17 ± 4.02 kcal/mol) and β2AR-ALE-ZINC5042 (− 50.18 ± 3.06 kcal/mol) in Fig. 5c. The observation clearly reveals negative cooperativity of the binding energy between the orthosteric agonist and the allosteric modulator, implying a negative allosteric potency of ZINC5042.

The allosteric modulator drives the receptor to the inactive conformation

To further estimate the effect of ZINC5042 on the activity of β2AR, we compared the structural differences upon binding ZINC5042 by superposing β2AR-ALE-ZINC5042 and β2AR-NE-ZINC5042 with the inactive and active crystal structures of β2AR. Since the structures are similar between the two systems (β2AR-ALE-ZINC5042 and β2AR-NE-ZINC5042), we only presented the superposition result of β2AR-ALE-ZINC5042 with the two crystal structures in the text, while the structural superposition of β2AR-NE-ZINC5042 was provided in Supplementary Fig. 7. As reflected by Fig. 6a, most regions of the receptor in the β2AR-ALE-ZINC5042 complex system resemble those of the inactive receptor, in particular for the activation region at the intracellular side. In the active state of β2AR, the intracellular ends of TM5 and TM6 typically exhibit outward movement, creating an open intracellular cavity for downstream protein binding. However, in the β2AR-ALE-ZINC5042 complex, the intracellular ends of TM5 and TM6 remain closed, exhibiting an inactive-like conformation that occludes downstream protein coupling (Supplementary Fig. 8).

**Fig. 6: The β2AR structural comparison between β2AR-ALE-ZINC5042 (blue), the crystal structures of inactive (PDB ID: 2RH1, light yellow) and active states (PDB ID: 3SN6, salmon).**

As accepted, GPCR activation is an allosteric process initiated by perturbations in the extracellular binding pocket, and transmitted to the intracellular region for the downstream protein binding through activating molecular switches. These molecular switches include residues of the PIF motif (P211^5.50-I121^3.40-F282^6.44) and three residues (D79^2.50-S120^3.39-S319^7.46) of the sodium ion binding site⁴², which are located in the middle of the transmembrane helix bundle and close to the allosteric pocket. As shown in Fig. 6b, the PIF motif in the active crystal structure undergoes rearrangement with respect to the inactive crystal structure. The rearrangement facilitates the outward movement of the cytoplasmic end of TM6, which has been considered to be necessary for the GPCR activation⁴³. In addition, D79^2.50, S120^3.39and S319^7.46 in the active crystal structure move closer to each other than the inactive crystal structure, leading to the disruption of the sodium ion pocket. Consequently, the inward movement of TM7 is promoted, which is another characteristic of the GPCR activation⁴³.

In contrast to the activation features above, the β2AR-ALE-ZINC5042 structure exhibits outward movement of the PIF motif residues (P211^5.50, I121^3.40, F282^6.44) and two sodium ion binding pocket residues (S120^3.39, S319^7.46), as evidenced by Fig. 6b. The outward movement mainly results from the steric hindrance between ZINC5042 and the four residues of the allosteric site (D79^2.50, S120^3.39, F282^6.44 and S319^7.46), as reflected by Fig. 6c. To further observe conformational change in the sodium ion binding pocket induced by the ZINC5042 binding, we compared the distance between S120^3.39 and S319^7.46 (labeled as d1) of the sodium ion binding pocket with those of the active and inactive crystal structures, as shown in Fig. 6d. It can be seen that the collapse of the sodium ion pocket upon activation causes d1 to decrease from 6.3 Å in the inactive state to 4.5 Å in the active state. In the β2AR-ALE-ZINC5042 system, d1 is always greater than 4.5 Å, indicating that the ZINC5042 binding inhibits the collapse of the sodium ion pocket. Similarly, the distance between P211^5.50 and F282^6.44 (labeled as d2) of the PIF motif is used to characterize the conformation of the PIF motif (Fig. 6e). Upon activation, the inward movement of P211^5.50 causes d2 to decrease from 11.1 Å in the inactive crystal structure to 9.8 Å in the active crystal. In the β2AR-ALE-ZINC5042 system, d2 is always greater than 9.8 Å, indicating that ZINC5042 binding inhibits the conformational rearrangement of the PIF motif induced by the agonist, thereby limiting the outward movement of the TM6 intracellular segment. Collectively, the binding of ZINC5042 to the receptor inhibits the transition of the receptor’s conformation to the active state, further suggesting the potential of ZINC5042 as a negative allosteric modulator (NAM).

Allosteric regulation mechanism revealed by protein structure network (PSN)

To gain insights into how the allosteric modulator regulates the orthosteric agonists, we employed PSN to calculate the shortest pathway with the highest frequency between the allosteric site and the orthosteric site for the β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042 systems. The shortest pathway is usually considered to be the most likely or biologically relevant pathway⁴⁴. The residues of the allosteric site and the orthosteric site are shown in Supplementary Table 3. As reflected by Fig. 7a, the shortest pathways are F282^6.44-W286^6.48-F290^6.52 for β2AR-NE-ZINC5042 and F282^6.44-L212^5.51-W286^6.48-F290^6.52 for β2AR-ALE-ZINC5042, suggesting the importance of these residues in the allosteric regulation. Both the two pathways include W286^6.48 and F282^6.44, which belong to the CWxP and PIF motif, respectively. The two regions are conserved motifs of the class A GPCRs⁴⁵, which have been reported to play important roles in the GPCR activation. Their attendance in the shortest pathway implies that the allosteric regulation should influence the receptor activation.

**Fig. 7: Allosteric regulation pathways derived from PSN.**

To further understand how the allosteric modulator influences the receptor activation induced by the orthosteric agonists, we analyzed the shortest pathways between the orthosteric site and the intracellular activation region for the four systems, β2AR-NE, β2AR-ALE, β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042 (Fig. 7b), in which we selected residues of the orthosteric site and intracellular activation region as starting and ending nodes (see Supplementary Table 3 for details), respectively. Without binding the allosteric modulator, the shortest pathways are D113^3.32-V86^2.57-M82^2.53-S319^7.46-D79^2.50-N322^7.49-Y326^7.53 for β2AR-NE and F290^6.52-W286^6.48-F282^6.44-I121^3.40-M215^5.54-M279^6.41-Y219^5.58 for β2AR-ALE. The two pathways include residues of the sodium ion binding site (S319^7.46, D79^2.50) and the PIF motif (F282^6.44, I121^3.40), respectively, through which the agonist regulates the receptor activation.

However, after binding ZINC5042, the two ternary complex (β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042) systems exhibit the same shortest pathway (F290^6.52-W286^6.48-F282^6.44-M279^6.41-Y219^5.58), significantly different from the systems without binding ZINC5042. As reflected by the residues in the shortest pathway, the regulation from the orthosteric site to the activation region in the two ternary complex systems mainly rely on intra-helical structural communication of TM6 and only contain one inter-helical structural communication (M279^6.41-Y219^5.58) at the end of the pathway. In contrast, for the system only binding the orthosteric agonists, the shortest pathways exhibit extensive inter-helical structure communication between TM2, TM3 and TM7 in β2AR-NE and between TM3, TM5 and TM6 in β2AR-ALE, as evidenced by Fig. 7b. Inter-helical communication beneficial to the conformational changes are considered crucial for the receptor activation⁴⁶, while the intra-helical interactions often stabilize the existing conformations⁴².

Taken together, it can be assumed that the allosteric ligand binding would decrease the inter-helical structure communication, thus disfavoring the activation signaling stimulated by the agonist. Moreover, it is noted that the important residue F282^{6 .44} of the allosteric pocket participates in both the regulation pathway from the allosteric site to the orthosteric site and that from the extracellular orthosteric site to the intracellular activation region, highlighting the importance of F282^6.44 for the allosteric signaling of ZINC5042.

Experimental validation on the pharmacological property of ZINC5042 and the allosteric site

To confirm the pharmacological property of ZINC5042, we first measured the efficacy of ZINC5042 alone for β2AR activation by using a cell-based function assay. Our result indicates that ZINC5042 fails to activate β2AR (Fig. 8a). Next, we investigated the allosteric effect of ZINC5042 on the two orthosteric agonists induced G-protein signaling for β2AR by the Glosensor-based cAMP assay. It is observed that ZINC5042 antagonizes both agonists NE and ALE-induced cAMP accumulation, as evidenced by a gradual decrease of the orthosteric agonists NE and ALE-induced receptor activation with increasing ZINC5042 concentration in a dose-dependent manner (Fig. 8b, c). These observations clearly confirm that ZINC5042 displays negative cooperativity with NE (log αβ = −0.82; αβ = 0.15) as well as ALE (log αβ = −1.15; αβ = 0.07) in a cell-based assay, strongly supporting our computational results. β2AR was reported to be engaged in the Gs signaling pathway and recruitment of β-arrestin. Given that biased allosteric modulators that exert pathway-specific effects have given rise to new frontiers in GPCR drug discovery⁴⁷, we further conducted the NanoBiT β-arrestin recruitment assay to test whether ZINC5042 would influence the β-arrestin recruitment. Firstly, our results show that ZINC5042 cannot activate β2AR mediated the recruitment ability of β-arrestin alone, compared with NE (Fig. 8d). Interestingly, ZINC5042 exhibits the ability inhibiting NE-induced β-arrestin2 recruitment via dose-dependently manner (Fig. 8e), behaving as a negative allosteric modulator (NAM) of β2AR on β-arrestin signaling. To evaluate the biased character of ZINC5042 on the β2AR signaling, we further explored the favorable signaling of ZINC5042 by NanoBiT β-arrestin recruitment and Glosensor cAMP accumulation assays. Compared with NE alone, the β-arrestin2 recruitment ability of β2AR is reduced to approximately 54% by additional 80 μM ZINC5042 (Fig. 8f). In contrast, G protein activation is reduced to approximately 10% (Fig. 8f). These findings suggest that the efficacy of Gs activation is more significantly attenuated than the β-arrestin recruitment in the presence of NE and ZINC5042, indicating that ZINC5042 acts as a G protein-biased NAM of β2AR. Similar to other β2AR negative allosteric modulators reported^20,48,49, ZINC5042 also presents allosteric activity at micromolar concentrations. However, the target selectivity indicates (see Supplementary Fig. 9 for details) that ZINC5042 is highly selective for β2AR in its pharmacological function. In addition, we also tested the cell-based functional assays for the other three hit compounds (ZINC11681543, ZINC4213962, and ZINC252008995) screened. As evidenced by Supplementary Fig. 10, the three hit compounds all exhibit negative allosteric modulator (NAM) effects, also supporting the potential of our screening strategy in practical application.

**Fig. 8: Experimental validation for the ZINC5042’s potency and the allosteric site.**

To validate the allosteric site of ZINC5042 in β2AR predicted by our computational framework, we performed site-directed mutagenesis studies and cell-based function assays in the presence of the orthosteric agonist NE, with and without ZINC5042. The result indicates that most of the mutations on the residues of the predicted allosteric binding site reduce the role of ZINC5042 in inhibiting NE-induced responses. Specifically, the Glosensor-based cAMP assay results show that D79^2.50A, F282^6.44A, N318^7.45A, and S319^7.46A significantly reduce the antagonistic effect of ZINC5042 (Fig. 8g) while these residues are almost identified as the key residues for binding ZINC5042 by the binding energy analysis above. For example, D79^2.50 is revealed to contribute significantly to the ZINC5042 binding by an essential salt bridge (Fig. 8h). S319^7.46, as a hotspot residue, forms hydrogen bonding with the alcohol hydroxyl group of the ZINC5042’s polar head. It can be seen from Supplementary Fig. 11 that S319^7.46 forms hydrogen bonding with ZINC5042 more frequently than N318^7.45 in the three independent simulations, which rationalizes the observation that S319^7.46 is identified as the hotspot residue for ZINC5042 binding in the MM/GBSA analysis above, rather than N318^7.45. It should be due to the rotation of the hydroxyl group of ZINC5042, leading to the situation that ZINC5042 alternately forms hydrogen bonding with S319^7.46 or the adjacent N318^7.45 (Fig. 8h), indicating that N318^7.45 also plays a role in stabilizing ZINC5042 despite of a shorter duration of hydrogen bonding than S319^7.46. The computational result rationalizes the experimental observation that the mutations on S319^7.46 and N318^7.45 all give rise to the drop in the antagonistic effect of ZINC5042, with a greater decrease by the S319^7.46A mutation than the N318^7.45A (Fig. 8g).

For F282^6.44, the binding mode analysis above already reveals that it interacts with chrysene moiety via π–π stacking interactions (also vide Fig. 8h). Moreover, F282^6.44 is revealed above to serve as a regulatory residue in the two allosteric pathways (i.e., one from the allosteric site to the orthosteric site and one from the orthosteric site to the intracellular activation domain). Thus, the mutation F282^6.44A exerts the most pronounced antagonistic effect on the downstream signaling, which also supports the allosteric regulation mechanism revealed above. Collectively, the pharmacological and site-directed mutagenesis experiments are completely in line with our computations, strongly validating the reliability of our computational framework for the allosteric site, allosteric effect, and allosteric mechanism.

Discussion

In the work, following the allostery nature of the residue-driven conformational transition, we developed a general and state-of-the-art computational framework by coupling the residue-intuitive hybrid machine learning (RHML) model into the MD simulations, in order to efficiently identify the allosteric site and discovering potential allosteric drugs. The RHML model was developed by combining an unsupervised clustering and an interpretable CNN-based multi-classification model, which addressed the limitation of existing ML models in the MD conformational analysis, including the optimal number of categories, the information loss in conformation representation and the residue-based interpretation of prediction result. Consequently, RHML enables accurate conformation classification and identification of important residue deciding different conformational classes for any MD trajectory. Benefiting from the technical advantages, RHML unveils a previously unreported allosteric site in β2AR and other GPCRs.

The additional allosteric site is located around the residue D79^2.50, F282^6.44, N318^7.45, and S319^7.46, through which we utilized virtual screening to discover a putative allosteric modulator ZINC5042. Assisted by extensive cMD simulations, MM/GBSA, and PSN, we further probed the communication of the allosteric site/modulator with the orthosteric site/agonist, which is very important in further estimating the allosteric potential so as to improve the success rate of the allosteric site/drug identification. MM/GBSA shows that ZINC5042 weakens the binding of the orthosteric agonists to β2AR in a negative cooperativity manner. The structural analysis indicates that ZINC5042 hinders the collapse of the sodium ion binding pocket and the conformational transition of the PIF motif to the active state, thus driving the receptor conformation to the inactive state. PSN indicates that the allosteric modulator ZINC5042 binding would decrease the inter-helical structure communication, thus disfavoring the activation signaling stimulated by the agonist. In addition, some important allosteric regulation residues are identified. Based on the sufficient computational evidence, the Glosensor-based cAMP assay and site-directed mutagenesis experiments strongly validate the computational prediction on the allosteric site and the negative allosteric effect, clearly confirming that the key residues D79^2.50, F282^6.44, N318^7.45 and S319^7.46 identified indeed play important roles in binding the allosteric modulator and inhibiting the activation signaling induced by the orthosteric agonists, in particular for F282^6.44.

It is noted that six allosteric sites of β2AR were reported^{8,20,21,22,23,24}. However, three^20,21,24 of these sites belong to protein-protein interaction (PPI) binding sites. The other three reported sites^8,22,23 and our identified site present preformed cavities, which facilitate drug binding with respect to the PPI binding sites⁵⁰. Similar to the exosite reported²³, the allosteric site identified by us implies extra potential as a target for novel bitopic ligands compared to the other two sites^8,22, since it is located closer to the sodium binding pocket. Collectively, the allosteric site identified by us offers another avenue to develop allosteric modulators for β2AR. Also, the important residues identified above are beneficial to rationally engineering allostery in β2AR.

As a major component of the interface between the sympathetic nervous system and the cardiovascular system, the β-adrenergic receptor signaling pathway plays a key role in the progression of heart failure. β-adrenergic receptor antagonists (β-blockers or βAR antagonists) are widely used in the treatment of congestive heart failure (CHF) due to their antagonistic effect on β-adrenergic receptors. It has been suggested that β-arrestin-biased agonists that selectively target β2AR may be more beneficial to the treatment of CHF⁵¹. ZINC5042, as a G protein-biased allosteric modulator of β2AR, retains some β-arrestin activity while significantly reducing endogenous ligand-activated G protein activity. Compared to other reported NAMs of β2ARs^20,48,49, the negative allosteric modulator ZINC5042 exhibits comparable effects and unique pathway specificity with the G protein bias. To the best of our knowledge, it is the first reported G-protein biased NAM for β2AR, promising a new generation of β-blockers and a novel pharmacological tool compound. Furthermore, ZINC5042 is an experimental anticancer agent investigated in Phase I clinical trials⁵². Its previously acquired data on the drug’s safety and toxicity could be instrumental in its future development, thus offering an advantage in accelerating the progress toward practical applications by drug repurposing. Besides, ZINC5042 also provides a blueprint for lead optimization to develop more potent NAMs.

Overall, the identification pipeline offers a promising strategy to discover allosteric sites/ drugs and reveal their regulation mechanisms for other target proteins. Thus, we uploaded a user-friendly code of the residue-intuitive hybrid machine learning framework available at https://github.com/chyannn06/RHML. The code offers customizable input options, automatically generating readable output files that include cluster categories and important residues deciding the classification. We expect that it will serve as a valuable tool in the MD field for aiding allosteric site identification and other MD tasks associated with conformational analyses.

Methods

System setup for MD simulations

The crystal structures of the inactive (PDB ID: 2RH1)⁵³ and active state (PDB ID: 4LDO)⁵⁴ of β2AR were obtained from the Protein Data Bank. Other components, except the protein, were removed from the crystal structure, and the missing intracellular loop 3 (ICL3) region (residue numbers: 231–262) was reconstructed using MODELER V9.2⁵⁵. The 3D structure of ALE was obtained from the co-crystalized structure (PDB ID: 4LDO). The 3D structure of NE was downloaded from the PubChem database⁵⁶ and optimized at the DFT/B3LYP/6-31 G** level using the Gaussian 09 program⁵⁷ before docking. All Ligand dockings were performed with AutoDock 4.2⁵⁸, and the rational docking pose with the top score was selected for subsequent MD simulations.

To prepare the system for MD simulation, hydrogen atoms were added under pH = 7 conditions by H + + ⁵⁹. The receptor structure was aligned using the Orientation of Protein in Membrane (OPM) database and inserted into a lipid bilayer comprised of 80% phosphatidylcholine (POPC) and 20% cholesterol. The system was solvated and neutralized with 0.15 mol/L NaCl in the aqueous phase. The CHARMM36 force field was used for the receptor, lipids, and salt ions, while the CHARMM TIP3P model was chosen for water⁶⁰. Ligand parameters were generated using the CHARMM General Force Field (CGenFF)⁶¹. These settings were successfully used in MD simulations of GPCRs^62,63. All these steps were carried out using the CHARMM-GUI server⁶⁴. After that, the systems were minimized and equilibrated (see Supplementary Methods for more details).

GaMD Molecular dynamics simulations

To sufficiently sample conformational changes associated with the opening of the allosteric site, we utilized GaMD⁶⁵ to enhance sampling (see Supplementary Methods for details). Before performing GaMD simulations, 210-ns cMD production was performed, through which acceleration parameters were calculated. The final structure of the 210 ns cMD simulation was selected as the starting structure for subsequent GaMD simulations with random initial velocities. For the inactive β2AR bound by NE, we carried out five independent 3-μs GaMD simulations to ensure sufficient sampling, labeled as traj1, traj2, traj3, traj4, and traj5. All the simulations have reached convergence (Supplementary Fig.12). MD simulations were performed using Amber16 software⁶⁶. Details parameters for simulations are described in Supplementary Methods.

Construction of the interpretable CNN-based multi-classification model

The foundational paradigm of the deep learning-based classification model mainly followed our previous binary classification method⁶⁷. However, different from the previous work, labels of conformational categories in the work are not pre-known, and the conformations encompass multiple categories, rather than simple binary classes⁶⁷. Consequently, our previous classification strategy needs to be modified to address the differences so as to handle more extensive and complex MD conformational analysis. Thus, we introduced the k-means algorithm to obtain the initial category labels. With the category labels, the interpretable CNN-based multi-classification model was constructed and trained. Specifically, the XYZ coordinate of each conformation was transferred to the RGB coordinate ${C}_{{RGB}}$ by using a matrix transformation (vide Eq. (1)).

$${C}_{{XYZ}}=M \cdot {C}_{{RGB}}$$

(1)

Consequently, each conformation was represented by a pixel map, where each pixel corresponds to an atom. These pixel maps and their category labels inferred from the clustering analysis were utilized to train the CNN-based classification model (Fig. 2a). The CNN model is composed of four convolutional layers, two max-pooling layers, and two fully connected layers. Rectified linear units (ReLU) were used as the activation function to increase the model’s nonlinearity. The fully connected layers of the model include two dense layers, with the first dense layer containing 512 neurons. The number of neurons in the final dense layer is dependent on the number of classes inferred from the clustering result. Softmax activation was used for the multi-classification. To prevent overfitting, dropout techniques were employed after the first and second max-pooling layers, as well as the first dense layer, with dropout rates of 0.25, 0.25, and 0.5, respectively. Model training utilized the Adam optimizer and categorical cross-entropy loss function, with prediction accuracy as the performance metric.

To address the black-box problem, we further established an interpreter for the CNN-based classification result based on the Local Interpretable Model-Agnostic Explanation (LIME) paradigm⁶⁸. LIME utilizes linear models to approximate the local decision boundary, which can provide an approximate explanation for the classification result of the CNN-based model. To identify important residues deciding each category, the LIME interpreter generated distinct sets of LIME matrices for each class. Figure 2b illustrates how the LIME interpreter works for the multi-classifier. To obtain predictions for the model being explained f, we generated a perturbed dataset A with small perturbations based on instance a being explained (vide red star in Fig. 2b) and weighted them by ${\pi }_{x}\left(a\right)$ that characterizes the proximity measure between the instances x and a. ${\pi }_{x}\left(a\right)$ was determined by an exponential kernel defined on a distance function D (Euclidean distance used in the work) with width σ, and expressed as Eq. (2).

$${\pi }_{x}\left(a\right)=\,{e}^{\frac{-D{(x,a)}^{2}}{{\sigma }^{2}}}$$

(2)

Next, we trained a local linear model l (vide gray dotted line in Fig. 2b) on the perturbed dataset to interpret the black-box model locally. To assess the fidelity of the linear model l in approximating the original model f for explanation, we calculated the error using Eq. (3):

$$L\left(f \!,\, {l,\, \pi }_{x}\right)=\sum _{a,{a}^{{\prime} }\in A}{\pi }_{x}\left(a\right){\left(f\left(a\right)-l\left({a}^{{\prime} }\right)\right)}^{2}$$

(3)

where $f\left(a\right)$ and $l({a}^{{\prime} })$ are the probability belonging to a certain class and ${a}^{{\prime} }$ is the interpretable version of a. The explanation produced by LIME is the optimal result that minimizes the loss function $L\left(f,{l,\pi }_{x}\right)$ and the complexity measure $\varOmega (l)\,$, which was calculated by Eq. (4)

$${explanation}(x)=\,\mathop{{\rm{argmin}}}\limits_{l}L\,\left(f \!,\, {l,\, \pi }_{x}\right)+\varOmega (l)$$

(4)

The complexity measures $\varOmega (l)$ penalize the model that has too many features or coefficients to ensure its interpretability.

For each conformation, a LIME matrix was generated to evaluate the importance of each pixel in the classification of the specific class, where the values can be either 0 (insignificant) or 1 (significant). The LIME matrices from all conformational states were summed and averaged to calculate a score ranging from 0 to 1, which can reflect the importance of the atom in distinguishing the class from the others. Then, the average importance scores for all atoms within a residue were calculated to present the importance of the residue. The higher score represents the greater importance of distinguishing different conformational states.

In order to train RHML, five independent 3-μs GaMD trajectories (traj1 to traj5) were used to construct the conformation dataset, in which 30,000 conformations of each trajectory were divided into ten groups based on the time order. Each group was randomly split into training and validation sets (8:2 ratio) to conduct five-fold cross-validation training. The results from the five trajectories were analyzed.

Mapping algorithm

We employed the FTMap site mapping online server (http://ftmap.bu.edu) to identify binding sites on the important conformations identified by the RHML model above. FTMap utilizes 16 small molecule probes with diverse properties to search for hot spots on the conformation. The optimal binding positions of the probes are calculated and then clustered based on free energy to yield consensus clusters. The regions that bind different probe clusters are called consensus sites (CS). CSs are ranked by the number of bound probes, starting with consensus site 0 (CS0) with the largest number of probe clusters. If the distances between the bound probe clusters of any consensus sites are within 4 Å, they are considered to form a single binding site. The residues of the binding site were identified within 4 Å of these bound probe clusters in the work. The mapping results were visually inspected using Pymol (https://pymol.org), and the most promising allosteric sites were determined by combining the results of the LIME interpreter.

Virtual screening

Structure-based drug design (SBDD) generally includes structural-based virtual screening (VS) and structural-based de novo drug design (DNDD). VS docks molecules of the virtual library into the receptor structure and predicts their binding scores, while DNDD creates novel chemical entities based on the receptor structures^69,70. Compared to DNDD, VS possesses the advantage of mitigating the problem of drug synthesis, as it uses large libraries of pre-synthesized compounds. Thus, it has become mainstream at the early hit identification stage³⁷.

In the work, VS was conducted using a ligand set comprising a total of 103,862 molecules, obtained from the diverse-lib and drugs-lib databases provided by MtiOpenScreen on 6^th August 2022³⁹. In total, 6000 molecules (including stereoisomers) were obtained through preliminary screening with MTiOpenScreen, which were docked by using Autodock 4.2⁵⁸. All docking input files were prepared using AutoDockTools 1.5.6 package, and the active site lattice files were generated using AutoGrid 4.2. Gasteiger charges were added to atoms. The docking box was positioned to cover the predicted allosteric site from FTMap, with a spacing of 0.375 Å. Semi-flexible docking was performed with the flexible ligand and the rigid receptor. To ensure accuracy, each ligand underwent 100 separate docking calculations. Each docking calculation included a total of 1,750,000 energy evaluations using the Lamarck genetic algorithm. The docking pose with the lowest binding energy was selected as the optimal binding mode for subsequent analysis.

Binding free energy analysis

Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) method has been considered as a reliable tool to estimate the binding free energy for protein-ligand interactions, which can be calculated in terms of Eq. (5),

$$\varDelta {G}_{{binding}}={G}_{{complex}}-({G}_{{receptor}}+{G}_{{ligand}})$$

(5)

where ${G}_{{complex}}$, ${G}_{{receptor}}$ and ${G}_{{ligand}}$ represent the free energy of the receptor-ligand complex, receptor and ligand, respectively. The free energy terms in the Eqs. (6)–(8) were estimated by the following equations.

$$G={E}_{{gas}}+{G}_{{sol}}-{TS}$$

(6)

$${E}_{{gas}}=\,{E}_{\mathrm{int}}+{E}_{{vdw}}+{E}_{{ele}}$$

(7)

$${G}_{{sol}}={G}_{{psolv}}+{G}_{{npsolv}}$$

(8)

The gas phase energy (${E}_{{gas}}$) is the sum of the internal energy (${E}_{\mathrm{int}}$), van der Waals energy (${E}_{{vdw}}$), and electrostatic interaction energy (${E}_{{ele}}$). The solvation energy (${G}_{{sol}}$) comprises contributions from polar solvation (${G}_{{psolv}}$) and non-polar solvation (${G}_{{npsolv}}$) energies. T represents temperature and $S$ denotes the total conformational entropy. Following some high-quality computational works^71,72, the entropy contribution was not considered in this study due to the high computational cost and the potential errors from the entropy calculations⁷³. All binding free energy calculations were performed using the SANDER program in AMBER16.

Protein structure network (PSN)

Herein, the PSN method was employed to investigate allosteric communication in the receptor, which has exhibited successes in computational studies⁷⁴. In PSN, residues are represented as nodes, and interactions between two nodes are represented as edges in a network. An edge is formed between two nodes only if the non-covalent interaction strength between the two nodes equals or overcomes a given cutoff, as defined by Eq. (9):

$${I}_{{ij}}=\frac{{n}_{{ij}}}{\sqrt{{N}_{i}{N}_{j}}}\times 100$$

(9)

where ${I}_{{ij}}$ represents the percentage interaction between nodes i and j. The term ${n}_{{ij}}$ denotes the number of heavy atom-tom pairs between the side chains of residues i and j within a distance cutoff (4.5 Å). ${N}_{i}$ and ${N}_{j}$ are normalized factors for residues i and j. After PSN is constructed, the shortest pathways between pairs of nodes can be searched using Dijkstra’s algorithm. Then, the correlation matrix is utilized to filter these shortest pathways. Herein, the dynamic cross-correlation (DCC) algorithm was used to estimate the motion correlation between residues by Eq. (10):

$${C}_{ij}=\frac{\overline{({r}_{i}(t)-\overline{{r}_{i}})({r}_{j}(t)-\overline{{r}_{j}})}}{\sqrt{({r}_{i}{(t)}^{2}-\overline{{{r}_{i}}^{2}})({r}_{j}{(t)}^{2}-\overline{{{r}_{j}}^{2}})}}\times 100$$

(10)

where i and j denotes residues, and ${r}_{i}\left(t\right)$ and ${r}_{j}\left(t\right)$ are the corresponding position vectors at time t. $\bar{r}$ means the ensemble average over a period of time. DCC could characterize the extent of residue-residue movement correlations within a range from 1.0 to − 1.0, where 1.0 indicates completely correlated motion and − 1.0 denotes completely anti-correlated motion. Cross-correlation analysis and PSN were performed using Wordom software⁷⁵.

cAMP Accumulation assay

To examine the intracellular cAMP levels of HEK293 cells overexpressing β2AR in response to the two agonists (NE (TargetMol, T7044), ALE (Med Chem Express (MCE), HY-B0447B)) under study and allosteric ligand screened (ZINC5042(MCE, HY-108999A), ZINC252008995(MCE, HY-15337), ZINC4213962(MCE, HY-100572), ZINC11681534 (MCE, HY-B0203A)). The GloSensor-based cAMP accumulation assay was performed as described previously^76,77. Briefly, HEK293 cells were transfected with β2AR plasmids and GloSensor plasmids in 6-well plates using Polyethylenimine Linear (PEI) MW40000 (Yeasen, Cat# 40816ES02. After 24 h incubation, cells were seeded into 96-well plates and incubated for another 24 h at 37 °C. The next day, the culture media was discarded, washed twice with PBS buffer, and replaced with 90 μL assay buffer (Hank’s Balanced Salt Solution buffer containing 10 mM HEPES, pH 7.4) containing 3% v/v dilution of the D-luciferin-potassium salt (Yeasen, Cat# 40902ES03), and incubated for 1 h at room temperature. After that, ligands diluted by the same buffer as above were added to cells. After 30 min of stimulation at room temperature, luminescence was measured by a Synergy H1 microplate reader (BioTek). Data were processed by the nonlinear regression (curve fit) dose-response function in GraphPad Prism 8. All data are the mean ± SEM from three independent experiments performed. Operational models used here to help us understand the interaction between ZINC5042 and orthosteric ligand NE or ALE in a Glosensor-based cAMP assay. Operational models are shown below⁷⁸.

$${\rm{E}}{\rm{ffect}}=\frac{{E}_{\max }{({\tau }_{A}\left[A\right]\left({K}_{B}+\alpha \beta \left[B\right]\right)+{\tau }_{B}[B]{K}_{A})}^{n}}{{(\left[A\right]{K}_{B}+{K}_{A}{K}_{B}+{K}_{A}\left[B\right]+\alpha [A][B])}^{n}+{({\tau }_{A}\left[A\right]({K}_{B}+\alpha \beta \left[B\right])+{\tau }_{B}\left[B\right]{K}_{A})}^{n}}$$

(11)

${E}_{\max }$ is the maximal response of the system; $\left[A\right]$ and $[B]$ are the concentrations of orthosteric ligand NE (or ALE) and allosteric modulator ZINC5042, respectively; ${K}_{A}$ and ${K}_{B}$ denote the equilibrium dissociation constants of an orthosteric ligand (A) and an allosteric modulator (B), respectively; α is the binding cooperativity parameter between the NE (or ALE) and ZINC5042; β denotes the allosteric effect of the ZINC5042 on NE (or ALE) efficacy; ${\tau }_{A}$ and ${\tau }_{B}$ denote the capacity of NE (or ALE) and ZINC5042, respectively. n is the slope of the transducer function that links receptor occupancy to the response.

Site-directed mutagenesis

The cDNA of human β2AR isoform1 (NM_000024.6) was obtained from Changsha Youze Biotechnology Co., Ltd and subcloned into the pcDNA3.1 vector, tagged with a hemagglutinin (HA) signal sequence at the N terminus followed by a Flag tag. Forward and reverse primers for each mutation (D79^2.50A, F282^6.44A, N318^7.45A, S319^7.46A) were synthesized by Tsingke Biotechnology Co., Ltd (Beijing, China). Mutations-specific primers and high-fidelity PrimeSTAR Max DNA Polymerase (Takara, Cat# R045A) were used to amplify the coding region with mutations from the pcDNA3.1-HA-Flag-β2AR vector. The PCR linearized products were ligated through homologous recombination using NovoRec plus One-step PCR Cloning Kit (Novoprotein Scientific Inc, China, Cat# NR005). All recombinant plasmids were extracted using the TIANprep Rapid Mini Plasmid Kit (TianGen, Cat# DP103) following the manufacturer’s instructions and verified by DNA sequencing.

NanoBiT β-arrestin recruitment assay

β2AR-mediated β-arrestin recruitment was measured by the NanoBiT β-arrestin recruitment assay as described previously⁷⁹. NanoLuc was split to create a large fragment (LgBiT) and a small fragment (SmBiT). LgBiT was fused with a flexible linker at the C-terminal of β2AR, and SmBiT was fused at the N-terminal of β-arrestin. HEK293 cells were transfected with β2AR-LgBiT and SmBiT-β-arrestin fusion vectors at a 1:1 ratio using PEI. After transfection, cells were washed with PBS, seeded into 96-well plates, and incubated for 12 h. Subsequently, media was removed and replaced with 5 µM coelenterazine h diluted by HBSS containing 20 mM HEPES. After 30 min incubation at room temperature, the ligand was added, and luminescence was measured by the Synergy H1 microplate reader (BioTek). Data analysis was conducted using GraphPad Prism 8.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The MD trajectories generated in this study and the source data underlying Figs. 3a, b, c–e, 5a, c, 6d, e, 8a–g and Supplementary Figs. 1, 2, 6, 7d, e, 9–12 have been deposited in figshare [https://doi.org/10.6084/m9.figshare.26129632]⁸⁰. Source data is provided in this paper as source data. Source data are provided in this paper.

Code availability

The RHML tool developed by this study is open source and publicly available from Zenodo [https://zenodo.org/doi/10.5281/zenodo.13325067]⁸¹.

References

Cheng, X. & Jiang, H. Allostery in Drug Development. Adv. Exp. Med. Biol. 1163, 1–23 (2019).
Article CAS PubMed Google Scholar
Wootten, D., Christopoulos, A. & Sexton, P. M. Emerging paradigms in GPCR allostery: implications for drug discovery. Nat. Rev. Drug Discov. 12, 630–644 (2013).
Article CAS PubMed Google Scholar
Möhler, H., Fritschy, J. M. & Rudolph, U. A new benzodiazepine pharmacology. J. Pharm. Exp. Ther. 300, 2–8 (2002).
Article Google Scholar
Guarnera, E. & Berezovsky, I. N. Allosteric drugs and mutations: chances, challenges, and necessity. Curr. Opin. Struct. Biol. 62, 149–157 (2020).
Article CAS PubMed Google Scholar
Chatzigoulas, A. & Cournia, Z. Rational design of allosteric modulators: Challenges and successes. WIREs Comput. Mol. Sci. 11, e1529 (2021).
Article CAS Google Scholar
Kuzmanic, A., Bowman, G. R., Juarez-Jimenez, J., Michel, J. & Gervasio, F. L. Investigating cryptic binding sites by molecular dynamics simulations. Acc. Chem. Res. 53, 654–661 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hollingsworth, S. A. et al. Cryptic pocket formation underlies allosteric modulator selectivity at muscarinic GPCRs. Nat. Commun. 10, 3289 (2019).
Article ADS PubMed PubMed Central Google Scholar
Shah, S. D. et al. In silico identification of a β2-adrenoceptor allosteric site that selectively augments canonical β2AR-Gs signaling and function. Proc. Natl. Acad. Sci. USA 119, e2214024119 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Q. et al. Targeting a cryptic allosteric site of SIRT6 with small-molecule inhibitors that inhibit the migration of pancreatic cancer cells. Acta Pharm. Sin. B 12, 876–889 (2022).
Article MathSciNet PubMed Google Scholar
Beglov, D. et al. Exploring the structural origins of cryptic sites on proteins. Proc. Natl. Acad. Sci. USA 115, E3416–E3425 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jiang, Y. et al. Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials. Nat. Commun. 12, 1–14 (2021).
Article Google Scholar
Wallach, I. et al. AI is a viable alternative to high throughput screening: a 318-target study. Sci. Rep. 14, 7526 (2024).
Article Google Scholar
Noé, F., Tkatchenko, A., Müller, K.-R. & Clementi, C. Machine learning for molecular simulation. Annu Rev. Phys. Chem. 71, 361–390 (2020).
Article ADS PubMed Google Scholar
Zhu, J., Wang, J., Han, W. & Xu, D. Neural relational inference to learn long-range allosteric interactions in proteins from molecular dynamics simulations. Nat. Commun. 13, 1661 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhou, H., Dong, Z. & Tao, P. Recognition of protein allosteric states and residues: Machine learning approaches. J. Comput. Chem. 39, 1481–1490 (2018).
Article CAS PubMed Google Scholar
Hayatshahi, H. S., Ahuactzin, E., Tao, P., Wang, S. & Liu, J. Probing protein allostery as a residue-specific concept via residue response maps. J. Chem. Inf. Model. 59, 4691–4705 (2019).
Article CAS PubMed Google Scholar
Wold, E. A., Chen, J., Cunningham, K. A. & Zhou, J. Allosteric modulation of class A GPCRs: Targets, agents, and emerging concepts. J. Med. Chem. 62, 88–127 (2019).
Article CAS PubMed Google Scholar
Baker, J. G. The selectivity of beta-adrenoceptor antagonists at the human beta1, beta2 and beta3 adrenoceptors. Br. J. Pharm. 144, 317–322 (2005).
Article CAS Google Scholar
Karoli, N. A. & Rebrov, A. P. [Possibilities and limitations of the use of beta-blockers in patients with cardiovascular disease and chronic obstructive pulmonary disease]. Kardiologiia 61, 89–98 (2021).
Article CAS PubMed Google Scholar
Liu, X. et al. An allosteric modulator binds to a conformational hub in the β2 adrenergic receptor. Nat. Chem. Biol. 16, 749–755 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, X. et al. Mechanism of β2AR regulation by an intracellular positive allosteric modulator. Science 364, 1283–1287 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, X. et al. Mechanism of intracellular allosteric β2AR antagonist revealed by X-ray crystal structure. Nature 548, 480–484 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Masureel, M. et al. Structural insights into binding specificity, efficacy and bias of a β2AR partial agonist. Nat. Chem. Biol. 14, 1059–1066 (2018).
Article CAS PubMed PubMed Central Google Scholar
Swaminath, G., Lee, T. W. & Kobilka, B. Identification of an allosteric binding site for Zn2+ on the beta2 adrenergic receptor. J. Biol. Chem. 278, 352–356 (2003).
Article CAS PubMed Google Scholar
Shi, C. et al. Auto-dialabel: labeling dialogue data with unsupervised learning. 2018 Conference on Empirical Methods in Natural Language Processing (Emnlp 2018), 684–689 (2018).
Dhamija, A., Pandoi, D., Singh, K. & Malhotra, S. An improved K-means clustering with convolutional neural network for financial crisis prediction. In Advances in Computational Intelligence and Communication Technology (Springer Singapore, Singapore, 2022).
Glielmo, A. et al. Unsupervised learning methods for molecular simulation data. Chem. Rev. 121, 9722–9758 (2021).
Article CAS PubMed PubMed Central Google Scholar
Nainggolan, R., Perangin-angin, R., Simarmata, E. & Tarigan, A. F. Improved the performance of the K-means cluster using the sum of squared error (SSE) optimized by using the Elbow Method. J. Phys. Conf. Ser. 1361, 012015 (2019).
Article Google Scholar
Akhanli, S. E. & Hennig, C. Comparing clusterings and numbers of clusters by aggregation of calibrated clustering validity indexes. Stat. Comput. 30, 1523–1544 (2020).
Article MathSciNet Google Scholar
Wakefield, A. E., Mason, J. S., Vajda, S. & Keserű, G. M. Analysis of tractable allosteric sites in G protein-coupled receptors. Sci. Rep. 9, 6180 (2019).
Article ADS PubMed PubMed Central Google Scholar
Srivastava, A. et al. High-resolution structure of the human GPR40 receptor bound to allosteric agonist TAK-875. Nature 513, 124–127 (2014).
Article ADS CAS PubMed Google Scholar
Oswald, C. et al. Intracellular allosteric antagonism of the CCR9 receptor. Nature 540, 462–465 (2016).
Article ADS CAS PubMed Google Scholar
Jaeger, K. et al. Structural basis for allosteric ligand recognition in the human CC chemokine receptor 7. Cell 178, 1222–1230.e10 (2019).
Article CAS PubMed PubMed Central Google Scholar
Shao, Z. et al. Structure of an allosteric modulator bound to the CB1 cannabinoid receptor. Nat. Chem. Biol. 15, 1199–1205 (2019).
Article CAS PubMed Google Scholar
Shen, S. et al. Allosteric modulation of G protein-coupled receptor signaling. Front. Endocrinol. 14, https://doi.org/10.3389/fendo.2023.1137604 (2023).
Yuan, J. et al. In silico prediction and validation of CB2 allosteric binding sites to aid the design of allosteric modulators. Molecules 27, 453 (2022).
Article CAS PubMed PubMed Central Google Scholar
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).
Article ADS CAS PubMed Google Scholar
De Vivo, M., Masetti, M., Bottegoni, G. & Cavalli, A. Role of molecular dynamics and related methods in drug discovery. J. Med. Chem. 59, 4035–4061 (2016).
Article PubMed Google Scholar
Labbé, C. M. et al. MTiOpenScreen: a web server for structure-based virtual screening. Nucleic Acids Res. 43, W448–W454 (2015).
Article ADS PubMed PubMed Central Google Scholar
Maffucci, I., Hu, X., Fumagalli, V. & Contini, A. An efficient implementation of the Nwat-MMGBSA method to rescore docking results in medium-throughput virtual screenings. Front. Chem. 6, 43 (2018).
Article ADS PubMed PubMed Central Google Scholar
Xu, X. et al. Binding pathway determines norepinephrine selectivity for the human β1AR over β2AR. Cell Res. 31, 569–579 (2021).
Article CAS PubMed Google Scholar
Zhou, Q. et al. Common activation mechanism of class A GPCRs. ELife 8, e50279 (2019).
Article PubMed PubMed Central Google Scholar
Latorraca, N. R., Venkatakrishnan, A. J. & Dror, R. O. GPCR Dynamics: Structures in motion. Chem. Rev. 117, 139–155 (2017).
Article CAS PubMed Google Scholar
VanWart, A. T., Eargle, J., Luthey-Schulten, Z. & Amaro, R. E. Exploring residue component contributions to dynamical network models of allostery. J. Chem. Theory Comput. 8, 2949–2961 (2012).
Article CAS PubMed PubMed Central Google Scholar
Filipek, S. Molecular switches in GPCRs. Curr. Opin. Struct. Biol. 55, 114–120 (2019).
Article CAS PubMed Google Scholar
Hauser, A. S. et al. GPCR activation mechanisms across classes and macro/microscales. Nat. Struct. Mol. Biol. 28, 879–888 (2021).
Article CAS PubMed PubMed Central Google Scholar
Slosky, L. M., Caron, M. G. & Barak, L. S. Biased allosteric modulators: New frontiers in GPCR drug discovery. Trends Pharmacol. Sci. 42, 283–299 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ahn, S. et al. Allosteric “beta-blocker” isolated from a DNA-encoded small molecule library. Proc. Natl. Acad. Sci. USA 114, 1708 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Ippolito, M. et al. Identification of a β-arrestin-biased negative allosteric modulator for the β2-adrenergic receptor. Proc. Natl. Acad. Sci. USA 120, e2302668120 (2023).
Article CAS PubMed PubMed Central Google Scholar
Whitty, A. & Kumaravel, G. Between a rock and a hard place? Nat. Chem. Biol. 2, 112–118 (2006).
Article CAS PubMed Google Scholar
Wisler, J. W. et al. A unique mechanism of beta-blocker action: carvedilol stimulates beta-arrestin signaling. Proc. Natl. Acad. Sci. USA 104, 16657–16662 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Villalona-Calero, M. A. et al. A phase I and pharmacological study of protracted infusions of crisnatol mesylate in patients with solid malignancies. Clin. Cancer Res. 5, 3369–3378 (1999).
CAS PubMed Google Scholar
Cherezov, V. et al. High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science 318, 1258–1265 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Ring, A. M. et al. Adrenaline-activated structure of β2-adrenoceptor stabilized by an engineered nanobody. Nature 502, 575–579 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Webb, B. & Sali, A. Comparative protein structure modeling using MODELLER. Curr. Protoc. Bioinforma. 54, 5.6.1–5.6.37 (2016).
Article Google Scholar
Kim, S. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).
Article CAS PubMed Google Scholar
Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E. & Fox, D. J. Gaussian 09. Revision A.01. (Gaussian Inc, Wallingford, 2009).
Google Scholar
Morris, G. M. et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791 (2009).
Article CAS PubMed PubMed Central Google Scholar
Anandakrishnan, R., Aguilar, B. & Onufriev, A. V. H. ++ 3.0: automating pK prediction and the preparation of biomolecular structures for atomistic molecular modeling and simulations. Nucleic Acids Res. 40, W537–W541 (2012).
Article CAS PubMed PubMed Central Google Scholar
Huang, J. & MacKerell, A. D. Jr CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data. J. Comput. Chem. 34, 2135–2145 (2013).
Article CAS PubMed PubMed Central Google Scholar
Vanommeslaeghe, K. et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690 (2010).
Article CAS PubMed PubMed Central Google Scholar
Thakur, N. et al. Anionic phospholipids control mechanisms of GPCR-G protein recognition. Nat. Commun. 14, 794 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Chan, H. C. S. et al. Exploring a new ligand binding site of G protein-coupled receptors. Chem. Sci. 9, 6480–6489 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, J. et al. CHARMM-GUI Input generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM simulations using the CHARMM36 additive force field. J. Chem. Theory Comput. 12, 405–413 (2016).
Article CAS PubMed Google Scholar
Miao, Y., Feher, V. A. & McCammon, J. A. Gaussian accelerated molecular dynamics: Unconstrained enhanced sampling and free energy calculation. J. Chem. Theory Comput. 11, 3584–3595 (2015).
Article CAS PubMed PubMed Central Google Scholar
D. A. Case, R. M. Betz, D. S. Cerutti, T. Cheatham, P. A. Kollman, Amber 16. (University of California, San Francisco, 2016).
Li, C. et al. An onterpretable convolutional neural network framework for analyzing molecular dynamics trajectories: A case study on functional states for G-protein-coupled receptors. J. Chem. Inf. Model 62, 1399–1410 (2022).
Article CAS PubMed Google Scholar
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (Association for Computing Machinery, New York, NY, USA, 2016). https://doi.org/10.1145/2939672.2939778.
Huang, L. et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun. 15, 2657 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Lionta, E., Spyrou, G., Vassilatis, D. K. & Cournia, Z. Structure-based virtual screening for drug discovery: principles, applications and recent advances. Curr. Top. Med. Chem. 14, 1923–1938 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wichapong, K. et al. Structure-based design of peptidic inhibitors of the interaction between CC chemokine ligand 5 (CCL5) and human neutrophil peptides 1 (HNP1). J. Med. Chem. 59, 4289–4301 (2016).
Article CAS PubMed Google Scholar
Lei, T. et al. Exploring the activation mechanism of a metabotropic glutamate receptor homodimer via molecular dynamics simulation. ACS Chem. Neurosci. 11, 133–145 (2020).
Article CAS PubMed Google Scholar
Hou, T., Wang, J., Li, Y. & Wang, W. Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations. J. Chem. Inf. Model 51, 69–82 (2011).
Article CAS PubMed Google Scholar
Zhang, F. et al. Molecular insights into the allosteric coupling mechanism between an agonist and two different transducers for μ-opioid receptors. Phys. Chem. Chem. Phys. 24, 5282–5293 (2022).
Article CAS PubMed Google Scholar
Seeber, M., Cecchini, M., Rao, F., Settanni, G. & Caflisch, A. Wordom: a program for efficient analysis of molecular dynamics simulations. Bioinformatics 23, 2625–2627 (2007).
Article CAS PubMed Google Scholar
Feng, Y. et al. Mechanism of activation and biased signaling in complement receptor C5aR1. Cell Res. 33, 312–324 (2023).
Article CAS PubMed PubMed Central Google Scholar
Zhao, J. et al. Prospect of acromegaly therapy: molecular mechanism of clinical drugs octreotide and paltusotine. Nat. Commun. 14, 962 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Leach, K., Sexton, P. M. & Christopoulos, A. Allosteric GPCR modulators: taking advantage of permissive receptor pharmacology. Trends Pharm. Sci. 28, 382–389 (2007).
Article CAS PubMed Google Scholar
Zhao, C. et al. Biased allosteric activation of ketone body receptor HCAR2 suppresses inflammation. Mol. Cell 83, 3171–3187 (2023).
Article CAS PubMed Google Scholar
Chen, X. et al. Integrative Residue-intuitive Machine Learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. figshare, https://doi.org/10.6084/m9.figshare.26129632 (2024).
Chen, X. et al. Integrative Residue-intuitive Machine Learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. Zenodo, https://doi.org/10.5281/zenodo.13325067 (2024).

Download references

Acknowledgements

This project was supported by the Sichuan International Science and Technology Innovation Cooperation Project (Grant No. 24GJHZ0431 to X.P.). This work was supported by the National Natural Science Foundation of China (62475177 to X.P. and T2221004 to Z.S.), Frontiers Medical Center, Tianfu Jincheng Laboratory Foundation (TFJC2023010010 to Z.S.), 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (ZYYC23022 to Z.S.).

Author information

These authors contributed equally: Xin Chen, Kexin Wang, Jianfang Chen.

Authors and Affiliations

College of Chemistry, Sichuan University, Chengdu, China
Xin Chen, Jianfang Chen, Jun Mao, Yuanpeng Song & Xuemei Pu
Division of Nephrology and Kidney Research Institute, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
Kexin Wang, Chao Wu & Zhenhua Shao
College of Computer Science, Sichuan University, Chengdu, China
Yijing Liu
Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, China
Zhenhua Shao

Authors

Xin Chen
View author publications
Search author on:PubMed Google Scholar
Kexin Wang
View author publications
Search author on:PubMed Google Scholar
Jianfang Chen
View author publications
Search author on:PubMed Google Scholar
Chao Wu
View author publications
Search author on:PubMed Google Scholar
Jun Mao
View author publications
Search author on:PubMed Google Scholar
Yuanpeng Song
View author publications
Search author on:PubMed Google Scholar
Yijing Liu
View author publications
Search author on:PubMed Google Scholar
Zhenhua Shao
View author publications
Search author on:PubMed Google Scholar
Xuemei Pu
View author publications
Search author on:PubMed Google Scholar

Contributions

X.C., J.C., and X.P. conceived and designed the project; X.C., J.C., J.M., Y.S., and Y.L. contributed to computational methodology and analyses; K.W., C.W., and Z.S. contributed to experimental validation and analyses; Z.S. and X.P. supervised the project; X.C., K.W., J.C., C.W., Z.S., and X.P. discussed the results, and contributed to the manuscript preparation and revision. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Zhenhua Shao or Xuemei Pu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Piia Bartos, Yinglong Miao, Jia Zhou, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, X., Wang, K., Chen, J. et al. Integrative residue-intuitive machine learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. Nat Commun 15, 8130 (2024). https://doi.org/10.1038/s41467-024-52399-y

Download citation

Received: 07 April 2024
Accepted: 03 September 2024
Published: 16 September 2024
DOI: https://doi.org/10.1038/s41467-024-52399-y

This article is cited by

Exploring the distinct activation mechanisms of neuromedin B receptor through multiple replica molecular dynamics simulations and Markov state modeling
- Nuan Li
- Ming-yuan Yang
- Shao-yong Lu
Acta Pharmacologica Sinica (2025)
Advancing active compound discovery for novel drug targets: insights from AI-driven approaches
- Xing-you Wang
- Yang Chen
- Xu-tong Li
Acta Pharmacologica Sinica (2025)