Abstract
Allosteric drugs offer a new avenue for modern drug design. However, the identification of cryptic allosteric sites presents a formidable challenge. Following the allostery nature of residue-driven conformation transition, we propose a state-of-the-art computational pipeline by developing a residue-intuitive hybrid machine learning (RHML) model coupled with molecular dynamics (MD) simulation, through which we can efficiently identify the allosteric site and allosteric modulator as well as reveal their regulation mechanism. For the clinical target β2-adrenoceptor (β2AR), we discover an additional allosteric site located around residues D792.50, F2826.44, N3187.45 and S3197.46 and one putative allosteric modulator ZINC5042. Using Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and protein structure network (PSN), the allosteric potency and regulation mechanism are probed to further improve identification accuracy. Benefiting from sufficient computational evidence, the experimental assays then validate our predicted allosteric site, negative allosteric potency and regulation pathway, showcasing the effectiveness of the identification pipeline in practice. We expect that it will be applicable to other target proteins.
Similar content being viewed by others
Introduction
Allostery represents a critical biological mechanism wherein distant sites within a biomolecule undergo fine-tuned structural and dynamic alterations in response to specific perturbations. Allosteric regulation plays a vital role in diverse biological processes1. Allosteric drugs can modulate the protein activity by means of non-competitive binding in the allosteric site2, thus yielding higher selectivity, specificity, and lower off-target toxicity. Allosteric drugs have been approved for the treatment of various diseases, including cancers, neuropsychiatric disorders, and immune diseases, which offer a new paradigm for modern drug development3,4. Despite the fascinating advantages of allosteric drugs, their development remains a great challenge, in particular allosteric site identification.
Allostery is an intrinsic property of the protein conformational landscape while the allosteric sites are often cryptic, generally only opening in specific conformational ensembles that may not have an associated resolved 3D structure5. Complicated conformational changes often lead to difficulties in discovering the allosteric site experimentally6. MD simulation can provide target conformational changes over time with high resolution in full atom detail, thus being considered one of the best approaches to identify and characterize cryptic binding sites6. There has been some exploration in using the MD technique to successfully identify the allosteric site7,8,9. A crucial step for the success of MD-based methods is to mine specific conformational states with an open allosteric pocket from the massive MD conformational space, as it is a prerequisite for subsequent detection of the allosteric site.
The opening of the allosteric site generally occurs on a long time scale. With advancements in computing power and enhanced sampling techniques, MD simulation can more sufficiently sample conformational changes involving the open cryptic pocket, yet it simultaneously leads to data explosion. In this case, manual analysis is very difficult in a complex environment with a risk of overlooking some subtle but important conformational changes, which is also generally restricted by conscious human bias. Thus, we face another difficult challenge: how to efficiently capture the conformational state involving the allosteric site? In existing MD works, the free energy analysis serves to identify low-energy states along with coordinates pre-defined. In addition, the Markov state model (MSM) is used to find important intermediates in certain processes of interest like activation, ligand binding, or disassociation. These low-energy states and intermediate states are taken as target conformations to identify the allosteric sites. Despite some successes achieved, the opening of the allosteric site does not necessarily correspond to the low-energy states or the MSM macro-states in terms of the allostery nature. Furthermore, the predefined coordinates inferred from domain knowledge would confine the discovery of allosteric sites due to the highly intricate mechanism of the allostery, which has not been fully elucidated. Thus, there remains an unmet need to develop unbiased and general methods to efficiently identify the conformational states with the open allosteric site from the vast conformational space.
Given that the nature of allostery is a residue-driven conformational transition5,10, it is reasonable to hypothesize that the residues, which play a key role in conformational changes and coincide with or couple to structural elements in the functional site, are likely to form viable cryptic sites. Theoretically, if we can develop a computational method that can identify important residues determining the signature fluctuations and detect whether they can communicate with functional domains of the protein, the identification efficacy should be substantially improved. Inspired by this, we aim to couple machine learning (ML) into MD to develop an effective identification framework, as ML possesses a powerful capacity in mining causality underlying massive and complex data11,12. Although ML has been successfully applied in MD fields13,14 to generate force fields, reduce dimensionality, and estimate free energy surfaces, its application in conformational analysis is very limited. Several works attempted to use traditional ML models with relatively simple model architecture and complicated feature engineering to distinguish the conformations between the ligand-bound state and the apo one for the MD trajectory15,16, yet the category labels need to be known as an essential prerequisite. In addition, these ML models lack interpretability, thus residue information involving the allostery cannot be obtained. Consequently, it is inaccessible to use the existing ML classification models to identify the conformation state with the opening of a cryptic allosteric site.
GPCRs are the largest and the most successful drug targets, with approximately 35% of FDA-approved drugs targeting them. However, the highly conserved orthosteric site of GPCRs poses challenges in developing subtype-selective orthosteric ligands, while the allosteric modulators with higher selectivity and specificity offer an attractive avenue for GPCR drug development17. β2AR plays a vital role in cardiovascular and respiratory physiology and thus is a clinically crucial target for widely prescribed drugs like beta-blockers and beta-agonists. Drugs targeting the β2AR orthosteric site often cause cross-reactivity and lead to various therapeutic side effects, which have garnered increasing attention in clinical arenas18,19. Thus, developing new drugs targeting the β2AR allosteric site holds great significance. Positive allosteric modulators (PAM) of β2AR have therapeutic value for diseases like asthma and chronic obstructive pulmonary disease8, while negative allosteric modulators (NAM) have the potential to treat hypertension, arrhythmia, and heart failure20. Unfortunately, only six β2AR allosteric sites have been reported so far8,20,21,22,23,24, and no allosteric drugs for β2AR have been approved. Therefore, the identification of additional β2AR allosteric sites and allosteric ligands is highly desired.
In this work, to circumvent the technical obstacles above, we explore a residue-intuitive hybrid machine learning (named RHML) framework by combining unsupervised clustering and an interpretable deep learning multi-classification model. With the framework, we can address the absence of category labels and achieve accurate classification with residue-level interpretability, thus identifying important residues involving the allosteric site. After we identify the putative allosteric site and screen-related modulators, we further probe their communication with functional domains like the orthosteric site and the active region. Our objective is to further pre-evaluate the potential as the allosteric site/modulator and to reveal their regulation mechanism, which is key for ensuring the prediction success rate and rationally engineering allostery in protein, yet often being overlooked in previous methods of allosteric drug design. In order to validate the efficiency of the identification strategy, we select the β2 adrenergic receptor (β2AR) of the G protein-coupled receptor (GPCR) family as a case study. We discovered an allosteric site and a negative allosteric modulator (ZINC5042) for β2AR, which we validate by cell-based function experiments.
Results
The proposed identification pipeline is illustrated by Fig. 1. Here, extensive gaussian accelerated molecular dynamics (GaMD) simulations are first performed to enhance sampling in order to construct a sufficient conformation space (Fig. 1a). With the conformation space, a residue-intuitive hybrid machine learning (RHML) framework is constructed, which is composed of an unsupervised clustering and an interpretable convolutional neural network (CNN) based multi-classifier (Fig. 1b). By using RHML, we can determine the optimal number of clusters (labels) and the conformation state with the opening of the allosteric site (Fig. 1c). Then, the allosteric site is identified by FTMap coupled with the LIME interpreter of RHML (Fig. 1c). The potential allosteric modulators are screened from two compound datasets, based on the identified allosteric site (vide Fig. 1d). As illustrated by Fig. 1e, the regulation effect of the allosteric site/drug and its regulation pathway are further probed by conventional MD (cMD), binding energy analysis, structural analysis, and regulation pathway analysis. Finally, experimental validation is performed by cAMP accumulation assays, β-arrestin recruitment assay and site-directed mutagenesis experiments (vide Fig. 1f). In total, this work involves six systems, 15-μs GaMD simulations and 22.5-μs cMD simulations. Supplementary Table 1 lists detailed MD information for each system.
a Conformation space generated by GaMD. b Residue-intuitive hybrid machine learning (RHML). RHML has consisted of an unsupervised clustering and an interpretable CNN-based multi-classification model. c Identification of key conformation and allosteric site with the aid of the LIME interpreter of RHML and FTmap. d Virtual screening to the allosteric modulator. e Assessment of the allosteric effect and mechanism for the allosteric site/modulator. f Experimental validation.
Construction of the residue-intuitive hybrid machine learning (RHML)
In order to capture the conformation state with a cryptic allosteric site from the vast unknown conformation space, we need to perform two tasks. One is to first classify the conformations according to structure differences between conformational classes. The other is to determine the important residue fluctuations of which class is associated with the function region, as the allostery is a functional mechanism driven by the residue conformation transition. To this end, an unsupervised classification task was first utilized to label the conformation categories in the unknown trajectory space generated by GaMD. Herein, we selected unsupervised clustering, which has been widely served as an auto-labeling strategy25,26. With the labels obtained, we further trained a supervised classification model to identify the conformation state with the opening of the allosteric site. In the two tasks, we need to address two main technical challenges. One is how to determine the optimal number of clusters in the unsupervised clustering, and the other is how to obtain the residue-level interpretability in the supervised classification model. Thus, a residue-intuitive hybrid machine learning framework (RHML in Fig. 1b) was constructed by combining an unsupervised clustering and a supervised classification model. Herein, the k-means algorithm was adopted, as it has been considered as a popular and effective method for MD trajectory clustering27. For the supervised model, an interpretable CNN-based multi-classification model was exploited to achieve accurate classification with the capacity to identify important residues deciding the classification result (vide Fig. 2). In the deep learning model, the pixel map representation was proposed to avoid hand-feature engineering with the risk of information loss in conformation representation. Accordingly, the convolution neural network with powerful learning capacity on the image was utilized to realize accurate classification in terms of the category labels inferred from the unsupervised clustering (Fig. 2a). More importantly, we explored an interpreter based on the locally linear approximation paradigm (named as LIME interpreter) to address the black-box limitation of deep learning in interpretability. Based on the interpreter, we could further identify key residues deciding the classification result, through which the conformation state with the putative allosteric site can be captured (Fig. 2b). Technical details regarding the interpretable CNN-based multi-classification model are described in Methods.
a A CNN-based multi-classifier. Each conformation is represented by a pixel map through a matrix transformation, and the CNN-based classification model is trained based on the pixel maps and their category labels inferred from the clustering analysis. b LIME interpreter for the CNN model. Based on a locally linear approximation paradigm, a LIME interpreter is developed to identify important residues deciding the CNN classification result. In the picture of “local linear approximation,” salmon, blue, and yellow backgrounds represent three classification classes, while salmon crosses, blue triangles, and yellow stars represent the conformation samples of the three different classes, respectively. For example, the star highlighted in red line represents a conformation sample being explained, around which the perturbed dataset represented by yellow stars is generated by adding perturbations. The star sizes denote the proximity measure between the perturbed sample and the sample being explained. The gray dotted line represents the local linear model that is trained on the perturbed dataset. LIME matrixes are then generated for each class to evaluate the importance of each pixel in deciding the specific class. By projecting the important pixels into the corresponding atoms, important residues can be identified.
To ensure a reliable classification result and a rational interpretation, a high prediction accuracy is required for the supervised classifier. Thus, the accuracy of the classification model can serve as a feedback metric to determine the optimal number of categories for the unsupervised clustering (Fig. 1b). The optimal number will act as the final labels of the classification model to identify the key residues with the aid of the LIME interpreter.
A conformation ensemble with putative allosteric site is identified by RHML for β2AR
To cover the opening of cryptic pockets, five independent 3-μs GaMD simulations were carried out for the inactive β2AR bound by the agonist norepinephrine (NE) to generate an extensive ensemble of receptor conformations, through which 150,000 conformations from the five trajectories were extracted to construct data sets for machine learning (see “Methods” for more details). The conformations of every trajectory were first clustered based on the root mean square deviation (RMSD) of the receptor backbone atoms excluding the highly flexible ICL3 region (residue numbers: 231–262). In the text, we only present the result from one trajectory (labeled as traj1) with a putative allosteric site. The other four trajectory results and related discussion are placed in Supplementary Figs. 1, 2 which do not involve new allosteric sites. To determine the optimal number of clusters k, which is also an open and challenging problem for unsupervised clustering, we considered three clustering evaluation indices (SSR/SST, pSF, DBI) to initially estimate k values from 2 to 7 (Fig. 3a). SSR/SST represents the explained variance, and the value closer to 1 indicates better clustering. As reflected by the green line in Fig. 3a, SSR/SST increases gradually with increasing k, but the rise becomes weak after a critical value. Using the elbow method28 to identify the point of maximum curvature in the curve, the optimal number of clusters can be determined to be k = 3. pSF, a metric measuring separation between all the clusters, suggests that larger values correspond to better clustering. The red line in Fig. 3a shows the rise of pSF with increasing k values, favoring k = 7 or higher as the best choice. DBI measures similarity within and between clusters. Smaller DBI values imply better clustering. It turns out that k = 2 has the smallest DBI (DBI = 0.522), while k = 3 (DBI = 0.526) is very close to k = 2. In other words, both k = 2 and k = 3 can be taken as reasonable choices (yellow line in Fig. 3a). The inconsistency of the optimal cluster number between different clustering metrics is a common phenomenon, as there may not be a definite optimal k value for complex data. Thus, the choice of the k value depends on balancing different validity indices and considering specific research purpose29. Herein, we referenced the classification accuracy of the CNN-based model. Figure 3b shows the CNN-based classification accuracy for different numbers of clusters. We first excluded k = 6 and k = 7 due to low classification accuracy (<0.8), which would drop the reliability of our LIME interpreter. Similarly, k = 5 was not considered due to its poor performance in the DBI index. Compared with k = 2 and k = 4, both SSR/SST and DBI indices favor k = 3. Furthermore, our DL-based classification model also achieves a prediction accuracy of 0.903 ± 0.004 at k = 3. Taken together, we used the labels of the three categories to train the interpretable CNN-based classifier, in which the LIME interpreter identified important residues deciding the classification result, as shown in Fig. 3c–h.
a Three evaluation metrics of clustering for different numbers of clusters (k = 2–7). SSR/SST represents the explained variance wherein values closer to 1 indicate better clustering. Pseudo-F statistic (pSF) measures the cluster separation, and Davies-Bouldin Index (DBI) assesses the cluster similarity. Higher pSF values and lower DBI values indicate better results. b Prediction performance of the interpretable CNN-based multi-classifier for different numbers of clusters (k values). Data are presented as the mean (points) ± standard deviation (error bars) derived from the five-fold cross-validation. c–e Scores of the top 20 important residues identified by the LIME interpreter in deciding the three-classification result such as cluster0 (c), cluster1 (d), and cluster2 (e). f–h Distribution of the top 20 important residues in the 3D structure of β2AR for cluster0 (f), cluster1 (g), and cluster2 (h). The important residues are highlighted in spheres.
It can be seen from Fig. 3f, h that the important residues deciding the cluster0 and cluster2 mainly distribute at the extracellular end of TM6 and TM7 as well as the extracellular loops (ECL2 and ECL3). These regions were already revealed to involve an allosteric site of β2AR reported23, demonstrating the effectiveness of our RHML for identifying the allosteric site. Since our objective is to discover additional allosteric sites, the reported site was not considered for further investigation. Interestingly, for cluster1, important residues identified distributed in the middle and near the intracellular end of TM6 and TM7, including key residues such as F2826.44, N3187.45, N3227.49, and P3237.50, which were revealed to be molecular switches in the receptor activation (vide Fig. 3d, g). The observation suggests that the cluster1 undergoes specific conformational changes in these important regions associated with the activity of β2AR, implying its functional potential. Given that allostery is a functional mechanism involving the GPCR activity, we selected representative conformations from cluster1 to further identify allosteric sites by using FTMap.
One additional allosteric site identified by FTMap coupled with the RHML interpreter
FTMap is an energy-based method for identifying binding sites, which has been accepted as an effective tool to predict potential allosteric sites within the helical regions of GPCRs30. We selected two representative conformations (labeled as Conf1 and Conf2) from cluster1, which can account for 70% of conformations, to perform the site mapping by means of FTMap. In the two conformations, FTMap detects more than ten consensus sites (CSs), as shown in Supplementary Fig. 3. In fact, how to effectively identify the sites with potential allosteric function has been a difficult task for various pocket identification tools. Since the cryptic allosteric sites are exposed to conformational changes, it is reasonable to assume that the pockets near the important residues, which decide the conformational states and associate with function regions, will have a high possibility of acting as the allosteric site. Thus, the result from the LIME interpreter of our RHML framework was utilized to facilitate the allosteric site identification. It was found that one allosteric site in Conf1 identified by FTMap is comprised of CS0, CS1, CS2, and CS6, while the identified site in Conf2 consists of CS0, CS3, and CS4. These probe clusters are in proximity to some important residues revealed by our LIME interpreter (vide Supplementary Fig.3). Table 1 lists the allosteric site residues identified for Conf1 and Conf2. It can be seen that the majority of residues are the same for the two conformations and only several residues are different due to the flexibility of the site. Thus, they represent the same binding site despite some differences in shape and size, as reflected in Fig. 4a, b. The binding site is located in the middle of the protein helical bundle and near to the sodium binding site (see Supplementary Fig. 4 for details). In addition, the pocket includes I1213.40, F2826.44, and S3197.46, which play important roles in regulating the activity of β2AR23. Taken together, it can be expected that drugs targeting the binding site most probably modulate the β2AR signaling and function, implying high potential as an allosteric site. It is noted that the allosteric site does not open in active and inactive crystal structures of β2AR, as evidenced by Supplementary Fig. 5. More interestingly, the cryptic allosteric site has not been reported for other GPCRs31,32,33,34,35.
a, b Allosteric site identified by a combination of FTMap and the LIME interpreter of RHML for two representative conformations (Conf1 (a) and Conf2 (b)). The predicted allosteric sites are shown as surface and important residues identified by the LIME interpreter are highlighted in blue. c Virtual screening workflow employed in the work and structures of four hit compounds screened. The binding energies between the receptor and the four ligands screened are highlighted in red, which is derived from the MM/GBSA calculations based on the last 10-ns trajectories of 120-ns short cMD simulations. d Binding modes of the four hit compounds with β2AR. Ligands are represented by stick (ZINC5042, salmon; ZINC252008995, cyan; ZINC4213962, wheat; ZINC11681534, skyblue). The receptor is represented as cartoon (yellow for Conf1 and green for Conf2). The polar interactions are shown as black dashed lines.
Screening potential allosteric modulator by virtual screening and MM/GBSA
Virtual screening has been successfully employed to identify allosteric modulators, including those targeting GPCRs36,37. As accepted, protein flexibility is crucial for structure-based drug design, and it was reported that multi-conformational virtual screening with two or three conformations of the target could improve the final enrichment and chemical diversity of the hit compounds38. As outlined above, the putative allosteric sites identified exhibit some differences in shape and size between Conf1 and Conf2 due to pocket flexibility. Therefore, we conducted the multi-conformational virtual screening based on the two conformations, as depicted in Fig. 4c. The ligand set is composed of two datasets (Diverse-lib and Drugs-lib) obtained from MTiOpenScreen39. Diverse-lib consists of 99,288 chemically diverse molecules suitable for screening novel drug scaffolds. Drugs-lib contains 4574 purchasable approved drug molecules, which facilitates drug repurposing with the advantages of reduced time and costs in drug development. In total, the ligand set contains 103,862 ligand molecules after a series of operations, for example, removing redundancy, evaluating drug-likeness, filtering toxic groups, and analyzing chemical diversity. Then, a preliminary screening was performed using the MTiOpenScreen platform. The top 3000 molecules from each conformation underwent further docking evaluations using Autodock 4.2. The docking score (AD4.2 energy) was used for ranking, along with visual inspection (see Supplementary Information for details) to exclude potential high-ranked false positives. Finally, four hit compounds (Fig. 4c) were selected from the top 20 compounds. Figure 4d shows predicted binding modes for the four ligands with β2AR. Interestingly, the compound ZINC5042 from Drugs-lib exhibits good scores both in the two conformations, as evidenced by Supplementary Table 2. As the MD simulation combined with the MM/GBSA calculation can improve the binding affinity prediction of poses obtained from docking protocols40, we selected the conformation with the best docking score for each of the four ligands to perform a short 120-ns cMD simulation and used the last 10 ns of each MD trajectory to conduct MM/GBSA binding free energy calculation. As shown in Fig. 4c, the MM/GBSA binding energies indicate that the four hit compounds all can stably bind to β2AR. Practically, ZINC11681534, with the weakest binding energy (− 33.35 kcal/mol), was already reported to be a β2AR antagonist23, implying that the other three compounds with better binding affinity may have potential efficacies. Herein, we selected ZINC5042 with the highest affinity as a representation of the promising compounds to verify our design strategy.
Interaction mode between the allosteric ligand ZINC5042 and the receptor
To more reliably estimate the interaction of ZINC5042 with the receptor, we extended the simulation time of the β2AR-ZINC5042 complex to three independent 1.5-μs cMD simulations (see Supplementary Fig. 6 for RMSD). Based on the last 100-ns equilibrium trajectory, we calculated their binding free energies by using MM/GBSA and decomposed the energy into the corresponding residues. As reflected by Fig. 5a, the hotspot residues to the ZINC5042 binding distribute in TM2, TM3, TM6, and TM7 of β2AR. Figure 5b illustrates detailed interactions between ZINC5042 and the hotspot residues. It can be seen that ZINC5042 is bound deep within the transmembrane helix bundle mainly through polar and van der Waals interactions. Notably, the residue D792.50 contributes significantly to the ligand binding by forming an essential salt bridge with the polar head of ZINC5042. Another hotspot residue, S3197.46, forms hydrogen bonding with the alcohol hydroxyl group of the ligand’s polar head, further stabilizing the ligand-receptor complex. The chrysene moiety of ZINC5042 occupies a hydrophobic pocket composed of several hydrophobic hotspot residues (L752.46, L1243.43, I2786.40, F2826.44, N3227.49, and P3237.50), forming extensive van der Waals interactions, in turn contributing to the overall stability of the complex. It is noted that some functional residues of the allosteric pocket revealed above, such as D792.50, S3197.46, F2826.44, N3227.49, and P3237.50, devote important contribution to binding ZINC5042, implying a functional potential of ZINC5042.
The binding free energies are calculated by the MM/GBSA method based on the last 100-ns trajectory of three independent 1.5-μs cMD simulations, which are presented by mean ± standard error of the mean (SEM). a Important residues contributing to the ZINC5042 binding with the absolute value of binding energy greater than 0.5 kcal/mol in all three independent trajectories. Error bars represent SEM. b Overall structure of β2AR (blue) bound by ZINC5042 (green). The inset illustrates the detailed interactions between ZINC5042 (green stick) and the hotspot residues (blue sticks). Polar interactions are highlighted as black dotted lines. c Binding free energies of the two orthosteric agonists (NE and ALE) and the allosteric modulator ZINC5042 to β2AR in different systems.
The allosteric modulator weakens the binding of orthosteric agonist
In order to estimate the allosteric potency of ZINC5042, we first examined its impact on the orthosteric ligand and the receptor. To this aim, two endogenous agonists of β2AR with to some extent differences in signaling, i.e., NE and L-epinephrine (ALE)41, were considered in the work in order to provide more sufficient evidence. Three independent 1.5-μs cMD simulations (See Supplementary methods for simulation details) were carried out for each of the five complex systems, including β2AR-ZINC5042, β2AR-NE, β2AR-ALE, β2AR-NE-ZINC5042, and β2AR-ALE-ZINC5042. RMSD values show that the five systems reach equilibrium, as reflected by Supplementary Fig. 6. MM/GBSA was used to calculate their ligand-receptor binding free energies based on the last 100 ns trajectory. As shown in Fig. 5c, without binding the allosteric modulator ZINC5042, the binding free energies between the orthosteric ligands and the receptor are − 21.34 ± 2.19 kcal/mol for NE and − 31.73 ± 2.77 kcal/mol for ALE. The agonist ALE exhibits a higher affinity than NE, consistent with the experimental results of the inhibitory constant (Ki)41. However, after the allosteric modulator ZINC5042 is bound, the binding energies between the two agonists and the receptor are weakened to − 14.58 ± 3.65 kcal/mol for NE and − 13.07 ± 0.85 kcal/mol for ALE. These results clearly indicate that the allosteric modulator ZINC5042 significantly reduces the affinity of the orthosteric agonists to the receptor. Besides, it is found that the two agonists also weaken the binding of ZINC5042 to the receptor, as evidenced by a comparison between β2AR-ZINC5042 (− 56.46 ± 1.68 kcal/mol), β2AR-NE-ZINC5042 (− 48.17 ± 4.02 kcal/mol) and β2AR-ALE-ZINC5042 (− 50.18 ± 3.06 kcal/mol) in Fig. 5c. The observation clearly reveals negative cooperativity of the binding energy between the orthosteric agonist and the allosteric modulator, implying a negative allosteric potency of ZINC5042.
The allosteric modulator drives the receptor to the inactive conformation
To further estimate the effect of ZINC5042 on the activity of β2AR, we compared the structural differences upon binding ZINC5042 by superposing β2AR-ALE-ZINC5042 and β2AR-NE-ZINC5042 with the inactive and active crystal structures of β2AR. Since the structures are similar between the two systems (β2AR-ALE-ZINC5042 and β2AR-NE-ZINC5042), we only presented the superposition result of β2AR-ALE-ZINC5042 with the two crystal structures in the text, while the structural superposition of β2AR-NE-ZINC5042 was provided in Supplementary Fig. 7. As reflected by Fig. 6a, most regions of the receptor in the β2AR-ALE-ZINC5042 complex system resemble those of the inactive receptor, in particular for the activation region at the intracellular side. In the active state of β2AR, the intracellular ends of TM5 and TM6 typically exhibit outward movement, creating an open intracellular cavity for downstream protein binding. However, in the β2AR-ALE-ZINC5042 complex, the intracellular ends of TM5 and TM6 remain closed, exhibiting an inactive-like conformation that occludes downstream protein coupling (Supplementary Fig. 8).
The blue arrows display the movement direction of the β2AR-ALE-ZINC5042 system. a The comparison of the overall structure for the three systems. For clarity, components except receptors are omitted. b Structural comparison of the microswitches in the sodium ion binding site (D2.50-S3.39-S7.46) and the PIF motif (P5.50-I3.40-F6.44) in the three different systems. c Comparison of four important residues of the allosteric site identified in the three different systems, in which ZINC5042 is represented by green spheres. d Distributions (histograms) of the distance (d1) of the OG atoms between S3.39 and S7.46 during the three 1.5-μs simulations for the β2AR-ALE-ZINC5042 system (blue). The yellow and salmon dashed lines represent the distance (d1) in the inactive and active crystal structures, respectively. e Distributions (histograms) of the distance (d2) of the Cα atoms between P5.50 and F6.44 during the three 1.5-μs simulations for the β2AR-ALE-ZINC5042 system (green). The yellow and salmon dashed lines denote the distance (d2) in the inactive (PDB ID: 2RH1) and active (PDB ID: 3SN6) crystal structures, respectively.
As accepted, GPCR activation is an allosteric process initiated by perturbations in the extracellular binding pocket, and transmitted to the intracellular region for the downstream protein binding through activating molecular switches. These molecular switches include residues of the PIF motif (P2115.50-I1213.40-F2826.44) and three residues (D792.50-S1203.39-S3197.46) of the sodium ion binding site42, which are located in the middle of the transmembrane helix bundle and close to the allosteric pocket. As shown in Fig. 6b, the PIF motif in the active crystal structure undergoes rearrangement with respect to the inactive crystal structure. The rearrangement facilitates the outward movement of the cytoplasmic end of TM6, which has been considered to be necessary for the GPCR activation43. In addition, D792.50, S1203.39and S3197.46 in the active crystal structure move closer to each other than the inactive crystal structure, leading to the disruption of the sodium ion pocket. Consequently, the inward movement of TM7 is promoted, which is another characteristic of the GPCR activation43.
In contrast to the activation features above, the β2AR-ALE-ZINC5042 structure exhibits outward movement of the PIF motif residues (P2115.50, I1213.40, F2826.44) and two sodium ion binding pocket residues (S1203.39, S3197.46), as evidenced by Fig. 6b. The outward movement mainly results from the steric hindrance between ZINC5042 and the four residues of the allosteric site (D792.50, S1203.39, F2826.44 and S3197.46), as reflected by Fig. 6c. To further observe conformational change in the sodium ion binding pocket induced by the ZINC5042 binding, we compared the distance between S1203.39 and S3197.46 (labeled as d1) of the sodium ion binding pocket with those of the active and inactive crystal structures, as shown in Fig. 6d. It can be seen that the collapse of the sodium ion pocket upon activation causes d1 to decrease from 6.3 Å in the inactive state to 4.5 Å in the active state. In the β2AR-ALE-ZINC5042 system, d1 is always greater than 4.5 Å, indicating that the ZINC5042 binding inhibits the collapse of the sodium ion pocket. Similarly, the distance between P2115.50 and F2826.44 (labeled as d2) of the PIF motif is used to characterize the conformation of the PIF motif (Fig. 6e). Upon activation, the inward movement of P2115.50 causes d2 to decrease from 11.1 Å in the inactive crystal structure to 9.8 Å in the active crystal. In the β2AR-ALE-ZINC5042 system, d2 is always greater than 9.8 Å, indicating that ZINC5042 binding inhibits the conformational rearrangement of the PIF motif induced by the agonist, thereby limiting the outward movement of the TM6 intracellular segment. Collectively, the binding of ZINC5042 to the receptor inhibits the transition of the receptor’s conformation to the active state, further suggesting the potential of ZINC5042 as a negative allosteric modulator (NAM).
Allosteric regulation mechanism revealed by protein structure network (PSN)
To gain insights into how the allosteric modulator regulates the orthosteric agonists, we employed PSN to calculate the shortest pathway with the highest frequency between the allosteric site and the orthosteric site for the β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042 systems. The shortest pathway is usually considered to be the most likely or biologically relevant pathway44. The residues of the allosteric site and the orthosteric site are shown in Supplementary Table 3. As reflected by Fig. 7a, the shortest pathways are F2826.44-W2866.48-F2906.52 for β2AR-NE-ZINC5042 and F2826.44-L2125.51-W2866.48-F2906.52 for β2AR-ALE-ZINC5042, suggesting the importance of these residues in the allosteric regulation. Both the two pathways include W2866.48 and F2826.44, which belong to the CWxP and PIF motif, respectively. The two regions are conserved motifs of the class A GPCRs45, which have been reported to play important roles in the GPCR activation. Their attendance in the shortest pathway implies that the allosteric regulation should influence the receptor activation.
Blue-shaded area: the orthosteric site. Green-shaded area: the allosteric site. Pink-shaded area: the intracellular activation region. a The shortest pathways with the highest frequency from the allosteric site to the orthosteric site in the β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042 systems. b The shortest pathway with the highest frequency from the orthosteric site to the intracellular activation region. PSN was calculated based on the last 1000-ns trajectory of three independent 1.5-μs cMD simulations for each system.
To further understand how the allosteric modulator influences the receptor activation induced by the orthosteric agonists, we analyzed the shortest pathways between the orthosteric site and the intracellular activation region for the four systems, β2AR-NE, β2AR-ALE, β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042 (Fig. 7b), in which we selected residues of the orthosteric site and intracellular activation region as starting and ending nodes (see Supplementary Table 3 for details), respectively. Without binding the allosteric modulator, the shortest pathways are D1133.32-V862.57-M822.53-S3197.46-D792.50-N3227.49-Y3267.53 for β2AR-NE and F2906.52-W2866.48-F2826.44-I1213.40-M2155.54-M2796.41-Y2195.58 for β2AR-ALE. The two pathways include residues of the sodium ion binding site (S3197.46, D792.50) and the PIF motif (F2826.44, I1213.40), respectively, through which the agonist regulates the receptor activation.
However, after binding ZINC5042, the two ternary complex (β2AR-NE-ZINC5042 and β2AR-ALE-ZINC5042) systems exhibit the same shortest pathway (F2906.52-W2866.48-F2826.44-M2796.41-Y2195.58), significantly different from the systems without binding ZINC5042. As reflected by the residues in the shortest pathway, the regulation from the orthosteric site to the activation region in the two ternary complex systems mainly rely on intra-helical structural communication of TM6 and only contain one inter-helical structural communication (M2796.41-Y2195.58) at the end of the pathway. In contrast, for the system only binding the orthosteric agonists, the shortest pathways exhibit extensive inter-helical structure communication between TM2, TM3 and TM7 in β2AR-NE and between TM3, TM5 and TM6 in β2AR-ALE, as evidenced by Fig. 7b. Inter-helical communication beneficial to the conformational changes are considered crucial for the receptor activation46, while the intra-helical interactions often stabilize the existing conformations42.
Taken together, it can be assumed that the allosteric ligand binding would decrease the inter-helical structure communication, thus disfavoring the activation signaling stimulated by the agonist. Moreover, it is noted that the important residue F2826 .44 of the allosteric pocket participates in both the regulation pathway from the allosteric site to the orthosteric site and that from the extracellular orthosteric site to the intracellular activation region, highlighting the importance of F2826.44 for the allosteric signaling of ZINC5042.
Experimental validation on the pharmacological property of ZINC5042 and the allosteric site
To confirm the pharmacological property of ZINC5042, we first measured the efficacy of ZINC5042 alone for β2AR activation by using a cell-based function assay. Our result indicates that ZINC5042 fails to activate β2AR (Fig. 8a). Next, we investigated the allosteric effect of ZINC5042 on the two orthosteric agonists induced G-protein signaling for β2AR by the Glosensor-based cAMP assay. It is observed that ZINC5042 antagonizes both agonists NE and ALE-induced cAMP accumulation, as evidenced by a gradual decrease of the orthosteric agonists NE and ALE-induced receptor activation with increasing ZINC5042 concentration in a dose-dependent manner (Fig. 8b, c). These observations clearly confirm that ZINC5042 displays negative cooperativity with NE (log αβ = −0.82; αβ = 0.15) as well as ALE (log αβ = −1.15; αβ = 0.07) in a cell-based assay, strongly supporting our computational results. β2AR was reported to be engaged in the Gs signaling pathway and recruitment of β-arrestin. Given that biased allosteric modulators that exert pathway-specific effects have given rise to new frontiers in GPCR drug discovery47, we further conducted the NanoBiT β-arrestin recruitment assay to test whether ZINC5042 would influence the β-arrestin recruitment. Firstly, our results show that ZINC5042 cannot activate β2AR mediated the recruitment ability of β-arrestin alone, compared with NE (Fig. 8d). Interestingly, ZINC5042 exhibits the ability inhibiting NE-induced β-arrestin2 recruitment via dose-dependently manner (Fig. 8e), behaving as a negative allosteric modulator (NAM) of β2AR on β-arrestin signaling. To evaluate the biased character of ZINC5042 on the β2AR signaling, we further explored the favorable signaling of ZINC5042 by NanoBiT β-arrestin recruitment and Glosensor cAMP accumulation assays. Compared with NE alone, the β-arrestin2 recruitment ability of β2AR is reduced to approximately 54% by additional 80 μM ZINC5042 (Fig. 8f). In contrast, G protein activation is reduced to approximately 10% (Fig. 8f). These findings suggest that the efficacy of Gs activation is more significantly attenuated than the β-arrestin recruitment in the presence of NE and ZINC5042, indicating that ZINC5042 acts as a G protein-biased NAM of β2AR. Similar to other β2AR negative allosteric modulators reported20,48,49, ZINC5042 also presents allosteric activity at micromolar concentrations. However, the target selectivity indicates (see Supplementary Fig. 9 for details) that ZINC5042 is highly selective for β2AR in its pharmacological function. In addition, we also tested the cell-based functional assays for the other three hit compounds (ZINC11681543, ZINC4213962, and ZINC252008995) screened. As evidenced by Supplementary Fig. 10, the three hit compounds all exhibit negative allosteric modulator (NAM) effects, also supporting the potential of our screening strategy in practical application.
a Representative curve for concentration-dependent activation of β2AR in response to NE or ZINC5042 stimulation examined by Glosensor-based cAMP assay. ZINC5042 fails to activate β2AR. Data are presented as the mean ± standard error of the mean (SEM) of three independent experiments performed in triplicate. Error bars represent SEM. b, c Negative allosteric effect of ZINC5042 on NE-induced cAMP accumulation (b) and on ALE-induced cAMP accumulation mediated by β2AR (c). Data are presented as mean ± SEM from three independent experiments (n = 3) performed in triplicates. d, e Dose-response curves of β2AR in response to stimulation with different ligands by β-arrestin2 recruitment assay. Values are mean ± SEM from three independent experiments (n = 3) performed in triplicates. f The activation efficacy of β2AR in response to NE in the presence of 80 μM ZINC5042. Data are presented as mean ± SEM from three independent experiments (n = 3) performed in triplicates. p-values were obtained by Dunnett’s multiple comparison test. Gs protein: Norepinephrine vs. Gs protein: Norepinephrine+80 μM ZINC5042, ***p < 0.001; β-arr2: Norepinephrine vs. β-arr2: Norepinephrine + 80 μM ZINC5042, ***p < 0.001; Gs protein: Norepinephrine+80 μM ZINC5042 vs. β-arr2: Norepinephrine + 80 μM ZINC5042, ###p < 0.001. g The allosteric effect of ZINC5042 on β2AR WT and mutations. Bars represent differences in each mutation relative to WT for β2AR after calculating the ratio of the maximum effect efficacy (Emax) of NE in the presence and absence of 80 μM ZINC5042. Data are presented as mean ± SEM from three independent experiments (n = 3) performed in triplicates. All data were obtained by one-way analysis of variance with Dunnett’s multiple comparison test to determine significance (compared with WT, from left to right, ***p < 0.001, < 0.001, <0.001 < 0.001). h Schematic diagram illustrating the interactions between key allosteric residues (blue sticks) and ZINC5042 (green sticks) derived from the computational analysis of binding mode. ZINC5042 forms a salt bridge with D792.50, hydrogen bonding with the N3187.45 and S3197.46 side chains, and the π–π stacking interactions with F2826.44. Polar interactions are highlighted as black dotted lines, and π-π stacking is represented by magenta dotted lines.
To validate the allosteric site of ZINC5042 in β2AR predicted by our computational framework, we performed site-directed mutagenesis studies and cell-based function assays in the presence of the orthosteric agonist NE, with and without ZINC5042. The result indicates that most of the mutations on the residues of the predicted allosteric binding site reduce the role of ZINC5042 in inhibiting NE-induced responses. Specifically, the Glosensor-based cAMP assay results show that D792.50A, F2826.44A, N3187.45A, and S3197.46A significantly reduce the antagonistic effect of ZINC5042 (Fig. 8g) while these residues are almost identified as the key residues for binding ZINC5042 by the binding energy analysis above. For example, D792.50 is revealed to contribute significantly to the ZINC5042 binding by an essential salt bridge (Fig. 8h). S3197.46, as a hotspot residue, forms hydrogen bonding with the alcohol hydroxyl group of the ZINC5042’s polar head. It can be seen from Supplementary Fig. 11 that S3197.46 forms hydrogen bonding with ZINC5042 more frequently than N3187.45 in the three independent simulations, which rationalizes the observation that S3197.46 is identified as the hotspot residue for ZINC5042 binding in the MM/GBSA analysis above, rather than N3187.45. It should be due to the rotation of the hydroxyl group of ZINC5042, leading to the situation that ZINC5042 alternately forms hydrogen bonding with S3197.46 or the adjacent N3187.45 (Fig. 8h), indicating that N3187.45 also plays a role in stabilizing ZINC5042 despite of a shorter duration of hydrogen bonding than S3197.46. The computational result rationalizes the experimental observation that the mutations on S3197.46 and N3187.45 all give rise to the drop in the antagonistic effect of ZINC5042, with a greater decrease by the S3197.46A mutation than the N3187.45A (Fig. 8g).
For F2826.44, the binding mode analysis above already reveals that it interacts with chrysene moiety via π–π stacking interactions (also vide Fig. 8h). Moreover, F2826.44 is revealed above to serve as a regulatory residue in the two allosteric pathways (i.e., one from the allosteric site to the orthosteric site and one from the orthosteric site to the intracellular activation domain). Thus, the mutation F2826.44A exerts the most pronounced antagonistic effect on the downstream signaling, which also supports the allosteric regulation mechanism revealed above. Collectively, the pharmacological and site-directed mutagenesis experiments are completely in line with our computations, strongly validating the reliability of our computational framework for the allosteric site, allosteric effect, and allosteric mechanism.
Discussion
In the work, following the allostery nature of the residue-driven conformational transition, we developed a general and state-of-the-art computational framework by coupling the residue-intuitive hybrid machine learning (RHML) model into the MD simulations, in order to efficiently identify the allosteric site and discovering potential allosteric drugs. The RHML model was developed by combining an unsupervised clustering and an interpretable CNN-based multi-classification model, which addressed the limitation of existing ML models in the MD conformational analysis, including the optimal number of categories, the information loss in conformation representation and the residue-based interpretation of prediction result. Consequently, RHML enables accurate conformation classification and identification of important residue deciding different conformational classes for any MD trajectory. Benefiting from the technical advantages, RHML unveils a previously unreported allosteric site in β2AR and other GPCRs.
The additional allosteric site is located around the residue D792.50, F2826.44, N3187.45, and S3197.46, through which we utilized virtual screening to discover a putative allosteric modulator ZINC5042. Assisted by extensive cMD simulations, MM/GBSA, and PSN, we further probed the communication of the allosteric site/modulator with the orthosteric site/agonist, which is very important in further estimating the allosteric potential so as to improve the success rate of the allosteric site/drug identification. MM/GBSA shows that ZINC5042 weakens the binding of the orthosteric agonists to β2AR in a negative cooperativity manner. The structural analysis indicates that ZINC5042 hinders the collapse of the sodium ion binding pocket and the conformational transition of the PIF motif to the active state, thus driving the receptor conformation to the inactive state. PSN indicates that the allosteric modulator ZINC5042 binding would decrease the inter-helical structure communication, thus disfavoring the activation signaling stimulated by the agonist. In addition, some important allosteric regulation residues are identified. Based on the sufficient computational evidence, the Glosensor-based cAMP assay and site-directed mutagenesis experiments strongly validate the computational prediction on the allosteric site and the negative allosteric effect, clearly confirming that the key residues D792.50, F2826.44, N3187.45 and S3197.46 identified indeed play important roles in binding the allosteric modulator and inhibiting the activation signaling induced by the orthosteric agonists, in particular for F2826.44.
It is noted that six allosteric sites of β2AR were reported8,20,21,22,23,24. However, three20,21,24 of these sites belong to protein-protein interaction (PPI) binding sites. The other three reported sites8,22,23 and our identified site present preformed cavities, which facilitate drug binding with respect to the PPI binding sites50. Similar to the exosite reported23, the allosteric site identified by us implies extra potential as a target for novel bitopic ligands compared to the other two sites8,22, since it is located closer to the sodium binding pocket. Collectively, the allosteric site identified by us offers another avenue to develop allosteric modulators for β2AR. Also, the important residues identified above are beneficial to rationally engineering allostery in β2AR.
As a major component of the interface between the sympathetic nervous system and the cardiovascular system, the β-adrenergic receptor signaling pathway plays a key role in the progression of heart failure. β-adrenergic receptor antagonists (β-blockers or βAR antagonists) are widely used in the treatment of congestive heart failure (CHF) due to their antagonistic effect on β-adrenergic receptors. It has been suggested that β-arrestin-biased agonists that selectively target β2AR may be more beneficial to the treatment of CHF51. ZINC5042, as a G protein-biased allosteric modulator of β2AR, retains some β-arrestin activity while significantly reducing endogenous ligand-activated G protein activity. Compared to other reported NAMs of β2ARs20,48,49, the negative allosteric modulator ZINC5042 exhibits comparable effects and unique pathway specificity with the G protein bias. To the best of our knowledge, it is the first reported G-protein biased NAM for β2AR, promising a new generation of β-blockers and a novel pharmacological tool compound. Furthermore, ZINC5042 is an experimental anticancer agent investigated in Phase I clinical trials52. Its previously acquired data on the drug’s safety and toxicity could be instrumental in its future development, thus offering an advantage in accelerating the progress toward practical applications by drug repurposing. Besides, ZINC5042 also provides a blueprint for lead optimization to develop more potent NAMs.
Overall, the identification pipeline offers a promising strategy to discover allosteric sites/ drugs and reveal their regulation mechanisms for other target proteins. Thus, we uploaded a user-friendly code of the residue-intuitive hybrid machine learning framework available at https://github.com/chyannn06/RHML. The code offers customizable input options, automatically generating readable output files that include cluster categories and important residues deciding the classification. We expect that it will serve as a valuable tool in the MD field for aiding allosteric site identification and other MD tasks associated with conformational analyses.
Methods
System setup for MD simulations
The crystal structures of the inactive (PDB ID: 2RH1)53 and active state (PDB ID: 4LDO)54 of β2AR were obtained from the Protein Data Bank. Other components, except the protein, were removed from the crystal structure, and the missing intracellular loop 3 (ICL3) region (residue numbers: 231–262) was reconstructed using MODELER V9.255. The 3D structure of ALE was obtained from the co-crystalized structure (PDB ID: 4LDO). The 3D structure of NE was downloaded from the PubChem database56 and optimized at the DFT/B3LYP/6-31 G** level using the Gaussian 09 program57 before docking. All Ligand dockings were performed with AutoDock 4.258, and the rational docking pose with the top score was selected for subsequent MD simulations.
To prepare the system for MD simulation, hydrogen atoms were added under pH = 7 conditions by H + + 59. The receptor structure was aligned using the Orientation of Protein in Membrane (OPM) database and inserted into a lipid bilayer comprised of 80% phosphatidylcholine (POPC) and 20% cholesterol. The system was solvated and neutralized with 0.15 mol/L NaCl in the aqueous phase. The CHARMM36 force field was used for the receptor, lipids, and salt ions, while the CHARMM TIP3P model was chosen for water60. Ligand parameters were generated using the CHARMM General Force Field (CGenFF)61. These settings were successfully used in MD simulations of GPCRs62,63. All these steps were carried out using the CHARMM-GUI server64. After that, the systems were minimized and equilibrated (see Supplementary Methods for more details).
GaMD Molecular dynamics simulations
To sufficiently sample conformational changes associated with the opening of the allosteric site, we utilized GaMD65 to enhance sampling (see Supplementary Methods for details). Before performing GaMD simulations, 210-ns cMD production was performed, through which acceleration parameters were calculated. The final structure of the 210 ns cMD simulation was selected as the starting structure for subsequent GaMD simulations with random initial velocities. For the inactive β2AR bound by NE, we carried out five independent 3-μs GaMD simulations to ensure sufficient sampling, labeled as traj1, traj2, traj3, traj4, and traj5. All the simulations have reached convergence (Supplementary Fig.12). MD simulations were performed using Amber16 software66. Details parameters for simulations are described in Supplementary Methods.
Construction of the interpretable CNN-based multi-classification model
The foundational paradigm of the deep learning-based classification model mainly followed our previous binary classification method67. However, different from the previous work, labels of conformational categories in the work are not pre-known, and the conformations encompass multiple categories, rather than simple binary classes67. Consequently, our previous classification strategy needs to be modified to address the differences so as to handle more extensive and complex MD conformational analysis. Thus, we introduced the k-means algorithm to obtain the initial category labels. With the category labels, the interpretable CNN-based multi-classification model was constructed and trained. Specifically, the XYZ coordinate of each conformation was transferred to the RGB coordinate \({C}_{{RGB}}\) by using a matrix transformation (vide Eq. (1)).
Consequently, each conformation was represented by a pixel map, where each pixel corresponds to an atom. These pixel maps and their category labels inferred from the clustering analysis were utilized to train the CNN-based classification model (Fig. 2a). The CNN model is composed of four convolutional layers, two max-pooling layers, and two fully connected layers. Rectified linear units (ReLU) were used as the activation function to increase the model’s nonlinearity. The fully connected layers of the model include two dense layers, with the first dense layer containing 512 neurons. The number of neurons in the final dense layer is dependent on the number of classes inferred from the clustering result. Softmax activation was used for the multi-classification. To prevent overfitting, dropout techniques were employed after the first and second max-pooling layers, as well as the first dense layer, with dropout rates of 0.25, 0.25, and 0.5, respectively. Model training utilized the Adam optimizer and categorical cross-entropy loss function, with prediction accuracy as the performance metric.
To address the black-box problem, we further established an interpreter for the CNN-based classification result based on the Local Interpretable Model-Agnostic Explanation (LIME) paradigm68. LIME utilizes linear models to approximate the local decision boundary, which can provide an approximate explanation for the classification result of the CNN-based model. To identify important residues deciding each category, the LIME interpreter generated distinct sets of LIME matrices for each class. Figure 2b illustrates how the LIME interpreter works for the multi-classifier. To obtain predictions for the model being explained f, we generated a perturbed dataset A with small perturbations based on instance a being explained (vide red star in Fig. 2b) and weighted them by \({\pi }_{x}\left(a\right)\) that characterizes the proximity measure between the instances x and a. \({\pi }_{x}\left(a\right)\) was determined by an exponential kernel defined on a distance function D (Euclidean distance used in the work) with width σ, and expressed as Eq. (2).
Next, we trained a local linear model l (vide gray dotted line in Fig. 2b) on the perturbed dataset to interpret the black-box model locally. To assess the fidelity of the linear model l in approximating the original model f for explanation, we calculated the error using Eq. (3):
where \(f\left(a\right)\) and \(l({a}^{{\prime} })\) are the probability belonging to a certain class and \({a}^{{\prime} }\) is the interpretable version of a. The explanation produced by LIME is the optimal result that minimizes the loss function \(L\left(f,{l,\pi }_{x}\right)\) and the complexity measure \(\varOmega (l)\,\), which was calculated by Eq. (4)
The complexity measures \(\varOmega (l)\) penalize the model that has too many features or coefficients to ensure its interpretability.
For each conformation, a LIME matrix was generated to evaluate the importance of each pixel in the classification of the specific class, where the values can be either 0 (insignificant) or 1 (significant). The LIME matrices from all conformational states were summed and averaged to calculate a score ranging from 0 to 1, which can reflect the importance of the atom in distinguishing the class from the others. Then, the average importance scores for all atoms within a residue were calculated to present the importance of the residue. The higher score represents the greater importance of distinguishing different conformational states.
In order to train RHML, five independent 3-μs GaMD trajectories (traj1 to traj5) were used to construct the conformation dataset, in which 30,000 conformations of each trajectory were divided into ten groups based on the time order. Each group was randomly split into training and validation sets (8:2 ratio) to conduct five-fold cross-validation training. The results from the five trajectories were analyzed.
Mapping algorithm
We employed the FTMap site mapping online server (http://ftmap.bu.edu) to identify binding sites on the important conformations identified by the RHML model above. FTMap utilizes 16 small molecule probes with diverse properties to search for hot spots on the conformation. The optimal binding positions of the probes are calculated and then clustered based on free energy to yield consensus clusters. The regions that bind different probe clusters are called consensus sites (CS). CSs are ranked by the number of bound probes, starting with consensus site 0 (CS0) with the largest number of probe clusters. If the distances between the bound probe clusters of any consensus sites are within 4 Å, they are considered to form a single binding site. The residues of the binding site were identified within 4 Å of these bound probe clusters in the work. The mapping results were visually inspected using Pymol (https://pymol.org), and the most promising allosteric sites were determined by combining the results of the LIME interpreter.
Virtual screening
Structure-based drug design (SBDD) generally includes structural-based virtual screening (VS) and structural-based de novo drug design (DNDD). VS docks molecules of the virtual library into the receptor structure and predicts their binding scores, while DNDD creates novel chemical entities based on the receptor structures69,70. Compared to DNDD, VS possesses the advantage of mitigating the problem of drug synthesis, as it uses large libraries of pre-synthesized compounds. Thus, it has become mainstream at the early hit identification stage37.
In the work, VS was conducted using a ligand set comprising a total of 103,862 molecules, obtained from the diverse-lib and drugs-lib databases provided by MtiOpenScreen on 6th August 202239. In total, 6000 molecules (including stereoisomers) were obtained through preliminary screening with MTiOpenScreen, which were docked by using Autodock 4.258. All docking input files were prepared using AutoDockTools 1.5.6 package, and the active site lattice files were generated using AutoGrid 4.2. Gasteiger charges were added to atoms. The docking box was positioned to cover the predicted allosteric site from FTMap, with a spacing of 0.375 Å. Semi-flexible docking was performed with the flexible ligand and the rigid receptor. To ensure accuracy, each ligand underwent 100 separate docking calculations. Each docking calculation included a total of 1,750,000 energy evaluations using the Lamarck genetic algorithm. The docking pose with the lowest binding energy was selected as the optimal binding mode for subsequent analysis.
Binding free energy analysis
Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) method has been considered as a reliable tool to estimate the binding free energy for protein-ligand interactions, which can be calculated in terms of Eq. (5),
where \({G}_{{complex}}\), \({G}_{{receptor}}\) and \({G}_{{ligand}}\) represent the free energy of the receptor-ligand complex, receptor and ligand, respectively. The free energy terms in the Eqs. (6)–(8) were estimated by the following equations.
The gas phase energy (\({E}_{{gas}}\)) is the sum of the internal energy (\({E}_{\mathrm{int}}\)), van der Waals energy (\({E}_{{vdw}}\)), and electrostatic interaction energy (\({E}_{{ele}}\)). The solvation energy (\({G}_{{sol}}\)) comprises contributions from polar solvation (\({G}_{{psolv}}\)) and non-polar solvation (\({G}_{{npsolv}}\)) energies. T represents temperature and \(S\) denotes the total conformational entropy. Following some high-quality computational works71,72, the entropy contribution was not considered in this study due to the high computational cost and the potential errors from the entropy calculations73. All binding free energy calculations were performed using the SANDER program in AMBER16.
Protein structure network (PSN)
Herein, the PSN method was employed to investigate allosteric communication in the receptor, which has exhibited successes in computational studies74. In PSN, residues are represented as nodes, and interactions between two nodes are represented as edges in a network. An edge is formed between two nodes only if the non-covalent interaction strength between the two nodes equals or overcomes a given cutoff, as defined by Eq. (9):
where \({I}_{{ij}}\) represents the percentage interaction between nodes i and j. The term \({n}_{{ij}}\) denotes the number of heavy atom-tom pairs between the side chains of residues i and j within a distance cutoff (4.5 Å). \({N}_{i}\) and \({N}_{j}\) are normalized factors for residues i and j. After PSN is constructed, the shortest pathways between pairs of nodes can be searched using Dijkstra’s algorithm. Then, the correlation matrix is utilized to filter these shortest pathways. Herein, the dynamic cross-correlation (DCC) algorithm was used to estimate the motion correlation between residues by Eq. (10):
where i and j denotes residues, and \({r}_{i}\left(t\right)\) and \({r}_{j}\left(t\right)\) are the corresponding position vectors at time t. \(\bar{r}\) means the ensemble average over a period of time. DCC could characterize the extent of residue-residue movement correlations within a range from 1.0 to − 1.0, where 1.0 indicates completely correlated motion and − 1.0 denotes completely anti-correlated motion. Cross-correlation analysis and PSN were performed using Wordom software75.
cAMP Accumulation assay
To examine the intracellular cAMP levels of HEK293 cells overexpressing β2AR in response to the two agonists (NE (TargetMol, T7044), ALE (Med Chem Express (MCE), HY-B0447B)) under study and allosteric ligand screened (ZINC5042(MCE, HY-108999A), ZINC252008995(MCE, HY-15337), ZINC4213962(MCE, HY-100572), ZINC11681534 (MCE, HY-B0203A)). The GloSensor-based cAMP accumulation assay was performed as described previously76,77. Briefly, HEK293 cells were transfected with β2AR plasmids and GloSensor plasmids in 6-well plates using Polyethylenimine Linear (PEI) MW40000 (Yeasen, Cat# 40816ES02. After 24 h incubation, cells were seeded into 96-well plates and incubated for another 24 h at 37 °C. The next day, the culture media was discarded, washed twice with PBS buffer, and replaced with 90 μL assay buffer (Hank’s Balanced Salt Solution buffer containing 10 mM HEPES, pH 7.4) containing 3% v/v dilution of the D-luciferin-potassium salt (Yeasen, Cat# 40902ES03), and incubated for 1 h at room temperature. After that, ligands diluted by the same buffer as above were added to cells. After 30 min of stimulation at room temperature, luminescence was measured by a Synergy H1 microplate reader (BioTek). Data were processed by the nonlinear regression (curve fit) dose-response function in GraphPad Prism 8. All data are the mean ± SEM from three independent experiments performed. Operational models used here to help us understand the interaction between ZINC5042 and orthosteric ligand NE or ALE in a Glosensor-based cAMP assay. Operational models are shown below78.
\({E}_{\max }\) is the maximal response of the system; \(\left[A\right]\) and \([B]\) are the concentrations of orthosteric ligand NE (or ALE) and allosteric modulator ZINC5042, respectively; \({K}_{A}\) and \({K}_{B}\) denote the equilibrium dissociation constants of an orthosteric ligand (A) and an allosteric modulator (B), respectively; α is the binding cooperativity parameter between the NE (or ALE) and ZINC5042; β denotes the allosteric effect of the ZINC5042 on NE (or ALE) efficacy; \({\tau }_{A}\) and \({\tau }_{B}\) denote the capacity of NE (or ALE) and ZINC5042, respectively. n is the slope of the transducer function that links receptor occupancy to the response.
Site-directed mutagenesis
The cDNA of human β2AR isoform1 (NM_000024.6) was obtained from Changsha Youze Biotechnology Co., Ltd and subcloned into the pcDNA3.1 vector, tagged with a hemagglutinin (HA) signal sequence at the N terminus followed by a Flag tag. Forward and reverse primers for each mutation (D792.50A, F2826.44A, N3187.45A, S3197.46A) were synthesized by Tsingke Biotechnology Co., Ltd (Beijing, China). Mutations-specific primers and high-fidelity PrimeSTAR Max DNA Polymerase (Takara, Cat# R045A) were used to amplify the coding region with mutations from the pcDNA3.1-HA-Flag-β2AR vector. The PCR linearized products were ligated through homologous recombination using NovoRec plus One-step PCR Cloning Kit (Novoprotein Scientific Inc, China, Cat# NR005). All recombinant plasmids were extracted using the TIANprep Rapid Mini Plasmid Kit (TianGen, Cat# DP103) following the manufacturer’s instructions and verified by DNA sequencing.
NanoBiT β-arrestin recruitment assay
β2AR-mediated β-arrestin recruitment was measured by the NanoBiT β-arrestin recruitment assay as described previously79. NanoLuc was split to create a large fragment (LgBiT) and a small fragment (SmBiT). LgBiT was fused with a flexible linker at the C-terminal of β2AR, and SmBiT was fused at the N-terminal of β-arrestin. HEK293 cells were transfected with β2AR-LgBiT and SmBiT-β-arrestin fusion vectors at a 1:1 ratio using PEI. After transfection, cells were washed with PBS, seeded into 96-well plates, and incubated for 12 h. Subsequently, media was removed and replaced with 5 µM coelenterazine h diluted by HBSS containing 20 mM HEPES. After 30 min incubation at room temperature, the ligand was added, and luminescence was measured by the Synergy H1 microplate reader (BioTek). Data analysis was conducted using GraphPad Prism 8.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The MD trajectories generated in this study and the source data underlying Figs. 3a, b, c–e, 5a, c, 6d, e, 8a–g and Supplementary Figs. 1, 2, 6, 7d, e, 9–12 have been deposited in figshare [https://doi.org/10.6084/m9.figshare.26129632]80. Source data is provided in this paper as source data. Source data are provided in this paper.
Code availability
The RHML tool developed by this study is open source and publicly available from Zenodo [https://zenodo.org/doi/10.5281/zenodo.13325067]81.
References
Cheng, X. & Jiang, H. Allostery in Drug Development. Adv. Exp. Med. Biol. 1163, 1–23 (2019).
Wootten, D., Christopoulos, A. & Sexton, P. M. Emerging paradigms in GPCR allostery: implications for drug discovery. Nat. Rev. Drug Discov. 12, 630–644 (2013).
Möhler, H., Fritschy, J. M. & Rudolph, U. A new benzodiazepine pharmacology. J. Pharm. Exp. Ther. 300, 2–8 (2002).
Guarnera, E. & Berezovsky, I. N. Allosteric drugs and mutations: chances, challenges, and necessity. Curr. Opin. Struct. Biol. 62, 149–157 (2020).
Chatzigoulas, A. & Cournia, Z. Rational design of allosteric modulators: Challenges and successes. WIREs Comput. Mol. Sci. 11, e1529 (2021).
Kuzmanic, A., Bowman, G. R., Juarez-Jimenez, J., Michel, J. & Gervasio, F. L. Investigating cryptic binding sites by molecular dynamics simulations. Acc. Chem. Res. 53, 654–661 (2020).
Hollingsworth, S. A. et al. Cryptic pocket formation underlies allosteric modulator selectivity at muscarinic GPCRs. Nat. Commun. 10, 3289 (2019).
Shah, S. D. et al. In silico identification of a β2-adrenoceptor allosteric site that selectively augments canonical β2AR-Gs signaling and function. Proc. Natl. Acad. Sci. USA 119, e2214024119 (2022).
Zhang, Q. et al. Targeting a cryptic allosteric site of SIRT6 with small-molecule inhibitors that inhibit the migration of pancreatic cancer cells. Acta Pharm. Sin. B 12, 876–889 (2022).
Beglov, D. et al. Exploring the structural origins of cryptic sites on proteins. Proc. Natl. Acad. Sci. USA 115, E3416–E3425 (2018).
Jiang, Y. et al. Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials. Nat. Commun. 12, 1–14 (2021).
Wallach, I. et al. AI is a viable alternative to high throughput screening: a 318-target study. Sci. Rep. 14, 7526 (2024).
Noé, F., Tkatchenko, A., Müller, K.-R. & Clementi, C. Machine learning for molecular simulation. Annu Rev. Phys. Chem. 71, 361–390 (2020).
Zhu, J., Wang, J., Han, W. & Xu, D. Neural relational inference to learn long-range allosteric interactions in proteins from molecular dynamics simulations. Nat. Commun. 13, 1661 (2022).
Zhou, H., Dong, Z. & Tao, P. Recognition of protein allosteric states and residues: Machine learning approaches. J. Comput. Chem. 39, 1481–1490 (2018).
Hayatshahi, H. S., Ahuactzin, E., Tao, P., Wang, S. & Liu, J. Probing protein allostery as a residue-specific concept via residue response maps. J. Chem. Inf. Model. 59, 4691–4705 (2019).
Wold, E. A., Chen, J., Cunningham, K. A. & Zhou, J. Allosteric modulation of class A GPCRs: Targets, agents, and emerging concepts. J. Med. Chem. 62, 88–127 (2019).
Baker, J. G. The selectivity of beta-adrenoceptor antagonists at the human beta1, beta2 and beta3 adrenoceptors. Br. J. Pharm. 144, 317–322 (2005).
Karoli, N. A. & Rebrov, A. P. [Possibilities and limitations of the use of beta-blockers in patients with cardiovascular disease and chronic obstructive pulmonary disease]. Kardiologiia 61, 89–98 (2021).
Liu, X. et al. An allosteric modulator binds to a conformational hub in the β2 adrenergic receptor. Nat. Chem. Biol. 16, 749–755 (2020).
Liu, X. et al. Mechanism of β2AR regulation by an intracellular positive allosteric modulator. Science 364, 1283–1287 (2019).
Liu, X. et al. Mechanism of intracellular allosteric β2AR antagonist revealed by X-ray crystal structure. Nature 548, 480–484 (2017).
Masureel, M. et al. Structural insights into binding specificity, efficacy and bias of a β2AR partial agonist. Nat. Chem. Biol. 14, 1059–1066 (2018).
Swaminath, G., Lee, T. W. & Kobilka, B. Identification of an allosteric binding site for Zn2+ on the beta2 adrenergic receptor. J. Biol. Chem. 278, 352–356 (2003).
Shi, C. et al. Auto-dialabel: labeling dialogue data with unsupervised learning. 2018 Conference on Empirical Methods in Natural Language Processing (Emnlp 2018), 684–689 (2018).
Dhamija, A., Pandoi, D., Singh, K. & Malhotra, S. An improved K-means clustering with convolutional neural network for financial crisis prediction. In Advances in Computational Intelligence and Communication Technology (Springer Singapore, Singapore, 2022).
Glielmo, A. et al. Unsupervised learning methods for molecular simulation data. Chem. Rev. 121, 9722–9758 (2021).
Nainggolan, R., Perangin-angin, R., Simarmata, E. & Tarigan, A. F. Improved the performance of the K-means cluster using the sum of squared error (SSE) optimized by using the Elbow Method. J. Phys. Conf. Ser. 1361, 012015 (2019).
Akhanli, S. E. & Hennig, C. Comparing clusterings and numbers of clusters by aggregation of calibrated clustering validity indexes. Stat. Comput. 30, 1523–1544 (2020).
Wakefield, A. E., Mason, J. S., Vajda, S. & Keserű, G. M. Analysis of tractable allosteric sites in G protein-coupled receptors. Sci. Rep. 9, 6180 (2019).
Srivastava, A. et al. High-resolution structure of the human GPR40 receptor bound to allosteric agonist TAK-875. Nature 513, 124–127 (2014).
Oswald, C. et al. Intracellular allosteric antagonism of the CCR9 receptor. Nature 540, 462–465 (2016).
Jaeger, K. et al. Structural basis for allosteric ligand recognition in the human CC chemokine receptor 7. Cell 178, 1222–1230.e10 (2019).
Shao, Z. et al. Structure of an allosteric modulator bound to the CB1 cannabinoid receptor. Nat. Chem. Biol. 15, 1199–1205 (2019).
Shen, S. et al. Allosteric modulation of G protein-coupled receptor signaling. Front. Endocrinol. 14, https://doi.org/10.3389/fendo.2023.1137604 (2023).
Yuan, J. et al. In silico prediction and validation of CB2 allosteric binding sites to aid the design of allosteric modulators. Molecules 27, 453 (2022).
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).
De Vivo, M., Masetti, M., Bottegoni, G. & Cavalli, A. Role of molecular dynamics and related methods in drug discovery. J. Med. Chem. 59, 4035–4061 (2016).
Labbé, C. M. et al. MTiOpenScreen: a web server for structure-based virtual screening. Nucleic Acids Res. 43, W448–W454 (2015).
Maffucci, I., Hu, X., Fumagalli, V. & Contini, A. An efficient implementation of the Nwat-MMGBSA method to rescore docking results in medium-throughput virtual screenings. Front. Chem. 6, 43 (2018).
Xu, X. et al. Binding pathway determines norepinephrine selectivity for the human β1AR over β2AR. Cell Res. 31, 569–579 (2021).
Zhou, Q. et al. Common activation mechanism of class A GPCRs. ELife 8, e50279 (2019).
Latorraca, N. R., Venkatakrishnan, A. J. & Dror, R. O. GPCR Dynamics: Structures in motion. Chem. Rev. 117, 139–155 (2017).
VanWart, A. T., Eargle, J., Luthey-Schulten, Z. & Amaro, R. E. Exploring residue component contributions to dynamical network models of allostery. J. Chem. Theory Comput. 8, 2949–2961 (2012).
Filipek, S. Molecular switches in GPCRs. Curr. Opin. Struct. Biol. 55, 114–120 (2019).
Hauser, A. S. et al. GPCR activation mechanisms across classes and macro/microscales. Nat. Struct. Mol. Biol. 28, 879–888 (2021).
Slosky, L. M., Caron, M. G. & Barak, L. S. Biased allosteric modulators: New frontiers in GPCR drug discovery. Trends Pharmacol. Sci. 42, 283–299 (2021).
Ahn, S. et al. Allosteric “beta-blocker” isolated from a DNA-encoded small molecule library. Proc. Natl. Acad. Sci. USA 114, 1708 (2017).
Ippolito, M. et al. Identification of a β-arrestin-biased negative allosteric modulator for the β2-adrenergic receptor. Proc. Natl. Acad. Sci. USA 120, e2302668120 (2023).
Whitty, A. & Kumaravel, G. Between a rock and a hard place? Nat. Chem. Biol. 2, 112–118 (2006).
Wisler, J. W. et al. A unique mechanism of beta-blocker action: carvedilol stimulates beta-arrestin signaling. Proc. Natl. Acad. Sci. USA 104, 16657–16662 (2007).
Villalona-Calero, M. A. et al. A phase I and pharmacological study of protracted infusions of crisnatol mesylate in patients with solid malignancies. Clin. Cancer Res. 5, 3369–3378 (1999).
Cherezov, V. et al. High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science 318, 1258–1265 (2007).
Ring, A. M. et al. Adrenaline-activated structure of β2-adrenoceptor stabilized by an engineered nanobody. Nature 502, 575–579 (2013).
Webb, B. & Sali, A. Comparative protein structure modeling using MODELLER. Curr. Protoc. Bioinforma. 54, 5.6.1–5.6.37 (2016).
Kim, S. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021).
Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E. & Fox, D. J. Gaussian 09. Revision A.01. (Gaussian Inc, Wallingford, 2009).
Morris, G. M. et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791 (2009).
Anandakrishnan, R., Aguilar, B. & Onufriev, A. V. H. ++ 3.0: automating pK prediction and the preparation of biomolecular structures for atomistic molecular modeling and simulations. Nucleic Acids Res. 40, W537–W541 (2012).
Huang, J. & MacKerell, A. D. Jr CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data. J. Comput. Chem. 34, 2135–2145 (2013).
Vanommeslaeghe, K. et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690 (2010).
Thakur, N. et al. Anionic phospholipids control mechanisms of GPCR-G protein recognition. Nat. Commun. 14, 794 (2023).
Chan, H. C. S. et al. Exploring a new ligand binding site of G protein-coupled receptors. Chem. Sci. 9, 6480–6489 (2018).
Lee, J. et al. CHARMM-GUI Input generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM simulations using the CHARMM36 additive force field. J. Chem. Theory Comput. 12, 405–413 (2016).
Miao, Y., Feher, V. A. & McCammon, J. A. Gaussian accelerated molecular dynamics: Unconstrained enhanced sampling and free energy calculation. J. Chem. Theory Comput. 11, 3584–3595 (2015).
D. A. Case, R. M. Betz, D. S. Cerutti, T. Cheatham, P. A. Kollman, Amber 16. (University of California, San Francisco, 2016).
Li, C. et al. An onterpretable convolutional neural network framework for analyzing molecular dynamics trajectories: A case study on functional states for G-protein-coupled receptors. J. Chem. Inf. Model 62, 1399–1410 (2022).
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (Association for Computing Machinery, New York, NY, USA, 2016). https://doi.org/10.1145/2939672.2939778.
Huang, L. et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun. 15, 2657 (2024).
Lionta, E., Spyrou, G., Vassilatis, D. K. & Cournia, Z. Structure-based virtual screening for drug discovery: principles, applications and recent advances. Curr. Top. Med. Chem. 14, 1923–1938 (2014).
Wichapong, K. et al. Structure-based design of peptidic inhibitors of the interaction between CC chemokine ligand 5 (CCL5) and human neutrophil peptides 1 (HNP1). J. Med. Chem. 59, 4289–4301 (2016).
Lei, T. et al. Exploring the activation mechanism of a metabotropic glutamate receptor homodimer via molecular dynamics simulation. ACS Chem. Neurosci. 11, 133–145 (2020).
Hou, T., Wang, J., Li, Y. & Wang, W. Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations. J. Chem. Inf. Model 51, 69–82 (2011).
Zhang, F. et al. Molecular insights into the allosteric coupling mechanism between an agonist and two different transducers for μ-opioid receptors. Phys. Chem. Chem. Phys. 24, 5282–5293 (2022).
Seeber, M., Cecchini, M., Rao, F., Settanni, G. & Caflisch, A. Wordom: a program for efficient analysis of molecular dynamics simulations. Bioinformatics 23, 2625–2627 (2007).
Feng, Y. et al. Mechanism of activation and biased signaling in complement receptor C5aR1. Cell Res. 33, 312–324 (2023).
Zhao, J. et al. Prospect of acromegaly therapy: molecular mechanism of clinical drugs octreotide and paltusotine. Nat. Commun. 14, 962 (2023).
Leach, K., Sexton, P. M. & Christopoulos, A. Allosteric GPCR modulators: taking advantage of permissive receptor pharmacology. Trends Pharm. Sci. 28, 382–389 (2007).
Zhao, C. et al. Biased allosteric activation of ketone body receptor HCAR2 suppresses inflammation. Mol. Cell 83, 3171–3187 (2023).
Chen, X. et al. Integrative Residue-intuitive Machine Learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. figshare, https://doi.org/10.6084/m9.figshare.26129632 (2024).
Chen, X. et al. Integrative Residue-intuitive Machine Learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. Zenodo, https://doi.org/10.5281/zenodo.13325067 (2024).
Acknowledgements
This project was supported by the Sichuan International Science and Technology Innovation Cooperation Project (Grant No. 24GJHZ0431 to X.P.). This work was supported by the National Natural Science Foundation of China (62475177 to X.P. and T2221004 to Z.S.), Frontiers Medical Center, Tianfu Jincheng Laboratory Foundation (TFJC2023010010 to Z.S.), 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (ZYYC23022 to Z.S.).
Author information
Authors and Affiliations
Contributions
X.C., J.C., and X.P. conceived and designed the project; X.C., J.C., J.M., Y.S., and Y.L. contributed to computational methodology and analyses; K.W., C.W., and Z.S. contributed to experimental validation and analyses; Z.S. and X.P. supervised the project; X.C., K.W., J.C., C.W., Z.S., and X.P. discussed the results, and contributed to the manuscript preparation and revision. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Piia Bartos, Yinglong Miao, Jia Zhou, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, X., Wang, K., Chen, J. et al. Integrative residue-intuitive machine learning and MD Approach to Unveil Allosteric Site and Mechanism for β2AR. Nat Commun 15, 8130 (2024). https://doi.org/10.1038/s41467-024-52399-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-52399-y
This article is cited by
-
Exploring the distinct activation mechanisms of neuromedin B receptor through multiple replica molecular dynamics simulations and Markov state modeling
Acta Pharmacologica Sinica (2025)
-
Advancing active compound discovery for novel drug targets: insights from AI-driven approaches
Acta Pharmacologica Sinica (2025)