Introduction

Exhibiting functional diversity of proteins usually requires a large number of sequence alterations through multiple evolutionary steps1,2. However, in the case of the guanylate kinase (GK) enzyme3,4,5, functional novelty is observed in a single (S→P) mutation, leading to the conversion of the protein from a phosphoryl transfer kinase into a phosphorprotein interaction domain6,7.

The GK enzyme is a broadly distributed nucleotide kinase and essential for cellular GMP recycling and nucleotide equilibration5. The protein is involved in a crucial intermediate step in RNA or DNA synthesis by catalyzing the phosphoryl group transfer from ATP to GMP3. According to the available X-ray crystallography structures of GK enzyme6,7,8,9, it can be structurally divided into three featured domains: the N-terminal GMP-binding domain (GBD), the C-terminal ATP -binding LID domain and the innermost CORE domain. In the apo form, the protein resembles a U-shape.Upon the substrate GMP binding to the binding site between the GBD and LID domain, large conformational changes would be induced, leading to the formation of a fully closed state, which is analogous to the closing motion of the fingers and the thumb7,10,11,12.

Recently, Johnston et.al7 demonstrated that a single mutation (S35P) of GK enzyme would fundamentally alter the protein function and switch it from a phosphoryl transfer kinase into a phosphoprotein interaction domain to regulate spindle orientation. In addition, the phylogenetic analysis suggested that the GK enzyme could evolve into a protein interaction domain, namely the guanylate kinase (GK) domain, with significantly high structural similarity and sequence identity13,14. The GK domain is one of the core modules of the scaffold protein Membrane Associated Guanylate Kinase (MAGUK), which is responsible for modulating cell-cell communication, spindle orientation and cellular signal transduction15,16,17,18,19,20,21,22. Functionally, the GK domain is able to interact with a variety of phospho-peptide ligands with high affinity but its binding ability to GMP is surprisingly low18. Previously, Olsen et.al10 reported that mutating the Ser35 of GK enzyme to Pro, a conversed proline residue at this position among all GK domains, drastically impairs the GMP binding affinity and significantly reduces the guanylate kinase activity. By comparing the difference between the crystal structures of GK enzyme6,7,8,9 and GK domain22, it can be observed that eight residues from the GK enzyme directly contact with the ligand and three residues (Ser35,Glu70 and Asp101) more closely coordinate to the guanine ring of GMP, while the other five residues remain invariant from the GK enzyme to the GK domain.

Thus, the experimental evidences mentioned above clearly highlight the critical functional role of Ser35 in the GK enzyme. In this work, in order to elucidate the underlying mechanism of its functional transition after introducing the proline at position 35 of GK enzyme, we carried out molecular dynamic (MD) simulations to investigate their dynamic behaviors. In consistence with experimental observations, our simulations demonstrate that the S→P mutation fundamentally changes the dynamic motion of the protein, leading to the dynamic behavior of the mutant more similar to the GK domain.

We further investigated the relationship between the conformational transitions and GMP binding in both WT and S35P GK by employing the bias-exchange metadynamics23. Our calculated free energy profiles (FEPs) suggest that the closed-open transition of WT GK is positive correlated with the process of GMP binding, indicating the GMP-induced closing motion of GK. In contrast, the S → P mutation raises a barrier for the closing motion and leads to the protein energetically favoring the open conformation even with the presence of GMP. Moreover, the FEPs further demonstrate that mutating Ser35 to Pro also results in the mis-binding of GMP in the binding site, explaining the vanishing of catalytic activity of the mutant.

Methods

System Preparations

The structure of the GMP-binding form of WT GK enzyme (PDBid:1ex7;186 amino acids)8 was used as the initial conformation for theoretical modeling. The apo form of WT GK enzyme was prepared by removing the GMP ligand from its GMP-binding form. The Mutator Plugin in the VMD program24 was utilized to mutate Ser to Pro at position 35 of GK enzyme for both the apo- and the GMP-binding forms. Accordingly, as shown in Figure 1, all of the starting structures have the identical backbone conformations except for the GK domain which is extracted from the MAGUK scaffold protein (PDBid:3uat)22.

Figure 1
figure 1

The superimpose of initial conformations of the GK enzymes (green:apoWT; cyan:apo S35P mutant; magenta:GMP-bound WT;red:WT-bound S35P mutant) and the GK domain (orange).

Standard MD Simulations

All MD simulations were performed using the Gromacs25,26 MD simulation engine with amber03 force field27. The MD protocol for all systems is given as follow. For each system, the GK enzyme was centered into a ~80 × 80 × 80 Å cube and then was dissolved in ~18000 TIP3 waters. Then, 0.1 M NaCl ions were added to neutralize the net charge of the whole system. The steepest descent algorithm was employed to minimize the whole system before it was gradually heated to 300 K with a increment of 20 K within 50000 steps. The leap-frog integrator was used to produce a 300 ns unbiased MD trajectory for each system with an integration time-step of 2 fs under NPT condition. The Parrinello-Rahman barostat28 was used to control the pressure at 1 bar with a coupling constant of 2ps. The modified Berendsen (V-rescale) thermostat29 was employed to control the temperature of the systems at 300 K with a time constant of 1 ps. The Particle Mesh Ewald method30 was used to compute the electrostatic interactions with a real-space cut-off distance of 1 nm. The same cutoff value was chosen for treating the van der Waals interactions. The SETTLE algorithm31 was used to constrain water molecules and all non-water bonds were constrained using the LINCS algorithm32.

Free Energy Profile

Metadynamics33,34,35,36,37,38,39,40 is a recently developed enhanced sampling technique used to explore the free energy profile (FEP) of a system along coarse-grained reaction coordinates, usually a few collective variables (CVs). In this work, according to the analysis of the RMSF results for essential dynamic, a collective variable (CV1) was defined as the distance between the Cα atoms of Gly43 and Thr137 and another collective variable (CV2) was considered as the distance between the carbonyl oxygen of GMP and the side chain of residue 35 in WT GK (the hydroxyl group of Ser35) and in S35P GK (the side chain of Pro35), respectively. The method was depicted as “filling the free energy wells with computational sand” which exert a positive user-defined Gaussian potential intermittently to the real-energy landscape of the system in a chronological order. Thus during the evolution of the simulation, more and more Gaussians were summed up and not updated afterwards until the system explored the full energy landscape. In the long time limit, the bias potential converges to minus the free energy.

In this work, we carried out the bias-exchange (BE) matadynamics34,35 in Gromacs/PLUMED41,42 to obtain the 2D free energy profiles. In BE matadynamics, several replicas run in parallel and the configurations between pairs of replicas are allowed to swap at a fixed time interval. We chose the CV1 and CV2 in replica 1 and allow the exchange between configurations with the unbiased system (replica 2) using an exchange rate of 120 ps. On average, the exchange acceptance ratio was ~27% between the two replicas. The MD protocol for the BE matadynamics were applied similarly as the standard MD simulations mentioned above. The Gaussian widths σ were set to 0.1 for CV1 and 0.01for CV2 with the height of W = 0.1 kJ/mol added by every 20ps. Well-tempered metadynamics43 were also employed to avoid overfilling with BIASFACTOR = 10 at TEMP = 300 K. The convergence of FEP in each basin is estimated as a function of simulation time.

Results and Discussion

Computational investigations of protein conformational changes and ligand binding events have become indispensable tools to provide the atomic description of protein functions for the experimental observations44,45,46,47. However, sampling large-scale conformational space of the protein dynamic and acquiring the energetic profile of a ligand binding process in standard molecular dynamic (MD) simulations are still challenging tasks. In this work, by employing standard MD simulations and metadynamics, we provide a detailed description of the conformational transitions and ligand binding process for the WT GK enzyme and the S35P mutant to illustrate the mechanism of its functional transitions. To compare the dynamical motions between the WT and the S35P GK, we first conduct standard MD simulations for both of their ligand-free and ligand-bound forms. An additional simulation of the GK domain is also performed as a comparison. Then we utilize the principal component analysis (PCA) to obtain the slowest motions for each individual system from MD trajectories, which are further used to determine the reaction coordinate of conformational transition in the free energy calculations. Finally, we use metadynamics to compare the free energy profiles (FEPs) of conformational transitions and ligand binding process after replacing Ser35 by Pro in GK enzyme.

The analysis of MD simulation trajectories

After performing the standard MD simulations of the five systems, we first analyzed each 300 ns MD trajectory using the principal component analysis. We find that it is sufficient to use the first two modes to describe the conformational changes of each system. In Figure 2, we depict the C-alpha RMSF contributed by the first two eigenvectors. As can be seen, the largest fluctuations during the transition between the closed and open conformations are observed in the hinge region between the beta3 and beta4 of the GBD domain (residues 40–46) with a peak at Gly43 and the loop region between the alpha5 and alpha6 of the LID domain (residues 135–141) with a peak at Thr137, respectively. Therefore, it is reasonable to define the distance between the Cα atoms of Gly43 and Thr137 as a collective variable (CV1) to describe the conformational changes between the GBD and LID domain of GK in our following metadynamics simulations. Note that the two positions determined by PCA were also used as labels to monitor the conformational transitions of GK in the fluorescence quenching the measurements performed by Johnston et.al7. Thus, our MD simulations can provide a direct comparison to the experimental data.

Figure 2
figure 2

The root mean square fluctuations (RMSF) contributed by first two eigenvectors of the MD trajectories.

In addition, we have analyzed the distance profiles between the Cα atoms of Gly43 and Thr137 as well as their distance distribution. The results are shown in Figure 3 and discussed in the following section.

Figure 3
figure 3

The distance profiles between the Cα atoms of Gly43 and Thr137 along the MD trajectory as well as the distributions.

(A)the apo WT; (B)the GMP-bound WT; (C)the apo S35P mutant; (D) the GMP-bound S35P; (E)the GK domain and (F)the backbone RMSD.

Dynamic behaviors obtained by standard MD simulations

The apo GK exhibits the partially closed conformation

Since the initial conformation of the apo GK is prepared by removing the ligand from its GMP-bound state, we find that the apo GK enzyme experiences large fluctuations in the first 250 ns of our simulation. As displayed in Figure 3, the distance distribution between the GBD and LID domain of apo GK is the broadest compared to other systems, indicating a large sampled conformational space, which fluctuates from 1.5 nm to 4.5 nm. After 250 ns, the protein reaches a stable state with a distance of around 2.5 nm between the GBD and LID domain, corresponding to the formation of the partially closed conformation. After closely inspecting these structures, we find that a hydrogen bond, which is essential to maintain the partially closed conformation, is formed between Ser35 and Asp99, implicating a critical role of Ser35 (Figure 4).

Figure 4
figure 4

Snapshots of the stable conformations randomly selected from the last 50 ns MD trajectory.

(A) The partially closed apo WT GK. The insert indicates the key interaction between Ser35 and Asp99 in maintain the partially closed conformation; (B) The open GMP-bound S35P GK. The insert indicates no hydrogen-bond formed between Pro35 and GMP; (C) the fully closed GMP-bound WT GK. The insert shows the hydrogen-bond interactions between GMP and the WT GK.

The GMP-bound GK maintains the fully closed state

In the presence of GMP, the inter-domain motions of GK enzyme are significantly restricted. The protein experiences only small fluctuations with backbone atoms RMSD around 0.2 nm during the whole simulation (Figure 5). The distribution of the distance between the GBD and LID domain also spans a small range from 0.8 nm to 2.5 nm (Figure 3). After 250 ns, the protein is stabilized in a fully closed state. Notably, the stable conformations obtained by our simulation share several key characteristics with the structures obtained by X-ray crystallography6,7,8,9 including: the phosphate group of GMP is stabilized by Tyr51, Tyr79, Arg39 and Arg42;the sugar hydroxyl group interacts with Lys15 and Asp99; the guanine carbonyl oxygen is hydrogen bonded by Ser35; the guanine amino group is interacted by Glu70. To have a better illustration of these interactions, we depict the detailed hydrogen-bond pattern in Figure 4. As can be seen from that figure, the presence of GMP interrupts the hydrogen-bond interaction between the Ser35 and Asp99 in the apo form and, in turn, the two residues change their roles in modulating the ligand specificity and affinity.

Figure 5
figure 5

(A) The two-dimensional free energy profile (FEP) of the closed-open conformational transitions vs the GMP disassociation of WT GK. The separation between the contours is 1 kcal/mol. The four basins (A–D) represent the lowest four free energy minima of the FEP; (B–E)The conformations representative of the interactions between the residues and GMP of each basin; (F)The convergence of the four basins along the simulation. The free energy of each basin is converged after 300 ns.

Open conformations are more favorable in both apo and holo forms of S35P GK

Experimental data suggest that the Ser35→Pro mutation alters the GK function by changing its dynamic behavior. Therefore, in order to monitor the conformational transitions, we performed MD simulations of S35P GK by using the identical closed conformations in both ligand-free and ligand-bound forms (Figure 1). As expected, our simulations show that the dynamic behavior of the S35P GK is quite different from that of the WT GK. In the case of the GMP-bound S35P GK, we observe that the closed conformation is only maintained in the first 50 ns. However, due to the lack of interaction between the proline and the guanine carbonyl oxygen of GMP, the ligand is extremely unstable in the binding site. After 60 ns, the open conformations are achieved in both apo and holo forms of S35P GK with a large cleft between the GBD and LID domain. In the open state of holo form, the phosphate group of GMP may still interacts with the Arg39 and Arg42 located in the GBD domain for a long period of MD simulation time, which is in agreement with NMR observation that the GMP-induced chemical shift in the S35P GK could only be identified in the region of GBD domain5. During the rest of our simulations, the S35P mutant exhibits a high stability in the open conformation. At end of our simulation, we find that S35P GK loses the ability of ligand binding and the GMP escapes into the bulk solvent eventually.

The GK domain demonstrates an open state like the S35P GK

As we mentioned above, mutating Ser35 to Pro in GK would intrinsically alter the dynamic behavior of the protein and lead to the open conformation more favorable regardless of the presence or absence of GMP. Then, we conduct an additional simulation of GK domain to identify whether the S35P mutant and the GK domain share similar conformational dynamics. Indeed, we find that the protein dynamics of the S35P GK and the GK domain are quite similar in their open conformations. A large cleft between the GBD and LID domain is maintained during the whole simulation in the GK domain, with the distance around 4.0 nm. In addition, the protein demonstrates a high stability with the backbone RMSD fluctuating only around 0.2 nm (Figure 3). All these dynamic features of GK domain are in line with its functional role that 1) the conformational rigidity of GK domain is essential to function as a module of scaffold protein, which provides the physical constraints for efficient signal transduction and 2) the stable open conformation is required to provide a large protein-binding surface for the binding of the peptide ligands.

Thus, our MD results can provide an explanation in support of the experimental evidence that the S → P mutation prevents the closing motions of GK and then allows the protein to function as a protein interaction domain5. Furthermore, the S → P mutation diminishes the GMP binding ability of GK Consequently, the enzymatic activity of the mutant is significantly impaired6.

Reconstructing the free energy profiles using metadynamics

To further investigate the relationship between the conformational transition and the ligand binding properties in both WT and S35P GK, we employ metadynamics to reconstruct the two-dimensional free energy profiles (FEPs) along the reaction coordinates of CV1 and CV2. Our FEP results clearly reveal that, in the WT GK, the ligand binding process occurs concurrently with the conformational transitions, which is in line with the experimental observation that the closing motions of WT GK are induced by the binding of GMP7. As shown in Figure 5,the lowest free energy well is identified in a narrow region with CV1 ranging from 2.0 nm to 3.1 nm and CV2 less than 0.3 nm. In addition, the well can be further divided into two parts, corresponding to the fully closed (CV1 ≈ 2.2 nm) and the partially closed (CV1 ≈ 2.8 nm) states, separated by a small energy barrier less than 1 kcal/mol. The second lowest free energy well is identified in a broad region on the free energy surface ranging from 3.2 to 4.0 nm of CV1 and 0.48 to 0.80 nm of CV2, respectively, which corresponds to the process of conformational transitions between the partially closed to the open state and the ligand binding event. It is interesting to note that, as the conformation changes from the partially closed to the open state, there is a strong positive correlation between CV1 and CV2. This result provides a clear picture for the GMP-induced closing motions of WT GK.

After performing more analysis of the MD trajectories, we find that the second lowest minimum reflects the hydrogen-bond interactions between the phosphate group of GMP and the residue Arg136 located in the LID domain in the open conformations (Figure 5), suggesting the role of Arg136 in phosphate recognition. However, the FEP shows that the open state (basin D) is less energetically favorable than the closed state (basin A) in GMP binding with ΔG = ~4 kcal/mol (Figure 5).

Accordingly, our FEP calculations suggest that the binding process of GMP in WT GK occurs in two successive steps: 1) in the open state, the attraction of GMP is made by the interactions between the phosphate group of GMP and the Arg136 located in the LID domain which induces a trap for the ligand in the cleft between the GBD and LID domain; 2) the hydroxyl group of Ser35 tends to interact with the carbonyl oxygen of GMP via hydrogen-bond. Once the connection is formed between the two groups, the closing motion of GK occurs and the protein equilibrates between the partially closed and fully closed states. Meanwhile, the phosphate group of GMP translocates into a stronger binding site formed by Arg39 and Arg42 located in the GBD domain (Figure 5).

On the other hand, in the case of S35P GK, we find that the conformational transitions are not coupled with the process of GMP binding. As shown in Figure 6, the pathways of the two events are perpendicular to each other, indicating the two processes occur independently. In the conformational transition pathway, the protein undergoes no significant energy barriers (<1 kcal/mol) from 2.5 nm to 4.3 nm of CV1 even in the presence of GMP at the binding site (CV2 ≈ 0.4 nm). Compared with the WT GK, we observe that the lowest free energy well shifts as much as 10Å toward the open conformation in the S35P GK, suggesting that the closing motions of S35P GK are energetically unfavorable. The ligand dissociation from the binding site only occurs in the open state of S35P GK at CV1 ≈ 3.8 nm. After crossing an energy barrier of ~5 kcal/mol, the mutant reaches a metastable free energy minimum at CV1 = 3.8 nm and CV2 = 0.8 nm (Figure 6). In this open conformation, the phosphate group of GMP may still interact with two arginine residues Arg39 and Arg42 located in the GBD domain. This observation is in consistent with the results of our standard MD simulation and the experimental data7 mentioned above. In addition, the FEP shows that there is another deeper free energy well at CV1 ≈ 2.5 nm and CV2 ≈ 0.85 nm in the S35P GK, corresponding to the basin D in Figure 6. Tracing back to the MD trajectories, we find that this free energy minimum represents a mis-binding pattern in the closed conformation. In this state, the S35P GK is not able to recognize the ligand correctly because of the lack of the hydrogen-bond interaction between Pro35 and GMP. Instead, a hydrogen-bond is formed between the phenolic hydroxy group of Tyr51 and the guanine carbonyl oxygen of GMP, which results in a flip of the guanine group of GMP in the binding site. Furthermore, the binding of the phosphate group of GMP is also affected that the phosphate group binds to the two arginine residues Arg136 and Arg147 located in the LID domain instead of the arginines in the GBD domain. Taken together, the calculated FEPs further emphasize the critical role of Ser35 in ligand recognition and provide an explanation to the loss of the enzyme activity in the S35P mutant.

Figure 6
figure 6

(A) The two-dimensional free energy profile (FEP) of the closed-open conformational transitions vs the GMP disassociation of S35P GK. The separation between the contours is 1 kcal/mol; The four basins (A–D) represent the lowest four free energy minima of the FEP; (B–E)The conformations representative of the interactions between the residues and GMP of each basin; (F)The convergence of the four basins along the simulation. The free energy of each basin is converged after 350 ns.

Conclusion

In this work, we carried out molecular dynamic simulations to investigate the underlying mechanism of conformational transitions after replacing serine35 with proline in GK enzyme. Our MD simulations demonstrated that the single mutation intrinsically alters the dynamic behavior of the protein. In the apo GK, the Ser35 plays a key role in maintaining the partially closed conformation via the hydrogen-bond interactions with Asp99. In the presence of GMP, the Ser35 is critical for modulating the ligand specificity and affinity.

On the contrary, the S35P GK exhibits stable open conformations regardless of the presence or absence of GMP. The cleft between the GBD and the LID domain in S35P GK is as large as that exists in the GK domain, explaining the functional change of S35P GK to the GK domain.

To investigate the relationship between conformational transitions and ligand binding process in both WT and S35P GK, we further use metadynamics to calculate the free energy profiles along the two corresponding reaction coordinates. Our FEPs demonstrate that the closed-open conformational transition of WT GK is positively correlated with the GMP binding process, which strongly supports the experimental observation that the closing motion of GK is induced by the binding of GMP. In contrast, the S35P GK energetically favors the open conformation even with the presence of GMP. In addition, we also identify the mis-binding pattern in the S35P GK because of the lack of the hydrogen-bond interaction between proline 35 and GMP, implying the essential role of Ser35 in ligand recognition.