Abstract
The stereoselectivity of enzymes plays a central role in asymmetric biocatalytic reactions, but there remains a dearth of evolution-driven biochemistry studies investigating the evolutionary trajectory of this vital property. Imine reductases (IREDs) are one such enzyme that possesses excellent stereoselectivity, and stereocomplementary members are pervasive in the family. However, the regulatory mechanism behind stereocomplementarity remains cryptic. Herein, we reconstruct a panel of active ancestral IREDs and trace the evolution of stereoselectivity from ancestors to extant IREDs. Combined with coevolution analysis, we reveal six historical mutations capable of recapitulating stereoselectivity evolution. An investigation of the mechanism with X-ray crystallography shows that they collectively reshape the substrate-binding pocket to regulate stereoselectivity inversion. In addition, we construct an empirical fitness landscape and discover that epistasis is prevalent in stereoselectivity evolution. Our findings emphasize the power of ASR in circumventing the time-consuming large-scale mutagenesis library screening for identifying mutations that change functions and support a Darwinian premise from a molecular perspective that the evolution of biological functions is a stepwise process.
Similar content being viewed by others
Introduction
Stereoselectivity of enzymes, the ability to determine which enantiomer dominates in the achiral-to-chiral transformation, is critical for life and asymmetric biotransformation1,2. For example, L-type amino acids comprise almost all proteins on earth, and D-saccharides constitute a wide variety of sugar compounds. Enzymatic catalysis is essential to endow molecules with chirality during their de novo synthesis in vivo. Hence, stereoselectivity is a crucial enzymatic property for ensuring enantiomerically pure biomolecules. Furthermore, the stereoselectivity of enzymes plays a significant role in asymmetric organic reactions3. Unfortunately, naturally occurring enzymes are not always able to achieve excellent enantiomeric excess (ee) or desired stereoselectivity in asymmetric catalysis, but directed evolution can be employed to improve or even reverse stereoselectivity4,5. However, undefined molecular mechanisms often require labor-intensive experiments to identify key mutations that manipulate stereoselectivity. Therefore, illuminating the mechanism behind the stereoselective evolution of enzymes is highly desirable.
A variety of strategies have been proposed to reverse enzyme stereoselectivity based on horizontal biochemical analysis, such as the combinatorial active-site saturation test (CAST) and the mirror-image strategy5,6. More recently, a completely mechanism-guided method was successfully applied to invert the stereoselectivity of cytochrome P4117. Although stereoselectivity inversion has been widely studied in the protein engineering field, the evolutionary trajectory of stereoselectivity is not well explored from an evolutionary biochemistry perspective. Ancestral sequence reconstruction (ASR) is a specific and highly efficient approach for interrogating the evolution of protein functions in evolutionary biochemistry8,9,10,11. However, only one example of ASR has been reported to date for studying stereoselectivity evolution12. By experimentally characterizing ancestral enzymes located in the evolution trajectory of functional changes, detailed insights into biophysical mechanisms may be gained that are often inaccessible using biochemistry alone13,14,15. Compared with the above-mentioned strategies, ASR is better able to identify the crucial residues that switch enzyme function. It can directly pinpoint the specific amino acid types in key sites without performing saturation mutagenesis16,17. Additionally, epistasis, the phenomenon in which the effect of mutations on functions substantially depends on other mutations, often complicates the study of sequence-structure-function relationships8,9. Empirical evidence has shown that ASR can help reduce the effect of epistasis by reintroducing historical substitutions into a genetic background that is identical or very similar to the one in which they originally occurred9,11. Therefore, key residues can be swapped between ancestral sequences with different stereoselectivity to investigate the corresponding regulatory mechanism of stereochemistry.
Imine reductases (IREDs) are a class of NAD(P)H-dependent oxidoreductases that catalyze the reduction of imines and the reductive amination of carbonyl compounds and amines to their corresponding amines18. Numerous stereocomplementary IREDs have been identified and used to synthesize a variety of amines19,20,21. In addition, a range of studies on expanding substrate scope, exploring new reactions, and engineering enzymes for industrial applications have been performed22,23,24,25,26. However, there have been no explicit experimental insights regarding the mechanism underlying stereocomplementarity. Based on previous empirical biochemical data and horizontal multiple sequence alignment, highly conserved aspartate or tyrosine residues located at standard position 187 (residue numbering refers to PDB ID 3ZHB) appears to be responsible for opposite stereoselectivity in the IRED family27. However, this hypothesis has not been validated within extant enzymes by direct residue swapping experiments. This is in part because epistasis generally does not enable residue interchange in extant enzyme sequences. This weakness of horizontal research can be circumvented by conducting residue swapping experiments on an ancestral sequence. More importantly, much empirical evidence has demonstrated that ASR is able to reveal historical mutations leading to function divergence in two protein subfamilies with different functions28,29,30,31.
In the present work, we sought to identify the historical substitutions regulating the stereoselectivity of IREDs using ASR and unveil the evolutionary mechanism of stereoselectivity. Six historical mutations leading to the stereoselectivity reversion were identified via coevolution analysis. We then used X-ray crystallography to determine protein structures and gain structural insights into the mechanisms by which the six historical mutations reverse stereoselectivity. We also constructed an empirical fitness landscape consisting of the six mutations to uncover the role of epistasis in the evolutionary process that reverses IRED stereoselectivity. Through this work, we demonstrate that ASR is an efficient strategy for identifying mutations regulating stereoselectivity, providing a method for the directed evolution of stereoselectivity in enzymes.
Results
Reconstruction and characterization of ancestral IREDs
Initially, we performed multiple sequence alignment and phylogenetic analyses on 1408 IREDs from a public IRED sequence database (Imine Reductase Engineering Database version 3; https://www.ired.biocatnet.de) and 18 outgroup sequences, and all maximum likelihood ancestral IRED amino acid sequences in this study were inferred based on the phylogenetic tree (Supplementary Fig. 1). Interestingly, we found that all Y-type sequences (those with Tyr at standard position 187, corresponding to position 170 in ancestral sequences) all form a relatively late branching clade in the phylogenetic tree (Supplementary Fig. 1). By contrast, D-type sequences (those with Asp at standard position 187) are dispersed throughout the whole phylogenetic tree, and many are close to the last common ancestor (LCA, N1). We, therefore, speculated that the highly conserved Asp located at standard position 187 is probably an ancestral trait in the IRED family, while Tyr is its derived genotype.
To verify this hypothesis, we firstly inferred the maximum likelihood amino acid sequence of the LCA. Sequence analysis results indicated that residue 170 of LCA is Asp, belonging to D-type sequences in the IRED family, confirming the above hypothesis (Supplementary Fig. 2). Subsequently, we assessed the stereoselectivity of N1 toward model substrate 1a using chiral gas chromatography. The results showed that N1 displayed excellent (S)-stereoselectivity toward 1a with > 99.9% ee (Fig. 1b and Supplementary Fig. 3).
a Enzymatic asymmetric reduction of 1a used to investigate stereoselectivity evolution of IREDs. b Gas chromatogram profiles displaying the stereoselectivity of the last common ancestor N1 toward 1a. c Simplified phylogenetic tree showing the target evolution trajectory from N1 to N560. NX (X = 1, 2, 3 to 5, and 557, 558 to 560) represents the number of ancestral enzymes at each node. Light purple and yellow solid circles represent the percentage of S- and R-enantiomers produced by IREDs, respectively. Outgroup denotes ornithine cyclodeaminase sequences. d Enantiomeric excess (ee) of (R)-2a or (S)-2a generated by the ancestors within the focused trajectory and two extant imine reductases SIR from Streptomyces sp. GF3546 and ScIR from Streptomyces clavuligerus. The ee of N1–N560 was calculated from the average of triplicate experiments (n = 3). The data are presented as mean values ± SEM. The ee of SIR and ScIR cited from our pervious study20. Light purple and yellow bars represent the percentage of S- and R-enantiomers produced by IREDs, respectively.
Evolvability of stereoselectivity in IREDs
Our group previously described eight IREDs that catalyze 1a with excellent stereoselectivity, including two enzymes with R-stereoselectivity that belong to the Y-type clade20. Based on these previous results and phylogenetical analysis, we speculated that N560 (the recent common ancestor of Y-type IREDs) would asymmetrically reduce 1a to generate (R)-2a. To validate this speculation, we experimentally resurrected N560 and characterized its stereoselectivity toward 1a. As expected, N560 displayed R-stereoselectivity, yielding (R)-2a with 41% ee (Fig. 1d and Supplementary Fig. 5). To investigate how stereoselectivity evolves from an ancestor with S-stereoselectivity to descendants with R-stereoselectivity, we narrowed our analysis to the specific trajectory from N1 to N560 (Fig. 1c). We then experimentally resurrected these intermediate ancestors (N2–N559) in this evolution pathway and biochemically characterized their stereoselectivity toward 1a. The results demonstrated that the ancestors N1–N559 belonging to D-type sequences exhibit excellent S-stereoselectivity (>90% ee, Fig. 1d and Supplementary Figs. 2–5). Importantly, we observed that stereoselectivity inversion occurs at the evolutionary interval between N559 and N560. Moreover, it should be noted that the substitution of residue 170 from D to Y also occurs at this interval. In addition, we also investigated the evolution pattern of stereoselectivity in alternative ancestral sequences (AltN1–AltN5 and AltN557–AltN560) to support the robustness of reconstruction (Supplementary Fig. 7). Consistent with our results for the maximum likelihood ancestors, stereoselectivity inversion occurred at the evolution interval between AltN559 and AltN560 (Supplementary Fig. 7). Moreover, we also measured the specific activity of ancestral enzymes and their corresponding alternative versions (Supplementary Table 2). All ancestral enzymes were expressed in soluble form, which enabled the above-mentioned experiments to be successful (Supplementary Fig. 8). These results demonstrated that the stereoselectivity of IREDs is a highly evolvable trait within the targeted trajectory. Therefore, the historical mutations responsible for the evolution of stereoselectivity can be determined.
Identification of historical mutations through coevolution analysis
To identify the historical substitutions reversing stereoselectivity in the evolutionary interval between N559 and N560, we chose N559 as the target object for subsequent protein engineering. We first investigated whether the single mutation from Asp to Tyr at position 170 of N559 inverts its stereoselectivity. The results showed that the D170Y mutation could not reverse the stereoselectivity preference of N559, but it significantly decreased the ee of (S)-2a from 93% to 25% (Supplementary Fig. 9). This indicates that position 170 plays a vital role in regulating stereoselectivity. Furthermore, this also implied that additional historical mutations are necessary to invert stereoselectivity.
Besides in silico protein structure prediction, coevolution analysis is also used to identify important mutations impacting protein function32,33. To unveil the additional historical mutations, we performed coevolution analysis on ancestral sequences (N1–N560) and eight extant enzyme sequences using ibis2analyzer34. Based on this analysis, mutations co-evolving with the important substitution D170Y were determined. The results showed that inversion from S- to R-stereoselectivity is associated with the six mutations L19M, C67S, T94S, A119G, I120V, and D170Y (Fig. 2a, right). Subsequently, we introduced the six substitutions into N559 to generate mutant N559-M6. As expected, the stereoselectivity of N559-M6 was successfully inverted to the R-configuration from the S-configuration, generating (R)-2a with 67% ee (Fig. 2b and Supplementary Fig. 6). Furthermore, we also conducted back-mutations of the same six sites in N560 to generate mutant N560-M6, and stereoselectivity was also inverted from 41% eeR to 73% eeS (Fig. 2b and Supplementary Fig. 6). These protein engineering results demonstrated that these coevolution mutations played a crucial regulatory role in the evolution of IRED stereoselectivity.
a Simplified phylogenetic tree showing the distribution of ancestral and eight extant sequences (left), and six coevolution mutations identified via ibis2Analyzer (right). Light purple and yellow represent the S- and R-stereoselectivity, respectively. The extant enzymes originate from Amycolatopsis azurea (AaIR), Amycolatopsis decaplanina (AdIR), Streptomyces viridochromogenes (SvIR), Saccharothrix espanaensis (SeIR), Streptomyces turgidiscabies (StIR), Streptomyces sp. GF3587 (RIR). The ee of all extant IREDs cited from our previous study20. b Enantiomeric excess (ee) of 2a produced by the imine reduction using N559, N559-M6, N560 and N560-M6. N559-M6, a variant of N559 with mutations L19M, C67S, T94S, A119G, I120V, and D170Y. N560-M6, a variant of N560 with mutations M19L, S67C, S94T, G119A, V120I, and Y170D. c Comparison of the stereoselectivity of N559 and its variant N559-M6 toward other substrates 1b−1f. Numbers above histograms represent ee of the product. The ee of product from each enzyme was calculated from the average of triplicate experiments (n = 3). The data are presented as mean values ± SEM. Light purple and yellow bars represent the percentage of S- and R-enantiomers produced by IREDs, respectively.
To further verify the role of the six coevolution mutations in stereoselectivity regulation, more imine substrates (1b–1f) were used for biotransformation. We found that N559 possesses opposite stereoselectivity to N559-M6 toward the five substrates (Fig. 2c, Supplementary Fig. 10 and Supplementary Table 3). The results indicated that the stereoselectivity inversion caused by the six historical substitutions was an intrinsic property of enzymes during evolution.
Structural basis for the stereoselectivity evolution of IREDs
To examine the role of the six historical mutations that reversed stereoselectivity from a structural perspective, we determined the crystal structures of N559 (PDB: 8JKU) and N559-M6 (PDB: 8HWY) complexed with NADP+ at resolutions of 2.57 Å and 2.32 Å, respectively (Supplementary Fig.11a–d). Similar to the extant enzymes, ancestral IREDs were homodimers with an unusual reciprocal domain sharing arrangement, with the active site located at the interface formed by the two monomers (Fig. 3a, left). In the structure of N559-M6, two different binding conformations of NADP+ were observed. As shown in Supplementary Fig. 11d, e, the first conformation was consistent with the canonical NADP+ conformation in the structure of N559, whereas the second adopted an upturned conformation with almost no space left for binding substrate next to the nicotinamide ring of the cofactor. In addition, a significant conformational change was observed between N559 and N559-M6; the loop containing the T94S mutation in N559-M6 moved by 3.8 Å compared to N559 (Supplementary Fig. 12), and the resulting open space allowed the nicotinamide ring of NADP+ in N559-M6 to flip. Due to the limited space for substrate binding in the second conformation, we predicted that this upturned conformation is catalytically inactive. Therefore, the first conformation of NADP+ in N559-M6 was used for subsequent structural analysis.
a Differences in the overall protein conformation caused by the six historical mutations. The protein structure of N559 is shown on the left with the two monomers colored green and deep teal. Structural superposition of N559 and N559-M6 is shown on the right with N559-M6 colored pink. The shift in the Rossmann fold region is indicated by a black arrow. b Distribution of the six historical substitutions around the NADP+-binding site of N559-M6. Mutated residues in N559-M6 are shown in stick representation. c Sectional view of the substrate-binding pockets of N559 (left) and N559-M6 (right) accommodating NADP+.
During the reaction catalyzed by IREDs, a hydrogen atom and an electron are transferred from NADPH to imine substrates. Therefore, NADP+ was located in the center of the catalytic pocket of IREDs. To facilitate a precise comparison of catalytic site structures between N559 and N559-M6, we superposed the NADP+ molecule from the crystal structures so that we could observe a significant conformation change in the Rossmann fold region (Fig. 3a, right). This conformational change further reduced the space in the catalytic pocket of N559-M6 (Fig. 3c). As shown in Fig. 3b, five out of six mutations were located in three regions around the NADP+ binding pocket, whereas only one mutation, D170Y, was located in close proximity to the predicted substrate binding site next to the nicotinamide ring of NADP+. Since Tyr has a larger side chain than Asp, D170Y further reduced the volume of the catalytic pocket (Fig. 3c).
To understand how the reduced space in the catalytic pocket reversed the stereoselectivity of N559-M6, we conducted molecular docking analysis. As shown in Fig. 4, the location of the two ring structures of the substrate was exchanged in the binding model of N559 and N559-M6. The D170Y mutation in N559-M6 caused steric hindrance with the substrate-binding conformation of N559, which led to the adoption of a new substrate-binding conformation in N559-M6 (Fig. 4b). To quantify the change in volume, we used the DoGSiteScorer tool on the ProteinPlus server (https://proteins.plus/) to measure the corresponding volume of the NADP+ binding pocket35. According to the results, it was found that the volume decreased by 149.5 Å3 in N559-M6 (N559: 979.64 Å3, N559-M6: 830.14 Å3). Moreover, the substrate can form a hydrogen bond with Y170 in N559-M6, which stabilizes the substrate in the binding mode (Fig. 4b). As a result, these changes contribute to the inversion of the substrate-binding conformation in N559-M6 from Pro-S to Pro-R (Fig. 4).
Stereoview of the molecular docking results for substrate 1a in the active pocket of N559 (a) and N559-M6 (b). The active pocket is displayed in surface representation. The docked 1a, cofactor NADP+, and residue 170 (D170 in N559 and Y170 in N559-M6) are shown as sticks. The red dashed line denotes the formation of a hydrogen bond between substrate 1a and Y170. The black arrows indicate the transfer of hydride from C4 of the nicotinamide ring of NADP+ to the carbon atom of the imine bond in 1a.
Empirical fitness landscape of stereoselectivity
Our investigation of the six historical substitutions led to two additional pertinent questions. Are the six mutations necessary for stereoselectivity inversion a minimum number of mutations? Does epistasis play a role in the formation of novel stereoselectivity preferences, and if so, how does it work? To answer these questions, we constructed an empirical fitness landscape with 64 (26) genotypes and 720 (6!) potential evolutionary paths using stereoselectivity as a proxy for fitness (Fig. 5). We quantitatively defined the fitness using the difference in ee between N559 and each intermediate. Specifically, S- and R-stereoselectivity were denoted by positive and negative signs (Supplementary Table 5); a larger value indicates a higher level of fitness in our landscape, according to our definition.
The empirical fitness landscape reflects the transition from N559 to N559-M6 for R-stereoselectivity. Each circle represents a unique variant, and the color represents the level of stereoselectivity, according to the scale on the top right of the figure. The numerical value in each circle is the difference in enantiomer excess (ee) between N559 and each mutant. Solid lines represent positive trajectories, orange lines denote neutral trajectories, and dashed lines indicate negative trajectories. The positive arrows denote that the next node has more R-enantiomers in products than the previous node. The neutral arrows denote that the next node has equal R-enantiomers in products, compared to the previous node. The negative arrows denote that the next node has less R-enantiomers in products than the previous node. Raw experimental data are provided in Supplementary Table 5. Data in each circle are shown as the mean calculated from three measurements (n = 3).
We found that 15 mutants could change the stereoselectivity of model substrate 1a from S- to R-stereoselectivity. These mutants include one double mutant, three triple mutants, six quadruple mutants, four quintuple mutants, and one sextuple mutant (Fig. 5). The results indicated that a minimum of two mutations was sufficient for stereoselectivity inversion. Except for the L19M/A119G mutant, all the other 14 mutants contain the D170Y mutation. Additionally, we found a significant positive correlation between the number of mutations and the number of mutants altering stereoselectivity (r = 0.90, Supplementary Fig. 13 and Supplementary Table 6). The correlation relationship indicates that the effects of mutations depend on the sequence background, a phenomenon known as epistasis. These results demonstrate that the order of mutation introduction is vital for stereoselectivity evolution. In other words, epistasis is pervasive in this empirical fitness landscape.
We then examined the epistasis pattern in this landscape by analyzing mutation-induced fitness changes and performing statistical analyses. A closer look at this empirical fitness landscape showed three types of epistasis (Supplementary Fig. 14a–c). A further analysis of the contribution of the three types of epistasis to the whole landscape was performed using the MAGELLAN tool36,37. According to these results, magnitude epistasis explained 47.9% of fitness variations, while sign epistasis and reciprocal sign epistasis explained 35.4% and 13.3%, respectively (Supplementary Fig. 15). Additionally, we used a Python package to detect epistasis and quantify the fraction of variation accounted for by first- to sixth-order (the interaction among six mutations) epistatic coefficients38. Based on these results, 32% of variation could be explained without epistasis, 24% by second-order epistasis, and 44% by higher-order epistasis (interactions between three or more mutations, Supplementary Fig. 14d). The results from statistical analyses highlight the importance of considering epistasis when predicting mutation effects.
Discussion
In the present study, we demonstrated the evolutionary pattern of IRED stereoselectivity using ASR and artificially reproduced it via several crucial historical substitutions. Specifically, a set of six historical substitutions responsible for the evolution of stereoselectivity from ancestral S-selectivity to derived R-selectivity were identified using coevolution analysis. To identify key historical substitutions causing a change in function in the focused evolution trajectory, empirical evidence mostly relies on structural information or relevant prior knowledge about sites affecting function in extant enzymes12,15,16. In addition, it is often necessary to establish a large library of mutants. To identify single target mutations or combinations of mutations that change protein functions, vast screening experiments are also needed39,40. In order to overcome these challenges, we present an alternative approach that based on ancestral and extant sequences to perform a coevolution analysis to determine key historical substitutions. Here, our results showed that the alternative approach is efficient for uncovering historical mutations responsible for stereoselectivity evolution. As an added benefit, this approach also reduces the effort needed to screen large libraries of mutants.
Furthermore, we investigated the structural basis of the six coevolution mutations recapitulating stereoselectivity evolution by determining crystal structures. Our analyses of the mechanism underlying stereoselectivity evolution are in accordance with several previous studies on stereoselectivity inversion. Specifically, a handful of sequence mutations reversed stereoselectivity by reshaping the substrate-binding pocket41,42. Our structural analysis showed that five out of six substitutions did not directly interact with the substrate, indicating that ASR has significant potential to capture remote mutations critical for protein functions43. This aspect differs from several previous studies in which inverting stereoselectivity was achieved by mutating residues that interact directly with substrate7,44. Compared with ASR, the traditional biochemistry strategies are inefficient in detecting crucial residues distant from the substrate43. Here, our study demonstrates that reshaping the substrate-binding pocket by mutation of residues around NAD(P)H is an effective strategy for the inversion of stereoselectivity of NADPH-dependent oxidoreductases. Specifically, we suggest that these residues located at the loop surrounding the NADPH-binding pocket should be a hotspot for manipulating the stereoselectivity of the IREDs.
The empirical fitness landscape allows us to investigate the evolution of protein function by exploring all evolutionary intermediates45,46,47,48. Using this approach, we can analyze the relationships between sequences and functions in detail to understand the mechanisms of mutations for mediating functional changes. Numerous studies have demonstrated that creating empirical fitness landscapes is a powerful way to reveal accessible evolutionary paths, explore the potential for evolutionary predictability, and investigate epistasis49,50. Our findings on the empirical fitness landscape are in agreement with those of previous studies, in which enzyme activity was often selected as a proxy for fitness51,52,53,54. The sign epistasis limits the number of available evolutionary pathways where the proxy increases monotonically55. According to our statistical analysis, sign epistasis contributes 35.4% to fitness variations. As expected, only 2% (16/720) of trajectories are monotonically increasing in our empirical fitness landscape (Fig. 5). Despite the minimum requirement for stereoselectivity inversion being two mutations, we found that the probability of gaining a new stereoselectivity preference increases as the number of accumulated mutations increases. Furthermore, the order in which mutations are introduced determines whether a mutation results in a gain-or-loss of function. Epistasis is the main cause of the above-mentioned scenarios56,57,58. These findings demonstrated that early mutations play a permissive role by epistatic interaction to generate and improve the gain-to-function effect of latter mutations59,60. More than half of the fitness variations could be explained by epistasis, according to our statistical analyses (Supplementary Fig. 14d). Therefore, our results suggest that incorporating epistasis into predictive models contributes to improving the accuracy of predictions for the effect of mutations on protein functions. A plethora of epistasis-based predictive models have been developed in recent years, and these models outperform additive models without epistasis, such as EVmutation and Innov’SAR61,62.
A number of ASR studies use the maximum likelihood (ML) approach to infer ancestral sequences, and one study on simulated data revealed that the ML method is the most reliable method to estimate ancestral sequences63. It is possible, however, that the ML approach would incur statistical uncertainty at some sites, which means that the reconstructed residues at such sites would possess other replaceable residues. Eick et al. conducted a systematic study to investigate the robustness of reconstructed ancestral protein functions to statistical uncertainty in the ML method64. It is noteworthy that the alternative ancestors’ protein functions are qualitatively unchanged from those of their maximum-likelihood ancestors, according to their results. However, they demonstrated that precise quantitative measurements of function, such as, enzyme kinetic parameters, vary among alternative ancestors. Although our reconstruction has a high average posterior probability for each ancestral sequence (Supplementary Fig. 16), we also found the same experimental phenomenon in the present study. The specific activity of N560 is significantly higher than AltN560 (Supplementary Table 2), which is a consequence of reconstruction uncertainty. Additionally, we also found that the catalytic efficiency of ancestral enzymes is lower than extant enzymes (Supplementary Table 7), which is in accord with Jensen’s hypothesis65. Meanwhile, the results may indicate that the ancestral IREDs possess high catalytic promiscuity, and our study in progress supports this aspect.
Furthermore, we observed differences in specific activity between reconstructed ancestral enzymes (Supplementary Table 2). According to lessons learned from directed evolution campaigns, for example, two mutations (V122C and F177W) at the active pocket in ScIR improve its specific activity 10-fold21. The results indicate that positions 122 and 177 play an important role in regulating activity in IREDs. Then, we looked at the residues in the active pocket of ancestral enzymes and discovered obvious mutations in the location, especially for N559 and N560 (Supplementary Fig. 17). Specifically, we observed that the two key positions (122 and 177) have mutated compared with other ancestral enzymes with different magnitudes of activity (Supplementary Fig. 17). Therefore, we speculated that differences in the active pocket residues caused a significant variation in activity among the reconstructed ancestral enzymes. An empirical study on the directed evolution of enantioselectivity revealed a trade-off between enantioselectivity and activity, resulting in improved enantioselectivity at the expense of activity66. The other study on alcohol dehydrogenase also showed that enantioselectivity inversion would decrease the kcat/Km67. The empirical evidence indicated that stereoselectivity and activity are in a trade-off relationship in protein engineering66,67. Our results on N559 and N559-M6 are consistent with this type of trade-off phenomenon (Supplementary Table 5). However, one empirical study also showed that further directed evolution would break the trade-off6. According to Yu et al., early stereoselective inversion reduces enzyme activity compared to the wild type, but further directed evolution simultaneously increases both stereoselectivity and enzyme activity6. Additionally, we examined the activities of the 64 mutants located in the empirical fitness landscape and observed 18 intermediates with improved activity than N559 (Supplementary Table 5). Moreover, the mutants with inverted stereoselectivity exhibit decreased activity compared to the start point (N559).
In summary, our results demonstrate that ancestral sequence reconstruction has unique advantages for exploring the molecular mechanism underlying stereoselectivity evolution. Through experimentally characterizing the sequence space containing a complete combination of all historical mutations, the effect of epistasis on function evolution could be scrutinized by analyzing the specific change in fitness and conducting statistical analysis. Moreover, ancestral sequences proved to be suitable starting templates for protein engineering campaigns aimed at stereoselectivity reversion12. With knowledge of the key historical mutations underpinning the evolution of a specific enzyme function, ancestral proteins could be engineered through directed evolution experiments with less expense, providing efficient avenues for the development of new biocatalysts with desirable functions.
Methods
Phylogenetic analysis and ancestral sequence reconstruction
A total of 1409 extant IRED sequences were downloaded from the Imine Reductase Engineering Database version 3 (https://ired.biocatnet.de/)27. These sequences mainly originate from three bacterial phyla, except ten sequences are of eukaryotic origin (Ascomycota)27. More than 80% of sequences are from Actinobacteria, followed by Proteobacteria and Firmicutes. One sequence with X as the initial amino acid was excluded in subsequent analyses. Using the DASH option of the MAFFT online server, 1408 IRED sequences and 18 outgroup sequences were aligned using structure-based multiple sequence alignment (MSA; https://mafft.cbrc.jp/alignment/server/)68. The rest of the alignment parameters were default values. We then manually deleted gaps in the MSA results to decrease subsequent phylogeny reconstruction noise. Two principles are employed to modify major indels in the MSA: (1) We removed major indels that are caused by outgroup sequences. These indels result from differences in structure between ingroup and outgroup sequences. (2) We discard positions where more than 50% of sequences contain gaps in the MSA39. An excess of indels at a location suggests that the location is not sufficiently conserved in the evolutionary process, since protein structure is more conservative in evolution. PhyML 3.0 was then used to construct a phylogenetic tree of IRED sequences (http://www.atgc-montpellier.fr/phyml/)69. The matrix of amino acid substitution WAG with parameters +G + I + F was selected by Smart Model Selection (SMS) according to the Akaike Information Criterion (AIC), which was the best-fitting evolutionary model70. The branch supports of the phylogenetic tree were calculated using BOOSTER71, an alternative method to using traditional bootstrap values, that is more suitable for large datasets and deep branches. To determine the last common ancestor of IREDs, ornithine cyclodeaminase (a type of oxidoreductase distantly related to IREDs) sequences were chosen as the outgroup.
Ancestral sequences at each focused node in the phylogenetic tree were inferred using Graphical Representation of Ancestral Sequence Predictions (GRASP; http://grasp.scmb.uq.edu.au/)72, a new method that can infer insertion-deletion (indel) reconstruction. The WAG was selected as the evolutionary model in GRASP, and marginal reconstruction was selected for ancestral sequence reconstruction. All sequences used to reconstruct the phylogenetic tree were inputted into GRASP for inferring ancestral sequences. The second most-likely sequences for each ancestral sequence (alternative ancestors) were also reconstructed to verify the robustness of ASR63. Additionally, we also used the joint reconstruction method to reconstruct the ancestral sequences, and the sequences from joint reconstruction are almost identical to those from marginal reconstruction (Supplementary Figs. 18–20). Furthermore, we also reconstructed the other phylogenetical tree of IREDs using β-hydroxyacid dehydrogenases as outgroup sequences to support the study conclusions (Supplementary Discussion).
Cloning, expression, and purification of ancestral enzymes
NdeI and XhoI restriction sites were incorporated at the N- and C-termini of ancestral genes. These genes were synthesized (Genscript Corporation, Nanjing, China), amplified and cloned into a pET28a(+) vector containing a six-His-tag at the N-terminus. All vectors containing the relevant codon-optimized genes were transformed into chemically competent Escherichia coli cells. Transformations were carried out aseptically by adding 1 μL plasmids to 20 μL of E. coli chemically competent cells and incubating on ice for 30 min. Cells were heat-shocked for 90 s at 42 °C, then placed back on ice for 90 s before the addition of 500 μL of sterile Luria-Bertani (LB) media. Cells were then incubated for 40 min at 37 °C, 500 μL of recovered cells was plated onto a 2% (w/v) LB agar plate containing kanamycin (50 μg/mL), and plates were incubated overnight at 37 °C. Due to the addition of the six-His-tag, several extra residues are also added in front of the N-terminus of target proteins. For crystallization of N559 and N559-M6, these extra amino acids need to be removed. We used XhoI and NcoI restriction enzymes to remove the extra amino acids at the N-terminus. The N-terminus changes from MGSSHHHHHHSSGLVPRGSHMSNN to MHHHHHHHMSNN. All sequences were confirmed by DNA sequencing.
E. coli BL21(DE3) cells containing ancestral enzyme genes were grown in 4 mL LB medium with 50 μg/mL kanamycin for 12 h. A 1 mL sample of culture was then transferred into 100 mL Terrific Broth with 50 μg/mL kanamycin and cultured at 37 °C with shaking at 200 rpm for 3 h. The temperature was decreased to 16 °C, cells were adjusted to the new temperature for 10 min, and isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to a final concentration of 0.2 mM to induce protein expression. Cultures with IPTG were incubated at 16 °C for 24 h. Cells were harvested using a centrifuge at 7826 × g for 5 min at 4 °C and the supernatant was discarded. The cell pellet was resuspended in purification buffer A (20 mM NaPi, 500 mM NaCl, 10 mM imidazole, and 5 mM beta-mercaptoethanol), and cells were lysed by ultrasonication. The cell lysate was centrifuged at 7826 × g for 40 min at 4 °C, and the supernatant containing protein expressed in soluble form was loaded onto a Ni2+- affinity column to obtain the purified target protein. The bound protein was washed sequentially with 10 mL of three types of buffers containing increasing imidazole concentrations (10 mM, 108 mM, and 250 mM imidazole). Fractions (10 mL) containing sufficiently pure protein were transferred into a 10 kDa molecular weight cut-off centrifugal concentrator (Millipore, Darmstadt, Germany) and centrifuged at 704 × g for 30 min at 4 °C. The protein was then exchanged into a buffer comprising 20 mM NaPi, and 0.97 mM dithiothreitol (DTT) without imidazole. Protein concentration was measured using a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, USA) based on the absorbance at 280 nm. The purified enzyme was subjected to biotransformation to examine stereoselectivity.
N-terminally truncated proteins were cultured and purified under the same conditions as normal proteins for crystallization of N559 and N559-M6. A further purification step was carried out using an ÄKTA pure instrument (Cytiva, Marlborough, USA). The protein solution collected from the Ni2+ affinity column was further purified by size-exclusion chromatography on a HiLoad 16/600 Superdex 75 pg column (Cytiva, Marlborough, USA). The final protein solution was concentrated to 20 mg/mL for crystallization experiments.
Site-directed mutagenesis
All point mutants of N559 and N560 were constructed via site-directed mutagenesis using primers listed in Supplementary Table 1 (Tsingke Biotechnology Company Limited, Shanghai, China). PCR mixtures (20 μL) contained 0.5 μL forward and reverse primer, 0.5 μL template plasmid, 0.5 μL dimethylsulfoxide (DMSO), 10 μL PrimeSTAR HS Premix (Takara Bio, Dalian, China), and 8 μL ddH2O. Thermal cycling included denaturation at 98 °C for 3 min, followed by 15 cycles at 98 °C for 10 s, 55 °C for 5 s, 72 °C for 7 min, and a final extension step at 72 °C for 10 min. PCR products were digested using 1 μL DpnІ (New England Biolabs, Ipswich, USA) at 37 °C for 3 h. Digestion mixtures were transformed into chemically competent E. coli BL21 (DE3) cells.
Analytical-scale enzymatic reactions
Purified enzymes were used to conduct analytical-scale enzymatic reactions to quantitatively determine stereoselectivity. Each 500 μL reaction mixture contained: 5 mM substrate, 1 mM NADP+, 30 mM D-glucose, 2 mg/mL glucose dehydrogenase (GDH) enzyme powder, 1 mg/mL purified enzyme, and 100 mM NaPi (pH = 7.0). Reactions were carried out at 30 °C with shaking at 800 rpm for 20 h, and 50 μL 10 M NaOH was used to quench the reaction. Reaction mixtures were then extracted with 500 μL methyl tert-butyl ether (MTBE) and dried over anhydrous magnesium sulfate prior to gas chromatography analysis. For substrates 1b, 1c, 1 d, 1e, and 1 f, normal phase liquid chromatography was used, and corresponding products were derivatized to determine stereoselectivity. The acetalization reaction mixture contained 500 μL extract liquor, 0.6 μL pyridine, and 18 μL acetic anhydride, and the reaction was performed at 40 °C with shaking at 400 rpm for 4 h. The analytical methods are listed in Supplementary Information.
Construction of empirical fitness landscape
Each intermediate mutant was purified to measure stereoselectivity toward model substrate 1a. To calculate each intermediate’s specific activity via conversion, the reaction times of 15 mutants were set to 3 h (N559-A119G, N559-L19M/C67S, N559-T94S/A119G, N559-L19M/T94S, N559-T94S/I120V, N559-A119G/I120V, N559-L19M/T94S/A119G, N559-L19M/C67S/T94S/ N559-L19M/T94S/I120V, N559-L19M/C67S/T94S/I120V, N559-L19M/C67S/T94S/A119G, N559-C67S/T94S/A119G/I120V, N559-L19M/C67S/A119G/I120V, N559-T94S/A119G/I120V, and N559-T94S/A119G/I120V), and the rest of the mutants were set to 10 h. The other reaction conditions were the same as in the above-mentioned analytical-scale enzymatic reactions. Data were obtained from three replicate experiments.
Protein crystallization of N559 and N559-M6
Purified N559 and N559-M6 were subjected to crystallization trials using commercially available screen kits in 96-well sitting-drop format, with each drop comprising 1 μL of protein solution containing 2.5 mM NADP+ and 1 μL of reservoir solution. Crystallization and X-ray diffraction experiments yielded two structures: N559 in complex with NADP+ and N559-M6 in complex with NADP+. For further details see Supplementary Information. X-ray diffraction data were collected at beamlines BL19U1 and BL02U1 of the Shanghai Synchrotron Radiation Facility (SSRF).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Biochemical and X-ray crystallographic data generated in this study are provided in the Supplementary Information. The ancestral IRED sequences are available under PP329371 (N1), PP329372 (N2), PP329373 (N3), PP329374 (N4), PP329375 (N5), PP329376 (N557), PP329377 (N558), PP329378 (N559), PP329379 (N560), PP329380 (AltN1), PP329381 (AltN2), PP329382 (AltN3), PP329383 (AltN4), PP329384 (AltN5), PP329385 (AltN557), PP329386 (AltN558), PP329387 (AltN559), and PP329388 (AltN560).Crystallographic data are deposited in the Protein Data Bank. N559 and N559-M6 are available under 8JKU and 8HWY, respectively. The eight extant sequences in the study are provided in the Source Data file. The two β-hydroxyacid dehydrogenase sequences are available under 4GBJ and 3PEF. All raw data generated in this study are provided in the Supplementary Data file and Source Data file. Source data are provided with this paper.
References
Plankensteiner, K., Reiner, H. & Rode, B. M. Stereoselective differentiation in the salt-induced peptide formation reaction and its relevance for the origin of life. Petides 26, 535–541 (2005).
Bornscheuer, U. T. The fourth wave of biocatalysis is approaching. Phil. Trans. R. Soc. A. 376, 20170063 (2018).
Strohmeier, G. A., Pichler, H., May, O. & Gruber-Khadjawi, O. Application of designed enzymes in organic synthesis. Chem. Rev. 111, 4141–4164 (2011).
Dong, Y. J. et al. Manipulating the stereoselectivity of a thermostable alcohol dehydrogenase by directed evolution for efficient asymmetric synthesis of arylpropanols. Biol. Chem. 400, 313–321 (2019).
Reetz, M. T., Wang, L. W. & Bocola, M. Directed evolution of enantioselective enzymes: iterative cycles of CASTing for probing protein-sequence space. Angew. Chem. Int. Ed. 45, 1236–1241 (2006).
Yu, S. S. et al. Inverting the enantiopreference of nitrilase‐catalyzed desymmetric hydrolysis of prochiral dinitriles by reshaping the binding pocket with a mirror‐image strategy. Angew. Chem. Int. Ed. 60, 3679–3684 (2021).
Calvó-Tusell, C., Liu, Z., Chen, K., Arnold, F. H. & Garcia-Borràs, M. Reversing the enantioselectivity of enzymatic carbene N–H insertion through mechanism-guided protein engineering. Angew. Chem. Int. Ed. 62, e202303879 (2023).
Harms, M. J. & Thornton, J. W. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat. Rev. Genet. 14, 559–571 (2013).
Hochberg, G. K. A. & Thornton, J. W. Reconstructing ancient proteins to understand the causes of structure and function. Annu. Rev. Biophys. 46, 247–269 (2017).
Dean, A. M. & Thornton, J. W. Mechanistic approaches to the study of evolution: the functional synthesis. Nat. Rev. Genet. 8, 675–688 (2007).
Gumulya, Y. & Gillam, E. M. J. Exploring the past and the future of protein evolution with ancestral sequence reconstruction: the ‘retro’ approach to protein engineering. Biochem. J. 474, 1–19 (2017).
Chiang, C. H. et al. Deciphering the evolution of flavin-dependent monooxygenase stereoselectivity using ancestral sequence reconstruction. Proc. Natl. Acad. Sci. USA 120, e2218248120 (2023).
Harms, M. J. & Thornton, J. W. Analyzing protein structure and function using ancestral gene reconstruction. Curr. Opin. Struct. Biol. 20, 360–366 (2010).
Bridgham, J. T., Carroll, S. M. & Thornton, J. W. Evolution of hormone-receptor complexity by molecular exploitation. Science 312, 97–101 (2006).
Dishman, A. F. et al. Evolution of fold switching in a metamorphic protein. Science 371, 86–90 (2021).
Starr, T. N. et al. ACE2 binding is an ancestral and evolvable trait of sarbecoviruses. Nature 603, 913–918 (2022).
Su, W. et al. Ancestral sequence reconstruction pinpoints adaptations that enable avian influenza virus transmission in pigs. Nat. Microbiol. 6, 1455–1465 (2021).
Mangas-Sanchez, J. et al. Imine reductases (IREDs). Curr. Opin. Chem. Biol. 37, 19–25 (2017).
Lenz, M., Borlinghaus, N., Weinmann, L. & Nestl, B. M. Recent advances in imine reductase-catalyzed reactions. World J. Microbiol. Biotechnol. 33, 199 (2017).
Zhang, Y. H. et al. Stereocomplementary synthesis of pharmaceutically relevant chiral 2-aryl-substituted pyrrolidines using imine reductases. Org. Lett. 22, 3367–3372 (2020).
Chen, Q. et al. Engineered imine reductase for larotrectinib intermediate manufacture. ACS Catal. 12, 14795–14803 (2022).
Chen, F. F. et al. Discovery of an imine reductase for reductive amination of carbonyl compounds with sterically challenging amines. J. Am. Chem. Soc. 145, 4015–4025 (2023).
Thorpe, T. W. et al. Multifunctional biocatalyst for conjugate reduction and reductive amination. Nature 604, 86–91 (2022).
Marshall, J. R. et al. Screening and characterization of a diverse panel of metagenomic imine reductases for biocatalytic reductive amination. Nat. Chem. 13, 140–148 (2021).
Kumar, R. et al. Biocatalytic reductive amination from discovery to commercial manufacturing applied to abrocitinib JAK1 inhibitor. Nat. Catal. 4, 775–782 (2021).
Schober, M. et al. Chiral synthesis of LSD1 inhibitor GSK2879552 enabled by directed evolution of an imine reductase. Nat. Catal. 2, 909–915 (2019).
Fademrecht, S., Scheller, P. N., Nestl, B. M., Hauer, B. & Pleiss, J. Identification of imine reductase-specific sequence motifs: imine reductase-specific sequence motifs. Proteins 84, 600–610 (2016).
Pillai, A. S. et al. Origin of complexity in haemoglobin evolution. Nature 581, 480–485 (2020).
Wilson, C. et al. Using ancient protein kinases to unravel a modern cancer drug’s mechanism. Science 347, 882–886 (2015).
Kaltenbach, M. et al. Evolution of chalcone isomerase from a noncatalytic ancestor. Nat. Chem. Biol. 14, 548–555 (2018).
Clifton, B. E. et al. Evolution of cyclohexadienyl dehydratase from an ancestral solute-binding protein. Nat. Chem. Biol. 14, 542–547 (2018).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Liu, Y., Yan, Z. H., Lu, X. Y., Xiao, D. G. & Jiang, H. F. Improving the catalytic activity of isopentenyl phosphate kinase through protein coevolution analysis. Sci. Rep. 6, 24117 (2016).
Oteri, F., Nadalin, F., Champeimont, R. & Carbone, A. BIS2Analyzer: a server for co-evolution analysis of conserved protein families. Nucleic Acids Res. 45, W307–W314 (2017).
Graef, J., Ehrt, C. & Rarey, M. Binding site detection remastered: enabling fast, robust, and reliable binding site detection and descriptor calculation with DoGSite3. J. Chem. Inf. Model. 63, 3128–3137 (2023).
Brouillet, S., Annoni, H., Ferretti, L. & Achaz, G. MAGELLAN: a tool to explore small fitness landscapes. Preprint at https://www.biorxiv.org/content/10.1101/031583v1 (2015).
Ferretti, L. et al. Measuring epistasis in fitness landscapes: the correlation of fitness effects of mutations. J. Theor. Biol. 396, 132–143 (2016).
Sailer, Z. R. & Harms, M. J. Detecting high-order epistasis in nonlinear genotype-phenotype maps. Genetics 205, 1079–1088 (2017).
Ding, Q. et al. The evolutionary origin of naturally occurring intermolecular Diels-Alderases from Morus alba. Nat. Commun. 15, 2492 (2024).
DeMars, M. D. II & O’Connor, S. E. Evolution and diversification of carboxylesterase-like [4+2] cyclases in aspidosperma and iboga alkaloid biosynthesis. Proc. Natl. Acad. Sci. USA 121, e2318586121 (2024).
Cheng, F. et al. Controlling stereopreferences of carbonyl reductases for enantioselective synthesis of atorvastatin precursor. ACS Catal. 11, 2572–2582 (2021).
Noey, E. L. et al. Origins of stereoselectivity in evolved ketoreductases. Proc. Natl. Acad. Sci. USA 112, E7065–E7072 (2015).
Wilding, M., Hong, N., Spence, M., Buckle, A. M. & Jackson, C. J. Protein engineering: the potential of remote mutations. Biochem. Soc. Trans. 47, 701–711 (2019).
Aleku, G. A. et al. Stereoselectivity and structural characterization of an imine reductase (IRED) from Amycolatopsis orientalis. ACS Catal. 6, 3880–3889 (2016).
Weinreich, D. M., Delaney, N. F., Depristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006).
Starr, T. N. & Thornton, J. W. Exploring protein sequence-function landscapes. Nat. Biotechnol. 35, 125–126 (2017).
Yi, X. & Dean, A. M. Adaptive landscapes in the age of synthetic biology. Mol. Biol. Evol. 36, 890–907 (2019).
Meini, M. R., Tomatis, P. E., Weinreich, D. M. & Vila, A. J. Quantitative description of a protein fitness landscape based on molecular features. Mol. Biol. Evol. 32, 1774–1787 (2015).
Poelwijk, F. J., Kiviet, D. J., Weinreich, D. M. & Tans, S. J. Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445, 383–386 (2007).
de Visser, J. A. G. M. & Krug, J. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480–490 (2014).
Nishikawa, K. K., Hoppe, N., Smith, R., Bingman, C. & Raman, S. Epistasis shapes the fitness landscape of an allosteric specificity switch. Nat. Commun. 12, 5562 (2021).
Tokuriki, N. et al. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 3, 1257 (2012).
Sailer, Z. R. & Harms, M. J. High-order epistasis shapes evolutionary trajectories. PLoS Comput. Biol. 13, e1005541 (2017).
Yang, G. et al. Higher-order epistasis shapes the fitness landscape of a xenobiotic-degrading enzyme. Nat. Chem. Biol. 15, 1120–1128 (2019).
Weinreich, D. M., Watson, R. A. & Chao, L. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59, 1165–1174 (2005).
Kondrashov, D. A. & Kondrashov, F. A. Topological features of rugged fitness landscapes in sequence space. Trends Genet 31, 24–33 (2015).
Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C. & Kondrashov, F. A. Epistasis as the primary factor in molecular evolution. Nature 490, 535–538 (2012).
Lunzer, M., Golding, G. B. & Dean, A. M. Pervasive cryptic epistasis in molecular evolution. PLoS Genet 6, e1001162 (2010).
Bridgham, J. T., Ortlund, E. A. & Thornton, J. W. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461, 515–519 (2009).
Starr, T. N., Picton, L. K. & Thornton, J. W. Alternative evolutionary histories in the sequence space of an ancient protein. Nature 549, 409–413 (2017).
Hopf, T. A. et al. Mutations effects predicted from sequence co-variation. Nat. Biotechnol. 35, 128–135 (2017).
Cadet, F. et al. A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes. Sci. Rep. 8, 16757 (2018).
Hanson-Smith, V., Kolaczkowski, B. & Thornton, J. W. Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol. Biol. Evol. 27, 1988–1999 (2010).
Eick, G. N., Bridgham, J. T., Anderson, D. P., Harms, M. J. & Thornton, J. W. Robustness of reconstructed ancestral protein functions to statistical uncertainty. Mol. Biol. Evol. 34, 247–261 (2017).
Jensen, R. A. Enzyme recruitment in evolution of new function. Annu. Rev. Microbiol. 30, 409–425 (1976).
Guo, F., Xu, H. M., Xu, H. N. & Yu, H. W. Compensation of the enantioselectivity-activity trade-off in the directed evolution of an esterase from Rhodobacter sphaeroides by site-directed saturation mutagenesis. Appl. Microbiol. Biotechnol. 97, 3355–3362 (2013).
Zhou, J. Y. et al. Structural insight into enantioselectivity inversion of an alcohol dehydrogenase reveals a “polar Gate” in stereorecongnition of diaryl ketones. J. Am. Chem. Soc. 140, 12645–12654 (2018).
Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 20, 1160–1166 (2019).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Lefort, V., Longueville, J. E. & Gascuel, O. SMS: smart model selection in PhyML. Mol. Biol. Evol. 34, 2422–2424 (2017).
Lemoine, F. et al. Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556, 452–456 (2018).
Foley, G. et al. Engineering indel and substitution variants of diverse and ancient enzymes using graphical representation of ancestral sequence predictions (GRASP). PLoS Comput. Biol. 18, e1010633 (2022).
Acknowledgements
We thank J. Zheng (Westlake University, Hangzhou City, Zhejiang province) and G. N. Wen (Institute of Zoology, Chinese Academy of Sciences, Beijing) for critical comments and discussions; Y.Han (East China University of Science and Technology) for help during X-ray data collection; and Q. J. Xiao (Shanghai Institute for Advanced Study, Chinese Academy of Sciences) and R. Liang (Huazhong Agricultural University, Wuhan city, Hubei province) for determining protein crystal structures; and three reviewers for constructive feedback on the manuscript. We are grateful to Shanghai Synchrotron Radiation Facility (SSRF) beamlines BL19U1 and BL02U1 for protein structure determination support. This work was supported by the National Key Research and Development Program of China (2019YFA09005000 to G.W.Z. and 2021YFA0911400 to F.F.C.), and the National Natural Science Foundation of China (32371547 to G.W.Z. and 22008068 to F.F.C.).
Author information
Authors and Affiliations
Contributions
All listed authors performed experiments and/or analyzed data. X.X.Z., X.D.K., and G.W.Z. designed the research and wrote the paper; X.X.Z., W.Q.Z., Z.W.X., X.R.C., T.J., X.W.D., and F.F.C. performed the research. X.X.Z. and X.D.K. solved the protein structure. Q.C. and J.H.X. participated in research.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, XX., Zheng, WQ., Xia, ZW. et al. Evolutionary insights into the stereoselectivity of imine reductases based on ancestral sequence reconstruction. Nat Commun 15, 10330 (2024). https://doi.org/10.1038/s41467-024-54613-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-54613-3