Evolutionary insights into the stereoselectivity of imine reductases based on ancestral sequence reconstruction

Zhu, Xin-Xin; Zheng, Wen-Qing; Xia, Zi-Wei; Chen, Xin-Ru; Jin, Tian; Ding, Xu-Wei; Chen, Fei-Fei; Chen, Qi; Xu, Jian-He; Kong, Xu-Dong; Zheng, Gao-Wei

doi:10.1038/s41467-024-54613-3

Download PDF

Article
Open access
Published: 28 November 2024

Evolutionary insights into the stereoselectivity of imine reductases based on ancestral sequence reconstruction

Nature Communications volume 15, Article number: 10330 (2024) Cite this article

9269 Accesses
16 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The stereoselectivity of enzymes plays a central role in asymmetric biocatalytic reactions, but there remains a dearth of evolution-driven biochemistry studies investigating the evolutionary trajectory of this vital property. Imine reductases (IREDs) are one such enzyme that possesses excellent stereoselectivity, and stereocomplementary members are pervasive in the family. However, the regulatory mechanism behind stereocomplementarity remains cryptic. Herein, we reconstruct a panel of active ancestral IREDs and trace the evolution of stereoselectivity from ancestors to extant IREDs. Combined with coevolution analysis, we reveal six historical mutations capable of recapitulating stereoselectivity evolution. An investigation of the mechanism with X-ray crystallography shows that they collectively reshape the substrate-binding pocket to regulate stereoselectivity inversion. In addition, we construct an empirical fitness landscape and discover that epistasis is prevalent in stereoselectivity evolution. Our findings emphasize the power of ASR in circumventing the time-consuming large-scale mutagenesis library screening for identifying mutations that change functions and support a Darwinian premise from a molecular perspective that the evolution of biological functions is a stepwise process.

Multifunctional biocatalyst for conjugate reduction and reductive amination

Article 06 April 2022

Understanding activity-stability tradeoffs in biocatalysts by enzyme proximity sequencing

Article Open access 28 February 2024

Substrate scope expansion of 4-phenol oxidases by rational enzyme selection and sequence-function relations

Article Open access 03 June 2024

Introduction

Stereoselectivity of enzymes, the ability to determine which enantiomer dominates in the achiral-to-chiral transformation, is critical for life and asymmetric biotransformation^1,2. For example, L-type amino acids comprise almost all proteins on earth, and D-saccharides constitute a wide variety of sugar compounds. Enzymatic catalysis is essential to endow molecules with chirality during their de novo synthesis in vivo. Hence, stereoselectivity is a crucial enzymatic property for ensuring enantiomerically pure biomolecules. Furthermore, the stereoselectivity of enzymes plays a significant role in asymmetric organic reactions³. Unfortunately, naturally occurring enzymes are not always able to achieve excellent enantiomeric excess (ee) or desired stereoselectivity in asymmetric catalysis, but directed evolution can be employed to improve or even reverse stereoselectivity^4,5. However, undefined molecular mechanisms often require labor-intensive experiments to identify key mutations that manipulate stereoselectivity. Therefore, illuminating the mechanism behind the stereoselective evolution of enzymes is highly desirable.

A variety of strategies have been proposed to reverse enzyme stereoselectivity based on horizontal biochemical analysis, such as the combinatorial active-site saturation test (CAST) and the mirror-image strategy^5,6. More recently, a completely mechanism-guided method was successfully applied to invert the stereoselectivity of cytochrome P411⁷. Although stereoselectivity inversion has been widely studied in the protein engineering field, the evolutionary trajectory of stereoselectivity is not well explored from an evolutionary biochemistry perspective. Ancestral sequence reconstruction (ASR) is a specific and highly efficient approach for interrogating the evolution of protein functions in evolutionary biochemistry^8,9,10,11. However, only one example of ASR has been reported to date for studying stereoselectivity evolution¹². By experimentally characterizing ancestral enzymes located in the evolution trajectory of functional changes, detailed insights into biophysical mechanisms may be gained that are often inaccessible using biochemistry alone^13,14,15. Compared with the above-mentioned strategies, ASR is better able to identify the crucial residues that switch enzyme function. It can directly pinpoint the specific amino acid types in key sites without performing saturation mutagenesis^16,17. Additionally, epistasis, the phenomenon in which the effect of mutations on functions substantially depends on other mutations, often complicates the study of sequence-structure-function relationships^8,9. Empirical evidence has shown that ASR can help reduce the effect of epistasis by reintroducing historical substitutions into a genetic background that is identical or very similar to the one in which they originally occurred^9,11. Therefore, key residues can be swapped between ancestral sequences with different stereoselectivity to investigate the corresponding regulatory mechanism of stereochemistry.

Imine reductases (IREDs) are a class of NAD(P)H-dependent oxidoreductases that catalyze the reduction of imines and the reductive amination of carbonyl compounds and amines to their corresponding amines¹⁸. Numerous stereocomplementary IREDs have been identified and used to synthesize a variety of amines^19,20,21. In addition, a range of studies on expanding substrate scope, exploring new reactions, and engineering enzymes for industrial applications have been performed^{22,23,24,25,26}. However, there have been no explicit experimental insights regarding the mechanism underlying stereocomplementarity. Based on previous empirical biochemical data and horizontal multiple sequence alignment, highly conserved aspartate or tyrosine residues located at standard position 187 (residue numbering refers to PDB ID 3ZHB) appears to be responsible for opposite stereoselectivity in the IRED family²⁷. However, this hypothesis has not been validated within extant enzymes by direct residue swapping experiments. This is in part because epistasis generally does not enable residue interchange in extant enzyme sequences. This weakness of horizontal research can be circumvented by conducting residue swapping experiments on an ancestral sequence. More importantly, much empirical evidence has demonstrated that ASR is able to reveal historical mutations leading to function divergence in two protein subfamilies with different functions^28,29,30,31.

In the present work, we sought to identify the historical substitutions regulating the stereoselectivity of IREDs using ASR and unveil the evolutionary mechanism of stereoselectivity. Six historical mutations leading to the stereoselectivity reversion were identified via coevolution analysis. We then used X-ray crystallography to determine protein structures and gain structural insights into the mechanisms by which the six historical mutations reverse stereoselectivity. We also constructed an empirical fitness landscape consisting of the six mutations to uncover the role of epistasis in the evolutionary process that reverses IRED stereoselectivity. Through this work, we demonstrate that ASR is an efficient strategy for identifying mutations regulating stereoselectivity, providing a method for the directed evolution of stereoselectivity in enzymes.

Results

Reconstruction and characterization of ancestral IREDs

Initially, we performed multiple sequence alignment and phylogenetic analyses on 1408 IREDs from a public IRED sequence database (Imine Reductase Engineering Database version 3; https://www.ired.biocatnet.de) and 18 outgroup sequences, and all maximum likelihood ancestral IRED amino acid sequences in this study were inferred based on the phylogenetic tree (Supplementary Fig. 1). Interestingly, we found that all Y-type sequences (those with Tyr at standard position 187, corresponding to position 170 in ancestral sequences) all form a relatively late branching clade in the phylogenetic tree (Supplementary Fig. 1). By contrast, D-type sequences (those with Asp at standard position 187) are dispersed throughout the whole phylogenetic tree, and many are close to the last common ancestor (LCA, N1). We, therefore, speculated that the highly conserved Asp located at standard position 187 is probably an ancestral trait in the IRED family, while Tyr is its derived genotype.

To verify this hypothesis, we firstly inferred the maximum likelihood amino acid sequence of the LCA. Sequence analysis results indicated that residue 170 of LCA is Asp, belonging to D-type sequences in the IRED family, confirming the above hypothesis (Supplementary Fig. 2). Subsequently, we assessed the stereoselectivity of N1 toward model substrate 1a using chiral gas chromatography. The results showed that N1 displayed excellent (S)-stereoselectivity toward 1a with > 99.9% ee (Fig. 1b and Supplementary Fig. 3).

**Fig. 1: Characterization of the stereoselectivity of ancestral IREDs.**

Evolvability of stereoselectivity in IREDs

Our group previously described eight IREDs that catalyze 1a with excellent stereoselectivity, including two enzymes with R-stereoselectivity that belong to the Y-type clade²⁰. Based on these previous results and phylogenetical analysis, we speculated that N560 (the recent common ancestor of Y-type IREDs) would asymmetrically reduce 1a to generate (R)-2a. To validate this speculation, we experimentally resurrected N560 and characterized its stereoselectivity toward 1a. As expected, N560 displayed R-stereoselectivity, yielding (R)-2a with 41% ee (Fig. 1d and Supplementary Fig. 5). To investigate how stereoselectivity evolves from an ancestor with S-stereoselectivity to descendants with R-stereoselectivity, we narrowed our analysis to the specific trajectory from N1 to N560 (Fig. 1c). We then experimentally resurrected these intermediate ancestors (N2–N559) in this evolution pathway and biochemically characterized their stereoselectivity toward 1a. The results demonstrated that the ancestors N1–N559 belonging to D-type sequences exhibit excellent S-stereoselectivity (>90% ee, Fig. 1d and Supplementary Figs. 2–5). Importantly, we observed that stereoselectivity inversion occurs at the evolutionary interval between N559 and N560. Moreover, it should be noted that the substitution of residue 170 from D to Y also occurs at this interval. In addition, we also investigated the evolution pattern of stereoselectivity in alternative ancestral sequences (AltN1–AltN5 and AltN557–AltN560) to support the robustness of reconstruction (Supplementary Fig. 7). Consistent with our results for the maximum likelihood ancestors, stereoselectivity inversion occurred at the evolution interval between AltN559 and AltN560 (Supplementary Fig. 7). Moreover, we also measured the specific activity of ancestral enzymes and their corresponding alternative versions (Supplementary Table 2). All ancestral enzymes were expressed in soluble form, which enabled the above-mentioned experiments to be successful (Supplementary Fig. 8). These results demonstrated that the stereoselectivity of IREDs is a highly evolvable trait within the targeted trajectory. Therefore, the historical mutations responsible for the evolution of stereoselectivity can be determined.

Identification of historical mutations through coevolution analysis

To identify the historical substitutions reversing stereoselectivity in the evolutionary interval between N559 and N560, we chose N559 as the target object for subsequent protein engineering. We first investigated whether the single mutation from Asp to Tyr at position 170 of N559 inverts its stereoselectivity. The results showed that the D170Y mutation could not reverse the stereoselectivity preference of N559, but it significantly decreased the ee of (S)-2a from 93% to 25% (Supplementary Fig. 9). This indicates that position 170 plays a vital role in regulating stereoselectivity. Furthermore, this also implied that additional historical mutations are necessary to invert stereoselectivity.

Besides in silico protein structure prediction, coevolution analysis is also used to identify important mutations impacting protein function^32,33. To unveil the additional historical mutations, we performed coevolution analysis on ancestral sequences (N1–N560) and eight extant enzyme sequences using ibis2analyzer³⁴. Based on this analysis, mutations co-evolving with the important substitution D170Y were determined. The results showed that inversion from S- to R-stereoselectivity is associated with the six mutations L19M, C67S, T94S, A119G, I120V, and D170Y (Fig. 2a, right). Subsequently, we introduced the six substitutions into N559 to generate mutant N559-M6. As expected, the stereoselectivity of N559-M6 was successfully inverted to the R-configuration from the S-configuration, generating (R)-2a with 67% ee (Fig. 2b and Supplementary Fig. 6). Furthermore, we also conducted back-mutations of the same six sites in N560 to generate mutant N560-M6, and stereoselectivity was also inverted from 41% ee_R to 73% ee_S (Fig. 2b and Supplementary Fig. 6). These protein engineering results demonstrated that these coevolution mutations played a crucial regulatory role in the evolution of IRED stereoselectivity.

**Fig. 2: Identification of historical mutations responsible for the evolution of stereoselectivity.**

To further verify the role of the six coevolution mutations in stereoselectivity regulation, more imine substrates (1b–1f) were used for biotransformation. We found that N559 possesses opposite stereoselectivity to N559-M6 toward the five substrates (Fig. 2c, Supplementary Fig. 10 and Supplementary Table 3). The results indicated that the stereoselectivity inversion caused by the six historical substitutions was an intrinsic property of enzymes during evolution.

Structural basis for the stereoselectivity evolution of IREDs

To examine the role of the six historical mutations that reversed stereoselectivity from a structural perspective, we determined the crystal structures of N559 (PDB: 8JKU) and N559-M6 (PDB: 8HWY) complexed with NADP⁺ at resolutions of 2.57 Å and 2.32 Å, respectively (Supplementary Fig.11a–d). Similar to the extant enzymes, ancestral IREDs were homodimers with an unusual reciprocal domain sharing arrangement, with the active site located at the interface formed by the two monomers (Fig. 3a, left). In the structure of N559-M6, two different binding conformations of NADP⁺ were observed. As shown in Supplementary Fig. 11d, e, the first conformation was consistent with the canonical NADP⁺ conformation in the structure of N559, whereas the second adopted an upturned conformation with almost no space left for binding substrate next to the nicotinamide ring of the cofactor. In addition, a significant conformational change was observed between N559 and N559-M6; the loop containing the T94S mutation in N559-M6 moved by 3.8 Å compared to N559 (Supplementary Fig. 12), and the resulting open space allowed the nicotinamide ring of NADP⁺ in N559-M6 to flip. Due to the limited space for substrate binding in the second conformation, we predicted that this upturned conformation is catalytically inactive. Therefore, the first conformation of NADP⁺ in N559-M6 was used for subsequent structural analysis.

**Fig. 3: Structural differences between N559 and N559-M6.**

During the reaction catalyzed by IREDs, a hydrogen atom and an electron are transferred from NADPH to imine substrates. Therefore, NADP⁺ was located in the center of the catalytic pocket of IREDs. To facilitate a precise comparison of catalytic site structures between N559 and N559-M6, we superposed the NADP⁺ molecule from the crystal structures so that we could observe a significant conformation change in the Rossmann fold region (Fig. 3a, right). This conformational change further reduced the space in the catalytic pocket of N559-M6 (Fig. 3c). As shown in Fig. 3b, five out of six mutations were located in three regions around the NADP⁺ binding pocket, whereas only one mutation, D170Y, was located in close proximity to the predicted substrate binding site next to the nicotinamide ring of NADP⁺. Since Tyr has a larger side chain than Asp, D170Y further reduced the volume of the catalytic pocket (Fig. 3c).

To understand how the reduced space in the catalytic pocket reversed the stereoselectivity of N559-M6, we conducted molecular docking analysis. As shown in Fig. 4, the location of the two ring structures of the substrate was exchanged in the binding model of N559 and N559-M6. The D170Y mutation in N559-M6 caused steric hindrance with the substrate-binding conformation of N559, which led to the adoption of a new substrate-binding conformation in N559-M6 (Fig. 4b). To quantify the change in volume, we used the DoGSiteScorer tool on the ProteinPlus server (https://proteins.plus/) to measure the corresponding volume of the NADP⁺ binding pocket³⁵. According to the results, it was found that the volume decreased by 149.5 Å³ in N559-M6 (N559: 979.64 Å³, N559-M6: 830.14 Å³). Moreover, the substrate can form a hydrogen bond with Y170 in N559-M6, which stabilizes the substrate in the binding mode (Fig. 4b). As a result, these changes contribute to the inversion of the substrate-binding conformation in N559-M6 from Pro-S to Pro-R (Fig. 4).

**Fig. 4: Insights into the mechanism of stereoselectivity inversion.**

Empirical fitness landscape of stereoselectivity

Our investigation of the six historical substitutions led to two additional pertinent questions. Are the six mutations necessary for stereoselectivity inversion a minimum number of mutations? Does epistasis play a role in the formation of novel stereoselectivity preferences, and if so, how does it work? To answer these questions, we constructed an empirical fitness landscape with 64 (2⁶) genotypes and 720 (6!) potential evolutionary paths using stereoselectivity as a proxy for fitness (Fig. 5). We quantitatively defined the fitness using the difference in ee between N559 and each intermediate. Specifically, S- and R-stereoselectivity were denoted by positive and negative signs (Supplementary Table 5); a larger value indicates a higher level of fitness in our landscape, according to our definition.

**Fig. 5: Empirical fitness landscape encompassing the six historical substitutions.**

We found that 15 mutants could change the stereoselectivity of model substrate 1a from S- to R-stereoselectivity. These mutants include one double mutant, three triple mutants, six quadruple mutants, four quintuple mutants, and one sextuple mutant (Fig. 5). The results indicated that a minimum of two mutations was sufficient for stereoselectivity inversion. Except for the L19M/A119G mutant, all the other 14 mutants contain the D170Y mutation. Additionally, we found a significant positive correlation between the number of mutations and the number of mutants altering stereoselectivity (r = 0.90, Supplementary Fig. 13 and Supplementary Table 6). The correlation relationship indicates that the effects of mutations depend on the sequence background, a phenomenon known as epistasis. These results demonstrate that the order of mutation introduction is vital for stereoselectivity evolution. In other words, epistasis is pervasive in this empirical fitness landscape.

We then examined the epistasis pattern in this landscape by analyzing mutation-induced fitness changes and performing statistical analyses. A closer look at this empirical fitness landscape showed three types of epistasis (Supplementary Fig. 14a–c). A further analysis of the contribution of the three types of epistasis to the whole landscape was performed using the MAGELLAN tool^36,37. According to these results, magnitude epistasis explained 47.9% of fitness variations, while sign epistasis and reciprocal sign epistasis explained 35.4% and 13.3%, respectively (Supplementary Fig. 15). Additionally, we used a Python package to detect epistasis and quantify the fraction of variation accounted for by first- to sixth-order (the interaction among six mutations) epistatic coefficients³⁸. Based on these results, 32% of variation could be explained without epistasis, 24% by second-order epistasis, and 44% by higher-order epistasis (interactions between three or more mutations, Supplementary Fig. 14d). The results from statistical analyses highlight the importance of considering epistasis when predicting mutation effects.

Discussion

In the present study, we demonstrated the evolutionary pattern of IRED stereoselectivity using ASR and artificially reproduced it via several crucial historical substitutions. Specifically, a set of six historical substitutions responsible for the evolution of stereoselectivity from ancestral S-selectivity to derived R-selectivity were identified using coevolution analysis. To identify key historical substitutions causing a change in function in the focused evolution trajectory, empirical evidence mostly relies on structural information or relevant prior knowledge about sites affecting function in extant enzymes^12,15,16. In addition, it is often necessary to establish a large library of mutants. To identify single target mutations or combinations of mutations that change protein functions, vast screening experiments are also needed^39,40. In order to overcome these challenges, we present an alternative approach that based on ancestral and extant sequences to perform a coevolution analysis to determine key historical substitutions. Here, our results showed that the alternative approach is efficient for uncovering historical mutations responsible for stereoselectivity evolution. As an added benefit, this approach also reduces the effort needed to screen large libraries of mutants.

Furthermore, we investigated the structural basis of the six coevolution mutations recapitulating stereoselectivity evolution by determining crystal structures. Our analyses of the mechanism underlying stereoselectivity evolution are in accordance with several previous studies on stereoselectivity inversion. Specifically, a handful of sequence mutations reversed stereoselectivity by reshaping the substrate-binding pocket^41,42. Our structural analysis showed that five out of six substitutions did not directly interact with the substrate, indicating that ASR has significant potential to capture remote mutations critical for protein functions⁴³. This aspect differs from several previous studies in which inverting stereoselectivity was achieved by mutating residues that interact directly with substrate^7,44. Compared with ASR, the traditional biochemistry strategies are inefficient in detecting crucial residues distant from the substrate⁴³. Here, our study demonstrates that reshaping the substrate-binding pocket by mutation of residues around NAD(P)H is an effective strategy for the inversion of stereoselectivity of NADPH-dependent oxidoreductases. Specifically, we suggest that these residues located at the loop surrounding the NADPH-binding pocket should be a hotspot for manipulating the stereoselectivity of the IREDs.

The empirical fitness landscape allows us to investigate the evolution of protein function by exploring all evolutionary intermediates^45,46,47,48. Using this approach, we can analyze the relationships between sequences and functions in detail to understand the mechanisms of mutations for mediating functional changes. Numerous studies have demonstrated that creating empirical fitness landscapes is a powerful way to reveal accessible evolutionary paths, explore the potential for evolutionary predictability, and investigate epistasis^49,50. Our findings on the empirical fitness landscape are in agreement with those of previous studies, in which enzyme activity was often selected as a proxy for fitness^51,52,53,54. The sign epistasis limits the number of available evolutionary pathways where the proxy increases monotonically⁵⁵. According to our statistical analysis, sign epistasis contributes 35.4% to fitness variations. As expected, only 2% (16/720) of trajectories are monotonically increasing in our empirical fitness landscape (Fig. 5). Despite the minimum requirement for stereoselectivity inversion being two mutations, we found that the probability of gaining a new stereoselectivity preference increases as the number of accumulated mutations increases. Furthermore, the order in which mutations are introduced determines whether a mutation results in a gain-or-loss of function. Epistasis is the main cause of the above-mentioned scenarios^56,57,58. These findings demonstrated that early mutations play a permissive role by epistatic interaction to generate and improve the gain-to-function effect of latter mutations^59,60. More than half of the fitness variations could be explained by epistasis, according to our statistical analyses (Supplementary Fig. 14d). Therefore, our results suggest that incorporating epistasis into predictive models contributes to improving the accuracy of predictions for the effect of mutations on protein functions. A plethora of epistasis-based predictive models have been developed in recent years, and these models outperform additive models without epistasis, such as EVmutation and Innov’SAR^61,62.

A number of ASR studies use the maximum likelihood (ML) approach to infer ancestral sequences, and one study on simulated data revealed that the ML method is the most reliable method to estimate ancestral sequences⁶³. It is possible, however, that the ML approach would incur statistical uncertainty at some sites, which means that the reconstructed residues at such sites would possess other replaceable residues. Eick et al. conducted a systematic study to investigate the robustness of reconstructed ancestral protein functions to statistical uncertainty in the ML method⁶⁴. It is noteworthy that the alternative ancestors’ protein functions are qualitatively unchanged from those of their maximum-likelihood ancestors, according to their results. However, they demonstrated that precise quantitative measurements of function, such as, enzyme kinetic parameters, vary among alternative ancestors. Although our reconstruction has a high average posterior probability for each ancestral sequence (Supplementary Fig. 16), we also found the same experimental phenomenon in the present study. The specific activity of N560 is significantly higher than AltN560 (Supplementary Table 2), which is a consequence of reconstruction uncertainty. Additionally, we also found that the catalytic efficiency of ancestral enzymes is lower than extant enzymes (Supplementary Table 7), which is in accord with Jensen’s hypothesis⁶⁵. Meanwhile, the results may indicate that the ancestral IREDs possess high catalytic promiscuity, and our study in progress supports this aspect.

Furthermore, we observed differences in specific activity between reconstructed ancestral enzymes (Supplementary Table 2). According to lessons learned from directed evolution campaigns, for example, two mutations (V122C and F177W) at the active pocket in ScIR improve its specific activity 10-fold²¹. The results indicate that positions 122 and 177 play an important role in regulating activity in IREDs. Then, we looked at the residues in the active pocket of ancestral enzymes and discovered obvious mutations in the location, especially for N559 and N560 (Supplementary Fig. 17). Specifically, we observed that the two key positions (122 and 177) have mutated compared with other ancestral enzymes with different magnitudes of activity (Supplementary Fig. 17). Therefore, we speculated that differences in the active pocket residues caused a significant variation in activity among the reconstructed ancestral enzymes. An empirical study on the directed evolution of enantioselectivity revealed a trade-off between enantioselectivity and activity, resulting in improved enantioselectivity at the expense of activity⁶⁶. The other study on alcohol dehydrogenase also showed that enantioselectivity inversion would decrease the k_cat/K_m⁶⁷. The empirical evidence indicated that stereoselectivity and activity are in a trade-off relationship in protein engineering^66,67. Our results on N559 and N559-M6 are consistent with this type of trade-off phenomenon (Supplementary Table 5). However, one empirical study also showed that further directed evolution would break the trade-off⁶. According to Yu et al., early stereoselective inversion reduces enzyme activity compared to the wild type, but further directed evolution simultaneously increases both stereoselectivity and enzyme activity⁶. Additionally, we examined the activities of the 64 mutants located in the empirical fitness landscape and observed 18 intermediates with improved activity than N559 (Supplementary Table 5). Moreover, the mutants with inverted stereoselectivity exhibit decreased activity compared to the start point (N559).

In summary, our results demonstrate that ancestral sequence reconstruction has unique advantages for exploring the molecular mechanism underlying stereoselectivity evolution. Through experimentally characterizing the sequence space containing a complete combination of all historical mutations, the effect of epistasis on function evolution could be scrutinized by analyzing the specific change in fitness and conducting statistical analysis. Moreover, ancestral sequences proved to be suitable starting templates for protein engineering campaigns aimed at stereoselectivity reversion¹². With knowledge of the key historical mutations underpinning the evolution of a specific enzyme function, ancestral proteins could be engineered through directed evolution experiments with less expense, providing efficient avenues for the development of new biocatalysts with desirable functions.

Methods

Phylogenetic analysis and ancestral sequence reconstruction

A total of 1409 extant IRED sequences were downloaded from the Imine Reductase Engineering Database version 3 (https://ired.biocatnet.de/)²⁷. These sequences mainly originate from three bacterial phyla, except ten sequences are of eukaryotic origin (Ascomycota)²⁷. More than 80% of sequences are from Actinobacteria, followed by Proteobacteria and Firmicutes. One sequence with X as the initial amino acid was excluded in subsequent analyses. Using the DASH option of the MAFFT online server, 1408 IRED sequences and 18 outgroup sequences were aligned using structure-based multiple sequence alignment (MSA; https://mafft.cbrc.jp/alignment/server/)⁶⁸. The rest of the alignment parameters were default values. We then manually deleted gaps in the MSA results to decrease subsequent phylogeny reconstruction noise. Two principles are employed to modify major indels in the MSA: (1) We removed major indels that are caused by outgroup sequences. These indels result from differences in structure between ingroup and outgroup sequences. (2) We discard positions where more than 50% of sequences contain gaps in the MSA³⁹. An excess of indels at a location suggests that the location is not sufficiently conserved in the evolutionary process, since protein structure is more conservative in evolution. PhyML 3.0 was then used to construct a phylogenetic tree of IRED sequences (http://www.atgc-montpellier.fr/phyml/)⁶⁹. The matrix of amino acid substitution WAG with parameters +G + I + F was selected by Smart Model Selection (SMS) according to the Akaike Information Criterion (AIC), which was the best-fitting evolutionary model⁷⁰. The branch supports of the phylogenetic tree were calculated using BOOSTER⁷¹, an alternative method to using traditional bootstrap values, that is more suitable for large datasets and deep branches. To determine the last common ancestor of IREDs, ornithine cyclodeaminase (a type of oxidoreductase distantly related to IREDs) sequences were chosen as the outgroup.

Ancestral sequences at each focused node in the phylogenetic tree were inferred using Graphical Representation of Ancestral Sequence Predictions (GRASP; http://grasp.scmb.uq.edu.au/)⁷², a new method that can infer insertion-deletion (indel) reconstruction. The WAG was selected as the evolutionary model in GRASP, and marginal reconstruction was selected for ancestral sequence reconstruction. All sequences used to reconstruct the phylogenetic tree were inputted into GRASP for inferring ancestral sequences. The second most-likely sequences for each ancestral sequence (alternative ancestors) were also reconstructed to verify the robustness of ASR⁶³. Additionally, we also used the joint reconstruction method to reconstruct the ancestral sequences, and the sequences from joint reconstruction are almost identical to those from marginal reconstruction (Supplementary Figs. 18–20). Furthermore, we also reconstructed the other phylogenetical tree of IREDs using β-hydroxyacid dehydrogenases as outgroup sequences to support the study conclusions (Supplementary Discussion).

Cloning, expression, and purification of ancestral enzymes

NdeI and XhoI restriction sites were incorporated at the N- and C-termini of ancestral genes. These genes were synthesized (Genscript Corporation, Nanjing, China), amplified and cloned into a pET28a(+) vector containing a six-His-tag at the N-terminus. All vectors containing the relevant codon-optimized genes were transformed into chemically competent Escherichia coli cells. Transformations were carried out aseptically by adding 1 μL plasmids to 20 μL of E. coli chemically competent cells and incubating on ice for 30 min. Cells were heat-shocked for 90 s at 42 °C, then placed back on ice for 90 s before the addition of 500 μL of sterile Luria-Bertani (LB) media. Cells were then incubated for 40 min at 37 °C, 500 μL of recovered cells was plated onto a 2% (w/v) LB agar plate containing kanamycin (50 μg/mL), and plates were incubated overnight at 37 °C. Due to the addition of the six-His-tag, several extra residues are also added in front of the N-terminus of target proteins. For crystallization of N559 and N559-M6, these extra amino acids need to be removed. We used XhoI and NcoI restriction enzymes to remove the extra amino acids at the N-terminus. The N-terminus changes from MGSSHHHHHHSSGLVPRGSHMSNN to MHHHHHHHMSNN. All sequences were confirmed by DNA sequencing.

E. coli BL21(DE3) cells containing ancestral enzyme genes were grown in 4 mL LB medium with 50 μg/mL kanamycin for 12 h. A 1 mL sample of culture was then transferred into 100 mL Terrific Broth with 50 μg/mL kanamycin and cultured at 37 °C with shaking at 200 rpm for 3 h. The temperature was decreased to 16 °C, cells were adjusted to the new temperature for 10 min, and isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to a final concentration of 0.2 mM to induce protein expression. Cultures with IPTG were incubated at 16 °C for 24 h. Cells were harvested using a centrifuge at 7826 × g for 5 min at 4 °C and the supernatant was discarded. The cell pellet was resuspended in purification buffer A (20 mM NaPi, 500 mM NaCl, 10 mM imidazole, and 5 mM beta-mercaptoethanol), and cells were lysed by ultrasonication. The cell lysate was centrifuged at 7826 × g for 40 min at 4 °C, and the supernatant containing protein expressed in soluble form was loaded onto a Ni²⁺- affinity column to obtain the purified target protein. The bound protein was washed sequentially with 10 mL of three types of buffers containing increasing imidazole concentrations (10 mM, 108 mM, and 250 mM imidazole). Fractions (10 mL) containing sufficiently pure protein were transferred into a 10 kDa molecular weight cut-off centrifugal concentrator (Millipore, Darmstadt, Germany) and centrifuged at 704 × g for 30 min at 4 °C. The protein was then exchanged into a buffer comprising 20 mM NaPi, and 0.97 mM dithiothreitol (DTT) without imidazole. Protein concentration was measured using a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, USA) based on the absorbance at 280 nm. The purified enzyme was subjected to biotransformation to examine stereoselectivity.

N-terminally truncated proteins were cultured and purified under the same conditions as normal proteins for crystallization of N559 and N559-M6. A further purification step was carried out using an ÄKTA pure instrument (Cytiva, Marlborough, USA). The protein solution collected from the Ni²⁺ affinity column was further purified by size-exclusion chromatography on a HiLoad 16/600 Superdex 75 pg column (Cytiva, Marlborough, USA). The final protein solution was concentrated to 20 mg/mL for crystallization experiments.

Site-directed mutagenesis

All point mutants of N559 and N560 were constructed via site-directed mutagenesis using primers listed in Supplementary Table 1 (Tsingke Biotechnology Company Limited, Shanghai, China). PCR mixtures (20 μL) contained 0.5 μL forward and reverse primer, 0.5 μL template plasmid, 0.5 μL dimethylsulfoxide (DMSO), 10 μL PrimeSTAR HS Premix (Takara Bio, Dalian, China), and 8 μL ddH₂O. Thermal cycling included denaturation at 98 °C for 3 min, followed by 15 cycles at 98 °C for 10 s, 55 °C for 5 s, 72 °C for 7 min, and a final extension step at 72 °C for 10 min. PCR products were digested using 1 μL DpnІ (New England Biolabs, Ipswich, USA) at 37 °C for 3 h. Digestion mixtures were transformed into chemically competent E. coli BL21 (DE3) cells.

Analytical-scale enzymatic reactions

Purified enzymes were used to conduct analytical-scale enzymatic reactions to quantitatively determine stereoselectivity. Each 500 μL reaction mixture contained: 5 mM substrate, 1 mM NADP⁺, 30 mM D-glucose, 2 mg/mL glucose dehydrogenase (GDH) enzyme powder, 1 mg/mL purified enzyme, and 100 mM NaPi (pH = 7.0). Reactions were carried out at 30 °C with shaking at 800 rpm for 20 h, and 50 μL 10 M NaOH was used to quench the reaction. Reaction mixtures were then extracted with 500 μL methyl tert-butyl ether (MTBE) and dried over anhydrous magnesium sulfate prior to gas chromatography analysis. For substrates 1b, 1c, 1 d, 1e, and 1 f, normal phase liquid chromatography was used, and corresponding products were derivatized to determine stereoselectivity. The acetalization reaction mixture contained 500 μL extract liquor, 0.6 μL pyridine, and 18 μL acetic anhydride, and the reaction was performed at 40 °C with shaking at 400 rpm for 4 h. The analytical methods are listed in Supplementary Information.

Construction of empirical fitness landscape

Each intermediate mutant was purified to measure stereoselectivity toward model substrate 1a. To calculate each intermediate’s specific activity via conversion, the reaction times of 15 mutants were set to 3 h (N559-A119G, N559-L19M/C67S, N559-T94S/A119G, N559-L19M/T94S, N559-T94S/I120V, N559-A119G/I120V, N559-L19M/T94S/A119G, N559-L19M/C67S/T94S/ N559-L19M/T94S/I120V, N559-L19M/C67S/T94S/I120V, N559-L19M/C67S/T94S/A119G, N559-C67S/T94S/A119G/I120V, N559-L19M/C67S/A119G/I120V, N559-T94S/A119G/I120V, and N559-T94S/A119G/I120V), and the rest of the mutants were set to 10 h. The other reaction conditions were the same as in the above-mentioned analytical-scale enzymatic reactions. Data were obtained from three replicate experiments.

Protein crystallization of N559 and N559-M6

Purified N559 and N559-M6 were subjected to crystallization trials using commercially available screen kits in 96-well sitting-drop format, with each drop comprising 1 μL of protein solution containing 2.5 mM NADP⁺ and 1 μL of reservoir solution. Crystallization and X-ray diffraction experiments yielded two structures: N559 in complex with NADP⁺ and N559-M6 in complex with NADP⁺. For further details see Supplementary Information. X-ray diffraction data were collected at beamlines BL19U1 and BL02U1 of the Shanghai Synchrotron Radiation Facility (SSRF).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Biochemical and X-ray crystallographic data generated in this study are provided in the Supplementary Information. The ancestral IRED sequences are available under PP329371 (N1), PP329372 (N2), PP329373 (N3), PP329374 (N4), PP329375 (N5), PP329376 (N557), PP329377 (N558), PP329378 (N559), PP329379 (N560), PP329380 (AltN1), PP329381 (AltN2), PP329382 (AltN3), PP329383 (AltN4), PP329384 (AltN5), PP329385 (AltN557), PP329386 (AltN558), PP329387 (AltN559), and PP329388 (AltN560).Crystallographic data are deposited in the Protein Data Bank. N559 and N559-M6 are available under 8JKU and 8HWY, respectively. The eight extant sequences in the study are provided in the Source Data file. The two β-hydroxyacid dehydrogenase sequences are available under 4GBJ and 3PEF. All raw data generated in this study are provided in the Supplementary Data file and Source Data file. Source data are provided with this paper.

References

Plankensteiner, K., Reiner, H. & Rode, B. M. Stereoselective differentiation in the salt-induced peptide formation reaction and its relevance for the origin of life. Petides 26, 535–541 (2005).
Article CAS Google Scholar
Bornscheuer, U. T. The fourth wave of biocatalysis is approaching. Phil. Trans. R. Soc. A. 376, 20170063 (2018).
Article ADS PubMed Google Scholar
Strohmeier, G. A., Pichler, H., May, O. & Gruber-Khadjawi, O. Application of designed enzymes in organic synthesis. Chem. Rev. 111, 4141–4164 (2011).
Article CAS PubMed Google Scholar
Dong, Y. J. et al. Manipulating the stereoselectivity of a thermostable alcohol dehydrogenase by directed evolution for efficient asymmetric synthesis of arylpropanols. Biol. Chem. 400, 313–321 (2019).
Article CAS PubMed Google Scholar
Reetz, M. T., Wang, L. W. & Bocola, M. Directed evolution of enantioselective enzymes: iterative cycles of CASTing for probing protein-sequence space. Angew. Chem. Int. Ed. 45, 1236–1241 (2006).
Article CAS Google Scholar
Yu, S. S. et al. Inverting the enantiopreference of nitrilase‐catalyzed desymmetric hydrolysis of prochiral dinitriles by reshaping the binding pocket with a mirror‐image strategy. Angew. Chem. Int. Ed. 60, 3679–3684 (2021).
Article CAS Google Scholar
Calvó-Tusell, C., Liu, Z., Chen, K., Arnold, F. H. & Garcia-Borràs, M. Reversing the enantioselectivity of enzymatic carbene N–H insertion through mechanism-guided protein engineering. Angew. Chem. Int. Ed. 62, e202303879 (2023).
Article Google Scholar
Harms, M. J. & Thornton, J. W. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat. Rev. Genet. 14, 559–571 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hochberg, G. K. A. & Thornton, J. W. Reconstructing ancient proteins to understand the causes of structure and function. Annu. Rev. Biophys. 46, 247–269 (2017).
Article CAS PubMed PubMed Central Google Scholar
Dean, A. M. & Thornton, J. W. Mechanistic approaches to the study of evolution: the functional synthesis. Nat. Rev. Genet. 8, 675–688 (2007).
Article CAS PubMed PubMed Central Google Scholar
Gumulya, Y. & Gillam, E. M. J. Exploring the past and the future of protein evolution with ancestral sequence reconstruction: the ‘retro’ approach to protein engineering. Biochem. J. 474, 1–19 (2017).
Article CAS PubMed Google Scholar
Chiang, C. H. et al. Deciphering the evolution of flavin-dependent monooxygenase stereoselectivity using ancestral sequence reconstruction. Proc. Natl. Acad. Sci. USA 120, e2218248120 (2023).
Article CAS PubMed PubMed Central Google Scholar
Harms, M. J. & Thornton, J. W. Analyzing protein structure and function using ancestral gene reconstruction. Curr. Opin. Struct. Biol. 20, 360–366 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bridgham, J. T., Carroll, S. M. & Thornton, J. W. Evolution of hormone-receptor complexity by molecular exploitation. Science 312, 97–101 (2006).
Article ADS CAS PubMed Google Scholar
Dishman, A. F. et al. Evolution of fold switching in a metamorphic protein. Science 371, 86–90 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Starr, T. N. et al. ACE2 binding is an ancestral and evolvable trait of sarbecoviruses. Nature 603, 913–918 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Su, W. et al. Ancestral sequence reconstruction pinpoints adaptations that enable avian influenza virus transmission in pigs. Nat. Microbiol. 6, 1455–1465 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mangas-Sanchez, J. et al. Imine reductases (IREDs). Curr. Opin. Chem. Biol. 37, 19–25 (2017).
Article CAS PubMed Google Scholar
Lenz, M., Borlinghaus, N., Weinmann, L. & Nestl, B. M. Recent advances in imine reductase-catalyzed reactions. World J. Microbiol. Biotechnol. 33, 199 (2017).
Article PubMed Google Scholar
Zhang, Y. H. et al. Stereocomplementary synthesis of pharmaceutically relevant chiral 2-aryl-substituted pyrrolidines using imine reductases. Org. Lett. 22, 3367–3372 (2020).
Article CAS PubMed Google Scholar
Chen, Q. et al. Engineered imine reductase for larotrectinib intermediate manufacture. ACS Catal. 12, 14795–14803 (2022).
Article CAS Google Scholar
Chen, F. F. et al. Discovery of an imine reductase for reductive amination of carbonyl compounds with sterically challenging amines. J. Am. Chem. Soc. 145, 4015–4025 (2023).
Article CAS Google Scholar
Thorpe, T. W. et al. Multifunctional biocatalyst for conjugate reduction and reductive amination. Nature 604, 86–91 (2022).
Article ADS CAS PubMed Google Scholar
Marshall, J. R. et al. Screening and characterization of a diverse panel of metagenomic imine reductases for biocatalytic reductive amination. Nat. Chem. 13, 140–148 (2021).
Article CAS PubMed Google Scholar
Kumar, R. et al. Biocatalytic reductive amination from discovery to commercial manufacturing applied to abrocitinib JAK1 inhibitor. Nat. Catal. 4, 775–782 (2021).
Article CAS Google Scholar
Schober, M. et al. Chiral synthesis of LSD1 inhibitor GSK2879552 enabled by directed evolution of an imine reductase. Nat. Catal. 2, 909–915 (2019).
Article CAS Google Scholar
Fademrecht, S., Scheller, P. N., Nestl, B. M., Hauer, B. & Pleiss, J. Identification of imine reductase-specific sequence motifs: imine reductase-specific sequence motifs. Proteins 84, 600–610 (2016).
Article CAS PubMed Google Scholar
Pillai, A. S. et al. Origin of complexity in haemoglobin evolution. Nature 581, 480–485 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Wilson, C. et al. Using ancient protein kinases to unravel a modern cancer drug’s mechanism. Science 347, 882–886 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Kaltenbach, M. et al. Evolution of chalcone isomerase from a noncatalytic ancestor. Nat. Chem. Biol. 14, 548–555 (2018).
Article CAS PubMed Google Scholar
Clifton, B. E. et al. Evolution of cyclohexadienyl dehydratase from an ancestral solute-binding protein. Nat. Chem. Biol. 14, 542–547 (2018).
Article CAS PubMed Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, Y., Yan, Z. H., Lu, X. Y., Xiao, D. G. & Jiang, H. F. Improving the catalytic activity of isopentenyl phosphate kinase through protein coevolution analysis. Sci. Rep. 6, 24117 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Oteri, F., Nadalin, F., Champeimont, R. & Carbone, A. BIS2Analyzer: a server for co-evolution analysis of conserved protein families. Nucleic Acids Res. 45, W307–W314 (2017).
Article CAS PubMed PubMed Central Google Scholar
Graef, J., Ehrt, C. & Rarey, M. Binding site detection remastered: enabling fast, robust, and reliable binding site detection and descriptor calculation with DoGSite3. J. Chem. Inf. Model. 63, 3128–3137 (2023).
Article CAS PubMed Google Scholar
Brouillet, S., Annoni, H., Ferretti, L. & Achaz, G. MAGELLAN: a tool to explore small fitness landscapes. Preprint at https://www.biorxiv.org/content/10.1101/031583v1 (2015).
Ferretti, L. et al. Measuring epistasis in fitness landscapes: the correlation of fitness effects of mutations. J. Theor. Biol. 396, 132–143 (2016).
Article ADS MathSciNet PubMed Google Scholar
Sailer, Z. R. & Harms, M. J. Detecting high-order epistasis in nonlinear genotype-phenotype maps. Genetics 205, 1079–1088 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ding, Q. et al. The evolutionary origin of naturally occurring intermolecular Diels-Alderases from Morus alba. Nat. Commun. 15, 2492 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
DeMars, M. D. II & O’Connor, S. E. Evolution and diversification of carboxylesterase-like [4+2] cyclases in aspidosperma and iboga alkaloid biosynthesis. Proc. Natl. Acad. Sci. USA 121, e2318586121 (2024).
Article CAS PubMed PubMed Central Google Scholar
Cheng, F. et al. Controlling stereopreferences of carbonyl reductases for enantioselective synthesis of atorvastatin precursor. ACS Catal. 11, 2572–2582 (2021).
Article CAS Google Scholar
Noey, E. L. et al. Origins of stereoselectivity in evolved ketoreductases. Proc. Natl. Acad. Sci. USA 112, E7065–E7072 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wilding, M., Hong, N., Spence, M., Buckle, A. M. & Jackson, C. J. Protein engineering: the potential of remote mutations. Biochem. Soc. Trans. 47, 701–711 (2019).
Article CAS PubMed Google Scholar
Aleku, G. A. et al. Stereoselectivity and structural characterization of an imine reductase (IRED) from Amycolatopsis orientalis. ACS Catal. 6, 3880–3889 (2016).
Article CAS Google Scholar
Weinreich, D. M., Delaney, N. F., Depristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006).
Article ADS CAS PubMed Google Scholar
Starr, T. N. & Thornton, J. W. Exploring protein sequence-function landscapes. Nat. Biotechnol. 35, 125–126 (2017).
Article CAS PubMed PubMed Central Google Scholar
Yi, X. & Dean, A. M. Adaptive landscapes in the age of synthetic biology. Mol. Biol. Evol. 36, 890–907 (2019).
Article CAS PubMed PubMed Central Google Scholar
Meini, M. R., Tomatis, P. E., Weinreich, D. M. & Vila, A. J. Quantitative description of a protein fitness landscape based on molecular features. Mol. Biol. Evol. 32, 1774–1787 (2015).
Article CAS PubMed PubMed Central Google Scholar
Poelwijk, F. J., Kiviet, D. J., Weinreich, D. M. & Tans, S. J. Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445, 383–386 (2007).
Article ADS CAS PubMed Google Scholar
de Visser, J. A. G. M. & Krug, J. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480–490 (2014).
Article PubMed Google Scholar
Nishikawa, K. K., Hoppe, N., Smith, R., Bingman, C. & Raman, S. Epistasis shapes the fitness landscape of an allosteric specificity switch. Nat. Commun. 12, 5562 (2021).
Article ADS PubMed PubMed Central Google Scholar
Tokuriki, N. et al. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 3, 1257 (2012).
Article ADS PubMed Google Scholar
Sailer, Z. R. & Harms, M. J. High-order epistasis shapes evolutionary trajectories. PLoS Comput. Biol. 13, e1005541 (2017).
Article ADS PubMed PubMed Central Google Scholar
Yang, G. et al. Higher-order epistasis shapes the fitness landscape of a xenobiotic-degrading enzyme. Nat. Chem. Biol. 15, 1120–1128 (2019).
Article CAS PubMed Google Scholar
Weinreich, D. M., Watson, R. A. & Chao, L. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59, 1165–1174 (2005).
CAS PubMed Google Scholar
Kondrashov, D. A. & Kondrashov, F. A. Topological features of rugged fitness landscapes in sequence space. Trends Genet 31, 24–33 (2015).
Article CAS PubMed Google Scholar
Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C. & Kondrashov, F. A. Epistasis as the primary factor in molecular evolution. Nature 490, 535–538 (2012).
Article ADS CAS PubMed Google Scholar
Lunzer, M., Golding, G. B. & Dean, A. M. Pervasive cryptic epistasis in molecular evolution. PLoS Genet 6, e1001162 (2010).
Article PubMed PubMed Central Google Scholar
Bridgham, J. T., Ortlund, E. A. & Thornton, J. W. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461, 515–519 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Starr, T. N., Picton, L. K. & Thornton, J. W. Alternative evolutionary histories in the sequence space of an ancient protein. Nature 549, 409–413 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Hopf, T. A. et al. Mutations effects predicted from sequence co-variation. Nat. Biotechnol. 35, 128–135 (2017).
Article CAS PubMed PubMed Central Google Scholar
Cadet, F. et al. A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes. Sci. Rep. 8, 16757 (2018).
Article ADS PubMed PubMed Central Google Scholar
Hanson-Smith, V., Kolaczkowski, B. & Thornton, J. W. Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol. Biol. Evol. 27, 1988–1999 (2010).
Article CAS PubMed PubMed Central Google Scholar
Eick, G. N., Bridgham, J. T., Anderson, D. P., Harms, M. J. & Thornton, J. W. Robustness of reconstructed ancestral protein functions to statistical uncertainty. Mol. Biol. Evol. 34, 247–261 (2017).
CAS PubMed Google Scholar
Jensen, R. A. Enzyme recruitment in evolution of new function. Annu. Rev. Microbiol. 30, 409–425 (1976).
Article CAS PubMed Google Scholar
Guo, F., Xu, H. M., Xu, H. N. & Yu, H. W. Compensation of the enantioselectivity-activity trade-off in the directed evolution of an esterase from Rhodobacter sphaeroides by site-directed saturation mutagenesis. Appl. Microbiol. Biotechnol. 97, 3355–3362 (2013).
Article CAS PubMed Google Scholar
Zhou, J. Y. et al. Structural insight into enantioselectivity inversion of an alcohol dehydrogenase reveals a “polar Gate” in stereorecongnition of diaryl ketones. J. Am. Chem. Soc. 140, 12645–12654 (2018).
Article CAS PubMed Google Scholar
Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 20, 1160–1166 (2019).
Article CAS PubMed Google Scholar
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Article CAS PubMed Google Scholar
Lefort, V., Longueville, J. E. & Gascuel, O. SMS: smart model selection in PhyML. Mol. Biol. Evol. 34, 2422–2424 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lemoine, F. et al. Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556, 452–456 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Foley, G. et al. Engineering indel and substitution variants of diverse and ancient enzymes using graphical representation of ancestral sequence predictions (GRASP). PLoS Comput. Biol. 18, e1010633 (2022).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank J. Zheng (Westlake University, Hangzhou City, Zhejiang province) and G. N. Wen (Institute of Zoology, Chinese Academy of Sciences, Beijing) for critical comments and discussions; Y.Han (East China University of Science and Technology) for help during X-ray data collection; and Q. J. Xiao (Shanghai Institute for Advanced Study, Chinese Academy of Sciences) and R. Liang (Huazhong Agricultural University, Wuhan city, Hubei province) for determining protein crystal structures; and three reviewers for constructive feedback on the manuscript. We are grateful to Shanghai Synchrotron Radiation Facility (SSRF) beamlines BL19U1 and BL02U1 for protein structure determination support. This work was supported by the National Key Research and Development Program of China (2019YFA09005000 to G.W.Z. and 2021YFA0911400 to F.F.C.), and the National Natural Science Foundation of China (32371547 to G.W.Z. and 22008068 to F.F.C.).

Author information

Authors and Affiliations

State Key Laboratory of Bioreactor Engineering, Shanghai Collaborative Innovation Center for Biomanufacturing, East China University of Science and Technology, Shanghai, China
Xin-Xin Zhu, Wen-Qing Zheng, Zi-Wei Xia, Xin-Ru Chen, Tian Jin, Xu-Wei Ding, Fei-Fei Chen, Qi Chen, Jian-He Xu & Gao-Wei Zheng
State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, Shanghai, China
Xu-Dong Kong

Authors

Xin-Xin Zhu
View author publications
Search author on:PubMed Google Scholar
Wen-Qing Zheng
View author publications
Search author on:PubMed Google Scholar
Zi-Wei Xia
View author publications
Search author on:PubMed Google Scholar
Xin-Ru Chen
View author publications
Search author on:PubMed Google Scholar
Tian Jin
View author publications
Search author on:PubMed Google Scholar
Xu-Wei Ding
View author publications
Search author on:PubMed Google Scholar
Fei-Fei Chen
View author publications
Search author on:PubMed Google Scholar
Qi Chen
View author publications
Search author on:PubMed Google Scholar
Jian-He Xu
View author publications
Search author on:PubMed Google Scholar
Xu-Dong Kong
View author publications
Search author on:PubMed Google Scholar
Gao-Wei Zheng
View author publications
Search author on:PubMed Google Scholar

Contributions

All listed authors performed experiments and/or analyzed data. X.X.Z., X.D.K., and G.W.Z. designed the research and wrote the paper; X.X.Z., W.Q.Z., Z.W.X., X.R.C., T.J., X.W.D., and F.F.C. performed the research. X.X.Z. and X.D.K. solved the protein structure. Q.C. and J.H.X. participated in research.

Corresponding authors

Correspondence to Xu-Dong Kong or Gao-Wei Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Reporting Summary

Description of Additional Supplementary Information

Supplementary Data 1

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhu, XX., Zheng, WQ., Xia, ZW. et al. Evolutionary insights into the stereoselectivity of imine reductases based on ancestral sequence reconstruction. Nat Commun 15, 10330 (2024). https://doi.org/10.1038/s41467-024-54613-3

Download citation

Received: 30 January 2024
Accepted: 14 November 2024
Published: 28 November 2024
Version of record: 28 November 2024
DOI: https://doi.org/10.1038/s41467-024-54613-3