Engineered enzymes for enantioselective nucleophilic aromatic substitutions

Lister, Thomas M.; Roberts, George W.; Hossack, Euan J.; Zhao, Fei; Burke, Ashleigh J.; Johannissen, Linus O.; Hardy, Florence J.; Millman, Alexander A. V.; Leys, David; Larrosa, Igor; Green, Anthony P.

doi:10.1038/s41586-025-08611-0

Download PDF

Article
Open access
Published: 15 January 2025

Engineered enzymes for enantioselective nucleophilic aromatic substitutions

Nature volume 639, pages 375–381 (2025)Cite this article

34k Accesses
9 Citations
43 Altmetric
Metrics details

Subjects

Abstract

Nucleophilic aromatic substitutions (S_NAr) are among the most widely used processes in the pharmaceutical and agrochemical industries^1,2,3,4, allowing convergent assembly of complex molecules through C–C and C–X (X = O, N, S) bond formation. S_NAr reactions are typically carried out using forcing conditions, involving polar aprotic solvents, stoichiometric bases and elevated temperatures, which do not allow for control over reaction selectivity. Despite the importance of S_NAr chemistry, there are only a handful of selective catalytic methods reported that rely on small organic hydrogen-bonding or phase-transfer catalysts^{5,6,7,8,9,10,11}. Here we establish a biocatalytic approach to stereoselective S_NAr chemistry by uncovering promiscuous S_NAr activity in a designed enzyme featuring an activated arginine¹². This activity was optimized over successive rounds of directed evolution to afford an engineered biocatalyst, S_NAr1.3, that is 160-fold more efficient than the parent and promotes the coupling of electron-deficient arenes with carbon nucleophiles with near-perfect stereocontrol (>99% enantiomeric excess (e.e.)). S_NAr1.3 can operate at a rate of 0.15 s⁻¹, perform more than 4,000 turnovers and can accept a broad range of electrophilic and nucleophilic coupling partners, including those that allow construction of challenging 1,1-diaryl quaternary stereocentres. Biochemical, structural and computational studies provide insights into the catalytic mechanism of S_NAr1.3, including the emergence of a halide binding pocket shaped by key catalytic residues Arg124 and Asp125. This study brings a landmark synthetic reaction into the realm of biocatalysis to provide an efficient and versatile platform for catalytic S_NAr chemistry.

Catalysis of an S_N2 pathway by geometric preorganization

Article 18 July 2024

Organocatalytic stereoselective cyanosilylation of small ketones

Article Open access 04 May 2022

Decarboxylative tandem C-N coupling with nitroarenes via S_H2 mechanism

Article Open access 04 May 2022

Main

Nucleophilic aromatic substitutions (S_NAr) are fundamental transformations in organic chemistry used to functionalize (hetero)aromatic rings during the synthesis of valuable molecules, including pharmaceuticals and agrochemicals^1,2. These transformations involve the coupling of electron-deficient (hetero)aryl halide electrophiles with carbon, oxygen, nitrogen or sulfur nucleophiles^3,4,13 (Fig. 1a). The modularity and operational simplicity of S_NAr reactions has led to their widespread use in the synthesis of valuable organic molecules from discovery to manufacturing scales¹⁴. However, despite their prevalence, these processes still suffer from important limitations that can be attributed to a lack of efficient and general catalysts for mediating S_NAr chemistry^{15,16,17,18,19}. As a result, established methods of performing S_NAr chemistry are incompatible with stereoselective and/or regioselective processes that are highly desirable when constructing complex molecules. To address these limitations, a small number of enantioselective S_NAr reactions have recently been developed that make use of small organic hydrogen-bonding or phase-transfer catalysts^{5,6,7,8,9,10,11}. Although impressive, the efficiency of these organocatalysts is limited and they cannot be easily adapted to operate on new classes of substrates.

**Fig. 1: S_NAr reactions and directed evolution of an enantioselective S_NAr enzyme.**

We therefore considered alternative catalytic strategies for mediating selective S_NAr chemistry that could offer enhanced efficiency and greater flexibility. To this end, our thoughts turned to biocatalysis given the impressive rate accelerations, exacting selectivities and high degree of engineerability associated with enzymes^20,21,22. Unfortunately, there are no natural enzymes known that mediate selective and convergent S_NAr chemistry. Although the hydrolytic enzymes 4-chlorobenzoyl-CoA dehalogenase^23,24, 5-nitroanthranilic acid aminohydrolase²⁵ and atrazine chlorohydrolase²⁶ are thought to operate through S_NAr-type pathways, their mechanisms involve metal–hydroxide intermediates or the hydrolysis of covalent aryl esters, meaning that these enzymes cannot be readily adapted to use nucleophiles other than water. Similarly, the promiscuous glutathione arylation activity observed with selected glutathione S-transferases probably arises from activation of the glutathione nucleophile²⁷ and is not readily adaptable to more valuable substrate classes. In the absence of suitable natural enzymes, here we adopt a ‘bottom-up’ approach to engineer efficient and enantioselective S_NAr biocatalysts.

Engineering an enantioselective S_NAr enzyme

To identify a suitable starting template for engineering S_NAr enzymes, we considered a family of Morita–Baylis–Hillman (MBH) enzymes recently engineered in our lab that harbour active-site features that could be repurposed to promote the target chemistry^12,28. These enzymes contain a flexible Arg124 residue as a hydrogen-bond donor that sits adjacent to a binding site for electron-deficient aromatic substrates. Given that hydrogen-bonding catalysts have previously been shown to accelerate S_NAr reactions⁷, we evaluated a selection of our in-house MBH enzymes for promiscuous S_NAr activity, using a small panel of activated aryl halides and carbon nucleophiles as coupling partners. From this screening, we identified the variant BH32.8 (subsequently referred to as S_NAr1.0), which promotes the coupling of ethyl 2-cyanopropionate (1) and 2,4-dinitrochlorobenzene (2) with modest conversion and stereocontrol (approximately 5% e.e.) (Fig. 1b and Supplementary Fig. 1), as a promising candidate for S_NArase engineering. This reaction leads to the generation of product 3 containing an acyclic quaternary carbon stereocentre, a common functional motif in complex organic molecules that is challenging to construct in a stereocontrolled fashion²⁹. Furthermore, α-cyano esters serve as precursors to useful chiral motifs, including β-amino acids³⁰, β-lactams³¹ and oxindoles⁵.

To improve activity and selectivity, S_NAr1.0 was subjected to successive rounds of laboratory evolution (Fig. 1c and Extended Data Fig. 1). In total, 41 residues, located within the putative active site and secondary coordination sphere, were individually randomized using NNK degenerate codons. Individual library variants were arrayed in 96-well plates and evaluated as clarified cell lysate using an ultra performance liquid chromatography (UPLC) assay monitoring the conversion of 1 and 2 to 3 (Supplementary Fig. 2). The most active (about 1%) clones from each round were selected for further evaluation as purified proteins and screened. Beneficial mutations identified in each round were subsequently combined by DNA shuffling.

Following the evaluation of approximately 4,000 clones, an S_NAr1.3 variant emerged containing six mutations (Fig. 1c,e and Extended Data Fig. 1). Notably, during evolution, His23, which is a key catalytic nucleophile in MBH catalysis, was mutated—excluding the possibility of this residue promoting S_NAr chemistry through nucleophilic catalysis³². Under assay conditions used during evolution, S_NAr1.3 affords 3 as the sole product with 93% conversion, compared with 3% conversion using S_NAr1.0. This improvement in catalytic performance also correlates with improvements in enantioselectivity, with S_NAr1.3 delivering the (R)-enantiomer of 3 in 96% e.e. compared with the modest 5% e.e. observed with the parent template. The absolute configuration of 3 was assigned by X-ray diffraction of optically enriched 3 obtained from a preparative-scale biotransformation (51 mg of 2, 90% conversion, 70% isolated yield, 98% e.e. after recrystallization; Supplementary Fig. 3 and Supplementary Table 11). To further quantify the improvements in catalytic performance following evolution, we performed more detailed kinetic analysis. Assays performed at fixed concentration of 2 (2.5 mM) and variable concentrations of 1 reveal a substantial 160-fold improvement in k_obs (0.0040 ± 0.0002 and 0.65 ± 0.01 min⁻¹ for S_NAr1.0 and S_NAr1.3, respectively) with minimal changes in K_M1 (6.8 ± 0.2 and 7.7 ± 0.2 mM for S_NAr1.3 and S_NAr1.0, respectively; Extended Data Fig. 2). Subsequent assays performed under saturating concentrations of 1 also reveal a 160-fold enhancement in k_cat/K_M2 (0.0030 ± 0.0002 and 0.48 ± 0.01 min⁻¹ mM⁻¹ for S_NAr1.0 and S_NAr1.3 respectively; Extended Data Fig. 2). Notably, switching from phosphate-buffered saline (PBS) buffer (10 mM Na₂HPO₄, 1.8 mM KH₂PO₄, 137 mM NaCl, 2.7 mM KCl) to sodium phosphate (46.4 mM Na₂HPO₄, 3.6 mM NaH₂PO₄) leads to a further threefold increase in S_NAr1.3 activity (Extended Data Fig. 3), which can be attributed to enzyme inhibition at elevated concentrations of chloride (IC₅₀ = 147 ± 32 mM; Extended Data Fig. 4a). Notably, iodide was found to be a more potent inhibitor of S_NAr1.3, with an IC₅₀ value of 1.20 ± 0.05 mM (Extended Data Fig. 4b).

We next explored the effect of varying the halide leaving group on S_NAr1.3 activity and selectivity. Notably, despite performing evolution with aryl chloride 2, enzyme activity is improved by 3.8-fold and 8.6-fold using the bromide-containing (4) and iodide-containing (5) analogues of 2, respectively (3.67 ± 0.04 and 8.34 ± 0.11 mM⁻¹ min⁻¹) (Fig. 2b). This trend differs from that observed in the analogous uncatalysed background reactions, in which reactions with 5 are markedly slower than with 2 or 4. Using its preferred aryl iodide substrate 5, S_NAr1.3 can operate at a rate of 8.81 ± 0.11 mM⁻¹ min⁻¹ (using 1 mM of 5; Fig. 2b) and affords product 3 in greater than 99% e.e. (Supplementary Fig. 4b). Furthermore, the enzyme is able to achieve more than 4,000 turnovers (Fig. 2c and Supplementary Fig. 4a). To demonstrate synthetic utility, we performed a preparative-scale biotransformation to produce 150 mg of (R)-3 (>99% conversion, 91% isolated yield, 99% e.e.) using only 0.5 mol% of S_NAr1.3 (Supplementary Fig. 5). We also explored the potential of S_NAr1.3 to discriminate between regioisomeric aryl halide substrates. As expected, the enzyme shows no observable activity towards 3,5-dinitrobromobenzene, which is typically poorly reactive as an S_NAr substrate. By contrast, S_NAr1.3 promotes the coupling of 1 with 2,6-dinitrochlorobenzene (6) with high levels of stereocontrol (99% e.e.; Supplementary Fig. 6b). However, activity towards this substrate is approximately 170-fold lower than with the 2,4-dinitrochlorobenzene regioisomer 2 used for enzyme engineering (Supplementary Fig. 6c). The regioselective nature of this S_NAr process is further demonstrated through the reaction of 1 with an equimolar mixture of 2 and 6 as substrates, affording product 3 with high yield (97%) and regioselectivity (r.r. 71:1; Supplementary Fig. 6d).

**Fig. 2: Impact of the halide leaving group on S_NAr1.3 activity.**

To evaluate the range of transformations accessible with S_NAr1.3, we explored the scope towards diverse electrophile and nucleophile coupling partners (Fig. 3a, Supplementary Table 1 for further details and Supplementary Fig. 7). Notably, the enzyme tolerates a wide range of nitroarene substrates, including those containing nitrile, trifluoromethyl, ester, ketone and sulfone substituents, as well as pyridine rings (Fig. 3a) to afford S_NAr products with good to excellent e.e. In all cases, the desired S_NAr adducts were formed exclusively, with no side products observed. For selected transformations (those leading to products 8, 9 or 15), we assessed S_NAr variants from across the evolutionary trajectory. In all cases, S_NAr1.3 proved to be the most active and selective biocatalyst, suggesting that the mutations installed during evolution have led to general improvements in S_NArase performance on aromatic substrates with different substituent patterns and halide leaving groups (Supplementary Fig. 8). For the synthesis of 8, we compared the performance of S_NAr1.3 with aromatic precursors containing different halide leaving groups and observed a reactivity order of F > I > Br > Cl with minimal changes in reaction selectivity (Supplementary Fig. 9). These observations suggest that the intrinsically higher reactivity of aryl fluoride electrophiles overrides the preference of S_NAr1.3 for larger halide leaving groups. As well as its broad electrophile substrate scope, S_NAr1.3 also tolerates a variety of carbon nucleophiles. Analogues of 1 containing larger ester groups (17, 18, 19), amide motifs (20) and 2-alkyl substituents (21, 22) are well tolerated. The enzyme also accepts a cyclic β-ketoester as a substrate to afford the C-arylated species 24 as the sole product with high conversion and selectivity (Supplementary Fig. 10). This is in contrast with the mixtures of O-arylated and C-arylated products generated in analogous chemical transformations using stoichiometric base or small organic catalysts⁵. Beyond synthesis of all-carbon quaternary stereocentres, S_NAr1.3 also promotes the formation of optically enriched nitrogen-containing quaternary stereocentres (23) using ethyl 2-nitropropionate as a nucleophile. The resulting α-nitro ester products can be elaborated into valuable chiral motifs, including α-amino acids³³ and α-tertiary amines³⁴. Furthermore, S_NAr1.3 can be used for C–O bond construction using phenols or activated alcohols as nucleophiles (pK_a < 12.4) to generate biaryl ethers (26) or aryl alkyl ethers (27 and 28), respectively.

**Fig. 3: Substrate scope of S_NAr1.3.**

Finally, we recognized the potential to apply our S_NAr biocatalysts for the construction of 1,1-diaryl quaternary motifs, a common structural feature in bioactive molecules^35,36 that is challenging to synthesize in a stereocontrolled manner³⁷. To explore this possibility, we evaluated a selection of our S_NAr variants as biocatalysts for the conversion of 2 and ethyl 2-cyano-2-phenylacetate (29) to product 30 (Fig. 3b and Supplementary Fig. 11). The engineered S_NAr1.2 variant was able to promote this transformation with modest conversion (27%) and stereocontrol (46% e.e.). To enhance activity and selectivity, we subjected S_NAr1.2 to an extra round of directed evolution (Extended Data Fig. 5) to afford a double mutant (S_NAr_Ph1.0), which is threefold more active than the parent template and produces 30 in 84% e.e. Reaction conversion and selectivity can be further improved using aryl iodide 5 as a substrate in place of aryl chloride 2, with product 30 formed in 96% conversion and 87% e.e. using 5 (Fig. 3b). This enzyme is also able to produce the 1,1-di(hetero)arylated products 31 and 32 (Fig. 3b and Supplementary Table 2 for further details), albeit with reduced selectivity, suggesting that S_NAr_Ph1.0 will serve as a valuable template for engineering biocatalysts for the stereocontrolled synthesis of diverse 1,1-diarylated products (Supplementary Fig. 12).

S_NAr1.3 structure and mechanism

To gain insights into the S_NAr1.3 catalytic mechanism and the origins of enhanced performance across evolution, a series of biochemical, structural and computational studies were undertaken. We first considered the possibility that S_NAr catalysis could proceed through the formation of enzyme–substrate covalent intermediates, as proposed for 4-chlorobenzoyl-CoA dehalogenase²⁴. To explore this hypothesis, we incubated S_NAr1.3 and selected variants with aryl halide 5 in the absence of a nucleophilic coupling partner and monitored both halide release³⁸ and changes in protein mass over time (Supplementary Fig. 13). We note that no phenolic product arising from aryl-halide hydrolysis is observed under the assay conditions, meaning that we would expect any covalent intermediates to accumulate. Using either assay, there is no evidence for the formation of covalent adducts over catalytically relevant time frames (rate of 8.8 min⁻¹), suggesting that aryl-enzyme intermediates are unlikely to be involved in the S_NAr1.3 catalytic mechanism. We note that the Cys96 residue, which was introduced in the final round of engineering and gave a modest 2.4-fold activity increase, undergoes slow arylation on incubation of S_NAr1.3 with electrophile 5, with approximately 60% of the protein modified after 10 min (Supplementary Fig. 13). A Cys96Gln mutation in S_NAr1.3 leads to a modest 2.7-fold activity reduction, showing that Cys96 is beneficial but not critical to S_NAr catalysis (Supplementary Fig. 16).

To further investigate the mechanism, an X-ray crystal structure of S_NAr1.3 was solved to a resolution of 1.8 Å (Supplementary Table 10). We facilitated structural analysis by a K39A mutation at the protein surface, which has negligible impact on activity (Supplementary Fig. 14) and improves data resolution. The structure superimposes well with the S_NAr1.0 starting template used for directed evolution, with minimal changes to the overall protein fold (root mean square deviation of 0.89 Å). Given the importance of halide binding cavities in natural dehalogenases³⁹, S_NAr1.3 crystals were soaked with 100 mM KI before freezing. The resulting structure reveals two internal iodide binding sites (Extended Data Fig. 6), with the most occupied site (about 85%) shaped by Met64, Arg65, Arg124, Asp125 and Pro128 (Fig. 4a). Notably, Arg and Asp residues are a common feature of halide binding sites in natural proteins, despite the latter being negatively charged⁴⁰. To explore the potential importance of the halide binding site, S_NAr1.3 crystals were soaked with substrate 5. The corresponding structure revealed two distinct anomalous signals in close proximity, associated with the iodide substituent, suggesting that the substrate binds with considerable conformational heterogeneity. Notably, the major 5 pose places the halide substituent directly adjacent to the aforementioned halide binding cavity (Fig. 4b and Extended Data Fig. 7). In this pose, there is a vacant cavity below the aromatic substrate that can likely accommodate nucleophilic coupling partners (Extended Data Fig. 7 and Supplementary Fig. 15).

**Fig. 4: Structural and mechanistic studies of S_NAr 1.3.**

Guided by the structural analysis, we performed site-directed mutagenesis of residues lining the halide binding site (Extended Data Fig. 8 and Supplementary Fig. 16). R124A and D125N/A mutations led to substantial 180-fold and 68-fold/22-fold reductions in rate, respectively, with more modest 5.4-fold and 10-fold rate reductions observed with M64A and R65A (Extended Data Fig. 8 and Supplementary Fig. 16). These assays further underscore the importance of the halide binding motif to efficient catalysis, although the large contribution made by Arg124 could also be ascribed to its role in nucleophile activation (vide infra). Comparison of the S_NAr1.0 and S_NAr1.3 structures shows how the halide binding pocket has been modulated through directed evolution. In particular, the W88R mutation results in repositioning of Arg65 to optimize its electrostatic interactions with Asp125 (Fig. 4c). Substitution of Arg88 by alanine leads to a 24-fold reduction in rate (Extended Data Fig. 8 and Supplementary Fig. 16), highlighting the importance of this extended polar network. Notably, although high levels of selectivity were preserved with the M64A, R65A and D125A halide cavity mutations, the R124A mutation led to a substantial loss of enantioselectivity (Supplementary Fig. 17), suggesting that Arg124 may also play a role in positioning and/or activating the nucleophile 1 for selective catalysis (Supplementary Fig. 18). This hypothesis is supported by molecular dynamics simulations that reveal productive conformations with the enolate of 1 forming hydrogen bonding interactions with Arg124 (Supplementary Fig. 19). In these simulations, a bridging water between the enolate and Arg124 is present in 48% of the structures and a direct Arg124-enolate hydrogen bond is observed in 28% of the frames.

Conclusion

In summary, we have established a biocatalytic solution to S_NAr chemistry, one of the most important classes of transformations in the chemical industry. Here we have focused on the development of enzymes to enable stereocontrolled construction of carbon quaternary centres. Notably, we have shown that our methods can be extended to the synthesis of nitrogen-containing quaternary stereocentres, the construction of C–O bonds and to regioselective S_NAr processes, highlighting the broad synthetic utility of our engineered biocatalysts.

The S_NAr enzymes developed in this study already show impressive activities, selectivities and substrate scope, despite only sampling about 4,000 variants across the evolutionary trajectory. Deeper exploration of protein sequence space will undoubtedly deliver more potent S_NAr biocatalysts in the future, including those that operate on poorly activated electrophile and nucleophile coupling partners. Crucially, the structural and mechanistic studies described herein provide insights into the active-site features of S_NAr1.3 responsible for efficient and selective catalysis. This analysis provides an important blueprint for the de novo design of customized S_NAr biocatalysts with active-site geometries and arrangements of functional components required for a target transformation⁴¹. By combining modern protein design methods^42,43,44 with high-throughput laboratory evolution, we are optimistic about the prospects of developing biocatalysts for a wide variety of valuable S_NAr processes, including those that are beyond the reach of existing methodologies.

Methods

Materials

All chemicals and biological materials were obtained from commercial suppliers. Lysozyme, DNase I and chloramphenicol were purchased from Sigma-Aldrich; polymyxin B sulfate from Apollo Scientific; LB agar, 2×YT media and l-arabinose from Formedium; Escherichia coli 5α, Q5 DNA polymerase, T4 DNA ligase and restriction enzymes from New England Biolabs; and oligonucleotides were synthesized by Integrated DNA Technologies.

pBbE8k_S_NAr constructs

The construction of the vector pBbE8k_S_NAr1.0 (pBbE8k_BH32.8) is described elsewhere¹². The gene was subcloned using NdeI and XhoI restriction sites into a pBbE8k vector, modified to include a 6×His tag following the XhoI restriction site.

Protein expression and purification

For expression of S_NAr1.0 and variants, chemically competent E. coli 5α cells were transformed with the requisite pBbE8k_S_NAr construct. Single colonies of freshly transformed cells were cultured (18 h at 37 °C, 200 r.p.m.) in 2×YT medium (5 ml) containing kanamycin sulfate (25 µg ml⁻¹). Starter cultures (500 µl) were used to inoculate 2×YT medium (50 ml) supplemented with kanamycin sulfate (25 µg ml⁻¹). Cultures were grown (37 °C, 200 r.p.m.) to an optical density at 600 nm (OD₆₀₀) of about 0.6. Protein expression was induced with the addition of l-arabinose (10 mM final concentration). Induced cultures were incubated (20 h at 25 °C) and the cells were subsequently collected by centrifugation (3,220 × g for 10 min). Pelleted cells were resuspended in lysis buffer (50 mM HEPES, 300 mM NaCl, 20 mM imidazole, pH 7.5) and lysed by sonication. Cell lysates were cleared by centrifugation (27,216 × g for 30 min) and supernatants were subjected to affinity chromatography using Ni-NTA agarose (QIAGEN). Purified protein was eluted using elution buffer (50 mM HEPES, 300 mM NaCl, 250 mM imidazole, pH 7.5). Proteins were desalted using 10DG desalting columns (Bio-Rad) with the requisite storage buffer and analysed by sodium dodecyl sulfate–polyacrylamide gel electrophoresis. Proteins were aliquoted, flash-frozen in liquid nitrogen and stored at −80 °C. Protein concentrations were determined by measuring the absorbance at 280 nm using calculated extinction coefficients (ExPASy ProtParam); extinction coefficient of 27,390 M⁻¹ cm⁻¹ for S_NAr1.0 and 21,890 M⁻¹ cm⁻¹ for S_NAr1.1 to 1.3 and S_NAr_Ph1.0.

Protein mass spectrometry

Purified protein samples were buffer-exchanged into 0.1% acetic acid using a 10 k MWCO Vivaspin unit (Sartorius) and diluted to a final concentration of 0.5 mg ml⁻¹. Mass spectrometry was performed using a 1200 series Agilent LC system, with a 5 µl injection into 5% acetonitrile (with 0.1% formic acid) and desalted inline for 1 min. Protein was eluted over 1 min using 95% MeCN with 5% H₂O. The resulting multiply charged spectrum was analysed using an Agilent 6510 Q-TOF instrument and deconvoluted using Agilent MassHunter software.

To prepare protein with the Cys96 residue arylated (Supplementary Fig. 13c), S_NAr1.3 (200 µM final concentration) was incubated with 5 (1 mM) for 1 h at 30 °C. As a control, an S_NAr1.3 C96A variant was incubated under the same conditions. Samples were then characterized by mass spectrometry as described above.

Library construction

Rounds 1, 2 and 3: saturation mutagenesis. Positions were individually randomized using degenerate NNK codons. DNA libraries were constructed by overlap extension polymerase chain reaction (PCR). Primers for library generation are given in Supplementary Table 8. Assembled genes and pBbE8k vector were digested using NdeI and XhoI endonucleases, gel-purified and subsequently ligated using T4 DNA ligase in a 4:1 ratio, respectively. Ligations were transformed into E. coli 5α cells, the resulting colonies were pooled and plasmid DNA was extracted using a Miniprep Kit (QIAGEN) to yield plasmid DNA for each library. Sequencing was performed by Source BioScience.

Shuffling by overlap extension PCR

After each round of screening, beneficial mutations were combined by DNA shuffling of fragments generated by overlap extension PCR. Primers were designed that encoded either the parent amino acid or the identified mutation. These primers were used to generate short fragments that were gel-purified and mixed for assembly of the full-length gene by overlap extension PCR. Final full-length genes contain all possible combinations of mutations at specified positions. Genes were cloned as described above.

Library screening

For protein expression and screening, all transfer and aliquoting steps were performed using a Hamilton liquid-handling robot. Chemically competent E. coli 5α cells were transformed with the appropriate library plasmids. Freshly transformed colonies were used to inoculate 2×YT medium (150 μl) supplemented with kanamycin sulfate (25 μg ml⁻¹) in Corning Costar 96-well microtitre round-bottom plates. Each plate also contained six freshly transformed clones of the parent template and two clones of pBbE8k_RFP as an internal reference. Plates were incubated overnight (30 °C, 80% humidity, 850 r.p.m.), then an aliquot of overnight culture (20 µl) was used to inoculate 2×YT medium (480 μl) supplemented with kanamycin sulfate (25 μg ml⁻¹). The cultures were incubated (30 °C, 80% humidity, 850 r.p.m.) until an OD₆₀₀ of about 0.6 was reached and l-arabinose was added (10 mM final concentration). Induced plates were incubated (20 h, 30 °C, 80% humidity, 850 r.p.m.). Cells were collected by centrifugation (2,900 × g for 10 min). The supernatant was discarded and the pelleted cells were resuspended in lysis buffer (400 μl: PBS buffer at requisite pH supplemented with lysozyme (1.0 mg ml⁻¹), polymyxin B (0.5 mg ml⁻¹) and DNase I (10 μg ml⁻¹)) and incubated (2 h, 30 °C, 80% humidity) with shaking (850 r.p.m.). Cell debris was removed by centrifugation (2,900 × g, 10 min).

Rounds 1 and 2

Clarified lysate (50 μl) was added to a 96-well microtitre plate and then the reaction initiated by the addition of assay mix (50 μl) containing 2,4-dinitrochlorobenzene (2.5 mM final concentration), ethyl-2-cyanopropionate (25 mM final concentration) in PBS pH 8.0 with dimethyl sulfoxide (DMSO) (10% v/v final concentration). Reactions were heat-sealed and incubated overnight (30 °C, 850 r.p.m., 80% humidity) and then quenched with the addition of MeCN (100 μl), heat-sealed and incubated (1 h, 30 °C, 80% humidity, 850 r.p.m.). Precipitated proteins were removed by centrifugation (2,900 × g, 10 min). An aliquot of the clarified, quenched reaction mixture (100 µl) was transferred to a Greiner 96-well polypropylene microtitre plate and heat-sealed with pierceable foil. Reactions were evaluated by high-performance liquid chromatography (HPLC) analysis as described below.

Round 3

Clarified lysate (50 μl) was added to a 96-well microtitre plate and then the reaction initiated by the addition of assay mix (50 μl) containing 2,4-dinitrochlorobenzene (2.5 mM final concentration), ethyl-2-cyanopropionate (5 mM final concentration) in PBS pH 8.0 with DMSO (10% v/v final concentration). Reactions were heat-sealed and incubated overnight (30 °C, 850 r.p.m., 80% humidity) and then quenched with the addition of MeCN (100 μl), heat-sealed and incubated (1 h, 30 °C, 80% humidity, 850 r.p.m.). Precipitated proteins were removed by centrifugation (2,900 × g, 10 min). An aliquot of the clarified, quenched reaction mixture (100 µl) was transferred to a Greiner 96-well polypropylene microtitre plate and heat-sealed with pierceable foil. Reactions were evaluated by HPLC analysis as described below.

S_NAr_Ph1.0 evolution

Clarified lysate (50 μl) was added to a 96-well microtitre plate and then the reaction initiated by the addition of assay mix (50 μl) containing 2,4-dinitrochlorobenzene (1.0 mM final concentration), ethyl 2-cyano-2-phenylacetate (2.0 mM final concentration) in PBS pH 6.0 with DMSO (20% v/v final concentration). Reactions were heat-sealed and incubated overnight (30 °C, 850 r.p.m., 80% humidity) and then quenched with the addition of MeCN (100 μl), heat-sealed and incubated (1 h, 30 °C, 80% humidity, 850 r.p.m.). Precipitated proteins were removed by centrifugation (2,900 × g, 10 min). An aliquot of the clarified, quenched reaction mixture (100 µl) was transferred to a Greiner 96-well polypropylene microtitre plate and heat-sealed with pierceable foil. Reactions were evaluated by HPLC analysis as described below.

Following each round, the most active variants (about 1%) were rescreened as purified proteins using the HPLC assay. Proteins were produced and purified as described above. However, starter cultures were inoculated from glycerol stocks prepared from the original overnight cultures.

General procedure for analytical-scale biotransformations

Analytical-scale biotransformations (typically 100 μl) were performed in 96-well microtitre plates or microcentrifuge tubes (1.5 ml) using 1 (10, 15 or 25 mM), 2, 4 or 5 (1.0, 1.5 or 2.5 mM) and the relevant S_NAr variant at the specified concentration in PBS or NaP_i (pH 8.0) with 10% v/v DMSO co-solvent at 30 °C for the specified time period. For analysis of conversion by reverse-phase UPLC, reactions were quenched with MeCN (1 volume). Quenched reactions were shaken (850 r.p.m.) for 30 min. Precipitated protein was removed by centrifugation (2,900 × g or 14,000 × g for 15 min, for reactions in 96-well microtitre plates or microcentrifuge tubes, respectively) and supernatants were transferred to a fresh plate for UPLC analysis (see ‘Chromatographic analysis’ section). For normal-phase chiral HPLC analysis, the substrates and products were extracted with methyl tert-butyl ether (MTBE; 2 volumes). Precipitated protein was removed by centrifugation (14,000 × g for 10 min), the organic phase was obtained and directly injected onto the normal-phase HPLC.

General procedure for substrate scope biotransformations

Analytical-scale biotransformations (typically 100 μl) for the substrate profile (Fig. 4) were performed in 96-well microtitre plates or microcentrifuge tubes (1.5 ml) using the specified electrophile and nucleophile with S_NAr1.3 or S_NAr_Ph1.0 in NaP_i pH 8.0 with 10% v/v DMSO as co-solvent or PBS pH 6.0 with 20% v/v at 30 °C, respectively (Supplementary Tables 1 and 2). For analysis of conversion by reverse-phase UPLC, reactions were quenched with MeCN (1 volume). Quenched reactions were shaken (850 r.p.m.) for 30 min. Precipitated protein was removed by centrifugation (2,900 × g or 14,000 × g for 15 min, for reactions in 96-well microtitre plates or microcentrifuge tubes, respectively) and supernatants were transferred to a fresh plate for UPLC analysis (see ‘Chromatographic analysis’ section). For normal-phase chiral HPLC analysis, the substrates and products were extracted with MTBE (2 volumes). Precipitated protein was removed by centrifugation (14,000 × g for 10 min), the organic phase was separated and directly injected onto the normal-phase HPLC.

Competition experiment between regioisomers 2 and 6

The biotransformations (500 μl) were performed in microcentrifuge tubes (1.5 ml) using 1 (10 mM), 2 and 6, (1.0 mM) and S_NAr1.3 (50 μM) in NaP_i (pH 8.0) with 10% v/v DMSO co-solvent at 30 °C for 3 h. For analysis of conversion by reverse-phase UPLC, the reactions were quenched with MeCN (1 volume). Quenched reactions were shaken (850 r.p.m.) for 30 min. Precipitated protein was removed by centrifugation (14,000 × g) for 15 min and the supernatant was transferred to a fresh plate for UPLC analysis (see ‘Chromatographic analysis’ section).

Chromatographic analysis

We performed UPLC analysis on a 1290 Infinity II LC system (Agilent) with a Kinetex 5 µm XB-C18 100 Å LC Column, 50 × 2.1 mm (Phenomenex). The separation methods for all substrates/products and extinction coefficients used to calculate the conversion are reported in Supplementary Tables 3 and 4. Analytical-scale biotransformations, substrate standards and chemically synthesized product standards (3 to 32) were characterized using an injection volume of 3 μl and eluted over 15 min using a gradient of 5–95% MeCN in MQ H₂O with 0.1% TFA at 1 ml min⁻¹. Peaks were assigned by comparison with chemically synthesized standards and the peak areas were integrated using Agilent’s OpenLab software. We performed chiral analysis using a HPLC 1260 system (Agilent). For all products, the major stereoisomer formed in the biotransformations was assigned on the basis of an analogy to S_NAr1.3-derived (R)-3. Peaks were assigned by comparison with chemically synthesized standards and peak areas were integrated using Agilent’s OpenLab software. Separation methods for all product enantiomers used to determine e.e. are reported in Supplementary Table 5.

Kinetic characterization

Initial velocity (v₀) versus [ethyl 2-cyanopropionate] kinetic data were measured using His₆-tagged purified S_NAr1.0 (125 μM) and S_NAr1.3 (10 μΜ), a fixed concentration of 2 (2.5 mM) and varying concentrations of 1 (3.5–75.0 mM). Reactions were performed in PBS or NaP_i pH 8.0 with 10% v/v DMSO co-solvent and were incubated at 30 °C with shaking (850 r.p.m.). S_NAr1.0 was sampled at 60-min intervals from 1 to 6 h and S_NAr1.3 was sampled every 10 min from 10 to 60 min. Samples were quenched with MeCN (2 volumes) and analysed by UPLC as described above (see ‘Chromatographic analysis’ section). v₀ versus [2,4-dinitrochlorobenzene] kinetic data in PBS pH 8.0 were measured using a fixed concentration of 1 (75 mM) and varying concentrations of 2 (0.05–2.50 mM) as described above. v₀ versus [2,4-dinitrochlorobenzene], [2,4-dinitrobromobenzene] and [2,4-dinitroiodobenzene] kinetic data in NaP_i pH 8.0 were measured using a fixed concentration of 1 (75 mM) and varying concentrations of electrophile (0.05–2.50 mM) and S_NAr1.3 (1 μM) as described above. The plots of the averaged initial rates were fitted to the Michaelis–Menten equation using Origin software (equation (1)):

$${\rm{Y}}={V}_{\max }\times {\rm{X}}/({K}_{{\rm{M}}}+{\rm{X}})$$

(1)

Halide inhibition characterization

Initial velocity (v₀) versus [halide] kinetic data were measured using His₆-tagged purified S_NAr1.3 (1 μΜ) and a fixed concentration of 5 (1 mM) and 1 (75 mM), with varying concentrations of either KI or KCl (0.1–150.0 mM). Reactions were performed in 50 mM NaP_i pH 8.0 with 10% v/v DMSO co-solvent and were incubated at 30 °C with shaking (850 r.p.m.). Reactions were sampled at 10-min intervals for 40 min. Samples were quenched with MeCN (2 volumes) and analysed by UPLC as described above (see ‘Chromatographic analysis’ section). Linear fits of conversion versus time allowed determination of v₀, and the v₀ versus halide concentration steady-state kinetic data were fitted using the ‘[inhibitor] versus response’ equation using GraphPad Prism (equation (2)) allowing for the calculation of IC₅₀ values:

$${\rm{Y}}={\rm{Bottom}}+({\rm{Top}}-{\rm{Bottom}})/(1+({\rm{X}}/{{\rm{IC}}}_{50}))$$

(2)

Total turnover numbers

S_NAr1.3-catalysed (0.001 mol%) biotransformations were performed in microcentrifuge tubes (1.5 ml) using 1 (10 equiv) and 2, 4 or 5 (2.5, 1.5 or 1.0 mM) in NaP_i pH 8.0 with 10% v/v DMSO and 0.1% w/v Pluronic F-127 (Fig. 3b and Supplementary Fig. 4b). Reactions were incubated at 30 °C with shaking (850 r.p.m.) and samples were taken at 13, 37, 60, 84, 111 and 140 h. For UPLC analysis, reactions were quenched at the stated time points with the addition of MeCN (1 volume). Samples were vortexed and precipitated proteins were removed by centrifugation (14,000 × g for 10 min), followed by UPLC analysis.

Iodide quantification assay

The details of the enzymatic iodide sensing assay have been reported previously³⁸. A Curvularia inaequalis vanadium-dependent chloroperoxidase (CiVCPO) variant⁴⁵ was used to oxidize I⁻ to HOI, resulting in a sequential oxidation of the chromogen 3,3′,5,5′-tetramethylbenzidine (TMB) and an increase in absorbance at 570 nm. Procedures for CiVCPO expression, purification and activity determination were as described in the referenced study. S_NAr variants (200 µM) were incubated with 5 (1 mM) for 10 min and an aliquot (12 µl) of these reactions or KI standards was then added to iodide assay reagent (100 μl) in a transparent 96-well polystyrene microtitre plate. Absorbance at 570 nm was monitored spectrophotometrically over 10 min using a CLARIOstar plate reader and the initial rate was calculated. The iodide assay reagent contained CiVCPO (26 U ml⁻¹), TMB (2 mM), H₂O₂ (2 mM) and sodium orthovanadate (1 mM) in NaP_i (20 mM, pH 6.0) with 10% v/v DMSO as a co-solvent.

Preparative-scale biotransformation with 2

A 500-ml Erlenmeyer flask was charged sequentially with 2 (50.6 mg, 0.25 mmol, 1.00 equiv), 15 ml DMSO, 1 (153 μl, 1.25 mmol, 5 equiv) and PBS pH 8.0 (35 ml). The solution was swirled to reach homogeneity before the addition of a stock solution of 100 μM of S_NAr1.3 in PBS pH 8.0 (50 ml, 0.005 mmol, 2 mol%). The flask was sealed with a foam bung and foil and then incubated (30 °C, 120 r.p.m.) for 40 h. After this time, an aliquot (50 μl) of the reaction mixture was analysed by UPLC, showing 90% conversion. The reaction mixture was transferred to a separating funnel and extracted with EtOAc (3 × 100 ml), the organic layers washed with brine and dried over MgSO₄, followed by concentration in vacuo. The crude product was purified by flash column chromatography (SiO₂, celite dry loading, 10–35% EtOAc in hexane) to afford (R)-ethyl 2-cyano-2-(2,4-dinitrophenyl)propanoate (3) as a pale yellow solid (50 mg, 70% isolated yield, 94% e.e.). The product was recrystallized from EtOAc/hexane to give pale yellow needles (25 mg, 98% e.e.), which were used for X-ray diffraction to determine the absolute configuration of 3. Spectral data were consistent with the synthetic standard. ${[\alpha ]}_{D}^{24.3}=+\,113.0^\circ $ (c = 0.46, CHCl₃). R_f 0.28 (35% EtOAc in hexane) [UV]. Melting point: 116 °C (hexane). ¹H NMR (500 MHz, CDCl₃): δ 9.02 (d, J = 2.5 Hz, 1H), 8.61 (dd, J = 8.7, 2.4 Hz, 1H), 8.07 (d, J = 8.7 Hz, 1H), 4.42–4.29 (m, 2H), 2.24 (s, 3H), 1.37 (t, J = 7.1 Hz, 3H). ¹³C{¹H} NMR (126 MHz, CDCl₃): δ 166.1, 148.3, 136.9, 131.25, 128.3, 121.9, 117.4, 64.4, 47.4, 24.8, 14.0. HRMS: (ESI⁺) calcd for C₁₂H₁₂O₃N₆ ([M + H]⁺): 294.0721, found: 294.0720.

Preparative-scale biotransformation with 5

A 1-l Erlenmeyer flask was charged sequentially with 5 (161.7 mg, 0.55 mmol, 1.00 equiv) in DMSO (30 ml), 1 (699 μl, 5.5 mmol, 10 equiv) in DMSO (25 ml) and S_NAr1.3 in NaP_i pH 8.0 (495 ml at 5.5 μM, 5 μM final concentration). The flask was sealed with a foam bung and foil and then shaken in an incubator (30 °C, 200 r.p.m.) for 20 h. After this time, an aliquot (50 μl) of the reaction mixture was taken, quenched and analysed by UPLC, showing 100% conversion of 5. The reaction mixture was transferred to a separating funnel and extracted with EtOAc (3 × 500 ml), the organic layers were combined and washed with water (3.0 200 ml) and brine and then dried over MgSO₄, followed by concentration in vacuo. The crude product was purified by flash column chromatography (SiO₂, loaded in CH₂Cl₂, 10–35% EtOAc in hexane) to afford (R)-ethyl 2-cyano-2-(2,4-dinitrophenyl)propanoate (3) as a pale yellow solid (148 mg, 91% yield, 99% e.e.). Spectral data were consistent with the synthetic standard. ${[\alpha ]}_{D}^{\,24.3}=+117.5^\circ $ (c = 0.46, CHCl₃). R_f 0.28 (35% EtOAc in hexane) [UV]. Melting point: 116 °C (hexane). ¹H NMR (400 MHz, CDCl₃): δ 9.00 (d, J = 2.4 Hz, 1H), 8.59 (dd, J = 8.7, 2.5 Hz, 1H), 8.05 (d, J = 8.7 Hz, 1H), 4.33 (qq, J = 6.8, 3.6 Hz, 2H), 2.21 (s, 3H), 1.34 (t, J = 7.1 Hz, 3H). ¹³C{¹H} NMR (101 MHz, CDCl₃): δ 166.1, 148.3, 148.0, 136.8, 131.2, 128.3, 121.9, 117.4, 64.4, 47.4, 24.8, 14.0. HRMS: (ESI⁺) calcd for C₁₂H₁₂O₃N₆ ([M + H]⁺): 294.0721, found: 294.0723.

Protein crystallization, refinement and model building

As we had observed, halide inhibition for S_NAr1.3 samples for crystallization were purified in the absence of Cl⁻ by removing NaCl from all buffers in the protein purification procedures described above. Two of the structures (ligand free and 5 bound) contained the extra K39A mutation, as this improved the resolution of the corresponding diffraction data. K39A had a minimal effect on S_NAr1.3 rate or e.e. values (Supplementary Fig. 14). Crystals of S_NAr1.3 or S_NAr1.3(K39A) were prepared by mixing 300 nl PACT condition E11 (0.2 M sodium citrate tribasic dihydrate and 20% w/v PEG 3350) with 20 nl S_NAr1.3 seed stock and 280 nl of 20 mg ml⁻¹ S_NAr1.3 in 50 mM NaP_i. All trials were conducted by sitting vapour drop diffusion with incubation at 4 °C. Crystals were cryoprotected in reservoir solution supplemented with 10% w/v PEG 200 and flash-cooled in liquid N₂. For the I⁻ soak, KI was added to the drop at a final concentration of 100 mM before freezing. To obtain the complex with 5, the 5 solid was added to the drop and incubated overnight before crystal harvesting and freezing. The data were collected from single crystals at Diamond Light Source and subsequently scaled and reduced with Xia2 (ref. ⁴⁶). Preliminary phasing was performed by molecular replacement in Phaser using BH32.7 (PDB: 7O1D) as a search model. Iterative cycles of rebuilding and refinement were performed in COOT⁴⁷ and Phenix.refine⁴⁸, respectively. The complete data collection and refinement statistics are provided in Supplementary Table 10. Coordinates and structure factors have been deposited in the Protein Data Bank (PDB) under accession numbers 9FUG, 9FUL and 9FUO.

( R )-3 data collection

Crystals suitable for diffraction were isolated by recrystallization from EtOAc/hexane. Single-crystal X-ray diffraction data for compound 3 were collected at 100 K on a Rigaku XtaLAB AFC-11 four-circle goniometer equipped with a HyPix-6000HE detector and Oxford Cryosystems. Data were collected with a dual-source Rigaku FR-X rotating anode using Cu Kα (λ = 1.54184 Å) radiation. Data were collected using the CrysAlisPro program.

( R )-3 crystal structure determination and refinements

Data processing and reduction was performed with CrysAlisPro. Empirical absorption correction was applied using spherical harmonics, implemented with the SCALE3 ABSPACK algorithm. The crystal structure was solved and refined using the SHELX suite of programs in Olex2 (refs. ^49,50). All non-hydrogen atoms were refined anisotropically. Hydrogen atom positions were calculated and refined with fixed isotropic displacement parameters. The complete data collection and refinement statistics are provided in Supplementary Table 10. Absolute structure determination was based on anomalous dispersion. The absolute configuration of compound 3 was found to be (R).

Crystallographic data has been deposited with the CCDC number 2362363.

Molecular docking

Putative binding poses for substrate 5 were generated by molecular docking performed with AutoDock Vina⁵¹, using AutoDockTools⁵² to assign hydrogen atoms to the crystal structure and generate input files. Residues R124 and L68 were flexible during docking, whereas the rest of the enzyme was kept rigid. 20 binding modes were generated with an exhaustiveness of 50, with a 20 Å × 20 Å × 20 Å search volume centred on the reacting carbon atom in the reactive pose of the crystal structure. There is very little difference in computed binding energy for the predicted binding poses, with the first 14 poses all within 1 kcal mol⁻¹. Pose 7 was selected as it places the iodide in a very similar position as in the crystal structure (Fig. 4d), with a computed binding energy of −5.8 kcal mol⁻¹ (0.6 kcal mol⁻¹ higher than the first binding pose). Nucleophile 1 was then docked into the crystal structure with pose 7 of substrate 5 and the corresponding coordinates of Arg124 and Leu68 added, using the same parameters, keeping Arg124 flexible. There is only a 0.9 kcal mol⁻¹ difference between the 20 identified binding poses. Pose 15, with a computed binding energy of −4.0 kcal mol⁻¹ (0.7 kcal mol⁻¹ higher than the first binding pose) is consistent with nucleophilic attack with the observed stereochemistry and overlaps with one of the ethylene glycol molecules in the active site of the crystal structure. The docking poses are shown in Supplementary Fig. 15.

Molecular dynamics simulations

Bonding parameters for nucleophile 1 and substrate 5 were generated using the AmberTools antechamber module⁵³ with charges parameterized by RESP fitting to the HF/6-31 G(d,p) electron density of a B3LYP/6-31 + G(d,p) structure optimized in Gaussian 16 Revision C.01 (ref. ⁵⁴). Molecular dynamics simulations were then carried out using GROMACS 2018 (refs. ^55,56) with the AMBER14 force field⁵⁷ with a solvation box with a minimum 10 Å buffering distance around the protein and counterions generated using AmberTools, retaining crystallographic waters, for a total of 61,696 atoms.

A 3.0 Å harmonic restraint with force constant 4 kJ mol⁻¹ Å⁻² was applied to the donor–acceptor distance to hold the nucleophile and electrophile close to a reactive conformation during the molecular dynamics simulation. Simulations were performed using constant temperature (velocity-rescaling thermostat⁵⁸, 300 K) and pressure (Parrinello–Rahman barostat⁵⁹, 1 bar), 10 Å van der Waals and electrostatic cut-offs, particle mesh Ewald for long-range electrostatics, LINCS bond constraints⁶⁰, periodic boundary conditions and a 2-fs time step. The protocol was as follows: (1) energy minimization with (a) 10 kJ mol⁻¹ Å⁻² constraints on the protein heavy atoms (not hydrogens), (b) 1 kJ mol⁻¹ Å⁻² constraints on the protein heavy atoms, (c) no constraints; (2) 1 ns constant volume (NVT) equilibration of the solvent with 10 kJ mol⁻¹ Å⁻² constraints on the protein heavy atoms; (3) three 1 ns constant pressure (NPT) equilibration stages with the same decreasing position constraints as for energy minimization; (4) 500 ns of production run. Root mean square deviations for the production run as well as the distance between Arg124 and enolate O⁻ are shown in Supplementary Fig. 19. For the hydrogen bonding, direct hydrogen bonds were identified using the GROMACS hbond tool and bridging hydrogen bonds defined as having a water molecule with the oxygen atom within 2.4 Å of the R124 hydrogen at the same time as one hydrogen atom within 2.4 Å of the enolate O⁻. We carried out hydrogen bonding analysis with CPPTRAJ⁶¹ in AMBER18 (ref. ⁶²).

Data availability

Coordinates and structure factors have been deposited in the Protein Data Bank under accession numbers 9FUG, 9FUL and 9FUO. Crystallographic data for the structure of 3 reported in this article have been deposited at the Cambridge Crystallographic Data Centre, under deposition number CCDC 2362363. Copies of the data can be obtained free of charge at https://www.ccdc.cam.ac.uk/structures. The data supporting the findings of this study are available in the paper and its Supplementary Information files.

References

Bunnett, J. F. & Zahler, R. E. Aromatic nucleophilic substitution reactions. Chem. Rev. 49, 273–412 (1951).
Article CAS Google Scholar
Terrier F. Modern Nucleophilic Aromatic Substitution (Wiley, 2013).
Kwan, E. E., Zeng, Y., Besser, H. A. & Jacobsen, E. N. Concerted nucleophilic aromatic substitutions. Nat. Chem. 10, 917–923 (2018).
Article PubMed PubMed Central CAS Google Scholar
Neumann, C. N., Hooker, J. M. & Ritter, T. Concerted nucleophilic aromatic substitution with ¹⁹F⁻ and ¹⁸F⁻. Nature 534, 369–373 (2016).
Article ADS PubMed PubMed Central CAS Google Scholar
Bella, M., Kobbelgaard, S. & Jørgensen, K. A. Organocatalytic regio- and asymmetric C-selective S_NAr reactions–stereoselective synthesis of optically active spiro-pyrrolidone-3,3′-oxoindoles. J. Am. Chem. Soc. 127, 3670–3671 (2005).
Article PubMed CAS Google Scholar
Guo, F., Fang, S., He, J., Su, Z. & Wang, T. Enantioselective organocatalytic synthesis of axially chiral aldehyde-containing styrenes via S_NAr reaction-guided dynamic kinetic resolution. Nat. Commun. 14, 5050 (2023).
Article ADS PubMed PubMed Central CAS Google Scholar
Shirakawa, S., Koga, K., Tokuda, T., Yamamoto, K. & Maruoka, K. Catalytic asymmetric synthesis of 3,3′-diaryloxindoles as triarylmethanes with a chiral all-carbon quaternary center: phase-transfer-catalyzed S_NAr reaction. Angew. Chem. Int. Ed. 53, 6220–6223 (2014).
Article CAS Google Scholar
Armstrong, R. J. & Smith, M. D. Catalytic enantioselective synthesis of atropisomeric biaryls: a cation-directed nucleophilic aromatic substitution reaction. Angew. Chem. Int. Ed. 53, 12822–12826 (2014).
Article CAS Google Scholar
Li, Y., Pan, H., Li, W.-Y., Feng, X. & Liu, X. Enantioselective nucleophilic aromatic substitution reaction of azlactones to synthesize quaternary α-amino acid derivatives. Synlett 32, 587–592 (2020).
Article MATH Google Scholar
Cardenas, M. M. et al. Catalytic atroposelective dynamic kinetic resolutions and kinetic resolutions towards 3-arylquinolines via S_NAr. Chem. Commun. 57, 10087–10090 (2021).
Article MATH CAS Google Scholar
Cardenas, M. M., Toenjes, S. T., Nalbandian, C. J. & Gustafson, J. L. Enantioselective synthesis of pyrrolopyrimidine scaffolds through cation-directed nucleophilic aromatic substitution. Org. Lett. 20, 2037–2041 (2018).
Article PubMed PubMed Central CAS Google Scholar
Crawshaw, R. et al. Engineering an efficient and enantioselective enzyme for the Morita–Baylis–Hillman reaction. Nat. Chem. 14, 313–320 (2021).
Article PubMed PubMed Central MATH Google Scholar
Rohrbach, S. et al. Concerted nucleophilic aromatic substitution reactions. Angew. Chem. Int. Ed. 58, 16368–16388 (2019).
Article CAS Google Scholar
Brown, D. G. & Boström, J. Analysis of past and present synthetic methodologies on medicinal chemistry: where have all the new reactions gone? J. Med. Chem. 59, 4443–4458 (2016).
Article PubMed MATH CAS Google Scholar
Tay, N. E. S. & Nicewicz, D. A. Cation radical accelerated nucleophilic aromatic substitution via organic photoredox catalysis. J. Am. Chem. Soc. 139, 16100–16104 (2017).
Article PubMed CAS Google Scholar
Pistritto, V. A., Schutzbach-Horton, M. E. & Nicewicz, D. A. Nucleophilic aromatic substitution of unactivated fluoroarenes enabled by organic photoredox catalysis. J. Am. Chem. Soc. 142, 17187–17194 (2020).
Article PubMed PubMed Central CAS Google Scholar
Shin, N. Y. et al. Radicals as exceptional electron-withdrawing groups: nucleophilic aromatic substitution of halophenols via homolysis-enabled electronic activation. J. Am. Chem. Soc. 144, 21783–21790 (2022).
Article PubMed PubMed Central CAS Google Scholar
Otsuka, M., Endo, K. & Shibata, T. Catalytic S_NAr reaction of non-activated fluoroarenes with amines via Ru η⁶-arene complexes. Chem. Commun. 46, 336–338 (2010).
Article CAS Google Scholar
Kang, Q.-K., Lin, Y., Li, Y. & Shi, H. Ru(II)-catalyzed amination of aryl fluorides via η⁶-coordination. J. Am. Chem. Soc. 142, 3706–3711 (2020).
Article PubMed MATH CAS Google Scholar
Buller, R. et al. From nature to industry: harnessing enzymes for biocatalysis. Science 382, eadh8615 (2023).
Article PubMed MATH CAS Google Scholar
Chen, K. & Arnold, F. H. Engineering new catalytic activities in enzymes. Nat. Catal. 3, 203–213 (2020).
Article MATH CAS Google Scholar
Bell, E. L. et al. Biocatalysis. Nat. Rev. Methods Primers 1, 46 (2021).
Article CAS Google Scholar
Scholten, J. D. et al. Novel enzymic hydrolytic dehalogenation of a chlorinated aromatic. Science 253, 182–185 (1991).
Article ADS PubMed MATH CAS Google Scholar
Crooks, G. P., Xu, L., Barkley, R. M. & Copley, S. D. Exploration of possible mechanisms for 4-chlorobenzoyl CoA dehalogenase: evidence for an aryl-enzyme intermediate. J. Am. Chem. Soc. 117, 10791–10798 (1995).
Article CAS Google Scholar
Kalyoncu, S. et al. Enzymatic hydrolysis by transition-metal-dependent nucleophilic aromatic substitution. Nat. Chem. Biol. 12, 1031–1036 (2016).
Article PubMed PubMed Central MATH CAS Google Scholar
Seffernick, J. L. & Wackett, L. P. Rapid evolution of bacterial catabolic enzymes: a case study with atrazine chlorohydrolase. Biochem. 40, 12747–12753 (2001).
Article CAS Google Scholar
Chen, W. J., Graminski, G. F. & Armstrong, R. N. Dissection of the catalytic mechanism of isozyme 4-4 of glutathione S-transferase with alternative substrates. Biochem. 27, 647–654 (1988).
Article CAS Google Scholar
Hutton, A. E. et al. A non-canonical nucleophile unlocks a new mechanistic pathway in a designed enzyme. Nat. Commun. 15, 1956 (2024).
Article ADS PubMed PubMed Central MATH CAS Google Scholar
Quasdorf, K. W. & Overman, L. E. Catalytic enantioselective synthesis of quaternary carbon stereocentres. Nature 516, 181–191 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Reddy, M. D. & Watkins, E. B. Palladium-catalyzed direct arylation of C(sp³)–H bonds of α-cyano aliphatic amides. J. Org. Chem. 80, 11447–11459 (2015).
Article PubMed MATH CAS Google Scholar
Nicewicz, D. A., Yates, C. M. & Johnson, J. S. Catalytic asymmetric acylation of (silyloxy)nitrile anions. Angew. Chem. Int. Ed. 43, 2652–2655 (2004).
Article CAS Google Scholar
Gellis, A. et al. A new DMAP-catalyzed and microwave-assisted approach for introducing heteroarylamino substituents at position-4 of the quinazoline ring. Tetrahedron 70, 8257–8266 (2014).
Article MATH CAS Google Scholar
Hervin, V., Coutant, E., Gagnot, G. & Janin, Y. Synthesis of α-amino esters via α-nitro or α-oxime esters: a review. Synthesis 49, 4093–4110 (2017).
Article CAS Google Scholar
Hager, A., Vrielink, N., Hager, D., Lefranc, J. & Trauner, D. Synthetic approaches towards alkaloids bearing α-tertiary amines. Nat. Prod. Rep. 33, 491–522 (2016).
Article PubMed CAS Google Scholar
Ameen, D. & Snape, T. J. Chiral 1,1-diaryl compounds as important pharmacophores. MedChemComm 4, 893–907 (2013).
Article CAS Google Scholar
Mondal, S. & Panda, G. Synthetic methodologies of achiral diarylmethanols, diaryl and triarylmethanes (TRAMs) and medicinal properties of diaryl and triarylmethanes-an overview. RSC Adv. 4, 28317–28358 (2014).
Article ADS CAS Google Scholar
Wei, J., Gandon, V. & Zhu, Y. Amino acid-derived ionic chiral catalysts enable desymmetrizing cross-coupling to remote acyclic quaternary stereocenters. J. Am. Chem. Soc. 145, 16796–16811 (2023).
Article PubMed PubMed Central CAS Google Scholar
Tang, Q. et al. Directed evolution of a halide methyltransferase enables biocatalytic synthesis of diverse SAM analogs. Angew. Chem. Int. Ed. 60, 1524–1527 (2021).
Article MATH CAS Google Scholar
O’Hagan, D. & Schmidberger, J. W. Enzymes that catalyse S_N2 reaction mechanisms. Nat. Prod. Rep. 27, 900–918 (2010).
Article PubMed Google Scholar
Skitchenko, R. K., Usoltsev, D., Uspenskaya, M., Kajava, A. V. & Guskov, A. Census of halide-binding sites in protein structures. Bioinformatics 36, 3064–3071 (2020).
Article PubMed PubMed Central CAS Google Scholar
Lovelock, S. L. et al. The road to fully programmable protein catalysis. Nature 606, 49–58 (2022).
Article ADS PubMed MATH CAS Google Scholar
Yeh, A. H.-W. et al. De novo design of luciferases using deep learning. Nature 614, 774–780 (2023).
Article ADS PubMed PubMed Central MATH CAS Google Scholar
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
Article ADS PubMed PubMed Central MATH CAS Google Scholar
Dauparas, J. et al. Robust deep learning–based protein sequence design using ProteinMPNN. Science 378, 49–56 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Hasan, Z. et al. Laboratory-evolved vanadium chloroperoxidase exhibits 100-fold higher halogenating activity at alkaline pH: catalytic effects from first and second coordination sphere mutations. J. Biol. Chem. 281, 9738–9744 (2006).
Article PubMed MATH CAS Google Scholar
Winter, G. et al. DIALS: implementation and evaluation of a new integration package. Acta Crystallogr. D 74, 85–97 (2018).
Article ADS MATH CAS Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).
Article ADS PubMed PubMed Central MATH CAS Google Scholar
Afonine, P. V. et al. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr. D 68, 352–367 (2012).
Article ADS PubMed PubMed Central CAS Google Scholar
Sheldrick, G. Crystal structure refinement with SHELXL. Acta Crystallogr. C 71, 3–8 (2015).
Article ADS MATH Google Scholar
Dolomanov, O. V., Bourhis, L. J., Gildea, R. J., Howard, J. A. K. & Puschmann, H. OLEX2: a complete structure solution, refinement and analysis program. J. Appl. Crystallogr. 42, 339–341 (2009).
Article ADS MATH CAS Google Scholar
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Article PubMed PubMed Central MATH CAS Google Scholar
Morris, G. M. et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791 (2009).
Article PubMed PubMed Central MATH CAS Google Scholar
Wang, J., Wang, W. & Kollman, P. Antechamber: an accessory software package for molecular mechanical calculations. Abstr. Pap. Am. Chem. Soc. 222, U403–U403 (2001).
Google Scholar
Gaussian 16 (Gaussian, Inc., 2016).
Abraham, M. J. et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2, 19–25 (2015).
Article ADS MATH Google Scholar
Markidis, S. & Laure, E. (eds) Solving Software Challenges for Exascale: International Conference on Exascale Applications and Software, EASC 2014, Stockholm, Sweden, April 2-3, 2014, Revised Selected Papers https://doi.org/10.1007/978-3-319-15976-8 (Springer, 2015).
Maier, J. A. et al. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput. 11, 3696–3713 (2015).
Article PubMed PubMed Central MATH CAS Google Scholar
Bussi, G., Donadio, D. & Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. 126, 014101 (2007).
Article ADS PubMed Google Scholar
Nosé, S. & Klein, M. L. Constant pressure molecular dynamics for molecular systems. Mol. Phys. 50, 1055–1076 (1983).
Article ADS MATH Google Scholar
Hess, B., Bekker, H., Berendsen, H. J. C. & Fraaije, J. G. E. M. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463–1472 (1997).
Article CAS Google Scholar
Roe, D. R. & Cheatham, T. E. PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 9, 3084–3095 (2013).
Article PubMed MATH CAS Google Scholar
Case, D. A. et al. AMBER 2018, University of California, San Francisco (2018).

Download references

Acknowledgements

We acknowledge the Human Frontier Science Program (RGP0004/2022), UK Research and Innovation (UKRI Frontier Research Guarantee, to A.P.G., EP/Y023722/1), the Engineering and Physical Sciences Research Council (EPSRC Centre-to-Centre Partnership, EP/Z531157/1; EPSRC Centre for Doctoral Training in Integrated Catalysis EP/S023755/1 studentships to T.M.L. and E.J.H.; EPSRC Doctoral Prize Fellowship EP/W524347/1 to F.J.H.), the Biotechnology and Biological Sciences Research Council (BB/W014483/1, G.W.R.), and the European Research Council (ERC Advanced Grant no. 833337 to I.L.). We are grateful to Diamond Light Source for beamtime (proposal mx31850-65) I03, to the Manchester SYNBIOCHEM Centre (BB/M017702/1), the Future Biomanufacturing Hub (EP/S01778X/1) and the Henry Royce Institute for Advanced Materials (financed through EPSRC grant nos. EP/R00661X/1, EP/S019367/1, EP/P025021/1 and EP/P025498/1) for access to their facilities, the assistance given by Research IT and the use of the Computational Shared Facility at The University of Manchester. We thank M. Dunstan (Manchester Institute of Biotechnology) for guidance on automating directed-evolution workflows and M. Trelore, R. Spiess and A. Andrews (Manchester Institute of Biotechnology) for acquiring protein mass spectra.

Author information

Authors and Affiliations

Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK
Thomas M. Lister, George W. Roberts, Euan J. Hossack, Fei Zhao, Ashleigh J. Burke, Linus O. Johannissen, Florence J. Hardy, David Leys & Anthony P. Green
Department of Chemistry, The University of Manchester, Manchester, UK
Thomas M. Lister, George W. Roberts, Euan J. Hossack, Ashleigh J. Burke, Florence J. Hardy, Alexander A. V. Millman, David Leys, Igor Larrosa & Anthony P. Green

Authors

Thomas M. Lister
View author publications
Search author on:PubMed Google Scholar
George W. Roberts
View author publications
Search author on:PubMed Google Scholar
Euan J. Hossack
View author publications
Search author on:PubMed Google Scholar
Fei Zhao
View author publications
Search author on:PubMed Google Scholar
Ashleigh J. Burke
View author publications
Search author on:PubMed Google Scholar
Linus O. Johannissen
View author publications
Search author on:PubMed Google Scholar
Florence J. Hardy
View author publications
Search author on:PubMed Google Scholar
Alexander A. V. Millman
View author publications
Search author on:PubMed Google Scholar
David Leys
View author publications
Search author on:PubMed Google Scholar
Igor Larrosa
View author publications
Search author on:PubMed Google Scholar
Anthony P. Green
View author publications
Search author on:PubMed Google Scholar

Contributions

T.M.L. carried out molecular biology, protein production, directed evolution, enzyme characterization, developed chromatographic methods and explored the substrate scope. G.W.R. carried out protein crystallization and the iodide release assay and interpreted, analysed and presented structural data, with assistance from F.J.H. and D.L. T.M.L. and G.W.R. carried out biochemical characterization. E.J.H. assisted with initial biochemical assays and identification of the starting template for directed evolution. T.M.L., F.Z. and G.W.R. synthesized product standards. A.J.B. assisted with initial assay development and directed evolution. L.O.J. performed and analysed the docking and molecular dynamics simulations. A.A.V.M. obtained X-ray crystal data and determined the single-crystal structure of (R)-3. T.M.L., G.W.R., I.L. and A.P.G. wrote the manuscript, with input from all authors. A.P.G. and I.L. initiated and directed the research.

Corresponding authors

Correspondence to Igor Larrosa or Anthony P. Green.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Yang Yang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Directed evolution of an efficient and enantioselective S_NAr enzyme.

Schematic showing the trajectory from S_NAr1.0 to S_NAr1.3. Mutations introduced are represented as CPK spheres at the C_β. Three rounds of evolution afforded S_NAr1.3, which contains six mutations. Library generation method, positions targeted, the number of clones evaluated, beneficial mutations and the most improved variant for each round are given in the table.

Extended Data Fig. 2 Kinetic characterization of S_NAr1.0 and S_NAr1.3.

Michaelis–Menten plots for the S_NAr reaction between 1 and 2 catalysed by either S_NAr1.0 (a) or S_NAr1.3 (b). Assays were performed at a fixed concentration of 1 (75 mM) and varying concentrations of 2, or a fixed concentration of 2 (2.5 mM) and varying concentrations of 1. The plots show the averaged initial rates that were fitted to the Michaelis–Menten equation using GraphPad Prism software. Error bars represent the standard deviation of measurements made in triplicate. See Supplementary Data for source data.

Extended Data Fig. 3 The effect of buffer composition on the Michaelis–Menten kinetic profile of S_NAr1.3.

Michaelis–Menten plots for the S_NAr reaction between 1 and 2 catalysed by S_NAr1.3, acquired with S_NAr1.3 in either PBS (10 mM Na₂HPO₄, 1.8 mM KH₂PO₄, 137 mM NaCl, 2.7 mM KCl) or sodium phosphate (NaP_i) (46.4 mM Na₂HPO₄, 3.6 mM NaH₂PO₄). Assays were performed at a fixed concentration of 1 (75 mM) and varying concentrations of 2 in PBS pH 8.0 (grey circle markers) or NaP_i pH 8.0 (blue square markers). The plots show the averaged initial rates that were fitted to the second-order polynomial (quadratic) using GraphPad Prism software, as saturation of 2 was not achieved. Error bars represent the standard deviation of measurements made in triplicate. See Supplementary Data for source data.

Extended Data Fig. 4 Relationship between S_NAr1.3 initial rate and concentration of chloride and iodide.

Assays were performed at a fixed concentration of 1 (75 mM) and 5 (1 mM) with varying concentrations of either KCl (a) or KI (b). Linear fits of conversion versus time allowed determination of v₀, and the v₀ versus halide concentration steady-state kinetic data were fitted to the ‘[inhibitor] versus response’ equation (Y = Bottom + (Top − Bottom)/(1 + (X/IC₅₀))) using GraphPad Prism, allowing for the calculation of IC₅₀ values. Data points shown are averages of triplicate measurements, with error bars representing standard deviation. See Supplementary Data for source data.

Extended Data Fig. 5 Directed evolution of S_NAr_Ph1.0, an S_NArase compatible with nucleophile 20 that generated products with optically 1,1-diaryl quaternary carbon centres.

Schematic showing the trajectory from S_NAr1.2 to S_NAr_Ph1.0. Mutations introduced are represented as CPK spheres at the C_α. One round of evolution afforded S_NAr_Ph1.0, which contains two mutations. Library generation method, positions targeted, the number of clones evaluated, beneficial mutations and the most improved variant for the single round are given in the table.

Extended Data Fig. 6 Halide binding sites in S_NAr1.3.

Soaking S_NAr1.3 with iodide (pink spheres) reveals two internal halide binding sites (HBS), with two conformations of the iodide ion presented for HBS2. The most occupied site (HBS1) was further investigated with knockout studies. The two most important residues for catalysis (Arg124 and Asp125) are shown as blue sticks. Arg124 is shown as transparent, as it was not visible in the electron density owing to side chain conformational heterogeneity. The position of electrophile binding from the analogous S_NAr1.3 soak with 5 is shown in transparent sticks, with the major position in salmon and the minor in grey.

Extended Data Fig. 7 2,4-dinitroiodobenzene binding poses in S_NAr1.3.

a, 2,4-dinitroiodobenzene binds S_NAr1.3 in two poses, one major (salmon) and one minor (grey), inferred by anomalous destiny. The major pose is adjacent to the halide binding site. Anomalous map contoured at 9σ (orange mesh). 2F_o–F_c map contoured at 1σ (grey mesh). b, Notable residues forming the aryl halide binding pocket.

Extended Data Fig. 8 The effect of S_NAr1.3 point mutations on the reaction rate with substrate 5.

Bar chart comparison of v₀/[E] (min⁻¹) values for point mutants of S_NAr1.3 at residues that line the active site and halide binding cavity. Reaction conditions: 1 (75 mM), 5 (1.0 mM) and S_NAr1.3 variant in NaP_i pH 8.0 with 10% v/v DMSO at 30 °C. Enzyme concentrations: S_NAr1.3 (1 μΜ); M64A, R65A, R88A, C96A (1 μΜ); R124A (25 μΜ); D125A (5 μΜ); D125N (40 μΜ). Error bars represent the standard deviation of measurements made in triplicate. See Supplementary Data for source data.

Supplementary information

Supplementary Information

Supplementary Information, including Supplementary Figs. 1–19, Supplementary Tables 1–11 and further references.

Supplementary Data

Source data for Figs. 1–3, Extended Data Figs. 2–4 and 8, and Supplementary Figs. 1, 3, 4, 6, 8, 9, 11–14, 16 and 17.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lister, T.M., Roberts, G.W., Hossack, E.J. et al. Engineered enzymes for enantioselective nucleophilic aromatic substitutions. Nature 639, 375–381 (2025). https://doi.org/10.1038/s41586-025-08611-0

Download citation

Received: 04 July 2024
Accepted: 08 January 2025
Published: 15 January 2025
Version of record: 05 March 2025
Issue date: 13 March 2025
DOI: https://doi.org/10.1038/s41586-025-08611-0

This article is cited by

Axially engineered single atoms in enzyme-mimic-binding pocket steering dehalogenation–polymerization pathways toward water pollutant upcycling
- Bin Wu
- Zhiling Li
- Aijie Wang
Nature Communications (2026)