Directed evolution of a TNA polymerase identifies independent paths to fidelity and catalysis

Hajjar, Mohammad; Maola, Victoria A.; Lee, Joy J.; Holguin, Manuel J.; Quijano, Riley N.; Nguyen, Kalvin K.; Ho, Katherine L.; Medina, Jenny V.; Botello-Cornejo, Elionel; Barpuzary, Bhawna; Chim, Nicholas; Chaput, John C.

doi:10.1038/s41467-025-67652-1

Download PDF

Article
Open access
Published: 19 December 2025

Directed evolution of a TNA polymerase identifies independent paths to fidelity and catalysis

Nature Communications volume 17, Article number: 925 (2026) Cite this article

6670 Accesses
6 Altmetric
Metrics details

Subjects

Abstract

Directed evolution facilitates functional adaptations through stepwise changes in sequence that alter protein structure. While most campaigns yield solutions that maintain the framework of a rigid protein architecture, a few have produced enzymes with more notable structural differences. One example is a polymerase that was evolved to synthesize threose nucleic acid (TNA) with near-natural activity. Understanding how this enzyme arose provides a model for studying pathways that guide enzymes toward more productive regions of the fitness landscape. Here, we trace the evolutionary trajectory of an unnatural polymerase by solving crystal structures of key intermediates along the pathway and evaluating their biochemical activity. Contrary to the view that fidelity is a product of increased catalytic efficiency, we find that accuracy and catalysis are decoupled activities guided by separate ground-state and transition-state discrimination events. Together, these results offer a glimpse into the forces responsible for shaping the emergence of new enzyme functions.

Directed evolution of a highly efficient TNA polymerase achieved by homologous recombination

Article 01 October 2024

Non-enzymatic error correction in self-replicators without extraneous energy supply

Article Open access 22 February 2026

Following replicative DNA synthesis by time-resolved X-ray crystallography

Article Open access 11 May 2021

Introduction

Directed evolution is a powerful approach for generating enzymes that are needed to drive future applications in biotechnology and medicine^1,2,3. Through recursive cycles of diversification and selection, populations of molecules adapt to novel biochemical functions that lie outside the sphere of activities found in nature⁴. Though effective, most protein evolution studies are unable to match the catalytic efficiency of natural enzymes⁵. The central challenge lies in understanding how to reshape the active site pocket to accommodate noncognate substrates while preserving the catalytic activity of the native enzyme fold. Although aspects of this problem have been studied by examining the products of directed evolution campaigns^6,7,8,9, understanding the molecular basis of functional adaptation requires structural information about key intermediates along the pathway, each representing a critical step toward the enzyme’s final, functional form¹⁰. By studying the evolutionary journey from a low-activity variant to a highly functional enzyme, it may be possible to learn basic design principles that could accelerate future selections or expand the predictive power of AI algorithms to include dynamic enzymes with complicated reaction mechanisms.

In recent work, we demonstrated the directed evolution of a polymerase, called 10–92, that can synthesize α-L-threofuranosyl nucleic acid (TNA) with near-natural activity (Fig. 1)¹¹. Biochemical studies reveal that 10–92 achieves a catalytic rate of ~1 nt s⁻¹ and >99% fidelity by sequentially adding TNA triphosphates (tNTPs) to the 2′-hydroxyl group of the growing primer, yielding an unnatural 3′ → 2′ linked TNA product (Fig. 1a). The enzyme was discovered using the massively parallel screening technique of droplet-based optical polymerase sorting (DrOPS)^12,13 to query a large combinatorial library prepared by recombining the sequences of four hyperthermophilic archaeal B-family DNA polymerases (DNAP) (Thermococcus kodakarensis, Kod; Thermococcus gorgonarius, Tgo; Pyrococcus Deep Vent, DV; and Thermococcus 9°N, 9°N) (Fig. 1b, c). Each polymerase carried the exonuclease silencing mutations (D141A and D143A) and four beneficial mutations (A485R, N491S, R606G, and T723A) found in Kod-RSGA, our previous best TNA polymerase (TNAP)¹⁴.

**Fig. 1: Evolution of the 10–92 TNA polymerase.**

A crystal structure of the 10–92 TNAP uncovered large movements in the protein architecture that improved substrate binding, including positional rearrangements of α-helical secondary structures (up to 7 Å) that bring the thumb subdomain closer to the paired region of the DNA duplex and other large movements that realign the catalytic triad in the palm subdomain (Supplementary Figs. 1 and 2)¹¹. Together, these structural changes enabled the transition from an enzyme capable of highly efficient DNA synthesis to one that is now specialized for TNA synthesis (Supplementary Fig. 3). Whether these structural changes arose early or late in the evolutionary process remains to be established, as do the underlying mechanisms responsible for achieving enhanced catalytic activity.

Here, we study the evolutionary trajectory of the 10–92 TNAP by solving crystal structures of key intermediates along the pathway and evaluating their biochemical activity. Crystal structures were solved as binary and closed ternary complexes, and biochemical data were collected for fidelity, catalytic efficiency, thermal stability, and substrate specificity. Among the more notable trends is the observation that fidelity and catalysis appear to occur as decoupled activities, possibly guided by separate ground- and transition-state discrimination events. Molecular insights into these processes were obtained from crystal structures that were used to map the process of active site refinement across successive generations of the enzyme. Together, these findings provide a detailed mechanistic framework for understanding how directed evolution reshapes enzyme function to achieve accurate and efficient catalysis.

Results

Biochemical analysis of the evolutionary intermediates

To follow the biochemical changes that occurred along the evolutionary path, we evaluated the polymerases that most influenced the evolution of the 10–92 TNAP for fidelity, activity, thermal stability, and substrate specificity. This analysis included Kod-RSGA as well as intermediates 5–270, 7–47, and 8–64 (Fig. 1c and Supplementary Fig. 4), the most active variants identified in rounds 5, 7, and 8, respectively, and the parent sequences used for further diversification¹¹.

Our analysis of fidelity and catalysis reveals a general trend in which steep gains in fidelity occur in earlier generations of the enzyme, whereas analogous improvements in activity are delayed until later stages of the evolutionary trajectory (Fig. 1d; Supplementary Figs. 5–7 and Supplementary Table 1). However, it should be noted that we do observe a concomitant improvement in fidelity as catalysis improves, presumably due to fine-tuning of the active site near the end of the evolutionary trajectory. Nevertheless, the overall trend of early improvements in fidelity and later improvements in catalysis caught our attention as it appears to counter the prevailing view that a selection for improved catalytic activity through the use of progressively shorter incubation times, as was the case for 10–92¹¹, should indirectly enrich for enzymes with higher fidelity^15,16. This is because less faithful polymerases are more likely to stall at misincorporation sites, thereby failing to complete the extension within the allotted time^{17,18,19,20,21,22}. Thus, variants that maintain rapid synthesis under stringent incubation times are expected to exhibit the combined properties of enhanced catalytic efficiency with increased fidelity. While this interpretation is consistent with the selection results overall, our biochemical analysis of the evolutionary intermediates indicates that variants can emerge that exhibit enhanced fidelity without a proportional gain in catalytic activity. This observation implies that fidelity and catalysis may have determinants that can be separately tuned during the early stages of evolution.

We also found that all of the polymerases tested retain their extreme thermostability, showing no detectable loss in TNA synthesis activity after prolonged exposure to 90 °C for 6 h (Fig. 1d and Supplementary Fig. 8). This result highlights the exceptional structural integrity of the variants and challenges the view that directed evolution inevitably compromises protein folding stability⁵. Such trade-offs are particularly relevant in enzyme engineering campaigns, where enhanced activity often comes at the expense of reduced thermostability²³. Remarkably, the 10–92 TNAP, with its 51 amino acid mutations relative to its closest natural homolog, Kod DNAP (Fig. 1c), exhibits no measurable loss of thermostability (Supplementary Fig. 8). This observation supports the view that protein stability promotes evolvability²⁴, which may be a result of the thermal lysis step of the DrOPS selection protocol²⁵.

Finally, the 10–92 TNAP also exhibits a striking shift in substrate specificity, defined here as the ability for the polymerase to distinguish TNA versus DNA substrates based on their catalytic rates of incorporation. Kinetic analysis across generations of the polymerase reveals a consistent reluctance of the enzymes to relinquish their ancestral DNA synthesis activity until the final stages of evolution, at which point an inversion (13-fold) in substrate specificity occurs, favoring TNA synthesis over DNA synthesis (Supplementary Fig. 7). This inversion of substrate preference marks the transition of the enzyme from a broadly acting generalist to one that is increasingly specialized for TNA synthesis on DNA templates. Such a shift is a hallmark of functional divergence, arising from the accumulation of mutations that reconfigure the active site toward a new function.

A binary complex capturing an unexpected preorganized state

As an initial step toward solving the puzzle of how 10–92 emerged as a highly active enzyme from a diverse pool of synthetic homologs, crystal structures of major intermediates identified along the evolutionary pathway were first solved as binary structures of the post-catalytic complex with a bound DNA primer-template (P/T) duplex that was enzymatically extended with two TNA residues (3′-tTtA-2′) at the 3′ terminus. Four structures (5–270, 7–47, 8–64, and 10–92) spanning a resolution limit of 2.17–2.56 Å were determined by molecular replacement (Supplementary Table 2). Consistent with all known archaeal B-family DNAPs, the enzymes adopt a disk-shaped architecture (Supplementary Figs. 9 and 10) defined by an N-terminal domain (NTD), an exonuclease domain, and a catalytic domain comprised of the finger, palm, and thumb subdomains²⁶. In all cases, the duplex is well resolved and tightly bound between the palm and thumb subdomains (Supplementary Fig. 9).

Surprisingly, comparative structural analysis reveals an unexpected, closed conformation for the finger subdomain in all four of the TNAP structures. In this conformation, helices α14 and α15 are shifted by ~22° relative to their position in the open binary structure of wild-type Kod DNAP (Fig. 2a–e)²⁷. This finding contrasts with all known binary structures of replicative DNAPs, which typically exhibit an open finger conformation ready to accept the incoming triphosphate²⁸. Until now, the closed state has only been observed in structures of ternary complexes trapped in a catalytically competent state of the reaction cycle (Fig. 2b, c)^29,30.

**Fig. 2: The finger subdomain of TNA polymerases.**

While it is possible that one could consider this observation a crystallization artifact, several lines of evidence argue against this option. First, the TNAP variants (5–270, 7–47, 8–64, and 10–92) crystallized under two distinct precipitation conditions, suggesting that the conformation reflects a bona fide structural feature rather than a condition-specific artifact. Second, the closed binary conformation is absent in previous binary structures of Kod-RI³¹ and Kod-RSGA³² (Supplementary Fig. 11), further strengthening the view that the conformational difference arises from sequence changes rather than crystallization chemistry. Finally, the closed binary conformation emerged in variants that also exhibit changes in their biochemical activity, providing a functional correlation that supports its structural relevance.

To understand the molecular basis of this distinct conformation, we analyzed the interdomain interactions stabilizing the closed state. In the binary form of natural Kod DNAP, only limited contacts exist between the exonuclease domain and finger subdomain (Fig. 2f)²⁷, whereas noticeably more contacts are observed as Kod transitions to a closed ternary conformation, supporting a catalytically competent state with the dNTP poised for chemical bond formation (Fig. 2g)²⁶. Contrasting the natural system, the closed binary conformation of the TNAP intermediate observed in 5–270, 7–47, 8–64, and 10–92, exhibits a robust network of electrostatic and hydrophobic interactions that preorganize the enzyme in a catalytically active conformation despite the absence of a bound tNTP (Fig. 2h).

The structural similarity between the binary TNAP complexes and the closed ternary structure of 10–92 previously solved by X-ray crystallography is striking, particularly in the geometry of the active site, which differs only by the absence of substate-specific contacts (Supplementary Fig. 12)¹¹. Because the closed binary motif first appears in the 5–270 intermediate and is absent in the binary structure of Kod-RSGA³² (Supplementary Fig. 11), we infer that this conformational switch arose from recombination events that facilitated TNA synthesis by stabilizing the enzyme in a catalytically productive state. In this scenario, the finger subdomain exists in a dynamic equilibrium between the open and closed conformations, reminiscent of natural DNAPs²⁰. However, upon correct substrate binding and the ensuing conformational change¹⁹, the preorganized state would provide a selective advantage by biasing the enzyme toward the closed conformation, thus offering more time for catalysis to occur prior to substrate release.

The catalytically active state of the enzyme

To visualize how catalytic activity evolved within the TNAP lineage, we determined the crystal structures of key intermediates in a catalytically competent state with a closed finger conformation and a P/T duplex and tATP substrate bound in the enzyme active site. A key aspect of this effort was the chemical synthesis of a chain-terminating 2′-deoxy-α-L-threofuranosyl thymidine-3′-triphosphate (dtTTP) analog, prepared in 8 synthetic steps from a 3′-protected α-L-threofuranosyl thymidine (tT) precursor using a Barton deoxygenation strategy to remove the 2′ hydroxyl group (Supplementary Figs. 13–16)^33,34. This substrate enabled us to capture the reaction state immediately following tNTP binding, but before chemical bond formation. Using this approach, we successfully obtained high-resolution crystal structures for 5–270 and 8–84, resolving to 3.03 and 2.38 Å, respectively (Supplementary Table 3).

By combining this data with previously solved structures for Kod-RI³¹ and 10–92¹¹, the least and most optimized versions of the enzyme, respectively, a structural framework was established to study the biochemical changes that occurred along the evolutionary path. Superposition of the four TNAP structures (Kod-RI, 5–270, 8–64, and 10–92) reveals that a global divergence in structure occurred following recombination (Fig. 3a, e and Supplementary Fig. 17). The backbone Cα root mean square deviation (RMSD) relative to wild-type Kod DNAP increases from ~0.5 Å for Kod-RI to ≥1.0 Å for 5–270, 8–64, and 10–92, with movements of secondary structural elements observed in the thumb, palm, and NTDs. These changes correspond to a progressive remodeling of the active site pocket (Fig. 3b–f), which aligns with observed biochemical gains in TNA synthesis activity (Fig. 1d). Surprisingly, most of the 51 acquired mutations found in 10–92 are located >20 Å from the catalytic core (Fig. 3a and Supplementary Fig. 18), emphasizing the importance of long-range structural interactions in fine-tuning the active site geometry³⁵.

**Fig. 3: Stepwise changes in global and local structural conformation.**

Importantly, deep learning models failed to predict these transitions. Structure predictions by AlphaFold3 yielded only canonical B-family DNAP structures, with open binary conformations and no evidence of the experimentally observed rearrangements (Supplementary Figs. 19–23). This observation underscores a key limitation of current AI-based modeling, which is the inability to accurately capture long-range mutations and evolution-driven conformational shifts that underlie new protein functions. Future improvements in predictive accuracy will require experimental benchmarks and better mechanistic understanding of conformational plasticity during enzyme evolution.

Reshaping the active site pocket

Our previous crystal structure of the 10–92 TNAP trapped in a closed ternary complex revealed how recombination and molecular evolution enabled substantial active site remodeling to support efficient TNA synthesis¹¹. The selected mutations produced local backbone shifts, including high-energy rearrangements, that restructured the active site pocket to promote optimal hydrogen bonding, electrostatic stabilization, and van der Waals complementarity with the bound tNTP. In this conformation, the catalytic aspartate triad (D405, D541, and D543) is precisely arranged to coordinate metal ions and facilitate robust nucleotide transfer chemistry.

By integrating structural data from intermediates along the evolutionary path, we now provide a high-resolution, atomic-level view of how these changes unfolded. A central theme to emerge is the stepwise refinement of the active site geometry, beginning with a disruptive expansion of the active site volume followed by precise, substrate-specific remodeling. For example, the active site volume increases ~250% from wild-type Kod DNAP to Kod-RI, before contracting through 5–270 and 8–64 to a more compact volume of 692 Å³ observed in 10–92, which is approximately 10% smaller than the starting enzyme (Fig. 3c, d and Supplementary Fig. 24). This process of active site refinement allows the enzyme to accommodate the smaller size of the tNTP substrate, noting that TNA derives from a four-carbon threose sugar, as compared to the five-carbon sugar found in DNA³⁶.

At the local level, analysis of the catalytic triad reveals a clear stepwise refinement across the lineage. Overlaying the active site from successive TNAP generations shows that the catalytic residues undergo a series of positional and rotational movements, supported by Polder maps, that ultimately align the aspartate residues into a catalytic configuration closely mimicking the arrangement found in wild-type Kod DNAP (Fig. 3e, f; Supplementary Figs. 25 and 26). These refinements enable TNA substrate recognition while preserving the structural features essential for metal ion coordination and catalysis.

Our structural analysis reveals that base pair recognition and reaction center geometry emerged as two distinct evolutionary pressures shaping the evolutionary trajectory. The former is reflected in the conformation of the nascent Watson–Crick base pair formed between the incoming tATP and the templating thymine base. Following the active site distortion observed in Kod-RI, base pair geometry was largely restored in 5–270, as measured for the parameters of opening, propeller, and buckle (Fig. 4 and Supplementary Table 4). By comparison, the geometry of the reaction center improved more gradually across generations of the enzyme, with the distance from the C2′ group of the primer to the α-phosphate of the bound tATP steadily decreasing from 5.8–6.7 Å in 5–270 (Supplementary Fig. 27) to eventually 3.9 Å in 10–92. This final value closely approximates the ideal distance of 3.7 Å observed in wild-type Kod DNAP (Fig. 4b). Together, these findings underscore the evolutionary mechanism underpinning TNAP optimization; namely, early gains in fidelity correspond to improved base pair recognition, while later gains in catalytic activity emerge from a fine-tuning of the active site geometry. This structural and functional convergence culminates in a highly specialized enzyme whose performance reflects both macro-scale architectural adaptation and micro-scale atomic-level precision.

**Fig. 4: Structural optimization of the nascent base pair and reaction center.**

Mutational landscape underlying the activity of 10–92

The mutational landscape underlying the evolution of the 10–92 TNAPs reveals a clear shift from broadly distributed, largely conserved substitutions found in 5–270 to a set of later-arising, non-conserved mutations that cluster in the thumb subdomain and strongly impact TNA synthesis activity (Fig. 5 and Supplementary Table 5). This distribution is unsurprising as the mutations found in 5–270 are the result of homologous recombination, while those in 10–92 derive from focused mutagenesis¹¹. Structural analyses show that mutations occur across both surface exposed and buried positions, with most mutations residing far from the catalytic center (Fig. 5), again underscoring the importance of long-range interactions in shaping new protein functions³⁵. Computational free-energy predictions indicate that while early substitutions are generally neutral with respect to stability, most of the function-driving mutations obtained from directed evolution are destabilizing (Fig. 5), highlighting the trade-off between structural integrity and catalytic enhancement²³. Limited reversion analysis of later-stage mutations demonstrates that at least one function-driving mutation is present in each of the key intermediates (Fig. 5)¹¹. Together, these data suggest that polymerase specialization for TNA synthesis emerged through a process involving the accumulation of large numbers of mostly neutral mutations via homologous recombination that change the protein architecture, followed by a smaller set of adaptive mutations that refine catalytic activity at the cost of local stability.

**Fig. 5: Integrated analysis of 51 mutations acquired during the evolution of 10–92.**

Identifying the precise set of mutations responsible for improved fidelity is challenging because 5–270 carries 35 mutations, all derived from natural diversity. These substitutions collectively remodel the enzyme architecture, produce major structural alterations that reshape the active site, improve the geometry of the nascent Watson–Crick base pair of the incoming tNTP, and facilitate formation of the preorganized conformation observed in the binary structures. While this extensive remodeling likely underlies the fidelity gains observed in 5–270, the large number of changes makes it difficult to determine which mutations are adaptive drivers versus neutral passengers that simply accumulate through the process of directed evolution. Nevertheless, some mutations do appear to contribute disproportionately to these global changes, including the 381 L insertion and substitutions within the thumb subdomain (G602D and K672R). Future studies aimed at disentangling the functional role of these and other mutations will be critical for elucidating the mechanistic basis of the fidelity improvement observed early in the evolutionary path.

Interpreting the mutations responsible for improved catalytic activity is aided by a single-point reversion analysis performed on the ten amino acid mutations observed between 5–270 and 10–92¹¹. This analysis revealed that the top four most important mutations in terms of their contribution to catalytic improvement (D615G, A741P, T548P, and S493G) involve the gain of either a glycine or proline residue, while the fifth most important mutation (P550H) transitioned a proline to histidine (Figs. 5 and 6). Together, these changes highlight the importance of flexibility and rigidity as a mechanism for fine-tuning the activity of the catalytic center. While these mutations are located ~20 Å (except S493G, which is ~13 Å) from the active site, defined as the 2′ terminus of the primer, they have a profound impact on enzyme function. The D615G and A741P mutations, first observed in 7–47 and 8–64, respectively, directly contact the primer and template strands, respectively (Fig. 6; Supplementary Figs. 28 and 29). These interactions balance the need for the enzyme to recognize the duplex tight enough to allow for catalysis yet weak enough to facilitate efficient translocation. The T548P and P550H mutations, both observed in 10–92, induce structural changes in a loop that optimizes the position of the catalytic triad for efficient coordination to divalent metals (Fig. 6 and Supplementary Figs. 28–30), allowing for enhanced catalytic efficiency. Finally, the S493G mutation observed in 10–92 is located in the active site pocket (Fig. 6; Supplementary Figs. 28 and 29). This mutation appears to reduce suboptimal interactions between the enzyme and templating base, resulting in better substrate binding and catalysis.

**Fig. 6: Structural analysis of key mutations driving improved catalysis.**

Discussion

This study provides a high-resolution structural and biochemical view of how directed evolution shapes the structure and function of an enzyme to enable the synthesis of an artificial TNA polymer. By mapping the evolutionary trajectory of a highly efficient TNAP isolated from a molecular evolution study involving recombination and directed evolution¹¹, we uncover a series of coordinated molecular events that separate the emergence of fidelity from catalytic efficiency, two properties traditionally thought to be coupled activities in enzyme evolution^15,16,17.

Contrary to prevailing assumptions, we observe that fidelity and catalysis diverge early in the evolutionary process and are refined through distinct structural pathways. Fidelity improves rapidly across early variants, corresponding to restoration of Watson–Crick base pair geometry and nascent base pair stability. In contrast, gains in catalytic activity and altered substrate specificity appear only in later generations and correlate with progressive remodeling of the active site architecture and reaction center geometry. These findings suggest that ground-state and transition-state discrimination can evolve independently, offering a modular mechanism by which enzymes acquire new function. A possible reason for this distinction could be that transition-state optimization is more difficult to realize than ground-state optimization, as the free energy of transition-state complexes are much higher than those of ground-state complexes¹⁶.

A defining feature of the evolved polymerase lineage is the adoption of a preorganized, closed finger conformation in the absence of a bound nucleotide. This catalytically competent architecture, which first appears in intermediate 5–270, the most active variant identified after recombination and five rounds of positive selection for TNA synthesis activity, is stabilized by a network of interactions that shift the conformational equilibrium away from the open conformation commonly associated with the binary complex of replicative DNAP. This structural transition may represent a critical inflection point in the evolutionary process, enabling the enzyme to escape the kinetic penalties associated with accommodating the unnatural TNA substrate. Notably, this conformational switch was not predicted by deep learning-based structure prediction models (AlphaFold3), underscoring current limitations in modeling dynamic, evolution-driven enzyme conformations.

Thermal profiling further reveals that even extensive mutational remodeling, 51 mutations in the case of 10–92, does not compromise the structural integrity of the enzyme fold. Instead, the evolved polymerase retains full activity following prolonged exposure at 90 °C. We attribute this remarkable thermostability to the stringent selection environment imposed by the DrOPS screening workflow, which includes a high-temperature lysis step. These findings challenge the view that functional gains necessarily incur stability trade-offs and highlight the value of integrating environmental constraints into the selection process. Additionally, this observation supports the view that intrinsic protein folding stability is an important criterion for promoting protein evolvability²⁴.

The evolution of substrate preference provides deeper insight into the adaptive landscape navigated by the polymerase. Across the evolutionary trajectory, successive variants exhibited progressively higher TNA synthesis rates, while DNA synthesis activity remained predominant. A marked inversion in substrate specificity emerged only in variant 10–92 (Supplementary Fig. 7). The preference of 10–92 for TNA likely stems from its remodeled active site, whose reduced volume reflects structural adaptation to the smaller size of the TNA triphosphate. This inversion in specificity represents a clear case of functional divergence and illustrates how exploration of sequence space (the ensemble of all possible protein sequences of a given length) can yield enzymes that are not only structurally robust but also precisely tuned for distinct chemical tasks.

Taken together, our findings illustrate one example of how directed evolution can orchestrate long-range structural rearrangements and local active site refinement to generate an enzyme with tailor-made activity. The modular separation of fidelity and catalysis observed here offers a framework for understanding enzyme adaptation and informs future efforts to design polymerases for applications in synthetic biology and medicine. More broadly, this work underscores the importance of combining structural biology with evolutionary trajectories to illuminate mechanisms of molecular innovation that remain opaque to current predictive models.

Methods

Reagents

DNA oligonucleotides were purchased from Integrated DNA Technologies (IDT, Coralville, Iowa). TNA triphosphates were obtained by chemical synthesis previously^34,37. ThermoPol buffer, Q5 DNA polymerase, Xl1 Blue competent E. coli, BL21(DE3) competent E. coli, Taq DNA polymerase, NdeI and NotI restriction enzymes were purchased from New England Biolabs (Ipswich, MA). Qiaprep plasmid minikit was purchased from Qiagen (San Diego, CA). Amicon centrifugal filter, hen egg white lysozyme, and chemical reagents including dNTPs, sodium chloride, manganese chloride, magnesium chloride, Tris-HCl, isopropyl β-D-thiogalactoside (IPTG), 1,4-dithiothreitol (DTT), ampicillin, and ammonium persulfate (APS) were purchased from Sigma Aldrich (St. Louis, Missouri). TOPO TA cloning kit, ethylenediaminetetraacetic acid (EDTA), and Dynabeads M270 were purchased from Thermofisher Scientific (Waltham, Massachusetts). SequalGel UreaGel 29:1 Denaturing Gel System was purchased from National Diagnostics (Atlanta, GA). Tetramethyl-ethylenediamine (TEMED) was purchased from Bio-Rad (Hercules, California). Chromatography columns were purchased from Cytiva (Little Chalfont, United Kingdom). EvaGreen® dye was purchased from Biotium (Fremont, CA). Clear V-bottom 96-well plates were purchased from Greiner (Monroe, NC). Plastic 1.5 mL micro-centrifuge tubes were purchased from Sigma-Aldrich (St. Louis, MO). Crystallization screens and individual crystallization reagents were purchased from Hampton Research (Aliso Viejo, CA), NeXtal Biosciences (Holland, OH), and Molecular Dimensions (Holland, OH). The Mosquito crystallization robot was purchased from SPT LabTech (Covina, CA). pET-21b(+) plasmid was purchased from Novagen Technology (Glendale, CA). Whole plasmid and Sanger sequencing were performed using Plasmidsaurus (Los Angeles, CA) and Genewiz, respectively (San Diego, CA).

TNAP expression and purification for biochemical assays

Plasmids for TNAP expression and purification were generated previously³⁸. Briefly, a pGDR11 vector containing the TNAP of interest was transformed into XL1 Blue E. coli, grown, induced for expression, and harvested. Cells were resuspended and sonicated in TNAP buffer (10 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10% glycerol). The cell lysate was heat-treated at 70 °C for 1 h, cooled on ice for 1 h, and centrifuged (23,708 × g, 4 °C, 20 min). The clarified supernatant was treated with a final concentration of 0.5% (v/v) PEI for 15 min and centrifuged (23,708 × g, 4 °C, 20 min) to remove nucleic acids. Ammonium sulfate precipitation was performed (final concentration: 60% (w/v)), centrifuged (23,708 × g, 4 °C, 20 min), and the protein pellet was resuspended in equilibration buffer (10 mM Tris-HCl, pH 8.0, 50 mM NaCl, 10% glycerol). The TNAP was purified by hand on a 5 mL heparin affinity column and eluted by stepwise addition of buffers containing: 10 mM Tris-HCl, pH 8.0, 10% glycerol and increasing concentrations of NaCl (50, 100, 250, 500, and 750 mM). Eluted TNAP at 500 mM NaCl was pooled and concentrated to 100 μM for biochemical assays.

TNAP kinetics

TNAP kinetic measurements were performed by monitoring the reaction progress as the DNA primer-template duplex (PBS8: GTCCCCTTGGGGATACCACC and EM619: CCCACACCCTCCTATCGCTAAACACACACTTAATAAAGTTGGTGGTATCCCCAAGGGGAC, Supplementary Table 1) was extended with dNTPs or tNTPs to measure DNA or TNA synthesis, respectively, over time³⁹. For each time point, 2 individual reaction replicates (15 µL) from a single master mix poised under single-turnover conditions with equimolar concentrations of TNAP (0.2 µM) and pre-annealed duplex (0.2 µM each, heated at 90 °C for 5 min, cooled on ice) in 1x ThermoPol buffer [20 mM Tris-HCl, 10 mM (NH₄)₂S)₄, 10 mM KCl, 2 mM MgSO₄, 0.1% Triton X-100, (pH 8.8)] were pre-equilibrated for 5 min at 55 °C. The reactions were then initiated by adding a preheated triphosphates mixture (200 µM of each). The reactions were stopped at designated time points by plunging the reaction vessel into powdered dry ice. Once all the time points were collected, the frozen reactions were thawed on a cold plate by adding 15 µL (6 µM) EvaGreen® dye, previously identified as the optimal intercalating dye to monitor the synthesis on a DNA template³⁹. Each reaction (25 µL) was transferred to a clear V-bottom 96-well plate, and fluorescence intensity was measured (ex: 487/20 nm, em: 528/20 nm) using a Synergy Neo2 plate reader with the Gen5 v3.14.03 software (BioTek).

For both DNA and TNA synthesis, baseline fluorescence was measured using Kod (exo-), the parent polymerase with negligible TNA synthesis ability, which was used to collect a 0-s time point by freezing the reaction immediately after adding tNTPs. The maximum fluorescence value was obtained from a 15-min reaction with each polymerase. The raw fluorescence data from the plate reader were normalized by subtracting the baseline and dividing by the difference between the maximum (15-min reaction) and minimum (baseline) fluorescence values. The fluorescence data were then converted to nucleotides per polymerase³⁹ by multiplying the data with the conversion factor F = [Template] * L/[Polymerase], where L is the length of extension in bases, then plotted over time. Rates in nucleotides per minute are extracted as the slope of the linear range of each curve. Data were processed using Microsoft Excel v16.66.

TNAP fidelity

Fidelity analysis was performed in hydrogel particles⁴⁰. In brief, an aliquot of ~12 million hydrogel particles displaying the acrydite PBS8 primer (GTCCCCTTGGGGATACCACC, Supplementary Table 1) was placed into a 1.5 mL and washed twice with 100 μL 1x ThermoPol buffer. The hydrogels were then resuspended in 100 μL 1x ThermoPol buffer containing 2 μM 4NT.9G fidelity template (Supplementary Table 1) and 1% KF-6012 and heated at 95 °C for 3 min. The reaction mixture was then snap-cooled on ice for 5 min to promote primer-template annealing. To initiate TNA synthesis, 2 μL of tNTPs (5 mM stock) and 20 μL of TNAP (10 μM stock) were added, mixed, and then incubated at 55 °C on a ThermoMixer C for 1 h. After the reaction is complete, hydrogels were washed with 100 μL breaking buffer (10 mM Tris-HCl pH 7.5, 100 mM NaCl, 1 mM EDTA, 1% (v/v) SDS, 1% (v/v) Triton X-100) twice, incubated with 100 μL 100 mM NaOH (37 °C, 3 min) to strip the template, and neutralized with 100 μL breaking buffer. The hydrogels were resuspended in 100 μL 1x ThermoPol buffer containing 2 μM PBS7.PBS9 overhang primer (Supplementary Table 1) with a double nucleotide mismatch (TT-TT), 0.01% KF-6012, 3 mM magnesium sulfate and then heated at 95 °C for 3 min. The reaction mixture was then snap-cooled on ice for 5 min to promote primer-template annealing. To initiate the reverse transcription reaction, 10 μL of dNTPs (5 mM stock) and 20 μL of Bst-LF (10 μM stock) were added, mixed, and then incubated at 50 °C on a ThermoMixer C for 4 h. The cDNA was PCR amplified with Taq DNA polymerase using PBS8 and PBS9: CTTTTAAGAACCGGACGAAC as primers (Supplementary Table 1). PCR material was ligated into a TOPO vector using a TOPO-TA kit and cloned into DH5α competent E. coli cells. Individual clones were picked, grown in liquid LB media, and sequenced. DNA sequences were aligned and analyzed with the template using CLC Main software. Error rates were determined as μ_exp→obs = (#observed/#expected) × 1000. The total error rate was determined by summing the error rate for each substitution.

TNAP thermal challenge

5 × 1.5 mL Eppendorf tubes, each containing 100 μL Kod or TNAP (10 μM), were placed into an Eppendorf ThermoMixer set at 90 °C with a thermal lid to prevent evaporation. After 6 h, the tubes were spun at 16,3000 × g for 5 min at 4 °C and UV quantified using a NanoDrop instrument. TNA synthesis was assayed with a primer-template duplex (PBS8_short:/5IRD680/GTCCCCTTGG and EM619: CCCACACCCTCCTATCGCTAAACACACACTTAATAAAGTTGGTGGTATCCCCAAGGGGAC, Supplementary Table 1). Polymerase activity screen was performed with the following final concentrations: 1 μM primer-template duplex, 100 μM tNTPs, 1 μM heat-treated polymerase in a 20 μL reaction. After 30 min of incubation, a 1 μL aliquot of the reaction was quenched by the addition of 39 μL of quenching buffer and visualized by denaturing polyacrylamide gel electrophoresis with fluorescent imaging on a Li-Cor Odyssey CLx Imager, Image Studio Lite v5.2.

TNAP cloning, expression, and purification for crystallography

Full-length tnap genes in pGDR11 were PCR amplified using the TNAP-Fwd (ATCCATATGATCCTCGACACTGACTAC) and TNAP-Rvs primers (ACGCATGCGGCCGCTCAAGTTCCTTTCGGCGTCAG), Supplementary Table 1, containing NdeI and NotI restriction enzyme sites, respectively. C-terminal truncation constructs for all TNAPs, except for 8–64, were PCR amplified by substituting the TNAP-Rvs primer with TNAP-Rvs_760 (ACGCATGCGGCCGCTCACTTCTGGTAGCGCAGGTC, Supplementary Table 1); 8–64-Rvs_760: ACGCATGCGGCCGCTCAAGTTCCTTTCGGCGTCAG was used to amplify 8–64, which harbored acquired mutations at the C-terminal. All reverse primers contain the NotI restriction enzyme site. Purified PCR product of each gene construct and pET-21b(+) were digested with NdeI and NotI restriction enzymes and ligated, resulting in pET21–5–270, pET21–5-270_760 pET21–7–47, pET21–7–47_760, pET21–8–64, pET21–8–64_760, and pET21–10–92. Post verification by whole plasmid sequencing, all TNAP constructs were expressed and purified³¹. Briefly, BL21(DE3) cells harboring pET21-tnap were grown aerobically at 37 °C in LB medium containing 100 μg mL⁻¹ ampicillin. At an OD₆₀₀ of 0.8, expression of a tagless TNAP was induced with 0.8 mM isopropyl β-D-thiogalactoside at 18 °C for 20 h. Cells were harvested by centrifugation for 20 min at 3315 × g at 4 °C and lysed in 40 mL lysis buffer (10 mM Tris-HCl, pH 7/6.5-full-length/truncated TNAP, 100 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 10% glycerol, 5 mg hen egg white lysozyme) by sonication. The cell lysate was centrifuged at 23,708 × g for 30 min, and the clarified supernatant was heat-treated for 20 min at 70 °C and centrifuged again at 23,708 × g for 30 min. The supernatant was loaded onto 5 mL HiTrap Q HP and heparin HP columns assembled in series with the efflux of the Q column loaded in front of the heparin column. After washing with lysis buffer, the Q column was removed, and TNAP was eluted from the heparin column with a high salt buffer (10 mM Tris-HCl, pH 7/6.5-full-length/truncated TNAP, 1 M NaCl, 0.1 mM EDTA, 1 mM DTT, 10% glycerol) using a linear gradient. Eluted fractions containing TNAP were visualized by SDS-PAGE, pooled, and concentrated using a 30 kDa cutoff Amicon centrifugal filter. Further purification was achieved by size exclusion chromatography (Superdex 200 HiLoad 16/600) pre-equilibrated with TNAP buffer (50 mM Tris-HCl, pH 8.5, 200 mM NaCl, 0.1 mM EDTA, 1 mM DTT). Purified TNAP was concentrated to 20 mg mL⁻¹ for crystallization trials.

TNAP crystallography

The binary DNA template, T1: /5Cy5/AAACGTACGCAGTTCGCG, was HPLC purified sample, while the remaining oligonucleotides, P: CGCGAACTGCG and T2: TATGCACGTACGCAGTTCGCG, were ordered with standard desalting (Supplementary Table 1). The primer-template duplexes, P/T1 and P/T2 for the binary and ternary complexes, respectively, were prepared by combining equal parts of the respective primer and template strands in TNAP buffer and supplemented with 20 mM MgCl₂, heating at 95 °C for 5 min and slow cooling to 10 °C over 10 min.

All constructs were screened against ~900 conditions in a hanging-drop format using a Mosquito crystallization robot (SPT LabTech). Positive crystal hits were further optimized in 24-well hanging drop trays, with each drop consisting of 1 μL of sample mixed with 1 μL of mother liquor over 500 μL mother liquor in every well. The specific crystallization conditions for each TNAP complex are described below. Six diffraction datasets were collected at synchrotron sources (Advanced Light Source, Stanford Synchrotron Radiation Lightsource, and National Synchrotron Light Source II) from single crystals. Unless specified, images were indexed, integrated, and scaled using XDS⁴¹. Data collection statistics are summarized in Tables S2 and S3. Initial models were determined by molecular replacement using Phaser⁴²; for the binary complexes, the Kod-RSGA binary complex (7RSU) was initially used as the search. However, the 10–92 closed ternary complex (8T3X) was subsequently used due to the closed conformation of the binary structures. 8T3x was used as the search model for the ternary complexes. All final models were determined using iterative rounds of manual building through Coot⁴³ and refinement with phenix⁴⁴. The stereochemistry and geometry of all structures were validated with Molprobity⁴⁵ with the final refinement parameters summarized in Tables S2 and S3. Final coordinates and structure factors have been deposited in the Protein Data Bank.

Binary complexes were prepared by incubating full-length TNAPs (6 mg mL⁻¹) with 1.2 molar equivalents of the annealed P/T1 at 37 °C for 30 min. 5 M excess of tTTP and tATP were added and incubated at 37 °C for 30 min separately. The resulting 5–270 binary complex crystallized in 0.1 M calcium acetate, 0.1 M 2-(N-morpholino)ethanesulfonic acid pH 6, and 15% polyethylene glycol 400 while the remaining binary complexes crystallized in 0.06 M magnesium chloride hexahydrate, 0.06 M calcium chloride dihydrate, 0.1 M imidazole, 0.1 M 2-(N-morpholino) ethanesulfonic acid pH 6, 17.5–19% glycerol, and 7.5–9% polyethylene glycol 4000. The images for the 7–47 binary complex were indexed, integrated, and scaled using iMOSFLM.

Initial binary complexes were prepared by incubating truncated TNAPs (6 mg mL⁻¹) with 1.2 M equivalents of the annealed P/T2 duplex at 37 °C for 30 min. 3 M excess of dtTTP was added and incubated at 37 °C for 30 min. The resulting n + 1 binary complex was further incubated with 10 M excess tATP at 37 °C for another 30 min to yield final ternary complexes. 5–270 ternary complex crystals grew in 0.2 M sodium iodide, 20% polyethylene glycol 3350, and 0.1 M calcium chloride dihydrate, and 8–84 ternary complex crystals grew in 0.2 M sodium iodide, 20% polyethylene glycol 3350, and 0.1 M taurine.

TNAP structural analysis

Molecular visualizations, RMSD, and distance and torsion angle measurements were performed using licensed PyMOL (version 2.5.2; https://www.pymol.org)⁴⁶ and open-source PyMOL (version 3.1.0; https://github.com/schrodinger/pymol-open-source). Structural alignments for visualization and analysis in Figs. 3 and 4 were carried out using the “align” function in PyMOL, applied to the entire protein–nucleic acid complex. Polder omit maps were generated in Phenix using phenix.polder and visualized in PyMOL⁴⁷. Local base-pair and base-pair step parameters are computed using Web 3DNA⁴⁸. 2D interaction maps were assembled with data from Mapiya and DNAproDB. Pocket volumes were calculated using PyVOL.

Integrated mutational analysis

Sequence alignments used to generate the fitness landscape (Supplementary Fig. 3) were performed using the EMBOSS Needle algorithm⁴⁹ on the EMBL–EBI platform⁵⁰. Conservation was quantified with BLOSUM62 substitution scores⁵¹.

Functional effects of reversions to 5–270 were quantified from the 45-s primer extension assay as previously reported¹¹. Band intensities were measured by densitometry in ImageJ⁵² as the fraction of full-length product (top band) relative to the total pixel density of each lane. Activities were normalized to the performance of 10–92.

Mutational effects on folding stability were estimated using FoldX5⁵³ with the BuildModel command applied to a repaired Kod^exo- closed ternary structure (PDB ID: 5OMF). Reported ΔΔG values are in kcal mol⁻¹.

Distances between each mutated alpha carbon atom and the active site, defined by the sugar moiety of the dideoxy primer terminus (C2′ in TNA or C3′ in DNA), were calculated in PyMOL on ternary complexes of Kod, 5–270, and 10–92.

Relative solvent-accessible surface area (SASA) was computed on ternary structures using FreeSASA⁵⁴ with the default parameters. Values were reported as relative exposure (%) in ternary complexes of Kod, 5–270, and 10–92.

AlphaFold3 predictions

AlphaFold3⁵⁵ structural predictions were performed on the online server provided by Google DeepMind (https://alphafoldserver.com/) on 2024-08-16, before the initial public release of the 10–92 ternary structure (PDB ID: 8T3X) on 2024-08-28. Two AlphaFold3 runs (denoted AF3-1 and AF3-2) were performed for every evaluated structure, producing 5 models per run (models 0–4, denoted by subscripts). The binary structural predictions were obtained by inputting the sequence of residues 1–776 in the presence of a DNA primer (CGCGAACTGC), DNA template (AAACGTACGCAGTTCGCG) and 2 Mg²⁺ ions. Ternary structural predictions were obtained by inputting the sequence of residues 1–760 in the presence of a DNA primer (CGCGAACTGCGT), DNA template (TATGCACGTACGCAGTTCGCG), an ATP, and 2 Mg²⁺ ions.

2′-deoxy-α-L-threofuranosyl thymidine-3′-triphosphate (dtTTP) synthesis

Synthetic procedures

All moisture sensitive reactions were performed in anhydrous solvents under an argon or nitrogen atmosphere. All commercial reagents, solvents and anhydrous solvents were used without further purification. Reaction progresses were monitored by thin-layer chromatography using glass-backed analytical Silica Plate with UV-active F254 indicator. Flash column chromatography was performed with Silica Flash® P60 silica gel (40–63 µm particle size) for most of the crude reaction mixture. Yields are reported as isolated yields of pure compounds. ¹H, ¹³C, and ³¹P NMR spectra were analysed on 400 MHz NMR spectrometer (Bruker Avance NEO). ¹H values are reported in parts per million relative to Me₄Si or corresponding deuterium solvents used as an internal standard. ¹³C values are reported in parts per million relative to corresponding deuterium solvents used as an internal standards. ³¹P NMR values are reported in parts per million relative to an external standard of 85% H3PO4. Splitting patterns are designated as follows: s, singlet; d, doublet; dd, doublet of doublets; t, triplet; m, multiplet. High-resolution mass spectrometry data were acquired using the electrospray ionization time-of-flight method at the University of California, Irvine Mass Spectrometry Core Facility.

Chemical synthesis, step 1

To a mixture containing the activated nucleoside monophosphate 1³³ (240 mg, 0.6087 mmol, 1 equiv) and 1.2 equiv of 1-(2-(pyrenesulfonyl)ethyl)pyrophosphate (343 mg, 0.7305 mmol) was added, along with 8 equiv of a premade solution of ZnCl₂ (664 mg, 4.8670 mmol) in 3.2 mL anhydrous DMF under a nitrogen atmosphere. The mixture was stirred at room temperature for 5 h, and the reaction progress was monitored by HPLC (MeCN/0.05 M TEAB buffer, from 0% to 70% over 42 min). After consumption of the starting material, the product was precipitated by adding the reaction mixture dropwise to stirred diethyl ether (170 mL). The precipitate was collected by centrifuging at 20,000 × g for 5 min at room temperature. The pellet was resuspended in 30 mL 20% H₂O in MeCN containing 2% N, N-diisopropylethylamine (DIPEA) and centrifuged at 20,000 × g for 5 min, and the supernatant was collected. The above process was repeated two times, and the combined supernatants were evaporated under diminished pressure. The crude was loaded on a silica gel column (packed with 2% H₂O/isopropanol containing 1% DIPEA) by liquid loading (2 mL of 2% H₂O/isopropanol containing 1% DIPEA) with eluents 5%–10% H₂O/isopropanol containing 1% DIPEA and then 5% H₂O in (isopropanol/MeCN 1:1) containing 1% DIPEA. The fractions containing the product were collected and evaporated under diminished pressure at 30–40 °C to afford pure 240 mg (0.2758 mmol) fully protected nucleoside triphosphate 2. ³¹P NMR (162 MHz, D₂O) δ −11.93, −12.85 (d, J = 15.6 Hz), −23.10.

Chemical synthesis, step 2

A solution of fully protected nucleoside triphosphate 240 mg (0.2758 mmol) in 30–33% aqueous NH₄OH was stirred for 18 h at 37 °C in a sealed tube. After the reaction, the solvent was evaporated under diminished pressure. The solid was resuspended with MilliQ water (5 mL) and the aqueous solution was washed with DCM (10 mL) and EtOAc (10 mL). The organic portion was discarded, and the aqueous extract was collected and evaporated under diminished pressure. The crude solid was resuspended with 2 mL of RNAse free water, filtrated by 0.22 μm syringe filter and dropwise added to the forty times volume of acetone (80 mL) at room temperature containing NaClO₄ (390 mg, 3.1815 mmol, 15 equiv). The resulting suspension was centrifuged at 20,000 × g for 5 min at room temperature. The supernatant was discarded, and the pellet was washed with organic solution (acetone/DCM, 10:1, 2 × 30 mL) to afford the 1a (85 mg, 0.1577 mmol, 26% two-step yield). ³¹P NMR (162 MHz, D₂O) δ −8.09, −12.42 (d, J = 19.5 Hz), −22.52 (t, J = 19.5 Hz, 1H).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The crystallographic data generated in this study have been deposited in the Protein Data Bank under accession codes 9OAT, 9OAU, 9OAV, 9OAW, 9OAX, and 9OAY. The Protein Data Bank furthermore contains additional structures referenced in this work: 4K8Z, 5OMF, 7OMB, 8T3X. The kinetics, fidelity, and thermostability data generated in this study are provided in the Supplementary Information. Source data are provided as Source data files. Source data are provided with this paper.

References

Turner, N. J. Directed evolution drives the next generation of biocatalysts. Nat. Chem. Biol. 5, 567–573 (2009).
Article CAS PubMed Google Scholar
Buller, R. et al. From nature to industry: harnessing enzymes for biocatalysis. Science 382, eadh8615 (2023).
Article CAS PubMed Google Scholar
Davis, A. M., Plowright, A. T. & Valeur, E. Directing evolution: the next revolution in drug discovery? Nat. Rev. Drug Discov. 16, 681–698 (2017).
Article CAS PubMed Google Scholar
Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876 (2009).
Article CAS PubMed PubMed Central Google Scholar
Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).
Article CAS PubMed Google Scholar
Lo Surdo, P., Walsh, M. A. & Sollazzo, M. A novel ADP- and zinc-binding fold from function-directed in vitro evolution. Nat. Struct. Mol. Biol. 11, 382–383 (2004).
Article CAS PubMed Google Scholar
Chao, F. A. et al. Structure and dynamics of a primordial catalytic fold generated by in vitro evolution. Nat. Chem. Biol. 9, 81–83 (2013).
Article CAS PubMed Google Scholar
Blomberg, R. et al. Precision is essential for efficient catalysis in an evolved Kemp eliminase. Nature 503, 418–421 (2013).
Article CAS PubMed ADS Google Scholar
Suzuki, T. et al. Crystal structures reveal an elusive functional domain of pyrrolysyl-tRNA synthetase. Nat. Chem. Biol. 13, 1261–1266 (2017).
Article CAS PubMed PubMed Central Google Scholar
Yang, G., Miton, C. M. & Tokuriki, N. A mechanistic view of enzyme evolution. Protein Sci. 29, 1724–1747 (2020).
Article CAS PubMed PubMed Central Google Scholar
Maola, V. A. et al. Directed evolution of a highly efficient TNA polymerase achieved by homologous recombination. Nat. Catal. 7, 1173–1185 (2024).
Article CAS Google Scholar
Larsen, A. C. et al. A general strategy for expanding polymerase function by droplet microfluidics. Nat. Commun. 7, 11235 (2016).
Article CAS PubMed PubMed Central ADS Google Scholar
Vallejo, D., Nikoomanzar, A., Paegel, B. M. & Chaput, J. C. Fluorescence-activated droplet sorting for single-cell directed evolution. ACS Synth. Biol. 8, 1430–1440 (2019).
Article CAS PubMed PubMed Central Google Scholar
Nikoomanzar, A., Vallejo, D., Yik, E. J. & Chaput, J. C. Programmed allelic mutagenesis of a DNA polymerase with single amino acid resolution. ACS Synth. Biol. 9, 1873–1881 (2020).
Article CAS PubMed Google Scholar
Ravasio, R. et al. A minimal scenario for the origin of non-equilibrium order. Preprint at https://doi.org/10.48550/arXiv.2405.10911 (2025).
Tawfik, D. S. Accuracy-rate tradeoffs: how do enzymes meet demands of selectivity and catalytic efficiency? Curr. Opin. Chem. Biol. 21, 73–80 (2014).
Article CAS PubMed Google Scholar
Beard, W. A., Shock, D. D., Vande Berg, B. J. & Wilson, S. H. Efficiency of correct nucleotide insertion governs DNA polymerase fidelity. J. Biol. Chem. 277, 47393–47398 (2002).
Article CAS PubMed Google Scholar
Johnson, S. J. & Beese, L. S. Structures of mismatch replication errors observed in a DNA polymerase. Cell 116, 803–816 (2004).
Article CAS PubMed Google Scholar
Johnson, K. A. Role of induced fit in enzyme specificity: a molecular forward/reverse switch. J. Biol. Chem. 283, 26297–26301 (2008).
Article CAS PubMed PubMed Central Google Scholar
Berezhna, S. Y., Gill, J. P., Lamichhane, R. & Millar, D. P. Single-molecule Forster resonance energy transfer reveals an innate fidelity checkpoint in DNA polymerase I. J. Am. Chem. Soc. 134, 11261–11268 (2012).
Article CAS PubMed PubMed Central ADS Google Scholar
Wu, E. Y. & Beese, L. S. The structure of a high fidelity DNA polymerase bound to a mismatched nucleotide reveals an “ajar” intermediate conformation in the nucleotide selection mechanism. J. Biol. Chem. 286, 19758–19767 (2011).
Article CAS PubMed PubMed Central Google Scholar
Yu, H., Zhang, S. & Chaput, J. C. Darwinian evolution of an alternative genetic system provides support for TNA as an RNA progenitor. Nat. Chem. 4, 183–187 (2012).
Article CAS PubMed Google Scholar
Tokuriki, N. et al. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 3, 1257 (2012).
Article PubMed ADS Google Scholar
Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl. Acad. Sci. USA. 103, 5869–5874 (2006).
Article CAS PubMed PubMed Central ADS Google Scholar
Vallejo, D., Nikoomanzar, A. & Chaput, J. C. Directed evolution of custom polymerases using droplet microfluidics. Methods Enzymol. 644, 227–253 (2020).
Article PubMed Google Scholar
Kropp, H. M., Betz, K., Wirth, J., Diederichs, K. & Marx, A. Crystal structures of ternary complexes of archaeal B-family DNA polymerases. PLoS ONE 12, e0188005 (2017).
Article PubMed PubMed Central Google Scholar
Bergen, K., Betz, K., Welte, W., Diederichs, K. & Marx, A. Structures of KOD and 9°N DNA polymerases complexed with primer template duplex. ChemBioChem 14, 1058–1062 (2013).
Article CAS PubMed Google Scholar
Steitz, T. A. DNA polymerases: structural diversity and common mechanisms. J. Biol. Chem. 274, 17395–17398 (1999).
Article CAS PubMed Google Scholar
Li, Y., Korolev, S. & Waksman, G. Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I: structural basis for nucleotide incorporation. EMBO J. 17, 7514–7525 (1998).
Article CAS PubMed PubMed Central Google Scholar
Doublie, S., Tabor, S., Long, A. M., Richardson, C. C. & Ellenberger, T. Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 A resolution. Nature 391, 251–258 (1998).
Article CAS PubMed ADS Google Scholar
Chim, N., Shi, C., Sau, S. P., Nikoomanzar, A. & Chaput, J. C. Structural basis for TNA synthesis by an engineered TNA polymerase. Nat. Commun. 8, 1810 (2017).
Article PubMed PubMed Central ADS Google Scholar
Li, Q. et al. Synthesis and polymerase recognition of threose nucleic acid triphosphates equipped with diverse chemical functionalities. J. Am. Chem. Soc. 143, 17761–17768 (2021).
Article CAS PubMed ADS Google Scholar
Bala, S. et al. Synthesis of 2’-deoxy-alpha-l-threofuranosyl nucleoside triphosphates. J. Org. Chem. 83, 8840–8850 (2018).
Article CAS PubMed Google Scholar
Liao, J.-Y., Bala, S., Ngor, A. K., Yik, E. J. & Chaput, J. C. P(V) reagents for the scalable synthesis of natural and modified nucleoside triphosphates. J. Am. Chem. Soc. 141, 13286–13289 (2019).
Article CAS PubMed ADS Google Scholar
Gromiha, M. M. & Selvaraj, S. Importance of long-range interactions in protein folding. Biophys. Chem. 77, 49–68 (1999).
Article CAS PubMed Google Scholar
Schöning, K. U. et al. Chemical etiology of nucleic acid structure: the a-threofuranosyl-(3’->2’) oligonucleotide system. Science 290, 1347–1351 (2000).
Article PubMed ADS Google Scholar
Sau, S. P., Fahmi, N. E., Liao, J.-Y., Bala, S. & Chaput, J. C. A scalable synthesis of α-L-threose nucleic acid monomers. J. Org. Chem. 81, 2302–2307 (2016).
Article CAS PubMed Google Scholar
Nikoomanzar, A., Dunn, M. R. & Chaput, J. C. Engineered polymerases with altered substrate specificity: expression and purification. Curr. Protoc. Nucleic Acid Chem. 69, 75 (2017). 4.
Article Google Scholar
Nikoomanzar, A., Dunn, M. R. & Chaput, J. C. Evaluating the rate and substrate specificity of laboratory evolved XNA polymerases. Anal. Chem. 89, 12622–12625 (2017).
Article CAS PubMed ADS Google Scholar
Medina, E. L. & Chaput, J. C. Measuring XNA polymerase fidelity in a hydrogel particle format. Nucleic Acids Res. 53, gkaf038 (2025).
Kabsch, W. Xds. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 (2010).
Article CAS PubMed PubMed Central ADS Google Scholar
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).
Article CAS PubMed PubMed Central ADS Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010).
Article CAS PubMed PubMed Central ADS Google Scholar
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr D Struct. Biol 75, 861–877 (2019).
Article CAS PubMed PubMed Central ADS Google Scholar
Williams, C. J. et al. MolProbity: more and better reference data for improved all-atom structure validation. Protein Sci 27, 293–315 (2018).
Article CAS PubMed Google Scholar
The PYMOL Molecular Graphics System, Version 3.0 Schrodinger, LLC.
Liebschner, D. et al. Polder maps: improving OMIT maps by excluding bulk solvent. Acta Crystallogr. D Struct. Biol. 73, 148–157 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
Li, S., Olson, W. K. & Lu, X. J. Web 3DNA 2.0 for the analysis, visualization, and modeling of 3D nucleic acid structures. Nucleic Acids Res. 47, W26–W34 (2019).
Article CAS PubMed PubMed Central Google Scholar
Needleman, S. B. & Wunsch, C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970).
Article CAS PubMed Google Scholar
Madeira, F. et al. The EMBL-EBI job dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res. 52, W521–W525 (2024).
Article PubMed PubMed Central Google Scholar
Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA. 89, 10915–10919 (1992).
Article CAS PubMed PubMed Central ADS Google Scholar
Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).
Article CAS PubMed PubMed Central Google Scholar
Schymkowitz, J. et al. The FoldX web server: an online force field. Nucleic Acids Res. 33, W382–W388 (2005).
Article CAS PubMed PubMed Central Google Scholar
Mitternacht, S. FreeSASA: an open source C library for solvent accessible surface area calculations. F1000Research 5, 189 (2016).
Article PubMed PubMed Central Google Scholar
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Article CAS PubMed PubMed Central ADS Google Scholar

Download references

Acknowledgements

We wish to thank members of the Chaput lab for helpful comments and suggestions, and the staff at the Advanced Light Source (ALS), Stanford Synchrotron Radiation Lightsource (SSRL), and the National Synchrotron Light Source II (NSLS) for technical assistance. This work was supported in part by the National Science Foundation (2433788) Division of Chemistry of Life’s Processes (CLP) and Genetic Mechanisms (GM), and the University of California.

Author information

These authors contributed equally: Mohammad Hajjar, Victoria A. Maola, Joy J. Lee.

Authors and Affiliations

Department of Pharmaceutical Sciences, University of California, Irvine, CA, USA
Mohammad Hajjar, Victoria A. Maola, Joy J. Lee, Manuel J. Holguin, Riley N. Quijano, Kalvin K. Nguyen, Katherine L. Ho, Jenny V. Medina, Elionel Botello-Cornejo, Bhawna Barpuzary, Nicholas Chim & John C. Chaput
Department of Chemistry, University of California, Irvine, CA, USA
John C. Chaput
Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, USA
John C. Chaput
Department of Chemical and Biomolecular Engineering, University of California, Irvine, CA, USA
John C. Chaput

Authors

Mohammad Hajjar
View author publications
Search author on:PubMed Google Scholar
Victoria A. Maola
View author publications
Search author on:PubMed Google Scholar
Joy J. Lee
View author publications
Search author on:PubMed Google Scholar
Manuel J. Holguin
View author publications
Search author on:PubMed Google Scholar
Riley N. Quijano
View author publications
Search author on:PubMed Google Scholar
Kalvin K. Nguyen
View author publications
Search author on:PubMed Google Scholar
Katherine L. Ho
View author publications
Search author on:PubMed Google Scholar
Jenny V. Medina
View author publications
Search author on:PubMed Google Scholar
Elionel Botello-Cornejo
View author publications
Search author on:PubMed Google Scholar
Bhawna Barpuzary
View author publications
Search author on:PubMed Google Scholar
Nicholas Chim
View author publications
Search author on:PubMed Google Scholar
John C. Chaput
View author publications
Search author on:PubMed Google Scholar

Contributions

J.C. and N.C. conceived the study and designed the research plan. M.H., V.M., J.L., M.J.H., R.Q., K.N., K.H., J.M., E.B., and B.B. performed the experiments and analyzed the data. J.C. and N.C. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Nicholas Chim or John C. Chaput.

Ethics declarations

Competing interests

J.C., V.M., and the University of California-Irvine have filed a patent application (PCT/US24/11595) on the composition and activity of the 10–92 TNA polymerase. No financial competing interests are declared. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Vitor Pinheiro and the other anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Peer Review File (download PDF )

Reporting Summary (download PDF )

Source data

Source data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hajjar, M., Maola, V.A., Lee, J.J. et al. Directed evolution of a TNA polymerase identifies independent paths to fidelity and catalysis. Nat Commun 17, 925 (2026). https://doi.org/10.1038/s41467-025-67652-1

Download citation

Received: 25 July 2025
Accepted: 04 December 2025
Published: 19 December 2025
Version of record: 23 January 2026
DOI: https://doi.org/10.1038/s41467-025-67652-1