Main

Cells produce tens of thousands of distinct proteins from 20 amino acids1. Were these amino acids to polymerize in isolation and then fold, it would result in the formation of a heterogeneous population of products; the amino acid monomers do not encode enough information in their interactions alone to direct the assembly of so many specific proteins from the astronomically large catalogue of possible products2. Instead, biology assembles complex macromolecules from simple monomers with high precision templating processes—RNA transcription and protein translation—wherein sequence information is efficiently copied from a copolymer template into a newly produced daughter copolymer3. Mechanistically, this copying involves sequence-specific recognition interactions between template and daughter. Equally, however, these interactions must eventually be disrupted so that the daughter dissociates, allowing sequence-directed folding of the daughter4 and reuse of the template5,6,7,8.

Although biological templating relies upon enzyme-catalysed reactions, there has been wide interest in rationally engineering enzyme-free templating mechanisms to assemble specific molecules9. Many researchers seek to use templating to enhance reactions that have an otherwise low yield10,11. Others pursue templating as a pathway to synthesize new, complex, sequence-controlled polymers12,13 or even use biological polymers, such as DNA, as an easily synthesized template for directing combinatorial screenings to discover new materials and molecules with therapeutic potential14,15. More ambitiously, biologically relevant polymers are used as templates to understand the origin of life or engineer synthetic life6,16,17,18,19.

When designing enzyme-free templating systems, one of the biggest challenges, rather than efficient monomer recognition, is producing templates that act effectively as catalysts. To ensure a reliable copying system, the reaction of monomers must be slow in solution but occur rapidly and with high turnover in the presence of the catalyst template. To achieve this high turnover, the assembled products must be efficiently released from the template to ensure its reusability5,6. If the templates were not reusable, a new, highly specific template would need to be assembled for each product macromolecule, and creating the template itself would become a self-assembly challenge of similar magnitude to the assembly of the product, defeating the purpose of templating7,8.

The tendency of products to remain bound to catalysts is known as product inhibition and occurs for all types of catalyst20. If the product–catalyst complex’s lifetime is similar to that of the substrate–catalyst complex, the product will ‘compete’ with the monomers for binding. If product binding is irreversible, it will prevent catalysis altogether. Product inhibition is a particular challenge for templated assembly of dimers and longer molecules. After polymerization of the monomers, the now interconnected monomers typically bind the template more strongly due to cooperativity. Indeed, in simple models, the free-energy change of dissociation increases linearly with polymer length21. This cooperative effect results in stronger inhibition as the polymer length increases.

As a result of cooperative product inhibition, the construction of catalysts for enzyme-free templating has seen limited progress. Several dimer templating systems have been demonstrated, with varying degrees of product inhibition and catalytic efficiency17,19,22,23. These systems are, however, not generalizable to longer templates. They lack a mechanism for overcoming product inhibition while ensuring that the weakly binding, partially formed products do not prematurely detach from the template7. Alternatively, many groups have circumvented product inhibition by cycling external conditions to first favour growth on the template and then separation of product and template24,25,26,27,28. Others have created environments with a non-chemical supply of energy—temperature gradients18 or mechanical agitation29,30—that allow growth and separation. Although early life may have used templating driven by external conditions, such systems lack the versatility and flexibility of the autonomous, chemically driven templating processes, as observed in extant biology.

In this work, we implement, using solely DNA, sequence-specific, autonomous, chemically driven catalytic templating of DNA dimerization with low product inhibition. As is standard in DNA nanotechnology27,31,32,33,34, our products are held together by DNA base pairing, rather than covalent bonds19,26,28,29,30,35. Modular motifs based on DNA base pairing have been used to demonstrate remarkable functionality31,32,36, but common primitives are not well suited to information propagation by catalytic templating of molecular assembly, as defined in Supplementary Note 1, in the absence of non-chemical or non-autonomous driving of dissociation. To demonstrate catalytic templating, we employ a mechanism that diverts free energy from dimerization to weaken the bonds of the reacting monomers with the template. Most previous work that channels the free energy of dimerization in this way has had limited efficacy37 or involved templates that are not reusable catalysts38,39. Lewandowski et al. have demonstrated effective catalysis of assembly for products of up to four units in length but without the ability to selectively template the assembly of different sequences of monomers and hence propagate information40. By contrast, we demonstrate that a DNA-based template can perform sequence-specific catalysis of the formation of one out of nine different dimers in competition, with high turnover and low product inhibition. We then show that the mechanism can be embedded within a multistep network and used to template covalent bond formation. Moreover, the design is, in principle, generalizable to the templated copying of longer templates, further increasing the potential of DNA as a tool to efficiently explore the vast chemical space of sequence-defined polymers.

Results

Catalytic mechanism

The proposed system consists of two DNA reactions: toehold-mediated strand displacement (TMSD) and handhold-mediated strand displacement (HMSD). TMSD (Fig. 1a) is central to dynamic DNA nanotechnology33. The reaction involves three nucleic acid strands—an invader I, an incumbent C and a target R. Initially, R and C form a duplex separate from I. However, R is typically longer than C and presents a single-stranded ‘toehold’ overhang. I is complementary to the whole of R and can thus bind to the toehold and then displace C from its binding with R. The toeholds act as recognition domains for the displacement, with longer toeholds increasing the probability of successful displacement, resulting in an exponential increase in displacement rate with toehold length up to a plateau at around six to seven nucleotides at room temperature34.

Fig. 1: DNA strand displacement topologies, catalysis mechanism of the template and system design.
figure 1

a, TMSD. Binding to the toehold (t) domain in the target DNA strand (R) mediates displacement of the incumbent (C) by the invader (I). After displacement, the toehold is cooperatively sequestered in duplex IR. b, HMSD. When I binds to the handhold (h) domain in C, the effective concentration of I increases in the vicinity of R, enhancing displacement. The reversible nature of handhold binding allows IR to detach. c, The DNA-based catalytic templating system. The DNA monomers (MxL and Ny) can dimerize after binding to a DNA template (Txy), exploiting first toehold exchange (a TMSD variant) then HMSD. Dimerization between the monomers weakens the interaction with Txy, allowing MxNy to detach and for Txy to undergo another dimerization cycle. d, The specific-sequence domains of Txy can trigger the dimerization of a specific MxL, Ny pair from pools of monomers in solution. The result is a product distribution enriched in MxNy dimers with t and h domains (red boxes) complementary to Txy, propagating the sequence information in the template. Any x,y combination is possible, with the dimerization domain a initially hidden by L, inhibiting any direct reaction in the absence of Txy. The edges of the MxL duplex have additional bases—‘clamps’—suppressing any leak reactions. The two mismatched base pairs in the a domain of MxL ensure that dimerization is thermodynamically favoured. The DNA strands are represented by domains (contiguous sequences of nucleotides considered to hybridize as a unit). The domains are labelled with a lowercase letter; a prime symbol indicates complementarity; for example, a′ binds to a.

HMSD (Fig. 1b) is a recently proposed motif that adds new functionality to strand displacement networks38,41. It operates similarly to TMSD, but the initial recognition domain (the ‘handhold’) is in C rather than R. This change in topology is ideal for templating; the binding of I to R can be templated by the recognition between C and I, just as recognition between DNA and RNA nucleotides templates the polymerization of RNA during transcription. Furthermore, the binding of I to R disrupts the binding of R to C; the displacement process rips apart the CR duplex, allowing the IR duplex to spontaneously detach38.

By combining TMSD and HMSD, we construct a system in which a template can recognize molecules in solution and catalyses their dimerization. In our proposed templating system, shown mechanistically in Fig. 1c and schematically in Fig. 1d, monomers are drawn from two pools of DNA strands, labelled M and N. Both M and N monomers possess a long ‘dimerization’ domain and a short ‘recognition’ domain. These recognition domains (toehold in M and handhold in N) are variable, and we use Mx and Ny with x,y = 1, 2, 3 to differentiate the monomers. The ‘dimerization’ domains are independent of x and y and complementary, so any MxNy dimer can form with roughly equal stability.

Direct dimerization is suppressed by a ‘lock’ strand L. Instead, sequence-specific dimerization is catalysed by a template Txy containing recognition domains complementary to those on both Mx and Ny, and a dimerization domain complementary to the dimerization domain on Mx. As a result, as illustrated in Fig. 1c, Txy can first displace L from MxL via TMSD, and then Ny can displace Txy from its duplex with Mx via HMSD, releasing the dimer MxNy and completing the catalysis. The HMSD process is facilitated by a short ‘secondary’ toehold on Mx revealed during the first TMSD step (which is, therefore, formally a ‘toehold exchange’34).

Both the sequence-specific catalysis and the resultant concatenation of monomers are consistent with the definition of autonomous, chemically driven catalytic templating of assembly provided in Supplementary Note 1. Justification for the design at the level of sequences (Fig. 1d), including the use of mismatched base pairs to provide thermodynamic impetus while retaining catalytic control42, is provided in Methods.

Optimization of system design for catalytic turnover

The programmability of DNA-based engineering allows a systematic optimization of the dimerization mechanism. We will consider variation of two key features: the lengths of the primary toehold (4–8 nt) and handhold (6–10 nt). We will use the notation ut/vh to refer to a system with a primary toehold of u nt and a handhold of v nt. Further discussion of system design principles is given in the Methods.

To compare the different systems with variable toehold and handhold lengths, we consider their initial turnover frequency (TOF), estimated from the initial rate of their dimerization reaction per the amount of Txy catalyst present in the reaction solution43. The TOF indicates how effective the template catalyst T is at binding the substrates—via TMSD—converting them into a product—via HMSD—and then releasing that product. Specifically, we consider the recovery of fluorescence of fluorophore-labelled M1 as quencher-labelled L is displaced by the template T13 (Fig. 2a). Sustained catalytic turnover is observed when an excess of N3 is added to the mixture, and we use the resultant reaction rate to calculate the TOF.

Fig. 2: Initial TOF is optimized for toeholds and handholds of moderate length.
figure 2

a, The experimental setup. A small concentration of template (1 nM) is combined with a larger pool of M1L and N3 monomers (10 nM) for a range of primary toehold and handhold lengths. The catalytic turnover of M1L is reported by an increase in fluorescence signal. b, The example trajectories showing the concentration of reacted M1L over time, for a range of handhold lengths and a primary toehold of 6 nt. Increasing the handhold length above 9 nt results in a decrease of the M1L catalytic turnover due to increased product inhibition. These results illustrate how the overall reaction rate is a balance between the displacement and MxNy detachment from Txy. The leak reaction in the absence of template could not be detected. Its magnitude for a monomer concentration of 100 nM is shown for M1N3 in Fig. 3c, and for all monomer combinations in Supplementary Fig. 12. The concentration of reacted M1L is inferred from the fluorescence data as outlined in Supplementary Note 5. c, The initial rate of reaction per unit of template (TOF) for each primary toehold and handhold condition. An optimum is obtained for a system with a primary toehold of 6 nt and a handhold of 9 nt (6t/9h) (1.01 ± 0.03 h−1) followed by condition 6t/8h (0.622 ± 0.009 h−1).

As illustrated by the kinetics shown in Fig. 2b, 1 nM of T13 is capable of catalysing the transformation of high concentrations of M1L relative to the amount of template, indicating multiple rounds of catalytic turnover. The measured TOF reaches its maximum for 6t/9h at 1.01 ± 0.03 molecules turned over per hour per molecule of template. Here, the recognition domains are long enough to encourage binding to the template and high displacement rates but not so long that release of the product is slow, as happens for 6t/10h in Fig. 2b. It is notable from Fig. 2c that an increase in the primary toehold length tends to decrease the optimal handhold length and vice versa, indicating the importance of minimizing the cooperative binding of MN to T.

Characterization of resistance to product inhibition

The initial TOF metric ignores the effect of competitive product inhibition from the rebinding of free-floating products to the template. To test the inhibition resistance of the different designs, we ran the same experiments but with variable concentrations of preannealed M1N3 already present in the reaction mix. The results for primary toeholds of length 5–7 nt and handholds of 8–10 nt are plotted in Fig. 3a. From the initial TOFs of the resultant kinetics, we estimated the product concentration at which the reaction’s initial rate is halved (IC50) as a metric to compare product inhibition44. The results demonstrate the expected inverse correlation between product inhibition resistance and domain lengths, with the estimated IC50 of 11 nM for 6t/8h, 4 nM for 6t/9h and 2 nM for 6t/10h (Fig. 3b). We, therefore, select 6t/8h as our optimal design; we prefer 6t/8h relative to 6t/9h specifically due to the emphasis on suppressing product inhibition in this work, and an IC50 similar to the monomer concentrations indicates relatively weak, non-cooperative binding to the template by the product. A screening for all ut/vh conditions and their extracted TOF values are shown in Supplementary Fig. 15 and Supplementary Table 18, respectively.

Fig. 3: The optimal design of the HMSD-based catalyst experiences only moderate competitive product inhibition and achieves high turnover.
figure 3

a, The reacted monomer concentration [M1L] in a system with 10 nM N3, 10 nM M1L, 1 nM T and an initial non-fluorescent pool of products M1N3 at a range of concentrations \({[{M}_{1}{N}_{3}]}_{0}\). The condition 6t/8h, considered as optimal, is highlighted in red. b, The initial TOF at different \({[{M}_{1}{N}_{3}]}_{0}\) conditions for 6t/8h, 6t/9h and 6t/10h, obtained from the kinetics depicted in a. The symbols are centred on the best fit of the TOF to a single trajectory, with the height indicating a 95% confidence interval on that fit. Although 6t/9h has a higher TOF in the absence of [M1N3], 6t/8h combines rapid growth with a higher resistance to rate reduction at high \({[{M}_{1}{N}_{3}]}_{0}\). c, The turnover of M1L as inferred from fluorescence data, in experiments with 100 nM M1L, 100 nM N3 and variable concentrations of the template \({[{T}_{13}]}_{0}\) (6t/8h). A large proportion of M1L is observed to react, even for a concentration of \({[{T}_{13}]}_{0}\) 400 times lower than the number of monomers, reaching turnovers above 20 products per template. The template-free leak reactions are essentially negligible (0.32 ± 0.06 M−1 s−1) even compared with the lowest template concentration regimes. d, The initial rates of reactions from c and an additional set of replica experiments (Supplementary Note 7.2), as a function of \({[{T}_{13}]}_{0}\). The symbols are centred on the best fit of the rate to a single trajectory, with the error bars giving a 95% confidence interval on that fit. The red line is the linear fit of the system TOF (3.6 ± 0.3 h−1) to the 11 independent kinetic measurements, the red dashed line is the 95% confidence interval of that fit and the black dashed line is the untemplated rate for monomers at 100 nM = 0.012 ± 0.002 nM h−1.

To test whether the inhibition experienced by 6t/8h is strictly competitive (arising from the competition between M1N3 and M1L for binding to the template), we increased the initial concentration of monomers and products by an order of magnitude and catalysed the reaction with 2.5, 5 or 10 nM of template. The three tested conditions show that the IC50 is indeed independent of \({[{T}_{xy}]}_{0}\) and remains comparable with the concentration of the monomer (~100 nM) (Supplementary Fig. 16 and Supplementary Table 19).

A catalyst must not only act rapidly but must also be able to complete several catalytic cycles. We report catalytic turnover of the 6t/8h design using a very large monomer:template concentration ratio in Fig. 3c. The single templates can catalyse the assembly of at least 20–25 dimers per template. Moreover, the underlying leak rate in the absence of a template is very slow. The template-free control is consistent with a reaction rate of 0.32 ± 0.06 M−1 s−1 (Supplementary Table 16 and Supplementary Fig. 12), slower than previous measurements of toehold-free strand displacement34. The dimer production signal due to the presence of templates far exceeds this leak reaction, even when templates are only present at a ratio of 1:400 with the monomers. In addition, these experiments confirm that the reaction’s initial rate is proportional to the template concentration (fitted TOF for 100 nM monomers regime of 3.6 ± 0.3 h−1) (Fig. 3d).

Sequence-specific copying by templated dimerization

To demonstrate that the optimized 6t/8h design can perform information propagation by sequence-specific templating, we now consider mixtures with three distinct species per monomer type: Mx, Ny with x,y = 1, 2, 3. The monomers of the same type share the same dimerization domain but have different template recognition domain sequences and different flourophore labels (Fig. 4a and Supplementary Note 2). Thus, there are nine possible MxNy of similar stability, associated with nine catalytic templates Txy. If each Txy templates the formation of only the dimer MxNy from a mixture of monomers, it will have successfully copied its sequence information. The individual characterization of the nine separate templated reactions is given in Supplementary Note 7.2. We evaluate sequence-specific templating both by real-time fluorescence monitoring and by post hoc gel electrophoresis.

Figures 4b and 5 illustrate the results of experiments in which a single Txy is mixed with all six monomer species. In Fig. 4b, we show polyacrylamide gel electrophoresis (PAGE) analysis of the system after 40 h of reaction; the gels demonstrate that information is accurately copied from template to product, with minor variability in speed and specificity probably resulting from differences in recognition sequences and the effects of different fluorescent labels. For each Txy, the expected MxNy band is visible, and its constituent monomer bands faded, with little evidence of unintended product formation.

Fig. 4: Information propagation by sequence-specific catalytic dimerization.
figure 4

a, The design of monomers to demonstrate accurate information propagation in catalytic dimerization. We consider three types of Mx, differentiated by their primary toehold, and three types of Ny, differentiated by their handholds. Fluorescent labelling, using poly(T) linkers of variable length, allows the identification of all MxNy complexes through gel electrophoresis. Templates Txy are intended to selectively template the formation of MxNy from a pool of all six monomer species. b, The fluorescent scan of gel electrophoresis demonstrating sequence-specific templating. Products control: the signal produced by each possible MxNy dimer produced by annealing 75 nM of each monomer. Templated reactions, the reaction mixture in which a low concentration of a single Txy (5 nM) is combined with 100 nM of each MxL monomer and 75 nM of each Ny monomer. The observed products and signal from unreacted monomers in each well after 40 h of reaction is consistent with the intended MxNy production. The false colours include: blue, Alexa 488; green, Alexa 546; red, Alexa 647; cyan, FRET 488/546; yellow, FRET 546/648; purple, FRET 488/648.

Fig. 5: Real-time kinetics of dimer formation for sequence-specific dimerization.
figure 5

A mixture of six monomers and a specific template Txy were mixed together to react. We plot inferred concentrations of eight of the nine possible dimers for all templates except T22 (M2N2 is not distinguishable from its constituent monomers via fluorescence alone). Note that the PAGE results in Fig. 4b show that M2N2 does indeed form as intended. Top: each dimer is represented by the same colour in each plot, indicated by the key. Middle: an estimate of the percentage of correct dimer formation after 24 h for each template.

The gel results are supported by the inferred concentrations from the real-time kinetics (Fig. 5). We inferred the concentrations of all products—except M2N2 due to its low signal-to-noise ratio—by deconvoluting fluorescence signals as described in Supplementary Note 5.2. A quantitative analysis of the kinetic data suggests an accuracy between 80–95% for each template product, despite the challenge of inferring product concentrations through fluorescence alone (Supplementary Results 2 and Supplementary Tables 2123). Assuming the production of M2N2 is similar to other species, as suggested by the PAGE data, we estimate that the mutual information between a uniform distribution of templates and the resultant products would be 2.45 bits, out of a maximum of 3.17 bits for perfect copying. The results for different reactant concentrations are collected in Supplementary Results 2.

The dimerization products are not inert and can participate in downstream processes. The ‘associative toehold’45 present in MxNy allows it to couple to essentially any DNA strand displacement network, as shown when non-catalytically produced MxNy dimers were used to trigger reporters in ref. 38. Here, we show that dimers formed by templating can participate in a subsequent templating reaction, leading to trimer assembly. To build such a system (Fig. 6a), we consider monomers Ax, By and Cz, with x,y,z = 1, 2 giving the recognition sequence identity. Ax and Cz have a similar form to Mx and Ny presented previously, and By molecules possess two dimerization domains, allowing the formation of a trimer. Ax and By are initially bound to lock strands, inhibiting template-free association. We show in Fig. 6b that two pairs of templates can catalyse the assembly of two distinct species of trimer, ABC111 and ABC222, from solutions of (A1LA, B1LB, C1) and (A2LA, B2LB, C2), respectively. Although the system was not optimized to avoid cross-reactivity, we also show that the products ABC111 and ABC222 are produced from a pool of all monomers only when the appropriate templates are added, albeit with some yield of intermediates and unintended products.

Fig. 6: Extension of templating to trimerization and covalent bond formation.
figure 6

a, A schematic of a trimerization process Ax + By + Cz → ABCxyz templated by two dimerization catalysts \({T}_{A{B}_{xy}}\) and \({T}_{B{C}_{yz}}\). Each stage is analogous to the catalytic dimerization cycles of Fig. 1. For simplicity, in this diagram, we have assumed template 1 first joins Ax and By before template 2 joins Cz to AxBy. b, A non-denaturing PAGE analysis of reaction products after mixing either A1, B1 and C1; A2, B2 and C2; or both, with various combinations of templates. The formation of intended products is minimal in the absence of the relevant templates but visible when the templates are present. The bands, including intermediates and unintended products, can be identified by comparing the fluorescence in three channels and migration speed to controls (Supplementary Results 5 and Supplementary Figs. 3335). c, A schematic illustrating the coupling of HMSD-based dimerization to the formation of a covalently linked dimer. The moieties for a click reaction (Cu-catalysed alkyne azide cycloaddition) are conjugated to monomers M1 and N1. d, Denaturing PAGE, which disrupts the duplex formed by HMSD, is used to detect which systems have formed covalent bonds. The monomers are converted to dimers after hybridization of M1 an N1 (‘M-alkyl + N-aza’) but not if binding is inhibited by the presence of lock strands (‘blocked M-alkyl + N-aza’). The action of a template T11 (at a ratio of 1:10 with M1L) during 24 h restores dimerization. Top: the labels indicate the initial concentrations of M1L and N1.

Biological templating typically uses non-covalent recognition interactions to template the formation of covalent bonds. In our default system, non-covalent interactions template the formation of a non-covalent bond held together by DNA base pairs. However, by adding functional groups that can undergo copper-catalysed alkyne azide cycloaddition to the end of monomers M1 and N1 (Fig. 6c), we are able to demonstrate the catalytic dimerization of a covalently linked product. Denaturing PAGE in Fig. 6d shows the formation of covalently linked product is negligible when the monomers are free in solution. However, covalent bond formation occurs once M1 and N1 have undergone dimerization via the HMSD method introduced here, even if template T11 is present at a relatively low concentration.

Discussion

Using our DNA-based dimerization system, which channels the free energy of dimerization into disrupting binding to a template, we demonstrated: (1) weak competitive product inhibition, (2) a catalytic reaction around 1,500 times faster than its leak rate under the conditions considered (5 nM template, 100 nM monomers), (3) a turnover of at least 20–25 reactions per template, (4) information propagation by highly specific molecular templating, with an accuracy of around 90% when selecting a single product from nine alternatives, (5) the incorporation of a templating reaction into a larger network and (6) templated covalent bond formation.

Comparing the performance of our DNA-based system with other synthetic, sequence-specific templating mechanisms is not straightforward, since the capabilities of those systems are often couched in the language of autocatalysis and self-replication. However, early biomolecular replication experiments16,22,23,36 typically show catalytic rates only a few times faster than spontaneous reactions, substantial product inhibition at low product concentrations or limited turnover per catalyst. More recent work17,19,37 has demonstrated improvements along various axes of this performance space. However, unlike previous designs, our HMSD-templating system is, in principle, extendable to longer templates. Crucially, the binding of Mx to the template is stable (Supplementary Results 4 and Supplementary Fig. 32) until that binding is disrupted by dimerization. Our template for dimerization can therefore be extended by adding more sites that look like the binding site of Mx, and copies could grow while remaining template-attached until they reach a final truncated binding site analogous to the binding site for monomer Ny in this work (Supplementary Note 8 and Supplementary Fig. 19). Recent theoretical work has demonstrated that a system of this kind can produce copies of longer templates7.

A major challenge when engineering catalytically controlled systems is that the reaction thermodynamics needs to be finely balanced,46 and it is difficult to engineer large thermodynamic driving forces without triggering unwanted leak reactions42. The relatively weak thermodynamic driving used here (the elimination of two mismatched base pairs) may limit yield in, for example, Fig. 5. In addition, it is challenging to ensure that the products are sufficiently metastable that slow product interconversion does not occur; we investigate this behaviour for the product M1N3 in Supplementary Results 3. Our work emphasises the importance in resolving these challenges when building templating systems.

The fundamental advantage of catalytic molecular templating is that a set of monomers can be combined into a combinatorially large number of products, with only a small amount of template required for a high yield. With dimerization, a small concentration of template can be added to a single master mixture of N different ingredients to produce a large amount of any one of O(N2) different products. For a template of length L, O(NL) products are possible. Catalytic templating therefore offers the potential to reduce the costs of assembling a diverse set of structures47 and facilitate combinatorial screening15,48,49,50, as each new product can use the same pool of monomers and requires only the introduction of a small amount of novel template.

The most promising applications of our work exploit these properties. First, the motif introduced here can be immediately coupled to conventional DNA-based strand displacement circuits, which have applications in diagnostics, molecular biophysics and unconventional computing31,32,33. In such circuits, our catalytic templating mechanism provides a simple way to amplify O(N2) different signals using only O(N) signal-processing components. Generalizing to longer templates—as proposed in Supplementary Note 8– would allow exponentially more signals at a linear cost. A simpler (albeit less powerful) alternative would be to generalize the mechanism in Fig. 6b to allow a series of dimerization catalysts to select specific products from a large ensemble.

Second, the DNA strands whose assembly we have templated are functionalized with fluorophores, resulting in assembly dependent fluorescence. Instead, they could be functionalized with other chemical groups that could perform some downstream task, such as selective binding, catalysis or assembly into large structures such as gels51 with properties determined by the templated sequence. A long term vision would be to assemble labelled oligomers that could then fold into a minimal synthetic analogue of a protein.

One can also use DNA to template covalent interactions directly between the functional groups themselves, with a view to generating combinatorial libraries of biologically, medically or chemically functional small molecules5,10,11,14,15. Here, we show the ability to template click reactions with high catalytic turnover, typically a major challenge in DNA-based templating of covalent chemistry37. The next steps include: incorporating other chemistries and going beyond dimerization to the formation of longer products.

An alternative to using DNA to template the assembly of organic molecules would be to use the organic molecules as templates themselves. Our work illustrates the principle that channelling of free energy into the disruption of template bonds is a viable mechanism for information propagation by catalytic assembly of molecules. An important question is whether this mechanism can be translated into other chemistries. Recent work has promisingly shown that a similar principle can be used to template tetrameric organic molecules, albeit without sequence specificity40.

Methods

System design principles

Since the product MxNy must detach from the template Txy, the overall reaction simply replaces a duplex in the MxL complex with an identical one in MxNy, with no extra base pairs formed in the process. As stated, the process would not have any thermodynamic drive pushing it towards dimerization. Therefore, to make the MxNy products more stable than the reactants, without substantially increasing the rate of template-free dimerization, we incorporate two mismatched base pairs in the MxL duplexes42. Each of these sequence mismatches destabilizes the MxL duplex by around 9 kBT relative to the MxNy product21. One of the mismatches is eliminated during the TMSD reaction and the second during HMSD (Fig. 1d).

Based on a previous study38, we introduced a secondary toehold of 2 nt, as it ensures fast HMSD while triggering negligible TMSD on its own. In addition, the final design of our system uses a dimerization domain as illustrated in Fig. 1d. This design has ‘clamp’ base pairs in MxL, not present in MxNy, to reduce the spontaneous displacement of L from MxL by Ny. We collect the results for an alternative design without the clamps in Supplementary Results 1 (Supplementary Figs. 20 and 21); this alternative design was slightly faster but suffered from a high dimerization rate in the absence of Txy.

To identify the possible MxNy dimers when performing sequence-specific dimerization, we label both Mx and Ny with fluorophores, as shown in Fig. 4a. As this labelling alone cannot unmistakably distinguish all nine complexes, we also give each monomer different lengths for the poly(T) linkers connecting the fluorophores to the monomers. The different linkers allow dimers to be identified during gel electrophoresis by a combination of their migration speed and strength of Förster resonance energy transfer (FRET).

DNA sequence design

DNA sequences that satisfy the principles outlined above and minimize undesired interactions during HMSD were designed with bespoke scripts using the NUPACK server (http://www.nupack.org)52,53. All strands were purchased from Integrated DNA Technologies with high-performance liquid chromatography purification and normalized at 100 μM in LabReady buffer. All sequences are listed by function in Supplementary Note 2 and Supplementary Tables 15.

Complex preparation

M x L duplex preparation for non-covalent dimerization

The MxL duplexes were annealed at a concentration of 2 μM of Mx with a 10% excess of its corresponding L to ensure the sequestration of every Mx. Annealing was performed in 100 μL of experimental buffer (Tris/acetate/EDTA 1× and 1 M NaCl, pH 8.3) and annealed by heating to 95 °C for 4 min and cooling to 20 °C at a rate of 1 °C min−1.

Complex preparation for trimerization

Trimerization monomers AxL and BxL were annealed at a concentration of 200 nM of each monomer with a twofold excess of its corresponding L. The total volume of 100 μl of the strands in experimental buffer was subjected to an annealing program identical to MxL for dimerization.

Complex preparation for covalent dimerization

The monomer M1-alkL was annealed at a concentration of 2.5 μM with a 50% excess of its corresponding L. The complexes were annealed as described for non-covalent dimerization. The buffer used for annealing was 100 mM NaH2PO4 buffer containing 200 mM NaCl.

Bulk fluorescence spectroscopy

Bulk fluorescence assays were carried out in a Clariostar Microplate reader (BMG LABTECH) using flat μClear bottom 96-well plates (Greiner) and reading from the bottom. The experimental protocols were based on those previously described in ref. 38. Each experiment consisted of the system’s kinetics and a set of complementary measurements. These complementary measurements quantified a fluorescence baseline and estimated the concentration of each species in the system from the fluorescence signal measured after sequentially triggering the reaction of all the species. More detailed protocols for the different experiments performed in this work are provided in Supplementary Note 4 (Supplementary Tables 914 and Supplementary Figs. 58). The unprocessed fluorescence signals are available in ref. 54 (see ‘Data availability’ and ‘Code availability’ sections).

Tests for optimizing the dimerization mechanism

The main tests on the kinetics of the dimerization mechanism were performed using the monomers M1 and N3 and the template T13. Unless stated otherwise, the kinetic results were obtained by tracking the fluorescence of the labelled strand M1 at 25 °C in a 200 μl volume. Typically, this fluorescence would increase due to the displacement of the quencher-bearing lock strand L in the presence of T13 and N3. Although we do not directly monitor M1N3 during these experiments, Fig. 2b demonstrates that N3 is required, as well as T13, for sustained M1L turnover, and we provide results for the kinetics of individual TMSD and HMSD steps in Supplementary Note 7.1. Product formation is monitored directly in subsequent experiments assessing catalytic specificity (Figs. 4 and 5).

The kinetics were recorded after injecting 50 μl of the reaction triggering species into 150 μl of experimental buffer containing the rest of the reactant species (pump speed 430 μl s−1). The final mixture was shaken for 3 s (double orbital, at 400 rpm). The injected and reacting volumes were previously preheated to the experiment temperature. Simultaneously, we recorded the fluorescence of a positive control (M1N3) and a negative control (experimental buffer). These controls were used to correct the measured fluorescence due to temperature and volume changes. The samples were contained in Eppendorf Lobind tubes, and the plate reader’s injector system was passivated by incubating with bovine serum albumin at 5% weight per volume for 30 min to maximize the concentration reproducibility during the assays55.

Strand M1 was labelled with Alexa Fluor 488 (excitation: 488/14 nm, emission: 535/30 nm) and strand L with FQ IowaBlack quencher. Every tested system was assayed at least three times, including modifications of either T13 or N3 concentrations to extract further information from the reaction kinetics in different regimes. The fluorescence signal was averaged for 100 flashes in a spiral area scan per data point. For experiments that lasted under 1 h, the kinetics were just averaged for 20 flashes per data point. The data from further experiments conducted during optimization, including assays on individual reaction substeps, are reported in Supplementary Note 7 (Supplementary Tables 1520 and Supplementary Figs. 1018).

Once the basic design had been optimized through tests of variants of M1, N3 and T13, we also measured the dimerization kinetics of other combinations of Mx, Ny and Txy for this optimized design. The results of these experiments are reported in Supplementary Note 7 (Supplementary Tables 15, 16 and 20 and Supplementary Figs. 11, 12, 17 and 18).

The additional monomer species used during these assays were labelled in the following way: M2L: Alexa Fluor 546 (excitation: 540/20 nm, emission: 590/30 nm) and Black Hole quencher-2; M3L: Alexa Fluor 647 (excitation: 625/30 nm, emission: 680/30 nm) and RQ IowaBlack quencher; N1: Alexa Fluor 647; N2: Alexa Fluor 546. To monitor as many species as possible, the experiments recorded these fluorescence signals and FRET resulting from the combinations of Alexa Fluor 488/46 (excitation: 488/14 nm, emission: 590/30 nm), Alexa Fluor 488/47 (excitation: 488/14 nm, emission: 670/30 nm and Alexa Fluor 546/47 (excitation: 540/20 nm, emission: 680/30 nm).

Kinetics of templated sequence-specific copying

The templating of specific products required more sophisticated bulk fluorescence assays. The experiment reported in Fig. 5 consisted of ten wells loaded with an intended concentration of 100 nM of each monomer M (M1L, M2L and M3L). Nine of these wells also contained 5 nM of one of the nine tested templates, with the tenth containing no template to track the mechanism’s leak reaction. The reaction in each of these wells was triggered with 50 μl of a solution of the N monomers (N1, N2 and N3), each of them at an intended concentration of 75 nM. The experiment also included another set of ten wells. Nine of them contained positive controls for each dimer, formed by annealing 75 nM of each combination of M and N strands. The last well was a buffer-only blank control. The raw fluorescence data for this experiment are present in Supplementary Fig. 8.

The sequence-specific copying experiments were initially performed using different concentrations of the monomers and templates (Supplementary Results 2 (Supplementary Figs. 2229)). The results are qualitatively similar, though an apparent slow conversion of M1N3 into M1N1 and M1N2 was observed when MxL was not initially present in excess of Ny. It appears that empty templates can also catalyze this interconversion reaction on slower timescales than the templating of assembly, reducing the apparent accuracy. This mechanism is further discussed in Supplementary Results 3 (Supplementary Figs. 30 and 31).

PAGE

Sequence-specific dimerization

After recording the kinetics of sequence-specific copying by templated dimerization, aliquots from the nine template conditions and the no-template control were loaded into a polyacrylamide gel. A second gel was produced as a reference, using the dimers assembled in the kinetics positive control and a freshly made mixture of all unreacted monomers (MxL at 100 nM and Ny at 75 nM) in the experimental buffer. The gel used was a precast Novex 10% 37.5:1 acrylamide:Bis-acrylamide gel in Tris/borate/EDTA buffer (TBE) 1× (Invitrogen). The samples were mixed with native gel loading dye solution 10× (Invitrogen), and 15 μl of the mixture was loaded in each well. The gels were run in an X-Cell SureLock electrophoresis chamber, using TBE 1× + 50 mM NaCl as running buffer and a program of 10 V cm−1 for 30 min and 15 V cm−1 for 90 min. The tank was kept in an ice bath during the electrophoresis to avoid sample heating. To avoid band distortions due to the difference in ionic strength between buffer (50 mM NaCl) and samples (900 mM NaCl), the loaded samples were left in the wells for at least 30 min before starting the electrophoresis56.

The gels were imaged with a Typhoon gel scanner (Amersham). The false colours reported in Fig. 4b correspond to the following fluorescence measurements: blue (excitation: 488 nm, emission: 525/20 nm), green (excitation: 532 nm, emission: 570/20 nm), red (excitation: 635 nm, emission: 670/30 nm), cyan (excitation: 488 nm, emission: 570/20 nm), yellow (excitation: 532 nm, emission: 670/30 nm) and purple (excitation: 488 nm, emission: 670/30 nm). The scanner gain was automatically optimized for each scan before imaging. The results reported in Fig. 4b and Supplementary Figs. 23 and 26 are an overlay of the six scans, represented individually in Supplementary Figs. 24, 27 and 29. The scans were merged using ImageJ after redefining the brightness adjustment range of the image to span from 10–100%, excluding the lower 0–10% range. This adjustment removed the background produced by the loading dye and the gel matrix in some channels, without removing any monomer or correct and incorrect product bands. The raw data for each scan are contained in ref. 54 (see ‘Data availability’ and ‘Code availability’ sections).

Trimer formation

The experiment reported in Fig. 6b consisted of several aliquots of 200 μl with 10 nM of each monomer (either A1L + B1L + C1, A2L + B2L + C2 or both sets). Where specified, the reaction volume also contained 2 nM of up to two different templates. The samples were incubated at 30 °C for 48 h.

Electrophoresis was performed in similar conditions to the sequence-specific dimerization. The main differences were the volume loaded, 5 μl + 2 μl of native gel purple loading dye solution 6× and the electrophoresis program: 11.25 V cm−1 for 90 min in TBE 1×. No ice bath was used due to the lower voltage and ionic strength of the running buffer.

The results reported in Fig. 6b are an overlay of three fluorescence scans (blue: 488 nm laser excitation, 525BP20 emission filter; green: 532 nm laser excitation, 570BP20 emission filter; red: 635 nm laser excitation, 670BP20 emission filter). Due to the lower quantity of DNA loaded in the gel, the brightness range had to be adjusted in ImageJ to span from 0–20%, to clearly observe the bands. The individual channels in pseudo-colour along with their composite are shown in Supplementary Figs. 33 and 34, and the raw data for each scan are contained in ref. 54 (see ‘Data availability’ and ‘Code availability’ sections).

The identity of the bands is inferred by comparing them to controls (Supplementary Fig. 35). For the controls, the relevant strands were directly annealed at 100 μl volume in experimental buffer (1× Tris/acetate/EDTA, 1 M NaCl, pH 8) with 200–400 nM strand concentrations by heating the solution at 95 °C for 5 min and then cooling it to 20 °C at a rate of 0.5 °C min−1.

Covalent dimerization

The experiment reported in Fig. 6d consists of annealed M1-alkL, mixed with N1-aza. Another aliquot was mixed in the absence of L strand and another in the presence of T11. The final concentrations used were: for [M1-alkL], 2,000 nM; [L], 1,000 nM; [N1-aza], 3,000 nM; and [T11], 200 nM. The three DNA aliquots were each diluted 2×, 4× and 6× ([M1-alk]: 1,000, 500 and 250 nM) to a total volume of 45 μl in the 100 mM NaH2PO4 buffer. A total of 1.5 μl of a Cu-premix was added to each of the nine 45 μl DNA aliquots. The Cu-premix was prepared by mixing 100 mM aqueous solutions of CuSO4 and tris(3-hydroxypropyltriazolylmethyl)amine (THPTA), with a proportion 1:2 and preincubating for 15 min. Afterwards, 1 μl of a 100 mM solution of ascorbic acid was added to each DNA aliquot, and they were incubated at 25 °C for 24 h.

After incubation, 3 μl of each sample was mixed with 3 μl of 1–2× Gel Loading Buffer II (Invitrogen), and they were incubated at 95 °C for 5 min. Denaturing PAGE was performed by loading 4 μl of the nine samples in a precast Novex 10% 37.5:1 acrylamide:Bis-acrylamide gel in 10% tert-butyl (Invitrogen) and running them at 22.5 V cm−1 for 40 min in 1× TBE buffer at a temperature of 65 °C. The 10% tert-butyl polyacrylamide gel was prerun at 6.25 V cm−1 for 30 min. After electrophoresis the gel was stained for 30 min in TBE 1× buffer containing 1× SYBR Gold. The gel was imaged with the Typhoon gel scanner (excitation: 488 nm, emission: 525/20 nm). The raw data for each scan are contained in ref. 54 (see ‘Data availability’ and ‘Code availability’ sections).

Calibration of fluorescent signals for real-time kinetics

Fluorescence calibrations were made for all the fluorescence species used in this work. The calibrations aimed to estimate the units of fluorescence produced per nanomolar of each complex containing a fluorophore-labelled species to quantify its concentration during the experiments. The calibration curves ranged from 15 to 150 nM, in 200 μl volumes, from stock solutions normalized at 100 μM. Additional calibrations tested the variation of fluorescence of M1 when bound to the template. The calibration protocol and results are given in Supplementary Note 3 (Supplementary Figs. 14 and Supplementary Tables 7 and 8).

Data processing

The data from bulk fluorescence experiments were corrected with each experiment’s positive and negative controls and transformed from fluorescence units to concentrations of the relevant species using fluorescence calibrations. The transformation procedures are described in detail in Supplementary Note 5 (Supplementary Fig. 9), and the scripts are available at ref. 54 (see ‘Data availability’ and ‘Code availability’ sections).

Data fitting

Simple models of reaction kinetics (Supplementary Note 6) were used to fit reaction rate constants to the processed data to give more information about system performance. In particular, we estimated the following quantities: the rate constant for binding of MxL to Txy (or the displacement of L from Mx by Txy), the rate constant for spontaneous leak reactions between MxL and Ny, the rate constant of the HMSD substep for M1N3 and the initial reaction rate for catalytic dimerization (TOF) for all templates used.

All fits were performed with MATLAB R2019a Optimization Toolbox. Supplementary Note 7 contains a detailed description of the fitting procedures, with the resultant fits tabulated in Supplementary Tables 1520 and illustrated in Supplementary Figs. 1018.