Abstract
Modern experimental methods in programmable self-assembly make it possible to precisely design particle concentrations, shapes and interactions. However, more physical insight is needed before we can take full advantage of this vast design space to assemble nanostructures with complex form and function. Here we show how a substantial part of this design space can be quickly and comprehensively understood by identifying a class of thermodynamic constraints that act on it. These thermodynamic constraints form a high-dimensional convex polyhedron that determines which nanostructures can be assembled at high equilibrium yield and reveals limitations that govern the coexistence of structures. We validate our predictions through detailed, quantitative assembly experiments of nanoscale particles synthesized using DNA origami. Our results uncover physical relationships underpinning many-component programmable self-assembly in equilibrium and form the basis for robust inverse design, applicable to various systems from biological protein complexes to synthetic nanomachines.
Similar content being viewed by others
Main
Programmable self-assembly holds enormous potential for the construction of complex nanostructures at scale. The past decade has seen the development of an array of advanced experimental platforms for designing and synthesizing particles with tunable and specific interactions that—guided by theoretical design principles—lead to the formation of precisely defined finite-sized nanostructures1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22. However, synthetic self-assembly is still no match for self-assembly in biology, which is capable of assembling a multitude of complex structures from shared components and can steer the assembly outcome based on external cues23. By contrast, the vast majority of work in synthetic programmable assembly starts with a single, static target structure, and achieving high-yield assembly often requires the creation of a large number of distinct and individually addressable particle species6,8,9,24,25. In addition to being highly uneconomical14,26,27, this severely limits our ability to assemble multiple structures simultaneously or design multifarious or reconfigurable assemblies10,28,29.
Moving beyond the design of a single static structure requires less restrictive binding rules. For example, Fig. 1 shows three target ring-like shapes (Fig. 1a) and a set of allowed bonds between four particle types (Fig. 1b(i)) that allow for the assembly of these shapes (Fig. 1c(i)). However, these binding rules are also compatible with 280 other structures (Fig. 1c(ii)), meaning that further design parameters are necessary to achieve any reasonable level of control over the assembly outcome. Fortunately, experimental platforms are able to do more than just control which bonds are allowed; many can independently adjust the binding energies of each bond type and introduce the particle species at different concentrations14,24,30. These binding energies and particle concentrations form a secondary design space (Fig. 1b(ii)), defined for fixed binding rules, which has not been systematically explored.
a, The design process often starts with the desired target shapes (here three rings) that should be assembled. b, (i) Binding rules that allow the assembly of the desired shapes from few components. Allowed bonds between the sides of the triangles are indicated by black lines. Grey sides are inert. The space of all the binding rules forms the (discrete) primary design space. (ii) For a given set of binding rules, it is possible to change the particle concentrations and binding energies, which together form the (continuous) secondary design space, illustrated here with sliders tuning the individual parameters. c, Assuming in-plane assembly with rigid binding, the binding rules shown in b allow the formation of the three desired target shapes (i) and 280 additional, off-target structures consisting of chimeras and incompletely assembled structures (ii). d, Experimental validation of yield calculations. Measured (points) and theoretical (lines) yields of all the observed structure shapes resulting from the triangular particles and binding rules shown on the top. The colours of the points and lines correspond to the colour outline around the seven observed structure shapes. The structure yields are shown as a function of the concentration cred of the red particle species. Every other particle species is supplied at concentration c0. Data are separated into two plots for better visibility. The error bars show the standard error of the measured yield, and are generally smaller than the plot markers (Methods).
In this paper, we show how to fully and robustly understand this secondary design space. More specifically, we show that equilibrium statistical mechanics implies the existence of a series of thermodynamic constraints that together have the mathematical structure of a high-dimensional convex polyhedral cone. The nature of this cone dictates theoretically allowed yields for a given choice of binding rules, enabling us to quickly and exactly determine whether a desired assembly outcome is possible or not. Furthermore, this polyhedral cone lets us identify ‘necessary chimeras’—off-target structures that are thermodynamically unavoidable given a particular target—and reveals low-dimensional relationships between the relative yields of coexisting structures.
To test the practical utility of our theory, we design and synthesize a set of DNA-origami particles, and perform quantitative experiments to self-assemble the set of ring-shaped structures shown in Fig. 1a. Without a priori knowledge of the binding energies or particle concentrations, and without modelling the details of the interactions, our theory is able to quantitatively predict the possible relative yields of the coexisting structures under various conditions. Together, our results demonstrate an internal logic to programmable self-assembly in equilibrium that leads to far-reaching predictions about physically possible assembly outcomes, which are independent of the microscopic details of the assembling particles. This broad generality makes our results the basis for a robust framework for economical inverse design in a wide range of experimental settings, from lock-and-key colloids18 to protein complexes20 and DNA nanoparticles6.
Equilibrium statistical mechanics predicts experimental yields
We consider the equilibrium self-assembly of finite-sized structures out of smaller programmable particles. As is the case in many experimental settings, we define a number of particle species (usually between 1 and 20), and specific short-ranged interactions lead to the formation of bonds only according to the binding rules (for example, Fig. 1b(i)). Although extensive experimental and theoretical work has focused on altering these binding rules to control the assembly process8,9,13,14,25,26,27,30,31,32, we will fix the binding rules and instead consider the impact of altering the binding energy of each bond type and the chemical potential (or equivalently, the particle concentration) of each particle species (Fig. 1b(ii)).
To proceed, we combine all the binding energies and chemical potentials into a single vector ξ, which we express in units of kBT, the Boltzmann constant multiplied by the temperature. Furthermore, let d be the length of this vector, that is, the combined number of independently adjustable binding energies and chemical potentials. Following a straightforward statistical mechanics formulation of the assembly outcome2,33,34,35,36, the equilibrium number density of a particular structure s is given by the mass action law34,37,38
where Ωs is a positive pre-factor related to the symmetry and entropy of s and depends on the system-specific details of the binding interactions. \({{\bf{M}}}_{s}\in {{\mathbb{N}}}^{d}\) is a vector listing the number of each particle species and bond type in s (Methods). Importantly, for particles assembling with short-ranged interactions, the parameters ξ only enter linearly in the exponential in equation (1), which has important consequences, as shown later.
We assume for now that the set \({\mathcal{S}}\) of all possible structures that are allowed by the binding rules and steric interactions is finite and has been computed, for example, through the methods in refs. 27,39,40. By summing over all the possible structures, we can then compute the equilibrium yield of every structure via
To demonstrate the real-world applicability of equations (1) and (2), we perform assembly experiments with triangular particles synthesized using DNA origami, which are discussed in detail below. Using this experimental platform, we realize the binding rules shown in Fig. 1d (top left), which (ignoring particle ‘colour’ and assuming perfectly rigid interactions without particle overlaps) allow seven distinct structure shapes to form. Transmission electron microscopy (TEM) micrographs of the observed structure shapes are shown in Fig. 1d (bottom left). The dependence of the structure yields on the concentration of the ‘red’ particle species cred is shown in Fig. 1d (right), comparing the experimentally measured absolute yields of each structure shape to theoretical yields computed using equation (2). Here, rather than using an exact value for the binding free energy (which is difficult to measure precisely), we instead find the best-fit value of the binding energy to be ε = 12.1kBT, which is in line with previous experimental estimates of the binding energy for DNA-origami triangles13,16 (Methods).
Thermodynamic constraints and their polyhedral structure
Although equations (1) and (2) enable us to predict structure yields from the design parameters, solving the inverse problem, that is, finding parameters that maximize the yield of the desired structure(s), is highly non-trivial. In fact, it is often unclear if high-yield assembly is possible at all. To understand when, why and how high-yield assembly is possible, we need to dive deeper into the mathematical structure implied by equation (1).
We begin with a simple but far-reaching observation. Achieving 100% yield for a target structure st requires a vanishing number density for all other structures. However, since the design parameters ξ only appear in the exponent in equation (1), this can happen only when \({{\bf{M}}}_{{s}^{{\prime} }}\cdot {\bf{\xi }}\to -\infty\) for all structures \({s}^{{\prime} }\ne {s}_{t}\). This means we must consider limits in the parameter space. We, thus, rewrite ξ as \({\bf{\xi }}=\lambda \hat{{\bf{\xi }}}\), where \(\parallel \hat{{\bf{\xi }}}\parallel =1\), so we can systematically take the limit \(\lambda\rightarrow\infty\). Note that this represents a seemingly diabolical limit, as the divergence of binding energies is accompanied by a corresponding divergence in equilibration times. Nevertheless, we will see that this limit implies thermodynamic constraints that have profound implications even as we pull back to experimentally relevant energy scales.
In the limit of large λ, the density of any structure s is given by
Importantly, the first case (\({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}} > 0\)) implies diverging particle concentrations, meaning that these limits cannot be physically realized and need to be excluded (Methods). Thus, for each structure s, there exists a thermodynamic constraint in the asymptotic limit given by
These constraints have an interesting geometrical structure. Each constraint slices the d-dimensional parameter space in half (Fig. 2a). As \(\lambda\rightarrow\infty\), the grey region is forbidden, whereas ρs vanishes in the blue region; only when \(\hat{{\bf{\xi }}}\) is placed on the (d – 1)-dimensional constraint plane at which \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}=0\) does the structure s assemble at a finite concentration in this limit. However, there are many such constraint planes—one for every structure allowed by the binding rules. As illustrated in Fig. 2b, these constraints work together to further restrict the allowed regions in the parameter space. Geometrically, this region forms a d-dimensional convex polyhedral cone, or a constraint cone, whose boundary is composed of constraint planes.
a, A constraint plane corresponding to a structure s, defined by \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}=0\), separates the parameter space into two half-spaces. The grey half-space in which \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}} > 0\) is physically forbidden in the limit of high λ, which restricts the allowed limiting directions to the space \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}\le 0\). b, The intersection of all the allowed half-spaces forms a convex polyhedral cone. Redundant constraint planes may ‘touch’ the cone, but the cone is unaffected by their presence. For clarity, this figure sketches a two-dimensional slice through a higher-dimensional space. c, A designable structure corresponds to every (d – 1)-dimensional face of the constraint cone. Lower-dimensional faces correspond to designable sets of structures. Structures corresponding to redundant constraints are not designable. A three-dimensional cartoon of the constraint cone is shown in the lower-left inset. d, Relationship between the faces of the constraint cone, and therefore, the designable sets can be visualized as a Hasse diagram, as sketched in the cartoon example here. Nodes in the diagram correspond to faces/designable sets and edges indicate a containment relation: a df-dimensional set is contained in a (df − 1)-dimensional set if they are connected in the diagram.
This constraint cone allows us to understand exactly what can and cannot happen in the \(\lambda\rightarrow\infty\) limit. Placing the parameters in the cone’s interior (blue region) means that the number density of all structures goes to zero. However, if we align the parameters with one of the (d – 1)-dimensional faces of the cone, then the structure corresponding to that constraint plane—and only this structure—will assemble at a finite number density and, thus, achieve 100% yield. For example, placing \(\hat{{\bf{\xi }}}\) anywhere along the dark blue face in Fig. 2c will assemble the hexagonal structure at 100% yield, whereas placing \(\hat{{\bf{\xi }}}\) on the red face will assemble the triangular structure at 100% yield. We say that these two structures are designable.
However, not every structure is designable. The constraint plane corresponding to the rhomboid structure is shown by the dashed line and only intersects the cone at the intersection of the dark blue and red faces (Fig. 2c, purple dot). If we place \(\hat{{\bf{\xi }}}\) at this intersection, then all three structures will assemble with non-zero yield, meaning the rhomboid structure is not designable. Nevertheless, this observation allows us to expand the notion of designability to sets of structures that together can assemble at 100% yield. The three structures shown in Fig. 2c form such a designable set, and others can be found by looking at similar intersections of the constraint planes.
The intersections of various high-dimensional constraint planes are more complicated than in the simple two-dimensional illustration shown in Fig. 2a–c. In a d-dimensional parameter space, the constraint planes define the (d – 1)-dimensional faces of the cone, whereas the intersection of two such faces forms a (d – 2)-dimensional face. Furthermore, (d – 2)-dimensional faces can intersect to form (d – 3)-dimensional faces and so on until we arrive at the zero-dimensional ‘face’ at the origin (ξ = 0). Using tools from polyhedral computation (Methods), we can identify every face f of the constraint cone, which we organize in a so-called Hasse diagram according to each face’s dimensionality df (Fig. 2d). The faces of a convex polyhedron are always nested within each other, and the inclusion relations are visualized by the edges in the Hasse diagram.
The key insight is that each face, regardless of its dimensionality, corresponds to a set of structures that together are designable. These structures can be assembled at a combined 100% yield by aligning the parameters with the face and taking \(\lambda\rightarrow\infty\). Furthermore, this geometrical and combinatorial structure of the designable sets allows us to identify lower-dimensional design spaces—unbreakable rules created by statistical mechanics that govern relative yields within these designable sets in thermal equilibrium.
To see this, we choose an arbitrary face f. We can write any set of parameters ξ as
where \(\lambda {\hat{{\bf{\xi }}}}_{f}\) is the df-dimensional component of ξ that is parallel to f and ξ⊥ is the cf ≡ (d − df)-dimensional component that is perpendicular to f. For structures not in the corresponding designable set \({{\mathscr{S}}}_{f}\), the extra finite component ξ⊥ does not affect the assembly as\(\lambda\rightarrow\infty\) and the structures are still completely suppressed. For structures s in \({{\mathscr{S}}}_{f}\), however, the number density becomes
meaning that the number densities—and therefore the relative yields—within the set can be tuned by varying ξ⊥. In other words, the number densities of structures within a designable set \({{\mathscr{S}}}_{f}\) are controlled by a cf-dimensional space perpendicular to f. In practice, this means that the relative yields between structures can often be tuned with far fewer degrees of freedom than one would naively expect, because biasing one structure over another can only be done by tuning the design parameters of components that are not shared between them. This is especially relevant to sets of three or more structures, where any two structures might be tunable independently, but component reuse among all of the structures puts constraints on the relative yields of the whole set.
Importantly, equation (6) is valid for any λ, even far away from the asymptotic limit. Methods provides additional details on this and a related, but less powerful, construction for non-designable sets. We will now explore these ideas through a series of examples that will highlight specific insights, consequences and experimental implications.
Exploring the polyhedral structure through specific examples
A minimal example
We begin with a very simple example consisting of two particle species, each with their own chemical potential, which can bind as shown in Fig. 3a. Here all the possible bonds are of the same type, governed by a single binding energy ε, which, together with two chemical potentials, form a three-dimensional parameter space. These binding rules allow five distinct structures to form: blue and red monomers, a dimer, a trimer and a tetramer. Figure 3a shows the five constraint planes, defined by \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}=0\), from which considerable insight can be gained.
a, Constraint planes and a polyhedral cone derived from the simple binding rules (bottom right). The parameter space consists of the binding energy ε (defined here to be positive for attractive interactions and the same for all bonds; Supplementary Information, section 1.1) and the chemical potentials μ of the two particle species. Non-redundant constraints are shown in colour and redundant constraints are shown in grey. The white dotted line shows a limit direction parallel to the face f3. b, Faces of the polyhedral cone in a, visualized as a Hasse diagram. Every node corresponds to a polyhedral face, which, in turn, corresponds to a designable set of structures. The edges indicate containment relations. c, Section of the Hasse diagram corresponding to the complex, non-deterministic binding rules (top right). Designable sets containing free monomers are not shown here. d, Binding rules (top of (i)) lead to a large number of possible crystal phases. A cartoon sketch of a high-dimensional intersection of faces in the approximate constraint cone is also shown, which predicts that the chequerboard tiling and the tiling with holes can coexist. (ii) Simulation snapshot of a chequerboard tiling. (iii) Simulation snapshot of a tiling with holes. (iv) Simulation snapshot of a coexistence of chequerboard and tiling with holes. Supplementary Information, sections 1.2 and 1.3, provide details on the simulations.
First, notice that the region that satisfies all constraints—the constraint cone—is only bounded by three of the five planes. Therefore, the corresponding three structures, both monomers and the tetramer, are designable: each one can be assembled at high yield by aligning the parameters with the corresponding face (f1, f2 or f3) and taking the asymptotic limit. By contrast, the dimer and the trimer are not individually designable because their constraint planes (shown in grey) only touch the constraint cone at the lower-dimensional face r3. Assembling either structure at high yield, therefore, requires aligning the parameters with r3, but doing so makes it impossible to suppress the tetramer or blue monomer. No matter how energies or concentrations are chosen, the dimer and the trimer can never assemble alone. Instead, these four structures together form a designable set, corresponding to face r3. The other two designable sets are the two monomers (corresponding to r1) and the red monomer plus the tetramer (corresponding to r2); see the Hasse diagram shown in Fig. 3b.
Since differences in yield can only be created by tuning the design parameters of particles or bonds that are not common between structures, it is not surprising that compromises become unavoidable when components are shared. In this case, the difference in composition between the dimer and the trimer is one bound red particle—the same difference as that between the trimer and the tetramer. Trying to bias the trimer over the dimer by raising the binding energy or the concentration of red particles, therefore, necessarily also biases the tetramer over the trimer, thereby making it impossible to assemble the dimer or the trimer on their own. Although this is intuitively plausible in the minimal example here, our framework formalizes these ideas in a way that is applicable to much more complicated scenarios, dealing with high-dimensional parameter spaces and multiple target structures, which we now explore.
Reconfigurable assembly with complex binding rules
We now apply our framework to the complex binding rules shown in Fig. 3c, which were originally investigated in ref. 27 as an economical design for a specific structure shape. Here we show that these rules can do much more. Assuming uniform binding energies (not required for our theory, but convenient in many experiments) leads to four degrees of freedom: three chemical potentials and one binding energy. We enumerate all 677 structures that are compatible with these binding rules and steric interactions using the tools of ref. 27, and construct the constraint cone.
Investigating the constraint cone reveals that there are seven individually designable structures (this time, excluding free monomers), which are shown in the partial Hasse diagram in Fig. 3c. In practice, this means that an experiment can control which one of these structures assembles simply by tuning the particle concentrations, without having to redesign the interactions (Supplementary Fig. 2). This is, thus, an example of reconfigurable assembly, where one set of binding energies can lead to different assembly outcomes, depending on the external input. Supplementary Information, section 1.2, provides a detailed discussion.
Going further, the designable sets shown in the next (two-dimensional) level in the Hasse diagram tell us which of these seven structures can be assembled simultaneously. Notice that some of these designable sets also include a third, non-designable structure: these are unavoidable chimeras that cannot be suppressed if the other two structures should assemble together.
Coexisting crystals
Constructing the polyhedral constraint cone requires a complete enumeration of all the allowed structures. For binding rules that lead to crystallization or other bulk assemblies, such an enumeration is not possible. Nevertheless, we now show that non-trivial insight and predictions into such systems can be derived through an incomplete enumeration. To this end, we consider the binding rules shown in Fig. 3d(i), consisting of two particle species and eight independently tunable binding energies.
As described in Supplementary Information, section 1.2, we construct an approximate constraint cone by sampling small clusters and periodic unit cells, and identify 77 that are designable. Two of these designable unit cells are shown in Fig. 3d(i). Our theory, thus, predicts that high-quality crystals, with these unit cells, can be achieved by placing the parameters along the corresponding faces and taking the \(\lambda\rightarrow\infty\) limit. To verify this, we simulate crystal formation using a simple grand canonical Monte Carlo scheme similar to the one used in ref. 28 (Supplementary Information, section 1.3). Figure 3d(ii),(iii) shows sections of the resulting crystal phases, confirming high-quality crystallization with only minimal defects.
Furthermore, the constraint planes of the two tilings intersect in a lower-dimensional face without any unavoidable chimeras. Our theory, thus, predicts parameter values at which coexistence between these two bulk phases is possible, which we confirm through simulation (Fig. 3d(iv) and Supplementary Fig. 3). Although this is only the first step towards a full treatment of phase coexistence through the present framework, this example highlights the applicability of our theory to bulk condensation, and shows that a complete enumeration of all structures is not always required for the constraint cone to make accurate predictions.
Coexistence of ring-shaped structures
We now return to our introductory example shown in Fig. 1b, and analyse it using our developed framework. The four particle species interact with five bond types, which, assuming rigid and in-plane assembly without particle overlaps, lead to 283 different possible structures. Out of these, we focus specifically on the three closed rings (Fig. 1c(i)). We view all five binding energies as free parameters, leading to a (d = 9)-dimensional parameter space.
Constructing the constraint cone reveals that there are 21 individually designable structures, which include the hexagonal ring and the triangular ring. However, the rhombus is not individually designable, but can only be assembled as part of a larger set that contains all three rings together. This situation was already sketched qualitatively in Fig. 2c: the constraint corresponding to the rhomboid is redundant and only touches the constraint cone at a (df = 7)-dimensional face.
To understand the possible relative yields between the three rings, we follow our earlier discussion and decompose the parameter space into a subspace parallel to the face associated with the designable set and a subspace orthogonal to it. The orthogonal subspace governs relative yields, and is (cf = 2)-dimensional in this case. However, in this specific example, one of these two degrees of freedom only leads to a uniform rescaling of the overall number density (Methods), which leaves only a single degree of freedom to tune the relative yields between the three rings.
Figure 4a shows the relative yields of the three rings as a function of the relevant one-dimensional component of ξ, which we label ζ⊥. No matter how the energies and particle concentrations are chosen, thermodynamically allowed relative yields must follow these curves, whose shapes are determined by structure compositions and entropy (Methods).
a, Relative yield space of rhomboid (yr; blue), hexagon (yh; yellow) and triangular (yt; orange) rings. The relative yields ys = Ys/(Ytri + Yrho + Yhex) are shown as a function of the one degree of freedom ζ⊥ in the parameter space that affects the relative yields. b, Comparison between the experimentally measured (points) and theoretically predicted (lines) relative yields of the three rings. Since we do not know the value of ζ⊥ in experiments, yr and yh are shown as a function of yt. Empty symbols correspond to equal interactions, and full symbols correspond to enhanced binding between some bonds, as defined in the main text. Different data points are obtained at varying concentrations of the purple particle cp (Fig. 1b) and MgCl2 concentrations. The error bars show the standard error of the measured relative yields (Methods and Supplementary Information, section 2.1).
Experiments corroborate the polyhedral structure
We now introduce an experimental system to explore the real-world implications of our theory. We synthesize triangular particles using DNA origami, and implement the specific binding rules shown in Fig. 1a by extending single-stranded DNA from the triangles’ sides. Adjusting the sequences of these strands then allows us to program specific interactions by exploiting Watson–Crick base pairing, and folding monomers with various combinations of edge strands allows us to create different ‘colours’ of particles, or species (Supplementary Tables 6 and 7 provide the sequence information). We then perform assembly experiments by mixing the various particle types with concentrations between 10 nM and 50 nM for an individual particle type in a buffer that has 20–30 mM of MgCl2. By annealing these assembly mixtures from 40 °C to 25 °C with a ramp rate of 0.1 °C every 1.5 h, we ensure that we pass through the temperature at which the interactions are weak and reversible, and the structures can anneal. The goal of this annealing protocol is to allow enough time for the structures to assemble near equilibrium at high temperatures, and then, at lower temperatures, the ring assemblies become stable owing to the increased binding energy of the DNA-mediated interactions. After annealing is complete, we image the assembly result using TEM, allowing us to count different structure shapes to measure their relative yields.
Since the particles assemble in three dimensions with finite out-of-plane flexibility, there are many other possible off-target structures not taken into account in our enumeration (Fig. 1c). We, thus, expect the absolute yield of the target structures to remain low, especially since the experimentally achievable energy scales are not high enough to reach the asymptotic limit, where the complete suppression of all off-target structures becomes possible. On the other hand, our prediction that there is only a single degree controlling the relative yields of the three target structures is independent of the number of off-target structures or the asymptotic limit and, thus, constitutes a robust prediction that we now test.
To this end, we perform multiple experiments at different parameter values, changing the concentration of purple particles cp (Fig. 1b(i)) and the relative binding energies between particles. Two sets of binding energies were used, one where all the interactions are equally strong, and another where the purple–blue and purple–yellow interactions are weaker than the other interactions (Supplementary Fig. 5). Since the value of ζ⊥ (Fig. 4a) is unknown in the experiments, we replot the different yield curves of Fig. 4a against each other, showing the normalized yields of the hexagonal and rhomboidal rings as a function of the triangular ring. In this way, we can directly compare our theoretical results with experimental data.
Figure 4b shows excellent agreement between our theoretical expectations and experimental results. The lines show the theoretically predicted relative yields, whereas the experimentally measured relative yields are shown with different markers, according to the experimental parameter values. Every data point corresponds to between 50 and 150 countable structures, together constituting n = 1,817 structure counts in total (Supplementary Table 4). This striking quantitative agreement between theory and experiment, which is achieved without any fit parameters and with minimal assumptions on the particle interactions (Methods), confirms that the polyhedral structure of the thermodynamic constraints leads to robust experimental predictions.
Moreover, this agreement is obtained very far from the asymptotic limit, with absolute yields of the rings ranging from about 0.1% to 1.2% (corresponding to 1.3%–18% mass-weighed yield; Supplementary Information, section 2.1). This confirms the key finding of the paper: equilibrium self-assembly is governed by a polyhedral structure that restricts which yield combinations are allowed. For example, even with nine parameters, the three yields shown in Fig. 4 cannot be tuned independently. Our ability to not only understand but also precisely calculate this underlying polyhedral structure makes it a powerful tool for experimentally accurate inverse design.
Discussion
Although most work in programmable self-assembly focuses on tuning the binding rules, we have shown how to comprehensively understand the secondary design space consisting of binding energies and chemical potentials. For a given choice of binding rules, thermodynamic constraints form a high-dimensional polyhedral cone, and the faces of this cone determine which sets of structures are designable, that is, can theoretically assemble at high yield. Moreover, this framework identifies best-case scenarios for non-designable structures, necessary chimeras, low-dimensional design spaces and reconfigurable assemblies, thereby providing a robust and powerful tool for economical inverse design.
The excellent experimental agreement shown in Fig. 4 both validates a key pillar of our theory and demonstrates its utility in practice. However, assembly outcomes that are theoretically possible may still be challenging to realize experimentally, for example, owing to long equilibration times, or experimental bounds on achievable parameter values. The continued development of experimental capabilities will help to achieve these challenging high-yield assemblies. In addition, despite the success shown in Fig. 3d, our framework rests on the ability to identify important competing structures, and although the tools of ref. 27 are sufficient for our purposes, further work is needed to identify structures with internal stress or more robustly deal with bulk assemblies.
Our theory is derived from a single assumption: the free energy of a self-assembled structure is linear in the relevant design variables. In practice, this holds for binding energies and chemical potentials in systems assembling with short-ranged interactions; owing to the correspondence between the grand canonical and canonical ensembles in the thermodynamic limit41, our results also carry over to the canonical ensemble (Methods). This form of free energy is routinely encountered in self-assembly36; in addition to DNA origami, it has been used to describe systems as diverse as DNA-coated colloids40,42,43,44, colloids with depletion interactions2,18,45, virus capsids and other protein complexes34,46,47,48,49, and magnetic-handshake particles50, among others. This broad generality comes from the fact that the system-specific details of the binding interactions affect the structure entropy Ωs, but leave the essential scaling of the Boltzmann weight, \({\rm{e}}^{{{\bf{M}}}_{s}\cdot {\bf{\xi }}}\), unaffected. This means that the designability of (sets of) structures is independent of system details, whereas the precise shape of the relative yield curves, such as those shown in Fig. 4, may change depending on the details of the binding interactions. In our specific example shown in Fig. 4, we were able to predict the shape of the relative yield curves without modelling the microscopic interactions, because the shared ring-like topology of the structures causes the structure entropies to drop out of the relative yield calculations (Methods).
To see how our framework might be used in practice, start with one or more target structures with the goal of assembling them at high yield. For a proposed set of binding rules (which could be generated with the methods of refs. 26,27, for example), our calculations enable a comprehensive and near-instantaneous view of the secondary design space. We simply read out whether or not the target structures are designable, allowing us to quickly reject binding rules for which the best-case yield is insufficiently small. For example, we can immediately reject the binding rules in Fig. 1b if we want to assemble the rhomboid on its own. However, these binding rules are excellent if our goal is to dynamically switch between the hexagon and triangle, as this can be achieved by moving along the low-dimensional design space ζ⊥.
Our results raise a number of important and interesting questions. First and foremost, does self-assembly far from equilibrium conform to similar constraints? Although counterexamples surely exist, it will be interesting to learn whether the polyhedral structure provides a useful starting point for understanding and designing non-equilibrium systems. For instance, although many protein complexes can assemble in near-equilibrium conditions37,46,47,49,51,52, proteins themselves are the product of complex non-equilibrium processes23,53,54,55. Many such non-equilibrium processes are nevertheless constrained by thermodynamic aspects, which could potentially be described by our approach. Furthermore, has evolution exploited the polyhedral structure to better achieve robust and economical assembly? A better understanding of the dynamical processes that govern the approach to equilibrium or the drive away from equilibrium will help to address these questions more thoroughly, opening new doors to the design and control of dissipative pathways to synthetic assembly, and a better understanding of self-organization in living systems.
Methods
Yield calculations and their experimental validation
As discussed in the main text, we consider design parameters consisting of the chemical potentials μα of each particle species α and the binding energies εαβ,ij between sites i and j of particle species α and β, where a positive εαβ,ij indicates an attractive interaction. To simplify the notation, we collect the chemical potentials and non-zero binding energies in a parameter vector:
where nspc is the number of particle species and the nbnd allowed bonds are defined by the binding rules (for example, Fig. 1b). ξ has dimension d = nspc + nbnd. The vector Ms that appears in equation (1) counts the number of species and bond types present in structure s. If the structure contains \({n}_{s}^{\alpha }\) particles of species α, and \({b}_{s}^{\alpha \beta ,ij}\) bonds connecting binding sites i and j of species α and β, then
The pre-factor Ωs is the entropic partition function of a structure s, and is given by
where \({Z}_{s}^{{\rm{vib}}}\) and \({Z}_{s}^{{\rm{rot}}}\) are the vibrational and rotational partition functions33,34 of s, respectively; λD is the volume of a phase-space cell; and \({n}_{s}={\sum }_{\alpha }{n}_{s}^{\alpha }\) is the number of particles in s. The symmetry number σs counts the number of permutations of particles in s that are equivalent to a rotation of the structure33. The multiplicity coming from the permutations of identical particle species is taken into account implicitly through the chemical potentials27,34,41. If the microscopic interactions between particles are known, Ωs can be computed using the methods in refs. 33,34,35,44.
We assume Ωs to be independent of ξ, which is exactly true for simple interaction models, such as those in ref. 27, and a good assumption in general, since the effect of the parameters on structure entropy is much weaker than their contribution in the Boltzmann weight (equation (1)). For the DNA-origami particles that we study here, models of the DNA-mediated binding interactions suggest that the binding entropy and energy can, to some extent, be tuned independently17.
When enumerating the structures in this paper, we assume that bond stiffness is very high, such that building blocks can only fit together side by side and strained structures cannot form. This assumption could be relaxed27, in which case the partition function for a strained structure would carry an additional factor \({\rm{e}}^{-{\varepsilon }_{{\rm{strain}}}/kT}\), set by the strain energy εstrain. As long as this strain energy does not strongly depend on, or is linear in, the design parameters, our framework applies to strained structures as well.
Figure 1d shows that equilibrium yields, as predicted by equation (2), agree excellently with the experimental yields. We have realized the particles and binding rules shown in Fig. 1d (left) with DNA origami (the details of which are discussed below and in Supplementary Information, section 2), and we measured the yields of the seven possible structure shapes as a function of the concentration cred of the particle species shown in red. All other particle species were kept at cblue = cyellow = cpurple = c0, with c0 = 2 nM. We measure the yields by counting the occurrence of each structure shape in the TEM images (excluding occasional unidentifiable aggregates), as shown in Supplementary Fig. 6 and Supplementary Table 3. At each value of cred, we obtain between 300 and 700 structure counts, leading to n = 3,617 structure counts in total. The binding energies of all the four bond types were designed to be the same in the experiments. Because the precise value of the binding energy is unknown, we use it as a fit parameter in our theoretical calculations. Minimizing the least-squares error between the yields predicted by equation (2) and the experimental yields, while constraining the values of the chemical potentials to reproduce the experimental particle concentrations, we obtain optimal agreement at a fitted binding energy of approximately 12.1kBT, as discussed in the main text.
For the measured yields shown in Fig. 1 and Fig. 4, we compute the error bars as follows. For a given collection of structure counts ns, the experimental (relative) yield of structure s is defined as
where N = ∑sns is the total number of structure counts. Assuming that the counting errors are given by the square roots of the counts, we estimate the error of the yields via
Structure enumeration
Most of the results in this paper were obtained with help from the structure enumeration algorithm introduced in ref. 27, which is capable of efficiently generating structures in two or three dimensions that satisfy a given set of binding rules, assuming rigidly locking binding interactions. The algorithm enumerates all physically realizable structures that can be formed from the given binding rules, meaning that steric overlaps and bonds between incompatible binding sites are forbidden. Particle overlaps are easily detected for particles assembling on a lattice (as is the case in all systems shown here); more complex building blocks can, in principle, be represented as rigid clusters of spheres or triangulated meshes to find overlaps and contacts27.
The algorithm can enumerate roughly 10,000 structures per second, and all enumerations in this paper were performed in less than 1 s on a 2024 MacBook Pro. The algorithm makes it possible to quickly detect whether a given set of binding rules leads to infinitely many structures or not, and can optionally generate structures only up to a maximal size. It is, in principle, possible to extend the enumeration algorithm to enumerate structures with flexible bonds. This would require a model of the microscopic binding interactions between particles, and would lead to additional computational costs27.
It is important to note that our theory is completely agnostic towards what enumeration method is used, and in general, the best (that is, most efficient or convenient) method for generating structures depends on the system at hand. For example, very different algorithms have been used to enumerate rigid sphere clusters39,44,56, or conformations of colloidal polymers40.
Thermodynamic constraints
To see why diverging structure densities in the asymptotic limit are unphysical, note that for the parameters violating \({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}\le 0\), particle concentrations rise as λ is increased, meaning that to realize the asymptotic limit, more and more particles need to be added to the system. At some point, steric effects between structures, which are not explicitly modelled here, make it impossible to add more particles, which means that the chemical potentials cannot be raised further, making it impossible to reach the asymptotic limit.
This situation is similar to, but distinct from, other cases in which unphysical chemical potentials emerge, such as standard aggregation theory37, or, for example, degenerate Bose gases41. In these cases, unphysical chemical potentials arise owing to singularities in the partition function, whereas in our case (assuming a finite number of possible structures), the forbidden chemical potentials come from the imposition of the asymptotic limit in parameter space.
Designability in the canonical ensemble
In the main text, we have described the assembly process using the grand canonical ensemble, which is the most convenient for our yield calculations. In practice, however, assembly usually happens in the canonical ensemble, meaning that particle concentrations, and not chemical potentials, are held constant.
In the thermodynamic limit (meaning that there are a large number of particles in the system), which is a good approximation for most self-assembly experiments, the canonical and grand canonical ensembles are equivalent41. This means that the equilibrium state described in the grand canonical ensemble by some choice of chemical potentials μα is equivalent to the state described in the canonical ensemble by specifying the total particle concentrations ϕα, provided the chemical potentials and particle concentrations are related via
where the sum runs over all possible structures s, \({n}_{s}^{\alpha }\) is the number of particles of species α in structure s and ρs is the equilibrium number density in the grand canonical ensemble, given by equation (1). Numerically inverting this relationship to solve for μα as a function of ϕα then makes it possible to compute yields as functions of particle concentrations. This was done, for example, in Fig. 1d to match the experimental particle concentrations and in Supplementary Fig. 2 when computing the concentration-dependent yield.
Importantly, whether or not μα are prescribed directly (grand canonical ensemble) or obtained as functions of the particle concentrations (canonical ensemble) does not change the predictions about designability we make in the main text.
Designable sets and polyhedral faces
We now give more precise definitions to the concepts discussed in the main text. All of the following definitions are discussed at length in any reference on convex polyhedra, such as refs. 57,58,59,60.
Consider a polyhedron defined by a series of linear inequality constraints (equation (4)), with \(s\in {\mathscr{S}}\). A polyhedral face f is a subset of the polyhedron in which certain constraints are active (\({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}}=0\) for \(s\in {{\mathscr{S}}}_{f}\subset {\mathscr{S}}\)), and all other constraints are inactive (\({{\bf{M}}}_{s}\cdot \hat{{\bf{\xi }}} < 0\) for \(s\notin {{\mathscr{S}}}_{f}\)). The faces of a polyhedral cone can have any dimension from df = 0 to df = d. Comparing this definition with the definition of designable sets (equation (3)) shows that the directions in parameter spaces that lead to high-yield assembly for a designable set \({{\mathscr{S}}}_{f}\) correspond one to one with the polyhedral face f whose active equalities correspond to the structures in \({{\mathscr{S}}}_{f}\).
Faces of a polyhedron can be partially ordered by inclusion, and the resulting partially ordered set is called the polyhedron’s face lattice. The combinatorial properties of the face lattice give rise to the combinatorial properties of designable sets, which are visualized with the Hasse diagrams in the main text.
Perhaps the most important combinatorial property of designable sets is that the intersection of two designable sets is again designable. This is proved in Supplementary Section 1.4, and has an important consequence: for any group of structures, there exists a unique, smallest designable set that contains all of them. Optimizing the yield of a structure that is not designable by itself, therefore, consists of finding this minimal compassing designable set, and then tuning the relative yields to maximize the yield of the structure as much as possible. However, note that there is no guarantee that a given structure is part of any designable set beyond the ‘trivial’, d = 0 set containing all possible structures. In fact, for the reconfigurable square system discussed in Fig. 3c, only 123 out of 677 structures are part of a non-trivial designable set.
For structures within a designable set, their number densities can be tuned via equation (6), independently of the asymptotic limit. A similar relation is true for sets of structures that are not designable. Taking the intersection of their associated constraint planes again decomposes the parameter space into a part that is within this intersection, and its orthogonal complement. Tuning the parameters within this orthogonal complement changes the relative yields within the non-designable set, in full analogy to the designable case discussed in the main text. The key difference, however, is that the intersection of the constraint planes of a non-designable set of structures is not a face of the constraint cone. This means that it is not possible to tune the relative yields within a non-designable set independently of the set’s absolute yield, because there is no asymptotic limiting direction that suppresses all other structures.
In addition to these theoretical aspects, the polyhedral structure has important computational implications. Since all the constraints are linear, checking whether a (set of) structure(s) is asymptotically designable can be done exactly and efficiently (in polynomial time in the number of constraints) with linear programming61 (Supplementary Section 1.5), whereas the enumeration of designable sets can be achieved with algorithms from polyhedral computation, as shown below.
Polyhedral computation
We compute the Hasse diagram of the constraint cones with a simple version of the algorithm described previously62. In brief, we start by filtering the constraints for redundancies using cddlib63, which leaves us with all non-redundant constraints. We then use the double description algorithm64,65 to transform the cone from its inequality representation to its ray (vertex) representation. From this, we can construct an incidence matrix that indicates which rays are contained in which facets of the polyhedron. By generating the unique Boolean products of the columns of this matrix, we can iteratively construct all the faces of the polyhedron. This process takes less than a second for all the systems considered here.
Although the computational cost of computing the entire Hasse diagram scales exponentially with the number of non-redundant constraints and, therefore, becomes infeasible for large systems, it is important to note that the diagram can be generated layer by layer. This means that the ‘rightmost’ layers of the Hasse diagram, corresponding to the individually designable structures and small designable sets, can always be done rather quickly (more precisely, in polynomial time) simply through redundancy removal of the constraint60,66,67.
Predicting allowed relative yields
To find the direction in parameter space that allows tuning of the relative yields, we construct the matrix \({M}_{f}=[{{\bf{M}}}_{{s}_{1}}^{\rm{T}},{{\bf{M}}}_{{s}_{2}}^{\rm{T}}\ldots ]\), \({s}_{i}\in {{\mathscr{S}}}_{f}\), consisting of the composition vectors of the structures in the designable set \({{\mathscr{S}}}_{f}\) we want to tune. In our case, in Fig. 4, where we tune the relative yields of the three rings, we have \({M}_{{f}_{{\rm{rings}}}}=[{{\bf{M}}}_{{\rm{tri}}}^{\rm{T}},{{\bf{M}}}_{{\rm{hex}}}^{\rm{T}},{{\bf{M}}}_{{\rm{rho}}}^{T}]\). Computing the singular value decomposition of this matrix shows that it has rank two, equal to the co-dimension \({c}_{{f}_{{\rm{rings}}}}=2\) of the corresponding face of the polyhedral constraint cone, as predicted by our general theory in the main text. In other words, the number densities of the three ring structures are constrained and can only be tuned with two degrees of freedom, which we can calculate from the right singular vectors of \({M}_{{f}_{{\rm{rings}}}}\).
However, since here we are not interested in the number densities directly but in the relative yields between the structures, there is one more thing to consider: if there exists a direction in the parameter space that uniformly scales the number densities of all structures in the designable set, this direction cannot affect their relative yields, since uniformly scaling all the number densities leaves their ratios invariant. In this case, relative yields can only be tuned with cf − 1 degrees of freedom, one fewer compared with the number densities themselves.
This loss of an additional degree of freedom occurs often, but not always, and in general, depends on the specific designable set. For example, if all the structures in the designable set are tree-like, then the uniform scaling of number densities is always possible and one degree of freedom for relative yields is lost. To see this, note that every tree-like structure containing n particles contains exactly n − 1 bonds. This means that increasing all the chemical potentials by α and decreasing all the binding energies by the same amount uniformly scales the number densities of all structures by a factor \(e^{\beta\alpha}\).
More generally, we can determine if uniform scaling is possible by checking if the vector of ones, \({\mathbb{1}}={[1,1\ldots ]}^{\rm{T}}\), lies in the image of Mf. If that is the case, the direction in parameter space mapping onto \({\mathbb{1}}\) leads to a uniform scaling of the number densities, and relative yields can only be tuned with cf − 1 degrees of freedom. We can calculate the left-over degrees of freedom by projecting the direction corresponding to uniform scaling out from Mf, and then again perform a singular value decomposition. In our example shown in Fig. 4, performing this analysis reveals that uniform scaling of number densities is possible. Therefore, the relative yields between the three rings can only be tuned with a single degree of freedom (Fig. 4).
In our example shown in Fig. 4, we do not model the microscopic interaction between particles and, thus, cannot exactly calculate the value of Ωs for the three rings. We, therefore, estimate the rings’ entropic partition functions as
where veff is the effective volume a bound particle can explore if all other particles are held fixed and ns is the number of particles in structure s. Importantly, this effective volume can be absorbed into the binding energies ε, making them effective binding free energies \(\tilde{{\bf{\varepsilon }}}={\bf{\varepsilon }}+{k}_{{\rm{B}}}T\log [{v}_{{\rm{eff}}}/{\lambda }^D]\). This shows that veff can be compensated by changing ε, which means that it cannot affect the range of possible relative yields and that we do not need to estimate veff for the predictions we make in the main text. This is not true in general, but works in our case because the number of particles equals the number of bonds for all rings, causing veff to factor out. Therefore, in this approximation, the shapes of the relative yield curves are determined by the structure compositions and the symmetry numbers σhex = 6, σtri = 3 and σrho = 2. Interestingly, this strong dependence on symmetry numbers is a fairly common feature in small-cluster assembly, even if the vibrational entropy does not completely drop out of the calculation2. In the general case in which entropic contributions do not factor out, looking at relative yield curves could be a potential way of measuring the entropic contributions to the partition function, without having to measure binding energies or even particle concentrations directly.
Folding DNA origami
To assemble our DNA-origami monomers, we make a solution with 50 nM of p8064 scaffold (Tilibit), 200 nM of each staple strand (Integrated DNA Technologies and Nanobase structure 247 (ref. 68) for sequences) and 1× folding buffer. We then anneal this solution using a temperature protocol described below. Our folding buffer, referred to as FoBX, contains 5 mM of Tris base, 1 mM of EDTA, 5 mM of NaCl and X mM of MgCl2. We use a Tetrad (Bio-Rad) thermocycler to anneal our samples.
To find the best folding conditions for each sample, we follow a standard screening procedure to search multiple MgCl2 concentrations and temperature ranges13,69, and select the protocol that optimizes the yield of monomers and limiting the number of aggregates that form. All the particles used in this study were folded at 17.5 mM of MgCl2 with the following annealing protocol: (i) hold the sample at 65 °C for 15 min, (ii) ramp the temperature from 58 °C to 50 °C with steps of 1 °C per hour and (iii) hold at 50 °C until the sample can be removed for further processing.
Agarose gel electrophoresis
We use agarose gel electrophoresis to assess the folding protocols and purify our samples with gel extraction. We prepare all gels by bringing a solution of 1.5% (w/w) agarose in 0.5× TBE to a boil in a microwave. Once the solution is homogeneous, we cool it to 60 °C using a water bath. We then add MgCl2 and SYBR-safe (Invitrogen) to have concentrations of 5.5 mM of MgCl2 and 0.5× SYBR-safe. We pour the solution into an Owl B2 gel cast and add gel combs (20-μl wells for screening folding conditions or 200-μl wells for gel extraction), which cools to room temperature. A buffer solution of 0.5× TBE and 5.5 mM of MgCl2, chilled at 4 °C for an hour, is poured into the gel box. Agarose gel electrophoresis is run at 110 V for 1.5–2 h in a 4 °C cold room. We scan the gel with a Typhoon FLA 9500 laser scanner (GE Healthcare) at 100-μm resolution.
Sample purification
After folding, we purify our DNA-origami particles to remove all the excess staples and misfolded aggregates using gel purification. If the particles have self-complementary interactions, they are diluted 2:1 with 1× FoB2 and held at 47 °C for 30 min to unbind higher-order assemblies. The folded particles are run through an agarose gel (now at a 1× SYBR-safe concentration for visualization) using a custom gel comb, which can hold around 2 ml of solution per gel. We use a blue fluorescent light table to identify the gel band containing the monomers. The monomer band is then extracted using a razor blade. We place the gel slices into a Freeze ‘N Squeeze spin column (Bio-Rad), freeze it in a –20 °C freezer for 5 min and then spin the solution down for 5 min at 12,000g. The concentration of the DNA-origami particles in the subnatant is measured using Nanodrop (Thermo Scientific). We assume that the solution consists only of monomers, where each monomer has 8,064 base pairs.
Since the concentration of particles obtained after gel purification is typically not high enough for assembly, we concentrate the solution using ultrafiltration69. First, a 0.5-ml Amicon 100-kDa ultrafiltration spin column (Millipore) is equilibrated by centrifuging down 0.5 ml of 1× FoB5 buffer at 5,000g for 7 min. Then, the DNA-origami solution is added and centrifuged at 14,000g for 15 min. We remove the flow-through and repeat the process until all of the DNA-origami solution is filtered. Finally, we flip the filter upside down into a new Amicon tube and spin down the solution at 1,000g for 2 min. The concentration of the final DNA-origami solution is then measured using a Nanodrop.
Assembly experiments
Assembly experiments are conducted with DNA-origami particle concentrations ranging from 10 nM to 50 nM for the ring experiments (Fig. 4b), and 6 nM to 10.5 nM for the small-cluster experiments (Fig. 1c). Assembly solutions have volumes up to 30 μl with the desired DNA-origami concentration in a 1× FoB buffer with MgCl2 concentrations of 20 mM to 30 mM for the ring experiments and 20 mM for the small-cluster experiments. During the small-cluster experiments, the solution is kept at room temperature. For ring experiments, the solution is placed in a 200-μl PCR tube and loaded into a thermocycler (Bio-Rad), which is placed through a temperature ramp between 40 °C and 25 °C with a ramp rate of 0.1 °C per 1.5 h. We note that we see similar assembly outcomes for ramp rates of 0.1 °C per 0.75 h, with more kinetic traps occurring at faster rates. The thermocycler lid is held at 100 °C to prevent water condensation on the cap of the PCR tube.
Negative-stain TEM
We first prepare a solution of uranyl formate (UFo). We boil doubly distilled water to deoxygenate it and then mix in UFo powder to create a 2% (w/w) UFo solution. We cover the solution with aluminium foil to avoid light exposure and vortex it vigorously for 20 min, after which we filter the solution with a 0.2-μm filter. Last, we divide the solution into 0.2-ml aliquots, which are stored in a –80 °C freezer until further use.
Before each negative-stain TEM experiment, we take a 0.2-ml UFo aliquot out from the freezer to thaw at room temperature. We add 4 μl of 1-M NaOH and vortex the solution vigorously for 15 s. The solution is centrifuged at 4 °C and 16,000g for 8 min. We extract 170 μl of the supernatant for staining and discard the rest.
The electron microscopy samples are prepared using FCF400-Cu grids (Electron Microscopy Sciences). We glow discharge the grid before use at –20 mA for 30 s at 0.1 mbar, using a Quorum Emitech K100X glow discharger. We place 4 μl of the sample on the carbon side of the grid for 1 min to allow the adsorption of the sample to the grid. During this time, 5-μl and 18-μl droplets of UFo solution are placed on a piece of parafilm. After the adsorption period, the remaining sample solution is blotted on 11-μm Whatman filter paper. We then touch the carbon side of the grid to the 5-μl drop and blot it away immediately to wash away any buffer solution from the grid. This step is followed by picking up the 18-μl UFo drop onto the carbon side of the grid and letting it rest for 30 s to deposit the stain. The UFo solution is then blotted and any excess fluid is vacuumed away. Grids are allowed to dry for a minimum of 15 min before insertion into the TEM.
We image the grids using an FEI Morgagni TEM operated at 80 kV with a Nanosprint5 complementary metal–oxide–semiconductor camera (AMT). The microscope is operated at 80 kV and images are acquired between ×3,500 to ×5,600 magnification.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Design files and folding conditions of DNA origami used in this work are provided in the repository Nanobase68 and are accessible at https://nanobase.org/structures/247. All the TEM images and associated experimental data are available via Zenodo at https://doi.org/10.5281/zenodo.17314727 (ref. 70).
Code availability
Structure enumeration was performed using the Roly.jl27,71 (v.0.1.0) package developed by M.C.H. and C.P.G., which is available via GitHub at https://github.com/mxhbl/Roly.jl. Polyhedral computation and linear programming were performed using the freely available Convex.jl72 (v.0.16.4) package, Polyhedra.jl73 (v.0.7.8) package and cddlib63 (v.0.9.4) library. The example code reproducing the calculations done on the three rings and reconfigurable squares is available via GitHub at https://github.com/mxhbl/PolyhedralStructureOfSelfAssembly. An implementation of the lattice Monte Carlo sampler is available via GitHub at https://github.com/mxhbl/LatticeSampler. Fitting of the yield curves was achieved using the freely available Optim.jl74 (v.1.12.0) package.
References
Mirkin, C. A., Letsinger, R. L., Mucic, R. C. & Storhoff, J. J. A DNA-based method for rationally assembling nanoparticles into macroscopic materials. Nature 382, 607–609 (1996).
Meng, G., Arkus, N., Brenner, M. P. & Manoharan, V. N. The free-energy landscape of clusters of attractive hard spheres. Science 327, 560–563 (2010).
Wang, Y. et al. Colloids with valence and specific directional bonding. Nature 491, 51–55 (2012).
Wang, Y. et al. Crystallization of DNA-coated colloids. Nat. Commun. 6, 7253 (2015).
Rogers, W. B., Shih, W. M. & Manoharan, V. N. Using DNA to program the self-assembly of colloidal nanoparticles and microparticles. Nat. Rev. Mater. 1, 16008 (2016).
Jacobs, W. M. & Rogers, W. B. Assembly of complex colloidal systems using DNA. Annu. Rev. Condens. Matter Phys. 16, 443–463 (2025).
Park, S. H. et al. Finite-size, fully addressable DNA tile lattices formed by hierarchical assembly procedures. Angew. Chem. Int. Ed. 45, 735–739 (2006).
Wei, B., Dai, M. & Yin, P. Complex shapes self-assembled from single-stranded DNA tiles. Nature 485, 623–626 (2012).
Ke, Y., Ong, L. L., Shih, W. M. & Yin, P. Three-dimensional structures self-assembled from DNA bricks. Science 338, 1177–1183 (2012).
Evans, C. G., O'Brien, J., Winfree, E. & Murugan, A. Pattern recognition in the nucleation kinetics of non-equilibrium self-assembly. Nature 625, 500–507 (2024).
Bai, X.-c., Martin, T. G., Scheres, S. H. W. & Dietz, H. Cryo-EM structure of a 3D DNA-origami object. Proc. Natl Acad. Sci. USA 109, 20012–20017 (2012).
Funke, J. J. & Dietz, H. Placing molecules with Bohr radius resolution using DNA origami. Nat. Nanotechnol. 11, 47–52 (2016).
Hayakawa, D. et al. Geometrically programmed self-limited assembly of tubules using DNA origami colloids. Proc. Natl Acad. Sci. USA 119, e2207902119 (2022).
Hayakawa, D., Videbæk, T. E., Grason, G. M. & Rogers, W. B. Symmetry-guided inverse design of self-assembling multiscale DNA origami tilings. ACS Nano 18, 19169–19178 (2024).
Sigl, C. et al. Programmable icosahedral shell system for virus trapping. Nat. Mater. 20, 1281–1289 (2021).
Wei, W.-S. et al. Hierarchical assembly is more robust than egalitarian assembly in synthetic capsids. Proc. Natl Acad. Sci. USA 121, e2312775121 (2024).
Videbæk, T. E. et al. Measuring multisubunit mechanics of geometrically-programmed colloidal assemblies via cryo-EM multi-body refinement. Proc. Natl. Acad. Sci. USA 122, e2500716122 (2025).
Sacanna, S., Irvine, W. T. M., Chaikin, P. M. & Pine, D. J. Lock and key colloids. Nature 464, 575–578 (2010).
Huntley, M. H., Murugan, A. & Brenner, M. P. Information capacity of specific interactions. Proc. Natl Acad. Sci. USA 113, 5841–5846 (2016).
Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
Boyken, S. E. et al. De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016).
Niu, R. et al. Magnetic handshake materials as a scale-invariant platform for programmed self-assembly. Proc. Natl Acad. Sci. USA 116, 24402–24407 (2019).
Alberts, B. et al. Molecular Biology of the Cell 4th edn (Garland, 2002).
Murugan, A., Zou, J. & Brenner, M. P. Undesired usage and the robust self-assembly of heterogeneous structures. Nat. Commun. 6, 6203 (2015).
Zeravcic, Z., Manoharan, V. N. & Brenner, M. P. Size limits of self-assembled colloidal structures made using specific interactions. Proc. Natl Acad. Sci. USA 111, 15918–15923 (2014).
Bohlin, J., Turberfield, A. J., Louis, A. A. & Sulc, P. Designing the self-assembly of arbitrary shapes using minimal complexity building blocks. ACS Nano 17, 5387–5398 (2023).
Hübl, M. C. & Goodrich, C. P. Accessing semiaddressable self-assembly with efficient structure enumeration. Phys. Rev. Lett. 134, 058204 (2025).
Murugan, A., Zeravcic, Z., Brenner, M. P. & Leibler, S. Multifarious assembly mixtures: systems allowing retrieval of diverse stored structures. Proc. Natl Acad. Sci. USA 112, 54–59 (2015).
Osat, S. & Golestanian, R. Non-reciprocal multifarious self-organization. Nat. Nanotechnol. 18, 79–85 (2023).
Videbæk, T. E. et al. Economical routes to size-specific assembly of self-closing structures. Sci. Adv. 10, eado5979 (2024).
Hormoz, S. & Brenner, M. P. Design principles for self-assembly with short-range interactions. Proc. Natl Acad. Sci. USA 108, 5193–5198 (2011).
Videbæk, T. E. et al. Tiling a tubule: how increasing complexity improves the yield of self-limited assembly. J. Phys. Condens. Matter 34, 134003 (2022).
Klein, E. D., Perry, R. W. & Manoharan, V. N. Physical interpretation of the partition function for colloidal clusters. Phys. Rev. E 98, 032608 (2018).
Curatolo, A. I., Kimchi, O., Goodrich, C. P., Krueger, R. K. & Brenner, M. P. A computational toolbox for the assembly yield of complex and heterogeneous structures. Nat. Commun. 14, 8328 (2023).
Holmes-Cerfon, M., Gortler, S. J. & Brenner, M. P. A geometrical approach to computing free-energy landscapes from short-ranged potentials. Proc. Natl Acad. Sci. USA 110, E5–E14 (2013).
Israelachvili, J. N. (ed.) Intermolecular and Surface Forces (Academic Press, 2011).
Hagan, M. F. & Grason, G. M. Equilibrium mechanisms of self-limiting assembly. Rev. Mod. Phys. 93, 025008 (2021).
van Kampen, N. Stochastic Processes in Physics and Chemistry (Elsevier Science Publishers, 1992).
Arkus, N., Manoharan, V. N. & Brenner, M. P. Deriving finite sphere packings. SIAM J. Discrete Math. 25, 1860–1901 (2011).
McMullen, A., Basagoiti, M. M., Zeravcic, Z. & Brujic, J. Self-assembly of emulsion droplets through programmable folding. Nature 610, 502–506 (2022).
Kardar, M. Statistical Physics of Particles (Cambridge Univ. Press, 2007).
Valignat, M.-P., Theodoly, O., Crocker, J. C., Russel, W. B. & Chaikin, P. M. Reversible self-assembly and directed assembly of DNA-linked micrometer-sized colloids. Proc. Natl Acad. Sci. USA 102, 4225–4229 (2005).
Grünwald, M. & Geissler, P. L. Patterns without patches: hierarchical self-assembly of complex structures from simple building blocks. ACS Nano 8, 5891–5897 (2014).
Holmes-Cerfon, M. Sticky-sphere clusters. Annu. Rev. Condens. Matter Phys. 8, 77–98 (2016).
Perry, R. W. & Manoharan, V. N. Segregation of `isotope' particles within colloidal molecules. Soft Matter 12, 2868–2876 (2016).
Zandi, R., Schoot, Pvd., Reguera, D., Kegel, W. & Reiss, H. Classical nucleation theory of virus capsids. Biophys. J. 90, 1939–1948 (2006).
Perlmutter, J. D. & Hagan, M. F. Mechanisms of virus assembly. Annu. Rev. Phys. Chem. 66, 1–23 (2015).
Zlotnick, A. Theoretical aspects of virus capsid assembly. J. Mol. Recognit. 18, 479–490 (2005).
Sartori, P. & Leibler, S. Lessons from equilibrium statistical physics regarding the assembly of protein complexes. Proc. Natl Acad. Sci. USA 117, 114–120 (2020).
Du, C. X. et al. Programming interactions in magnetic handshake materials. Soft Matter 18, 6404–6410 (2022).
Traub, P. & Nomura, M. Structure and function of E. coli ribosomes. V. Reconstitution of functionally active 30S ribosomal particles from RNA and proteins. Proc. Natl Acad. Sci. USA 59, 777–784 (1968).
Davis, J. H. & Williamson, J. R. Structure and dynamics of bacterial ribosome biogenesis. Philos. Trans. R. Soc. B Biol. Sci. 372, 20160181 (2017).
Hopfield, J. J. Kinetic proofreading: a new mechanism for reducing errors in biosynthetic processes requiring high specificity. Proc. Natl Acad. Sci. USA 71, 4135–4139 (1974).
Ouldridge, T. E. & Wolde, P. R. T. Fundamental costs in the production and destruction of persistent polymer copies. Phys. Rev. Lett. 118, 158103 (2017).
Qureshi, B., Poulton, J. M. & Ouldridge, T. E. Information propagation in far-from-equilibrium molecular templating networks is optimised by pseudo-equilibrium systems with negligible dissipation. Preprint at https://doi.org/10.48550/arXiv.2404.02791 (2024).
Arkus, N., Manoharan, V. N. & Brenner, M. P. Minimal energy clusters of hard spheres with short range attractions. Phys. Rev. Lett. 103, 118303 (2009).
Ziegler, G. M. Lectures on polytopes. In Graduate Texts in Mathematics Vol. 152 (Springer, 1995).
Grünbaum, B. Convex polytopes. In Graduate Texts in Mathematics Vol. 221 (Springer, 2003).
Barvinok, A. A course in convexity. In Graduate Studies in Mathematics Vol. 54 (American Mathematical Society, 2002).
Fukuda, K. Polyhedral Computation (ETH Zurich, 2020).
Boyd, S. & Vandenberghe, L. Convex Optimization (Cambridge Univ. Press, 2004).
Fukuda, K. & Rosta, V. Combinatorial face enumeration in convex polytopes. Comput. Geom. 4, 191–198 (1994).
Fukuda, K. et al. cddlib. GitHub https://github.com/cddlib/cddlib (2020).
Motzkin, T. S., Raiffa, H., Thompson, G. L. & Thrall, R. M. in Contributions to the Theory of Games Vol. II (eds Kuhn, H. W. & Tucker, A. W.) 51–74 (Princeton Univ. Press, 1953).
Fukuda, K. & Prodon, A. in Combinatorics and Computer Science (eds Deza, M. et al.) 91–111 (Springer, 1996).
Clarkson, K. More output-sensitive geometric algorithms. In Proc. IEEE Annual Symposium on Foundations of Computer Science 695–702 (IEEE, 1994).
Avis, D. & Jordan, C. in Algorithmic Foundations for Social Advancement (eds Minato, S.-i. et al.) 209–221 (Springer Nature Singapore, 2025).
Poppleton, E., Mallya, A., Dey, S., Joseph, J. & Sulc, P. Nanobase.org: a repository for DNA and RNA nanostructures. Nucleic Acids Res. 50, D246–D252 (2022).
Wagenbauer, K. F. et al. How we make DNA origami. ChemBioChem 18, 1873–1885 (2017).
Hübl, M. C., Videbæk, T. E., Hayakawa, D., Rogers, W. B. & Goodrich, C. A polyhedral structure controls programmable self-assembly. Zenodo https://doi.org/10.5281/zenodo.17314727 (2025).
Hübl, M. C. & Goodrich, C. P. Roly.jl: Reverse-search polyform enumerator. GitHub https://github.com/mxhbl/Roly.jl (2024).
Udell, M., Mohan, K., Zeng, D., Hong, J., Diamond, S. & Boyd, S. Convex optimization in Julia. In Proc. 1st Workshop for High Performance Technical Computing in Dynamic Languages 18–28 (IEEE, 2014).
Legat, B. Polyhedra.jl. GitHub https://github.com/JuliaPolyhedra/Polyhedra.jl (2023).
Mogensen, P. K. & Riseth, A. N. Optim: a mathematical optimization package for Julia. J. Open Source Softw. 3, 615 (2018).
Acknowledgements
We thank B. Isaac and A. Tiano for their technical support with the electron microscopy and S. Waitukaitis for helpful comments on the manuscript. The TEM images were prepared and imaged at the Brandeis Electron Microscopy facility. This work was supported by the Gesellschaft für Forschungsförderung Niederösterreich under project FTI23-G-011 (M.C.H. and C.P.G.), the Brandeis University Materials Research Science and Engineering Center (MRSEC) under grant number NSF DMR-2011846 (T.E.V., D.H. and W.B.R.) and the Smith Family Foundation (W.B.R.).
Funding
Open access funding provided by Institute of Science and Technology (IST Austria).
Author information
Authors and Affiliations
Contributions
M.C.H. and C.P.G. conceived of and developed the theory. M.C.H. performed the numerical calculations. T.E.V., D.H. and W.B.R. designed the experiments. T.E.V. and D.H. conducted the experiments. All authors analysed the data and wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Physics thanks Thomas Ouldridge and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information (download PDF )
Supplementary sections 1 and 2, Supplementary Figs. 1–8 and Supplementary Tables 1–7.
Supplementary Table 1 (download XLSX )
DNA-origami staple sequences.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hübl, M.C., Videbæk, T.E., Hayakawa, D. et al. A polyhedral structure controls programmable self-assembly. Nat. Phys. 22, 294–301 (2026). https://doi.org/10.1038/s41567-025-03120-3
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41567-025-03120-3






