Quantifying complexity in DNA structures with high resolution Atomic Force Microscopy

Holmes, Elizabeth P.; Gamill, Max C.; Provan, James I.; Wiggins, Laura; Rusková, Renáta; Whittle, Sylvia; Catley, Thomas E.; Main, Kavit H. S.; Shephard, Neil; Bryant, Helen. E.; Gilhooly, Neville S.; Gambus, Agnieszka; Račko, Dušan; Colloms, Sean D.; Pyne, Alice L. B.

doi:10.1038/s41467-025-60559-x

Download PDF

Article
Open access
Published: 01 July 2025

Quantifying complexity in DNA structures with high resolution Atomic Force Microscopy

Nature Communications volume 16, Article number: 5482 (2025) Cite this article

7911 Accesses
11 Citations
72 Altmetric
Metrics details

Subjects

Abstract

DNA topology is essential for regulating cellular processes and maintaining genome stability, yet it is challenging to quantify due to the size and complexity of topologically constrained DNA molecules. By combining high-resolution Atomic Force Microscopy (AFM) with a new high-throughput automated pipeline, we can quantify the length, conformation, and topology of individual complex DNA molecules with sub-molecular resolution. Our pipeline uses deep-learning methods to trace the backbone of individual DNA molecules and identify crossing points, efficiently determining which segment passes over which. We use this pipeline to determine the structure of stalled replication intermediates from Xenopus egg extracts, including theta structures and late replication products, and the topology of plasmids, knots and catenanes from the E. coli Xer recombination system. We use coarse-grained simulations to quantify the effect of surface immobilisation on twist-writhe partitioning. Our pipeline opens avenues for understanding how fundamental biological processes are regulated by DNA topology.

Rapid DNA origami nanostructure detection and classification using the YOLOv5 deep convolutional neural network

Article Open access 09 March 2022

Integrated computer-aided engineering and design for DNA assemblies

Article 19 April 2021

An open-source semi-automated robotics pipeline for embryo immunohistochemistry

Article Open access 13 May 2021

Introduction

The complex topological landscape of DNA is essential to cellular function. For example, while most DNA is negatively supercoiled, positive supercoiling at transcription start sites has been shown to influence mRNA synthesis¹. DNA can also become tangled, either with itself or with other DNA molecules. This entanglement can impede replication and transcription, increase DNA damage and mutation rates, prevent chromatin assembly, and may even influence cellular differentiation^2,3,4,5,6,7. The misregulation of DNA topology is an event which can trigger cell death at cytokinesis, thus ensuring that genome integrity is maintained^8,9. Defects in the regulation of DNA topology can also lead to disease such as cancer and neurodegeneration^10,11,12. To fully understand the role of DNA topology, including its regulatory processes and pathologies, we must develop new methods to determine how the mechanical and geometric properties of DNA affect its interactions with other biomolecules. DNA topology encompasses both its superhelical properties (over- or under-winding of the DNA helix) and its entanglement. Knots are self-entangled individual DNA circles, while catenanes consist of two or more interlinked circles of DNA¹³ (Supplementary Fig. 1a). These structures can be inferred at the ensemble or single molecule levels from their biophysical properties by, e.g. electrophoretic techniques^{13,14,15,16,17,18,19,20,21,22,23,24,25,26,27}, DNA looping assays²⁸, optical and magnetic tweezer measurements²⁹, and nanopore detectors^30,31,32,33. Of these, gel electrophoresis is by far the most accessible and well-defined method^{34,35,36,37,38}. In the absence of supercoiling, topological species migrate according to topological complexity or average crossing number in 3D space. Thus, if two distinct topological species of the same size have a similar average crossing number (e.g. a 4-node catenane and a 4-node knot) they will migrate at a similar position on the gel. In contrast, microscopy techniques, such as Atomic Force Microscopy (AFM)^{26,39,40,41,42,43,44,45} or electron microscopy (EM)^{13,17,21,22,23,46,47,48,49,50,51,52,53}, provide full geometrical descriptions of the structure of single DNA molecules, allowing precise topological determination in addition to information on local curvature^54,55, and helical structure.

Microscopy-based determination of entangled DNA must faithfully trace the path of the DNA molecules and then discriminate the “crossing order” of intersecting segments i.e., which segment passes over which at each crossing. Typically, the technique of choice has been transmission EM of rotary-shadowed specimens, where a DNA sample is coated with RecA, shadowed, and then imaged under vacuum^{13,49,51,52,53,56}. This technique helped elucidate the topological reaction mechanisms of various site-specific DNA recombinases and topoisomerases^{17,22,23,50,57,58}, but is technically demanding, and can suffer from problems with inconsistent rotary-shadowing and incomplete RecA polymerisation on double stranded DNA, with some crossing geometries remaining ambiguous.

AFM imaging has become a powerful tool to probe the structure and interactions of DNA^{39,42,43,44,59,60,61,62,63}. AFM provides nanometre resolution imaging on single molecules in aqueous conditions with real-time imaging capabilities, and minimal sample preparation. Recent studies have made use of AFM to mechanistically explore DNA-condensin interactions⁶⁴, the binding kinetics of a transposition complex⁶³, and the effect of supercoiling density on DNA minicircle structure⁴³. However, no studies to date have exploited the three-dimensional capabilities of AFM to determine the crossing order where one DNA duplex crosses another in topologically complex species, or have done so in an automated manner that allows this to be quantified at scale.

Here we present an automated pipeline for the high-throughput tracing of single, untreated DNA molecules from high resolution AFM images, captured in aqueous conditions. Our analysis tool traces each DNA molecule, identifies each point where it crosses itself and defines which segment passes over which, i.e. the crossing order, by analysing the differential height profiles of under- and over-passing DNA. This enables topological classification through the Python package Topoly⁶⁵.

To determine the effectiveness of our pipeline on complex DNA structures, we quantify the composition of DNA replication intermediates from Xenopus laevis egg extracts stalled with two orthogonal model replication fork impediments, the Lac repressor and Tus protein^66,67,68,69. We use high-resolution AFM imaging to visualise the entire structure and use our automated pipeline to calculate the contour length of the entire structure. Beyond this, we automatically identify replication forks and quantify the length of the unreplicated DNA between them. Furthermore, we identify a number of additional structural features, including stalled forks, and quantify their length and frequency, giving additional information beyond global structural composition.

To test the fidelity of our pipeline, we utilise the E. coli Xer recombination system (Supplementary Figs. 1–6) in vitro to generate a suite of plasmid-sized predictable topological products including homogeneous right-handed 4‑node catenanes, and knots of increasing node-number^{18,70,71,72,73,74}. Using these we construct a thorough representation and classification of the behaviour and topology of DNA by AFM. In addition, we quantify a recurrent depositional effect observed during our examination of 4-node catenated DNA; where the clustering of crossings obscures the overall conformation of the molecule. We further explain this effect by the comparison of our AFM observations to predictions of coarse-grained molecular dynamics simulations. By providing an objective and efficient means to explore the conformation and topology of varied DNA structures, we can interrogate fundamental processes involving DNA-protein transactions, with potential impact on a range of cellular processes.

Results

High-resolution AFM enables accurate tracing and contour length measurement via crossing order determination

High-resolution AFM imaging enables visualisation of the double helix of DNA on individual molecules in aqueous conditions, without manipulation beyond the process of deposition onto a flat mica substrate. The height information that AFM provides should enable us to determine which DNA duplex passes over and under at each crossing (the “crossing order”), and therefore explicitly determine the topology of individual molecules. We use the defined product of the E. coli Xer recombination system between two directly repeated psi sites on plasmid p4CAT (1651 bp); a right-handed 4-node catenane, consisting of one larger (1253 bp) and one smaller (398 bp) DNA circle (Supplementary Fig. 1c). Visual inspection of a high-resolution AFM image shows that the product contains the expected large and small circles, interlinked by 4 well-separated crossings (#1-4), with the addition of one ‘trivial’ self-crossing (#5) within the large circle that does not contribute to the overall topology (Fig. 1a). If each crossing is assigned as over- or under-passing by eye, the catenane appears to have the predicted topology (Fig. 1b).

**Fig. 1: Determining the topology of individual DNA molecules using AFM.**

To define the path each DNA duplex takes through the product and the overall DNA topology, we must determine the crossing order for every intersecting duplex. The over-passing segment will form a “humpback bridge” conformation, as the under-passing segment passes underneath. We use the height information traced through each crossing to determine the crossing order as the over-passing profile should form a wider peak than the under-passing (Fig. 1c). To test this, the height is traced manually along each duplex through the crossing, using the AFM processing software Gwyddion⁷⁵, with the wider peak assigned as the over-passing duplex, and the narrower the under-passing (Fig. 1d). When assigned in this manner, the crossings alternate between over and under as each circle is traced (the small circle overlies the large circle at crossings 1 and 3 but underlies it at crossings 2 and 4). As we cannot reliably determine DNA sequence direction from the AFM images, we cannot absolutely determine the sign notation at each crossing. Independent of sequence orientation, there are two possible right-handed topologies for this 4-node catenane (Supplementary Fig. 1a-iii). If we arbitrarily assign antiparallel orientations to the two circles, the four right-handed crossings will have a negative crossing sign (Supplementary Fig. 1a-iii). This corresponds to the expected right-handed anti-parallel 4-node topology of the catenane (Fig. 1b), which has previously been determined by biochemical methods⁷¹, confirming our visual assignment of the crossing order. In this example, the misassignment of a single crossing order would lead to incorrect classification of the topology as a 2-noded catenane, while misassignment of two crossings would lead to classification as two unlinked circles (Supplementary Fig. 3).

However, manual assignment of topology is a prolonged process and subject to observer bias. Additional challenges include tracing the entire molecular backbone and incorporating the measurements of individual crossings into the wider molecular architecture to extract topological characteristics such as writhe. To remove this bottleneck, we develop an automated pipeline to trace and classify individual molecules and determine their length and topology. We image a series of topologically complex molecules, including nicked and negatively supercoiled plasmids and catenated and nicked knotted DNA (Fig. 1e). To enable accurate tracing analysis and crossing detection, the molecules should be in as open a conformation as possible. A buffer solution containing magnesium chloride in place of nickel chloride^60,76,77 is used to adhere the DNA to the mica surface. This buffer immobilised the DNA in a more open conformation (Supplementary Fig. 7a–d), lowering the chance of trivial self-crossings. Given the level of supercoiling of the unknot plasmid (σ = −0.06), the expected writhe should be on the order of 10 self crossings⁷⁸. However, due to the immobilisation in MgCl₂, we only see 1.7 ± 1.4 visible self-crossings within the supercoiled unknot sample. By comparison, for the sample with NiCl₂ immobilisation, 4.4 ± 2.3 self-crossings are visible (Supplementary Fig. 7f). We measure the bounding area of the molecules and observe that the nicked molecules immobilised in MgCl₂ adopt the most open conformations, and the supercoiled molecules in NiCl₂ the most closed conformations (Supplementary Fig. 7e).

Our accurate tracing of the DNA molecules allows us to determine the mean contour length and standard deviation for each sample of unknotted, knotted and catenated structures. For the simplest structure, unknotted plasmids, the contour length is within 1 % of the expected length (761 ± 11 nm vs 768 nm expected) (Supplementary Table 1). However, for more complex molecules, multiple DNA crossings cluster into a singular crossing reducing the number of correctly identified molecules obtained after data cleaning steps. This is evidenced by a larger proportion of retained molecules in the 3-node knots (56%), which have a lower number of crossings and hence a lower propensity to cluster, than the 5-twist knots (7%) (Supplementary Table 1). The clustering also affects catenated molecules, where 80 of 145 catenated structures can be separated into their individual molecules. However, for those that are separable, contour lengths are determined to within 5% of their expected length (Supplementary Fig. 8). Beyond contour length, we determine the number of crossings in each molecule and therefore infer whether molecules of the same topology are likely to be supercoiled (Supplementary Figs. 7f, 9).

An automated pipeline for molecular tracing and explicit determination of DNA topology

To further enhance the scope of our methodology, we wanted to automate the topological determination of complex molecules as well as measurements of their contour length. There are two major challenges in automating the classification of DNA topologies from AFM images: accurate path tracing, which can discriminate between adjacent or crossing DNA duplexes, and the determination of the crossing order at each crossing. Therefore, we developed multiple new methods in this work to overcome these challenges and implemented them into the open-source software TopoStats⁷⁹ to maximise their accessibility.

Self-crossing or close-passing molecules cannot be segmented reliably using traditional methods such as binary thresholding (Fig. 2a; Supplementary Figs. 10 and 11). Therefore, we train a deep-learning U-Net⁸⁰ model to perform image segmentation for DNA molecules, producing clearer, more reliable segmentations (Fig. 2b). We then skeletonise the segmented image to produce single pixel traces which follow the molecular backbone (Fig. 2c). To faithfully recapitulate the path of DNA through molecular crossings, we enhance the Zhang and Shuen⁸¹ skeletonisation algorithm to take advantage of the height data present in AFM images, to ‘bias’ the skeletonisation onto the centre of the molecule, even at crossings (Fig. 2d). This is a substantial improvement over the manual tracing described above, which took straight-line profiles through crossings, reducing the accuracy of the crossing order determination. We perform convolution to detect the central point or “node” (Fig. 2d - black) of the crossing, and pair the emanating branches as described in the methods to obtain continuous molecular traces (Fig. 2e), even for complex DNA molecules with several intersections.

**Fig. 2: Automated tracing and topological determination for complex DNA molecules from AFM images.**

To determine the crossing order, we classify the height profile of the DNA duplex with the greatest full-width half-maximum (FWHM) through the crossing as over-passing (Fig. 2f - green). The FWHM metric is chosen over a simpler implementation e.g., average height, or area under the curve, because the resulting classification from these metrics has greater influence from trace artefacts such as close nodes or nicks, causing inaccuracies. Secondly, FWHM does not require Gaussian curve fitting, which is challenging to auto-fit across a dataset, particularly when trying to differentiate from the noisy background trace. We quantify the average crossing order reliability for each crossing using the ratio between the minimum and maximum FWHM values (FWHM_pair) for N paired branches (Eq. 1). Using a ratio of the true positive to false negative classifications for 83 hand-labelled crossings and the code-produced crossing orders, we suggest a crossing reliability threshold of 0.263 optimises the number of correct to incorrect classifications (Supplementary Fig. 12).

$${average\; crossing\; order\; reliability}=\frac{1}{N}{\Sigma }_{{{FWHM}}_{{pair}}}1-\frac{\min ({FWH}{M}_{{pair}})\,}{\max ({FWH}{M}_{{pair}})}$$

(1)

The path of a single circular molecule (knotted or unknotted) is traced in its entirety, starting and ending at a single point on the path. However, further complexity is introduced when multiple molecules are entangled, such as in DNA catenanes. For those images, the constituent molecules are separated and then traced (Fig. 2g, h). Single or entwined molecule traces are then classified using Rolfsen’s knot notation format by Topoly^65,82 (Fig. 2i), e.g. 4²₁ for a 4-node catenane. Although we produce a topological classification for each molecule, the general sample topology can also be identified using other measures from the software e.g., the topology with the largest minimum crossing order reliability (Supplementary Fig. 13), or the topology most represented in the distribution (Supplementary Fig. 14).

Building on our existing software TopoStats^43,79, this work demonstrates the first integrated pipeline to identify, trace, and determine the topology of complex DNA molecules, including knotted, catenated and supercoiled DNA substrates. The improvements we have made to the tracing pipeline enable us to trace complex self-intersecting molecules for the first time (Supplementary Fig. 11). To test the tracing capabilities of this new pipeline, we apply it to the complex structures of plasmid replication intermediates, which are akin to theta curves. The complexity in these structures includes replication forks, which appear as nodes with an odd number of branches, where a single piece of un-replicated DNA emanates from an intersection with two newly replicated DNA duplexes.

Structural analysis of Xenopus replication products reveal intermediate structures dependent on the stalling complex

We apply our automated pipeline to determine the structure and composition of plasmids replicated, partially or fully, in Xenopus laevis nucleoplasmic egg extract. The well-characterised Lac repressor and Tus-Ter complexes^66,67,68,69 are used to stall or impede replication fork progression, (Fig. 3a). This yields partially replicated intermediates (theta structures) or late replication intermediates (figure-of-8 molecules⁸³), which can be observed using AFM (Fig. 3b, c respectively).

**Fig. 3: Automated determination of DNA replication intermediates stalled using either the Lac Repressor protein or the Tus-Ter Complex, at 40 and 80 minute timepoints.**

Replication in the Xenopus extract system proceeds in both directions from the origin of replication in the plasmid due to bi-directional replication. This creates two identical replication forks unwinding the DNA, producing two equal-sized newly replicated segments with an unreplicated portion of DNA between them⁸⁴. The Lac repressor protein binds extremely tightly to operator sequences, and arrays of LacI have previously been used to site-specifically stall replication forks in egg extracts due to the protein blockage on the DNA^69,85. To determine whether we can use our pipeline to determine the extent of DNA replication, we sample two timepoints (40 and 80 minutes) after replication was initiated. For both timepoints, we observe that most of the molecules can be identified as theta structures (Fig. 3bi, bii), with two replication forks (arrow heads pointing inwards highlighting the replication fork) and a length of unreplicated DNA (green) between them. The replication forks can be identified as crossings with 3 emanating branches. By isolating these odd-branched crossing regions, we can trace each individual DNA segment and determine the contour length, and thus the length of unreplicated DNA.

Using our pipeline, we can clearly identify three-way junctions and measure the length of DNA between them (green traces) (Fig. 3biii, biv). The lengths of each segment of DNA between a 3-way junction are calculated, with two newly replicated regions calculated as very close in length, and the unreplicated DNA length identified as the shortest segment between two forks. To ensure the fidelity of this measurement, we carry out stalling via an alternative complex, Tus-Ter, which forms a polar fork arrest in E. coli to prevent over-replication and has been used in mammalian cells as a non-polar fork stalling system⁸⁶. In our conditions, Tus-Ter presents a less severe block to replication fork progression. The replication intermediate structures at 40 minutes (T40) show little to no unreplicated DNA (Fig. 3ci) and the 80-minute time point shows most molecules as fully replicated plasmids (Fig. 3cii). The T40 samples were dimeric in length, 2240 ± 77 nm, and present as a figure-of-8 structure. Whereas the structures stalled at 80 minutes appeared to be the length of the original plasmid, suggesting the plasmid had fully undergone replication (Fig. 3ciii, civ, g).

Additionally, some structures appear to contain forks with ssDNA gaps (Fig. 3d), which can be identified by resolving the height difference between the DNA duplex and the ssDNA where it joins the main replication fork (Fig. 3dii). We also observe reversed replication forks, although they are a rare occurrence (observed in 7.7% of overall structures). Replication forks are defined as 3-way junctions, whereas reversed replication forks are defined as 4-way junctions where one emanating segment is short (<50 nm in length). A reversed replication fork can occur where a replication fork encounters an obstacle such as a DNA lesion, which enables replication machinery to reverse its course, creating a 4-way junction to avoid replicating through the lesion and therefore avoiding genotoxic stress⁸⁷. Our pipeline detects these forks and measures the length of the incomplete branch, which is made possible due to the high resolution of our data (Fig. 3diii, div). To further ensure the validity of our pipeline, we compare the manual traces of the reversed fork length with the automated trace outputs, revealing a remarkably similar output (Supplementary Fig. 15f).

There is an obvious difference in the composition of the theta structures at 40 and 80 minutes with LacI (Fig. 3e), with a reduction in the length of unreplicated DNA from 313 ± 156 nm (L40) to 218 ± 153 nm (L80). This correlates to an increase in the overall contour length of the theta structures, including replicated and unreplicated DNA (Fig. 3f), from 2400 ± 168 nm after 40 minutes to 2500 ± 219 nm after 80 minutes. This is consistent with LacI inducing a robust slowdown of replication fork progression rather than a terminal stall. At long time points, the array is eventually read-through, and the replication of a plasmid completed (Fig. 3a +LacI 120 minutes). Manual analysis of the unreplicated contour length shows a similar distribution to the outputs from our pipeline with a reduction in the contour length from 361 ± 56 nm to 196 ± 98 nm (Supplementary Fig. 15e).

For the Tus-Ter complex, a similar increase in replication is indicated after 80 minutes, however, this presents as an increase in the number of completely replicated plasmids, i.e., a reduction in overall contour length. The Tus-Ter stalled sample stalled at 80 minutes contains several fully replicated plasmids, measuring 1190 ± 90 nm in length, only 25 nm (2%) from the expected length of 1215 nm for a 3574 bp plasmid. This is approximately half the contour length measured at the T40 time point, further confirming the presence of the figure-of-8 molecules in that sample (Fig. 3g, Supplementary Fig. 15). This data indicates that stalling using the Tus-Ter complex results in late replication intermediates consistent with a termination stall, in comparison to LacI-induced fork blockades.

Having established that we can observe and quantify structural changes in replicating DNA plasmids, we now apply our pipeline to explicitly determine the topology of complex knotted DNAs formed by the synapsis of the Xer recombination accessory proteins PepA and ArgR.

Automated topological identification of complex knotted DNA formed by Xer accessory proteins

The Xer accessory proteins PepA and ArgR form an interwrapped nucleoprotein synapse with a linear DNA substrate containing two cer recombination sites (Supplementary Fig. 2). With the addition of DNA ligase, the linear DNA is circularised and generates knotted species of DNA, referred to as “cer ligation”. The class of knots formed depend on the relative orientation of the cer sites on the substrate DNA; direct repeat sites form a mixture of twist-knot ligation products (e.g. 5₂*, 6₁*), while inverted repeat sites form specific chiral forms of torus knots (e.g. 3₁ and 5₁) and not their mirror images (Fig. 4a). The 5-node torus (5₁) and twist (5₂) knots run similarly on a gel (Fig. 4b), and the two chiral forms of each knot⁸⁸ (i.e. the same knot with positive or negative crossing signs) are indistinguishable by routine gel electrophoresis⁷².

**Fig. 4: Automated topological determination of twist and torus-type 5-node knots.**

We obtained AFM images of an unknotted plasmid (0₁) and nicked 5-node knots gel-purified from the pINV and pDIR circularisation reactions, respectively which show distinct internal DNA crossings by eye (Fig. 4c–e). We trace each molecule using our pipeline to determine their length, crossing order, a reliability score for each crossing, and to classify the knot type (Fig. 4f–h). We determined that the supercoiled unknot plasmid contains two negative crossings as expected for negatively supercoiled, unknotted DNA (Fig. 4f, i). The product from the pDIR reaction contains 5 negative crossings, and is correctly identified as a 5-node twist knot (5₂*) (Fig. 4g, j) while the product from pINV contains 5 positive crossings, and is correctly identified as a 5-node torus knot (5₁) (Fig. 4h, k).

Incorrect identification of the crossing order of a single crossing for these knotted products can lead to incorrect identification as either an unknot or a 3-node knot (Supplementary Figs. 5, 6, 16, and 17). At some crossings, one DNA duplex does not pass distinctly over the other, but instead the crossing has approximately the same height as a single DNA duplex, implying possible compression by the AFM tip or interdigitation of the DNA helices. This is marked by a ratio of FWHM close to 1 and a low crossing order reliability value for the crossing, increasing the difficulty of determining the correct topology for the molecule. We observe this by an increase in the correct topology when considering both the original topological classification and the classification when flipping the crossing order of the lowest reliable crossing (Supplementary Fig. 13).

Molecules with a higher number of expected crossings will therefore have a lower theoretical probability of obtaining the correct topological classification, which can be calculated using combinatorics. This results in a 56% probability of correct classification for 3-node knots, reducing to 37% for more topologically complex 5-node knots (Supplementary Fig. 18). We obtain an 82% crossing order accuracy from our hand-labelled crossing comparison (Supplementary Fig. 12d). This accounts for the presence of derivative topologies, a subset of incorrectly classified molecules within the automated algorithm topology classification. The differing probabilities between the theoretical and observed values are due to the clustering of crossing points where individual crossings are unable to be resolved and thus provide incorrect classifications.

Additionally, the crossing order is more likely to be classified as incorrect for images with lower resolution (Supplementary Fig. 19), or for those with clustered crossings, or close passing DNA duplexes that do not cross. The 5-node twist and torus knots are most highly misclassified, as their crossings cluster together in entropically favourable conformations. The colocalization of crossings increases the configurational freedom of the rest of the chain, and hence increases entropy⁸⁹. We observe more clustering for the 5-node torus compared to the 5-node twist by both AFM (Supplementary Fig. 20), and coarse-grained simulations (Supplementary Fig. 21), which implies that the availability of an entropic gain is different in different types of knots. The coarse-grained simulations indicate that the effect is dependent on the presence of an adsorptive force towards the surface, equivalent to an electrostatic potential. The adsorptive force increases the confinement strength and confinement free energy, which results in increased colocalisation of the crossings. The adsorptive force also induces a change in the shape and appearance of the knots, which tend to create rosette-like conformations (Supplementary Fig. 21e). Interestingly, the differences between twist and torus 5-node knots are less pronounced in the absence of the adsorptive force (Supplementary Fig. 21c, d)

Conformational variability is enhanced by surface immobilisation

For catenated molecules in particular, we observed a greater variability in conformation than predicted (Supplementary Figs. 22−23). We would expect the nicked 4-node catenanes to deposit on the mica surface predominantly in the open conformation, due to the expectation that they would have no writhe and are ~1 kb in length, which should minimise the chance of self-crossings. However, after immobilisation on the mica surface for AFM visualisation, a much greater range of conformations is observed, which we categorise as “open” (Fig. 5a), “taut” (Fig. 5b), “clustered” (Fig. 5c) or “bow-tie” (Fig. 5d) conformations. We have interpreted these conformations in schematic line diagrams to accommodate the necessary topology, to the right of each respective AFM images (Fig. 5a–d(ii)). The “taut” and “bow-tie” classes are difficult to interpret due to the tight clustering of several nodes, therefore, each schematic diagram shows two possible conformations that accommodate the geometry of the circles and their four catenated crossings.

**Fig. 5: Variability in catenane conformation is driven by topological state.**

These conformations make automated topological tracing challenging due to complex crossing architectures and close-passing DNA duplexes. For example, the “bow-tie” (Fig. 4d) crossing can be identified as an open-circle as the single crossing in the molecule is traced through multiple times along the best aligned segments. We develop an automated classifier that enables us to categorise catenanes that could not be topologically traced by their conformation. This classifier calculates the number of nodes and crossing branches for each molecule and assigns it as open, taut, clustered, bow-tie or unclassified, as defined in Supplementary Table 2.

We compare the conformations of nicked and negatively supercoiled 4-node catenanes immobilised using the same magnesium deposition process to see if the DNA mechanics influence the DNA conformation. There is a change in prevalent conformation between the nicked and negatively supercoiled species as determined by both automated and manual classification. The nicked population adopts a wider range of conformations with a preference for the clustered conformation, whereas the supercoiled presents with a much stronger preference for the open conformation (Fig. 5e, f).

This difference can be explained if the clustered and bow-tie conformations arise from deposition of catenanes where the large circle contains a positive supercoil encircled by the small circle. This positively writhed conformation of the large circle should be strongly disfavoured in the negatively supercoiled catenanes. In contrast, there should be no energetic barrier to the large circle adopting a positively writhed conformation in the nicked catenanes. Indeed, positive writhe induced by the wrapping of the large circle around the small circle has been observed experimentally in the nicked Xer catenane⁷¹. To better understand the physical properties that drive the DNA to adopt these conformations upon deposition, we perform coarse-grained simulations of the 4-node catenane (Fig. 5a-d(iii)).

Coarse-grained simulations reveal that twist-writhe partitioning is altered by surface immobilisation

To understand how surface immobilisation affects the DNA conformation upon deposition in divalent ions, we employ coarse-grained molecular dynamics simulations of the 4-node catenane, in both nicked and supercoiled states. We use the conformations obtained from AFM imaging to parameterise our model and reproduce the experimentally observed images (Fig. 5a–d(iii)). To obtain these conformations, we immobilise the DNA in a buffer solution containing magnesium ions, which facilitates weaker binding to the mica surface compared to nickel^76,90, and stiffens the DNA helix^77,91,92. This immobilisation method allows the molecules to sit in a more open state than is achieved using nickel ions, improving topological classification. AFM measurements also indicate fewer supercoils than expected for the supercoiled catenanes when using this method (Supplementary Fig. 7).

Given native supercoiling of the original DNA substrate and the assumed +4 linkage change of the Xer recombination reaction, the large DNA circle is expected to be underwound by ~4 turns. It has been reported that alternative DNA conformations may be observed at some salt concentrations⁵⁹ and that negatively supercoiled plasmids can lose 70–80% of plectonemic supercoils under conditions which allow for dynamics and equilibration on the surface⁹³. We hypothesise that the DNA helicity may change during deposition. For our models, we reduce underwinding by altering the number of turns defined by the equilibrium dihedral angle, ϕ≤2πΔLk/N⁹⁴. This enables the coarse-grained simulations to achieve more open conformations (Fig. 6b, c) than with the non-adjusted dihedral angle (Fig. 6a), consistent with our AFM experiments. Simulated supercoiled molecules show much more open conformations, as observed in both AFM images and analyses (Fig. 6b). As discussed above, in nicked catenanes, catenation of the small circle around the larger circle can induce a positive writhe in the larger circle, leading to an increased observation of clustered conformations with the larger circle folded upon itself (Fig. 6c).

**Fig. 6: Surface immobilisation drives conformational changes, however, topological species can still be separated by their linking number difference.**

As open conformations can be characterised by a lack of proximity in catenated crossings or a decrease in the number of self-crossings, we look to confirm this finding by characterising the frequency of self-crossings (Fig. 6d), and distance between crossings (Fig. 6e). For the supercoiled species, the number of molecules with no crossings is three times that of the nicked molecules, confirming the experimentally observed increase in open conformations. For nicked molecules, catenated crossings are observed in closer proximity to one another, with the distribution of distances between catenated crossings forming a single peak with mean and variance of 6 nm, consistent with “clustered” “taut” and “bow-tie” conformations (Fig. 6e). However, in supercoiled molecules we see a bimodal distribution of catenated crossings, with the first peak sharing the same mean and variance as for nicked molecules, and the second peak having a larger mean (~16 nm) and variance (~19 nm), more consistent with the “open” conformation (Fig. 5e).

For supercoiled molecules, the presence of attractive forces on the surface drives the observed conformations to lower writhe states. This can be quantified by the change in average writhe of the large circle from −2.59 ± 0.05 to 0.31 ± 0.05 following adsorption, indicating a large number of adsorbed conformations are free from self-crossings (Fig. 6f). The nicked molecules changed little upon surface adsorption with writhe in the larger circle 0.87 ± 0.01 at equilibration and 0.73 ± 0.05 upon adsorption (Fig. 6f). The change in writhe of supercoiled species upon adsorption is compensated for by a reduction in twist, with mean twist of the large circle changing from −1.08 ± 0.02 to −3.94 ± 0.03 following surface adsorption (Fig. 6g).

To determine which variables drive the differing behaviours of supercoiled and nicked molecules upon deposition, we perform Principal Component Analysis (PCA) on several statistics extracted from simulations including writhe, twist, linking number difference and radius of gyration, as well as a measure of distance between the centre of mass of small and large catenated circles (Fig. 6h). Through PCA, we achieve good separation between nicked and supercoiled molecules, and also between the equilibrated and adsorbed species of supercoiled, but much less for nicked species. For PC1, we observe that the linking number difference for the large circle and smaller circle have the biggest loading value, indicating that these measures have the greatest contribution to the variance observed between nicked and supercoiled species (0.46 and 0.44, respectively). For PC2 the writhe of the small circle, twist and writhe of the large circle, alongside radius of gyration and distance between the centre of mass of the two circles all have effect (0.49, 0.45, 0.41, 0.38, 0.37) (Fig. 6i). Statistical distributions of individual metrics calculated from simulations are provided in Supplementary Table 4 and Supplementary Fig. 24, while Supplementary Fig. 25 illustrates how these metrics influence the conformation of DNA.

Discussion

Atomic Force Microscopy (AFM) imaging has transformed our ability to visualise DNA in aqueous solution, offering high-resolution imaging, albeit via immobilisation on a surface, without the need for protein coating^42,43. We have developed a pipeline which enables us to determine the crossing order, i.e., the over- and under-passing strand for each molecule, enabling tracing and topological classification of complex DNA structures, including plasmids, knots, catenanes and replication intermediates. Our method enables us to determine contour length across DNA species and infer levels of writhe or supercoiling (Fig. 1). We show that we can use these metrics to recover a 56:44 classification of a 50:50 mixed population of nicked and supercoiled DNA plasmids using random forest classification (Supplementary Fig. 9), highlighting the wide-scale applicability of our approach.

However, automating the tracing of topologically complex DNA molecules from AFM images isn’t without its challenges, particularly in tracing close-passing or overlapping strands. We address these challenges by developing novel methods for segmentation, tracing, identification of crossings, determination of crossing order, and topological determination for single DNA molecules in a high-throughput and transparent manner. We can identify the over- and under-passing strand even in crossings that show minimal height variation using the full width at half maximum for each crossing segment and determine a pseudo-confidence to inform downstream classification. This integrated pipeline represents a significant advancement in the field, facilitating the identification of complex DNA structures that are difficult to resolve using traditional methods^13,34,35 (Fig. 2).

We apply our pipeline to analyse the structure of DNA plasmids, incubated in nucleoplasmic Xenopus egg extract, which is a mix of replicated and unreplicated plasmids and theta structures. We show that we can identify replication forks as 3-way junctions and quantify the length of DNA between them, quantifying the lengths of replicated and unreplicated DNA, and additional features such as reversed forks (Fig. 3).

By studying the Xer recombination system of E. coli^18,70,95, we show our method can determine the topology of individual DNA molecules, successfully differentiating between different knot types of the same node number, which are challenging to distinguish by gel electrophoresis. Deriving a 3D topological classification from a topographic image has its challenges, even by eye. As a result, we developed a data cleaning pipeline to remove molecules with clustered crossings from the topological classification analysis, where it was not possible to define a crossing order within the clustered crossing region and accurately extract crossing information from topography. To improve topological determination, each crossing in the automated trace is assigned a reliability score and the crossing order of singular low-reliability crossings can be inverted. By manually calculating all the possible sign conformations of these AFM-traced molecules, we observe that there were no instances of mis-identifying a twist-knot as a torus-knot, or vice versa (Fig. 4).

Additionally, we observe conformational variability in single topological species, indicating the impact of surface immobilisation and how this varies in different topological species. We uncover unexpected conformational preferences in nicked and negatively supercoiled catenanes, using both manual and automated analysis. These conformational preferences may arise from catenation of the small circle around the larger circle, which likely induces positive writhe in nicked catenanes. Conversely, negative supercoiling in negatively supercoiled samples discourages positive supercoiling, resulting in fewer clustered molecules and more open conformations (Fig. 5).

To determine the driving forces behind the changes in confirmation we observe for supercoiled and nicked molecules, we use coarse-grained simulations. We parameterise these using the AFM experimental data and determine that a change to the twist: writhe partitioning of the DNA molecules is necessary to achieve the “open” conformations obtained experimentally. However, the simulations also demonstrate that despite changes to twist, writhe, radius of gyration and molecular separation on adsorption of DNA molecules to a surface, the biggest discriminating factor between the nicked and supercoiled species is still linking number difference (Fig. 6).

Our automated pipeline can identify, trace and classify individual topologically complex DNA molecules across a variety of different unknotted, knotted, catenated and partially replicated plasmids to obtain quantitative and descriptive metrics. Ideal performance is achieved with resolutions between 0.5 and 1 nm/pixel (Supplementary Fig. 19) and when all crossing points are visible and assigned with high reliability, leading to precise contour length estimations (within 7% of predicted length). Though it is theoretically possible to segment and separate molecules that have adsorbed on top of one another, we did not observe these due to the concentration at which this study was performed. The ability to separate overlapping molecules will depend on the orientation of the molecules at the point where they overlap. If overlapping segments are nominally perpendicular to one another, these can be paired effectively and the individual molecules separated.

We observe conformational variation within topological classes, induced by DNA deposition, entanglement, electrostatic interactions and individual flexible sites. Immobilisation conditions must be altered depending on what structures are being observed, and what metrics are desired. In order to determine the supercoiling of small plasmids, NiCl₂ immobilisation provides a more representative measure of writhe. When attempting to determine the topology of larger or more complex structures, MgCl₂ should be used, as it will create more open conformations with fully separated crossings, improving the ability to trace the molecule.

Our pipeline can be applied to a range of DNA and RNA structures, including those interacting with proteins, and opens avenues for understanding fundamental biological processes which are regulated by or affect DNA topology. Beyond nucleic acids, this approach has wider applications in structural biology, from fibrillar structures to polymeric networks and could inform further advances in DNA nanotechnology and structural biology.

Methods

Strains used throughout this work can be found in Table 1.

Table 1 E. coli strains used during this work and their associated origin

Full size table

Protein purification

PepA was overexpressed in an argR^- E. coli strain DS956. PepA cell pellets were resuspended in a lysis buffer [50 mM Tris pH 8, 1 M NaCl, 1 mM EDTA] containing protease inhibitor cocktail (cOmplete™ Mini, Roche) and lysed by sonication. The lysate supernatant was treated with a 70 °C heating step for 10 minutes, where PepA remains soluble, and denatured proteins were then removed by centrifugation at 40,000 g for 20 minutes at 4 °C. PepA was then further crudely purified by ammonium sulphate precipitation to 50% saturation (0.291 g/ml at 0 °C). The ammonium sulphate precipitation pellet was resuspended using a resuspension buffer [20 mM Tris pH 8, 200 mM NaCl, 1 mM EDTA, 2 mM DTT], then dialysed overnight (4 °C) against a low ionic strength dialysis buffer [20 mM Tris pH 8, 10 mM NaCl, 1 mM EDTA, 2 mM DTT], causing precipitation of the PepA within the dialysis tubing. PepA was then further purified by two FPLC (Fast Liquid Protein Chromatography) steps. The precipitated PepA was collected by centrifugation at 40,000 g for 20 minutes at 4 °C, then resuspended with PepA Buffer C [20 mM Tris pH 8, 200 mM NaCl, 1 mM EDTA, 10% Glycerol]. The crude PepA solution was loaded onto a HiTrap Heparin-HP 1 ml column (GE Healthcare), then washed with PepA Buffer A [PepA Buffer C without Glycerol]. A gradient elution was performed from PepA Buffer A to PepA Buffer B [20 mM Tris pH 8, 2 M NaCl, 1 mM EDTA]. Elution fractions were combined into a single pool that was subsequently adjusted to conditions equal to that of PepA Buffer D [20 mM Tris pH 8, 1 M (NH₄)₂SO₄, 1 M NaCl, 1 mM EDTA]. The adjusted pool was loaded onto a 1 ml Phenyl-HP hydrophobic interaction column (GE Healthcare) pre-equilibrated with PepA Buffer D then washed with the same buffer. Protein was eluted during a gradient from PepA Buffer D to PepA Buffer E [20 mM Tris pH 8, 1 M NaCl, 1 mM EDTA]. Peak absorbance eluted fractions were pooled and dialysed against PepA storage buffer [50 mM Tris pH 8, 1 M NaCl, 1 mM EDTA, 2 mM TCEP, 50% glycerol] and stored at −20 °C.

ArgR was overexpressed in a pepA^- E. coli strain DS957 and purified essentially following the procedure of Burke et al.,⁹⁶ but used FPLC steps of heparin-affinity using a HiTrap Heparin-HP 1 ml column (GE Healthcare), followed by anion exchange chromatography using a MonoQ 5/50GL (GE Healthcare) column. ArgR was bound to each column using “2xTM” buffer + 50 mM NaCl [20 mM Tris pH 7.5, 20 mM MgCl₂, 2 mM DTT, 50 mM NaCl], and then eluted from each column using “2xTM” buffer + 1 M NaCl. MonoQ-purified pools of ArgR were dialysed to ArgR storage buffer [10 mM Tris pH 7.5, 10 mM MgCl₂, 75 mM NaCl, 2 mM TCEP, 50% Glycerol], and stored at 20 °C.

An MBP-XerC fusion protein was overexpressed in a xerC^- xerD^- E. coli strain DS9029. The cells were resuspended in MBP lysis buffer [20 mM Tris pH 8, 200 mM NaCl, 1 mM EDTA, 2 mM DTT] and lysed by sonication. FPLC was performed by loading the lysate supernatant onto a 1 ml MBP-Trap column (GE Healthcare), followed by a wash step using the same lysis buffer, and then elution with the same buffer containing 10 mM maltose. Pooled elution fractions were subjected to a 2nd FPLC stage using a 1 ml HiTrap Heparin-HP 1 ml column (GE Healthcare). The bound MBP-XerC was washed using heparin loading buffer [50 mM Tris pH 8, 200 mM NaCl, 1 mM EDTA, 2 mM DTT, 10% glycerol], then eluted by a gradient to the same buffer containing 1 M NaCl. Peak elution fractions were pooled and dialysed against a storage buffer [50 mM Tris pH 8, 1 M NaCl, 1 mM EDTA, 2 mM TCEP, 50% Glycerol] and stored at −20 °C. MBP-XerC was used as an active recombinase without removal of the solubility tag.

XerD was overexpressed in xerC^- xerD^- strain DS9029 and purified by native (non-His-tagged) nickel-affinity FPLC as performed in Subramanya et al. ⁹⁷, using a 1 ml HisTrap-HP column (GE Healthcare), then by a Heparin-affinity purification using a 1 ml HiTrap Heparin-HP column (GE Healthcare). Other FPLC steps from Subramanya et al. were omitted. Enriched Heparin-affinity elution fractions of XerD were pooled and dialysed against a storage buffer [50 mM Tris pH 8, 1 M NaCl, 1 mM EDTA, 2 mM TCEP, 50% Glycerol] and stored at −20 °C.

Creating a miniaturised Xer catenation substrate with uneven product circles

The 3406 bp plasmid pSDC153⁷² contains two directly repeated end-to-end psi recombination sites. This psi-site doublet was excised from pSDC153 by restriction digest using flanking restriction enzymes EcoRI and HindIII, and ligated into the 885 bp microplasmid vector piAN7⁹⁵ (Supplementary Fig. 1b). piAN7 contains the minimal DNA sequences required for plasmid replication by utilising a supF amber-suppressor tRNA selection strategy. For selection, piAN7 must be paired with an amber-codon mutagenized ampicillin/tetracycline resistance plasmid “p3” (60 kb). The newly miniaturised psi recombination substrate, p4CAT, is 1651 bp (Supplementary Fig. 1c). As piAN7-based plasmids rely on amber-suppression for antibiotic selection, the host strain must have a sup⁰ genotype. Empty piAN7 vector was maintained in sup⁰ strain HMS174 and p4CAT was maintained in sup⁰ xerC^- strain DS953xerC-y17. Amber suppressor selection plasmid p3 was freshly conjugated into both of these strains prior to use. piAN7 and derivatives were maintained by antibiotic selection on solid or liquid media with 25 µg/ml ampicillin and 10 µg/ml tetracycline. Xer recombination reactions of p4CAT were analysed by restriction digest with various restriction endonucleases (HindIII or BamHI) and/or nickases (Nb.BssSI or Nt.BsmAI) to reveal the relative mobility of the constituent circles within the p4CAT catenane under supercoiled/nicked/linear conditions by gel electrophoresis (Supplementary Fig. 1d).

Bulk DNA substrate preparation

Large preparations of p4CAT DNA were made using a QIAGEN Maxiprep kit. p3 amber-selection “helper” plasmid DNA was selectively removed from plasmid preparations by a 1-pot endo- and exonuclease digestion reaction protocol adapted from (Balagurumoorthy et al.)⁹⁸. Such digestion reactions contained EagI-HF restriction endonuclease for multi-linearization of p3, along with lambda-exonuclease and recJ^F-exonucleases for 5”-3” dsDNA and 5”-3” ssDNA digestion, respectively. A typical large preparation of substrate contained roughly 200 ug DNA, incubated in 1x NEB Cutsmart buffer conditions for 20 hours at 37 °C with 50U of EagI-HF, 30U of Lambda-exonuclease, and 260U of recJ^F. Enzymes were removed from the reaction products by extraction with phenol:chloroform:isoamylalcohol (25:24:1), followed by two treatments of the aqueous phase with chloroform:isoamylalcohol (24:1) to remove traces of phenol. The product DNA was ethanol precipitated and resuspended in TE buffer.

Assembly of pDIRECT/pINVERT

pKS492 and pKS493⁹⁹ each contain a different orientation of the 280 bp, HpaI-TaqI “cer-site” fragment from the ColE1 natural plasmid, within the pUC18 cloning vector MCS¹⁰⁰. Restriction-digested fragments of pKS492, pKS493, and a PCR amplified chloramphenicol resistance (cmR) were combined in a 3-part DNA ligation to assemble a cer site flanking either side of cmR in a pUC18 vector. This generated four new plasmids containing the four different combinations of cer site orientations with respect to the fixed orientation of cmR. Subsequently, the HindIII-EcoRI restriction fragments containing cer-cmR-cer in the direct- or inverted-repeat site orientations were cloned into piAN7, generating the 2280 bp plasmids pDIRECT and pINVERT. Due to the cmR marker, it was not necessary to maintain these plasmids in sup⁰p3 E. coli strains. pDIRECT was cloned and amplified in strain FC33 without issue. Due to its large inverted repeats, pINVERT had to be cloned and amplified in JAM103, an xerC^- derivative of the commercially available strain SURE (Agilent), which tolerates inverted-repeat DNA. Large quantities of contaminating gDNA in JAM103-derived DNA preparations were removed using the same 1-pot endo-exonuclease reaction described above.

DNA knot pINVERT substrate preparation

DNA preparations of pINVERT from JAM103 contained approximately <10% multimeric intermolecular plasmid recombination products. Upon linearisation with HindIII these multimers yield incorrect DNA orientations, and therefore contaminate the linear DNA in downstream ligation-knotting reactions. Homogenous pINVERT of the correct type was purified by large-scale low-melting point gel-electrophoresis of supercoiled pINVERT plasmid, followed by excision of the correct monomeric supercoiled species. This DNA was then extracted from the low-melt agarose. The homogenous supercoiled pINVERT was then linearised with HindIII and subjected for experiments.

Xer recombination reactions

In vitro Xer recombination reactions were performed in a buffer containing 50 mM Tris-HCl pH 8, 25 mM KCl, 1.25 mM EDTA, 5 mM spermidine, 25 μg/ml BSA, and 10% glycerol. Supercoiled plasmid DNA was added to a final concentration of 21 nM. Proteins were added to final concentrations of ~300 nM PepA (Hexamer concentration) and ~250 nM XerC and XerD (monomers). Reactions were incubated at 37 °C for 60 minutes, then the DNA was purified from the reaction components by phenol:chloroform:isoamyl extraction (25:24:1), followed by two treatments with chloroform:isoamyl, then ethanol precipitation and resuspension in TE buffer. Restriction digests and nicking-endonuclease reactions on purified recombinant DNA were carried out as per the manufacturer’s instructions.

DNA catenane purification for AFM

Large quantities of DNA catenanes were homogeneously purified away from un-recombined substrate DNA by agarose gel extraction. Extractions of supercoiled catenanes used the QIAGEN gel extraction kit as per the manufacturer’s instructions. Nicked catenanes (treated with Nt.BsmAI - singly nicking both circles) were extracted using 1% low-melting point agarose (SeaPlaque GTG, Lonza) gels using β-Agarase I (NEB) digestion. Low-melting point gel slices were weighed and an equal volume of nuclease-free water was added. 10x β-Agarase I Buffer was added to equal 1x final concentration. The gel slice was then crushed into a paste with a glass rod, melted at 65 °C for 10 minutes with repeated vortexing, and cooled to 42 °C. 2U of β-Agarase I enzyme was finally added per 200 mg of gel-slice, and the digestion reaction incubated at 42 °C for 16 hours. Agarase reactions were centrifuged at 12,000 g for 10 minutes and their supernatants were transferred to fresh tubes. The extracted DNA supernatants were further purified by the phenol:chloroform-ethanol precipitation steps and resuspended in TE buffer for use. The homogeneity of these gel-extracted samples prior to AFM imaging was observed by gel electrophoresis (Supplementary Fig. 1e).

DNA knot generation

DNA knots were generated by circularisation of linear DNA substrates with T4 DNA ligase in the presence of Xer accessory proteins PepA and ArgR, approximately following the procedure described by Alén et al. ⁷². Each reaction contained 17 nM linearised substrate DNA (pDIR or pINV) with 250 nM PepA (hexamer) and 30 nm ArgR (hexamer) in a reaction buffer condition (1 x “cer buffer”) of 50 mM Tris HCl pH 7.5, 25 mM NaCl, 1 mM L-arginine, 25 µg/ml BSA, 2.5 mM spermidine, 2.5 mM DTT. Reaction mixtures were then incubated at 37 °C for 15 minutes for nucleoprotein synapse formation. This was followed by addition of “ligase mix” (“cer buffer” plus T4 DNA ligase, 30 mM MgCl₂, 3 mM ATP), which leads to a final concentrations of 50 units T4 DNA ligase, 10 mM MgCl₂, and 1 mM ATP. DNA was ligated for 60 minutes at room temperature, then reactions were heat-inactivated at 65 °C for 15 minutes. Finally, the reaction products were nicked to remove supercoiling by incubation at 37 °C with Nt.BsmAI or Nb.BssSI nicking endonucleases (NEB) within the existing reaction mixture. For AFM analysis, the mixed knot production reactions were scaled-up and finished with DNA purification by phenol:chloroform:isoamyl extraction (25:24:1), then two extractions with chloroform:isoamyl alcohol (24:1), followed by ethanol precipitation and resuspension in TE buffer. Single knot types (e.g., 5-torus, or 5-twist alone) were isolated by the separation of nicked knot reactions in 1% low-melting point agarose gels (SeaPlaque agarose, Lonza), followed by excision of specific bands. DNA extractions from the agarose slices used β-agarase (NEB) enzymatic digestions as per the manufacturer’s instructions. Crude β-agarase extracted DNA was further purified for AFM by phenol:chloroform:isoamyl extraction (25:24:1), then two extractions with chloroform:isoamyl alcohol (24:1), followed by ethanol precipitation and resuspension in TE buffer.

DNA electrophoresis

For topological DNA electrophoresis, 1% agarose gels were used with a modified TAE running buffer “TSAE” [40 mM Tris acetate, 20 mM sodium acetate, 1 mM EDTA] using 23 × 16 cm gel electrophoresis kits, for 16−24 hours (1.5 V/cm). Gels were stained with SYBR® Gold DNA stain and laser-scanned using a GE Healthcare Typhoon FLA9500 laser scanner at 450 V PMT on the SYBR® Gold preset (474 nm; LBP filter). Unprocessed gel images are included with the source data.

Replication intermediate production from Xenopus egg extracts

Reagents and chemicals

Creatine phosphate (PC), disodium salt (Sigma), was made up as a 1 M stock dissolved in 0.1 M KHPO₄ and stored at −20 °C as single use aliquots. Creatine phosphokinase (CPK - Roche) was dissolved in a buffer containing 10 mM Hepes (pH 7.5), 50% glycerol and stored at −20 °C. ATP (Cytiva) was prepared as a 180−200 mM aqueous solution at pH 7.5, adjusted with NaOH. 3,4-Dihydroxybenzoic acid (PCA – Sigma) was prepared as a 250 mM solution in water with pH adjusted to 8.0 with KOH. 4,5′,8-Trimethylpsoralen (TMP – Sigma) was dissolved in EtOH at 1 mg/mL and stored in the dark at −20 °C. Phenol:chloroform (Sigma), Chloroform.

DNA substrates

pJD97 – Lac48 was a gift from Professor James Dewar. pBS SK + (Agilent). pET11a-LacI-bio was a gift from Dr Kenneth Marians (MSKCC). pUCattB-Ter24 was digested with HindIII and ligated to a 5’-phoshorylated oligonucleotide duplex (IDT) containing 4 TerB sequences to yield pUCattB-Ter4. GGATCCTCACACCTACAAGGGATGTACATCAATTAGTATGTTGTAACTAAAGTGTTAGGGAGGAATTAGTATGTTGTAACTAAAGTTGGAGTTGATAATTAGTATGTTGTAACTAAAGTGGCTTCAACGTAATTAGTATGTTGTAACTAAAGTTCCGTACGAATGTGCCGAACTTATAAGCTT. The repeat array was excised from pUCattB-Ter4 with BsrGI and BsiWI, gel purified and religated into the cut vector. Colonies were picked and sequenced to yield plasmids with 24 TerB repeats that were verified by sequencing and restriction digest mapping.

StrepII-Tus expression and purification

The protein sequence (E. coli | EG11038) of Tus was fused to the following N-terminal tandem Strep-tag II and linker sequence: MGSAWSHPQFEKGGGSGGGS GGSAWSHPQFEKGGGS. DNA sequences were codon optimised for E. coli and synthesised to be in frame with pET28a by Twist Biosciences. 4 L of BL21(DE3) transformed with pET28a-Strep_Tus were grown in 2XYT broth containing kanamycin (30 µg/mL) at 37 °C until an OD of 0.5-0.7 was reached and protein induction achieved by the addition of IPTG to 1 mM. Growth was continued for 3 hrs at 37 °C and cells harvested by centrifugation at 31,000 g. Cell pellets were resuspended in a buffer containing 50 mM tris.Cl (pH 7.5) and 10% sucrose and stored at −80 °C. Cell suspensions were thawed and 4 EDTA-free protease inhibitor tablets (Roche), PMSF (0.1 mM), EDTA (0.1 mM) and DTT (1 mM). Cells were lysed on ice using a Branson 450 sonifier at 70% power and 50% duty for 4 × 1 min with intervening rests on ice-water for 5 mins. The lysate was made up to 500 mM NaCl and clarified by centrifugation at 20,000 RPM, 31,000 g in a JA 25.50 rotor for 30 mins at 4 °C. Polymin P was added to the lysate to 0.4% with stirring at 4 °C for 15 mins. Nucleic acids were pelleted at 10,000 RPM for 20 mins (JA 25.50) at 4 °C. The supernatant was removed and precipitated with solid ammonium sulphate that was added slowly to 80% saturation (0.53 g/mL) at 4 °C. The suspension was stirred for 30 mins and then spun at 20,000 RPM (JA 25.50) for 30 mins at 4 °C. The pellet was resuspended in 40 mL TED25 buffer, 0.2 µm filtered and applied to 2 × 5 mL StrepTrap XT (Cytiva) columns run in series at 1 mL/min at 4 °C. The columns were washed at 1 mL/min until UV reached baseline and proteins eluted with 60 mL TED150 + 50 mM biotin. Peak fractions were pooled and dialysed overnight against the TED75 buffer at 4 °C. The protein solution, with slight precipitation was 0.2 µm filtered and applied to a MonoQ 5/50 GL column at 1 mL/min that was equilibrated in TED100 buffer at 4 °C. Tus protein was collected from the flowthrough and precipitated by addition of solid ammonium sulphate to 80% saturation with stirring for 30 mins. Protein was recovered by centrifugation at 20,000 RPM (JA 25.50) for 30 mins and the pellet resuspended in TED100 + 25% glycerol. Protein was dialysed overnight at 4 °C against the same buffer. Protein concentration was determined using an extinction coefficient of 50482.5 M/cm.

bioLacI expression and purification

4 L of BL21(DE3)-pBirACm transformed with pET11a-LacI-bio were grown in LB broth containing ampicillin (100 µg/mL) and chloramphenicol (34 µg/mL) at 37 °C. Protein expression was induced at an OD of 0.5–0.6 with the addition of IPTG and d-biotin to 1 mM and 50 µM, respectively, and growth continued for 3hrs at 37 °C. Cells were harvested by centrifugation at 31,000 g and pellets resuspended in 50 mM tris pH 7.5, 100 mM NaCl 10% sucrose and stored at −80 °C.

Cells were gently thawed and supplemented with 4 EDTA-free protease inhibitor tablets (Roche), PMSF (0.1 mM) and DTT (1 mM). Cells were lysed on ice using a Branson 450 sonifier at 70% power and 50% duty for 4 × 1 min with intervening rests on ice-water for 5 mins. The lysate was clarified by centrifugation at 20,000 RPM, 31,000 g in a JA 25.50 rotor for 30 mins. The supernatant was made 35% saturated with ammonium sulphate (0.21 g/mL) and stirred at 4 °C for 30 mins. The suspension was spun at 20,000 RPM (JA 25.50) for 30 mins. The pellet was resuspended in a buffer containing 50 mM tris.Cl - pH 7.5, 0.1 mM EDTA, 1 mM DTT and 100 mM NaCl – abbreviated as TEDX, where X denotes NaCl concentration. The protein solution was 0.2 µm filtered and applied to a 10 mL softlink avidin (Promega) column that had been equilibrated in TED100 at ~1 ml/min at room temperature. The flow-through was collected and reapplied to the column. The column was washed with 50 ml TED100. Proteins were eluted with 25 mL TED100 + 5 mM biotin. After 30 minutes another 25 mL TED100 + 5 mM biotin elution. Eluates were pooled and dialysed overnight in TED 60 at 4 °C. The protein solution 0.2 µm filtered and loaded onto a 5 mL HiTrap Heparin HP column (Cytiva) at 2 ml/min that was equilibrated with TED60 at 4 °C. The column was washed with TED60 buffer until UV reached a stable plateau. LacI was eluted with a gradient of TED60-TED1000 over 15CV. Peak fractions were pooled and dialysed overnight against TED100 + 10% glycerol. Protein concentration was determined using an extinction coefficient of 28482.5 M/cm.

Extracts

HSS (High-Speed Supernatant) and NPE (NucleoPlasmic Extract) egg extracts were made as described in Sparks et al. ¹⁰¹ with the following modifications. All buffers apart from L-cysteine were chilled at 4 °C overnight and moved to room temperature at the start of the procedure. Crude S phase extract was collected by puncturing tubes with a 5 mL syringe connected to an 18 G needle instead of by gravity. Crude S phase extract was diluted with 0.1 vol ELB buffer prior to ultracentrifugation when making HSS.

Bulk replication reactions

DNA templates were licensed at a final concentration of 10 ng/µL by mixing with HSS extract that had been supplemented with ATP (2 mM), CPK (5 µM) and PC (20 mM) and incubated at room temperature for 5 mins. Licensing was carried out for 30 minutes. Replication was initiated by adding 2 volumes of NPE that was also supplemented with ATP (2 mM), CPK (5 µM), PC (20 mM), dATP[α-32P] (Perkin Elmer) (1/50 vol) and equilibrated at room temperature for 15−20 mins. Timepoints from the point of NPE addition are denoted in figure legends. At indicated times, 1.5 µL of the reaction mixture was removed and added to 12 volumes of a stop solution to give final concentrations as follows: EDTA (50 mM), Ficoll 400 (5%, w/v), SDS (2.5%, w/v), bromophenol blue (0.0625%, w/v) and proteinase K (1 mg/mL). Reactions were deproteinised for 1–1.5 hrs at 37 °C prior to being run on 1% TAE gels (12×14 cm) at 25 V for 16.5 hrs. The gel was trimmed just above the dye front and sandwiched between a layer of filter paper and Hybond N+ membrane. A stack of paper towels were placed on top of the gel and a ~1.5 Kg weight was placed on top of the gel for 30–45 mins. The squashed gel was then dried under vacuum at 80 °C for 1.5 hours. After drying, the filter papers were removed and the dried gel and attached Hybond membrane was exposed to a storage phosphor screen and imaged using a Typhoon imager (Cytiva). Each NPE preparation was diluted (typically 50-70%) with an amount of LFB1/50 REF buffer that gave maximal rates and extents of DNA replication.

For experiments that induce site-specific fork stalling, LacI (163 pmols) or Tus (279 pmols) proteins were prebound to DNA templates (300 ng) in their storage buffer for 10 mins at room temperature. The entire volume was added to HSS to license the DNA at a final concentration of 10 ng/µL. An appropriate volume of this mixture was removed after 30 mins and added to 2 vols of NPE as described above. Sample processing and gel electrophoresis were performed as described above.

AFM sample preparation

Samples for Atomic Force Microscopy were prepared by preparing licensing reactions as described above at 15 ng/µL for 30 mins prior to replication initiation with 2 vols of NPE. Reactions (26 µL) were terminated at indicated times, under conditions that slow branch migration, by the addition of 4 volumes of stop buffer containing: 62.5 mM tris.Cl pH 7.5, 0.625% SDS and 12.5 mM MgCl₂ in order to slow branch migration. TMP was added to 10 µg/mL and samples were crosslinked by irradiation at 340 nm with a UV Stratalinker 3400 for 10 mins. This process was repeated three times in total. RNaseA was added to a final concentration of 1 mg/mL and samples were digested for 3 hrs at 37 °C. Proteinase K was added at 1 mg/mL and samples were incubated for a further 1 hour at 37 °C. Nucleic acids were twice extracted with phenol:chloroform then chloroform before ethanol precipitation. DNA was resuspended in 10 mM tris.Cl pH 8.0. For each DNA species. For mixed population samples, 3 ng of both supercoiled and relaxed unknot pDIRECT plasmid DNA was premixed in an eppendorf tube to make 6 ng total DNA, which was then immobilised as described below with either nickel chloride or magnesium chloride buffer.

AFM imaging

Atomic Force Microscopy of DNA samples was performed by depositing ~5 ng of DNA onto discs of freshly cleaved muscovite mica. A ~20 µl droplet of either nickel [20 mM HEPES pH7.4, 3 mM NiCl₂] or magnesium [10 mM TRIS, 25 mM MgCl₂] adsorption solution was placed on the mica disc, then the 5 ng DNA sample was pipetted into the droplet and mixed thoroughly by further pipetting⁶⁰. The sample disc was then incubated for 30 or 5 minutes for nickel and magnesium solution, respectively, at room temperature in an enclosed container with a humid environment, to reduce evaporation. After incubation, the sample was washed three times to remove any non-adsorbed DNA by the addition of nickel imaging solution [20 mM HEPES pH7.4, 3 mM NiCl₂], vigorous pipetting, then removal of the same volume. This was followed by increasing the final volume of the solution droplet to 25 μl with imaging solution prior to imaging. Sample scans were captured using either a Multimode 8 AFM system (Bruker) or a Dimension XR FastScan (Bruker) using PeakForce Tapping mode. One limitation of PeakForce Tapping mode is that it has lower throughput than standard Tapping mode. Despite the lower throughput, PeakForce tapping was used because it allows for accurate force control to prevent damage to DNA and easier quantification of imaging forces, allowing for consistent high-quality imaging using parameters which rarely vary by more than 5%^44,60,62. For Multimode use, PeakForce HiResB cantilevers (Bruker) and PeakForce Tapping mode were used. Initially, low-resolution scans [e.g., 1−2 μm² scan size, 3 Hz scan rate, 256 × 256 lines per scan] were used to quickly identify areas of interest. High-resolution scans [e.g,. 0.18-1 μm² scan sizes, 1.41 Hz scan rate, 512 × 512 lines per scan] of molecules of interest were carried out with fine-tuned operational settings for maximum imaging resolution [20 nm PeakForce amplitude, 4 KHz PeakForce frequency, and PeakForce setpoints in the range of 5–20 mV, corresponding to peak forces of <70 pN]. For Dimension XR FastScan use, FastScan-D cantilevers (Bruker) and PeakForce Tapping mode were used. Images were taken at 512 × 512 pixels at a scan rate of between 4 and 8 Hz for larger scans of areas. High resolution images were taken at 1024 × 1024 pixels at an appropriate image size (0.3–0.7 µm²) scan size to ensure a resolution of >0.75 nm/pixel at a scan rate of 3–4 Hz. For each sample, a minimum of 5 separate immobilisations were performed with multiple different molecules imaged per condition, totalling 400 images. The number of molecules selected for analysis is shown in the caption of the corresponding figure.

Manual processing of AFM images for manual interpretation and analysis

The open-source software package Gwyddion⁷⁵ was used to pre-process the data according to previously published work¹⁰². This removed surface tilt via mean plane subtraction, align rows by row medians, correct horizontal scars, obtain a foreground mask, assign a false-coloured topographic scale, apply a Gaussian filter of 3 pixels, and shift the minimum height to zero. The arbitrary line profile extraction tool in Gwyddion was used to determine the height across DNA crossings by hand using an average of 3 parallel lines. Additionally, this was used to measure the unreplicated length of DNA manually in the replication intermediate structures, as well as the fork length.

Automated processing of AFM images for interpretation and analysis

AFM images were processed using the open-source AFM image analysis software TopoStats⁷⁹ (https://doi.org/10.15131/shef.data.22633528.v2). Flattened images were obtained following similar steps to the Gwyddion processing using median row alignment, planar tilt removal, quadratic tilt removal and scar removal. A background mask obtained from pixel heights below 1σ height allowed the above steps to be repeated on just the background data. The image data was translated so the background average was centred around 0. Finally, a 1.1px Gaussian filter reduced any high-gain noise. DNA molecules of interest were identified using the standard TopoStats grain finding pipeline with the parameters found in the provided configuration file. Object bounding boxes used to crop the identified grains for finer segmentation within a U-Net.

U-Net finer segmentation

For training, 76 hand-labelled cropped AFM images underwent random augmentations of; up to a 30% scale increase, up to a 30% translation, integer rotations of 90 degrees, and horizontal or vertical flipping. All crops were upscaled or downscaled into the network as 512 × 512 pixels. The labelled images were split into training and test sets using an 80:20 ratio. The model was trained with a learning rate of 0.001, a batch size of 5, 120 steps per epoch, and trained for 100 epochs. The Adam optimiser and binary cross-entropy loss function were used. The U-Net architecture was constructed as a 5-layer encoder-decoder network with skip connections with specific parameters (Supplementary Fig. 26). Using a test dataset of 15 images, the resultant trained model obtained a dice score of 0.82 and intersection over union score of 0.70 on pixel probabilities, but 0.84 and 0.72 using a binary threshold of >0.1 as in the final workflow to prevent breaks in the mask.

DNA tracing algorithm

Individual grain masks are skeletonised using our intensity-biased variation of Zhang’s skeletonisation algorithm⁸¹. Terminating branches with less than 15% of the total of the number of skeleton pixels were pruned, along with branches whose centre height was less than the average skeleton height minus 0.85 nm (the depth of the DNA major groove). A sliding window convolution of the pruned skeleton counts the number of pixel neighbours where > 3 neighbours indicates crossings, and 2 neighbours indicate the backbone. Crossing points separated by up to 7 nm (~ double the tip-convolved DNA width) are connected as they likely belong to the same crossing. The shortest path between two nodes with an odd number of emanating branches (resulting from insensitive segmentation) is also connected. Then, for each 20 nm crossing region (below the 45 nm persistence length of DNA⁸⁷), emanating branches are paired based on their bipartite graph maximal matching vectors, and height traces along the pair are used to calculate the FWHM and determine the overlying and underlying branches. To order the trace, the skeleton was split into connecting and crossing regions, and each linear segment ordered, starting from an endpoint or random connecting region and following on from the previous segment until the trace reached its starting point. If all segments were not used, another molecule trace was started, such as for the DNA catenanes. Simplified topological traces used the over/under-passing crossing region classification to label heights as ascending integers.

Crossing order reliability

A pseudo-confidence value used to rank the crossing order reliability (COR) for each crossing is obtained using the FWHM from each duplex height trace in the crossing. For a single crossing (which may contain >2 crossing segments), all possible combinations (not permutations), N, of the calculated FWHM for each crossing segment are paired with another in the same crossing, FWHM_pairs. A confidence value for each pair of crossing duplexes is obtained, and when > 2 crossings, averaged to obtain an average crossing reliability value using Eq. 1.

Examples. (Simple) For two crossing duplexes, with FWMHs of 0.2 and 0.5. The possible pairs are just [0.2, 0.5] so N = 1 and the COR = 0.6. (Complex) For 3 crossing duplexes, with FWMHs of 0.2 and 0.5 and 0.7. The possible pairs are: [0.2, 0.5], [0.2, 0.7] and [0.5, 0.7] so N = 3 and the values within the summations are 0.6, 0.71 and 0.29. The COR is then the average of these so COR = 0.53 for this single crossing.

For multiple crossings, this would only work if branches are paired correctly, and if the crossing is a Reidemeister move and not a clustered crossing.

Calculating crossing classification probability

We took AFM images of DNA 3-Node knots (N = 10) and 4-Node catenanes (nicked, N = 5, supercoiled, N = 14) in open configurations with all crossings (N = 106) clearly visible. For both the 3-Node knot and 4-Node catenane samples, only one crossing needs to be clearly determined by eye for the ground truth of all crossings to be known due to the sample topology. Once at least one stacking order is accurately determined, the other more ambiguous crossings can be assigned as they must alternate in their stacking order along the strand path for these specific topologies. Using this simple model, the accuracy of the pipeline was calculated at 82% compared to the hand labels.

Provided a probability of obtaining a correctly classified crossing, p = 0.82, the probability of obtaining the different possible topological species from reversing the stacking order in each crossing can be calculated.

For example, a 3-node knot topology can be obtained by correctly or incorrectly classifying all crossings of a 3-node knot. Incorrectly classifying just one or two crossings will result in an unknot topology. Combinatorics tells us that there are 6 ways to obtain an unknot (3 with one misclassification and 3 with two crossing misclassifications), and 2 ways to obtain a 3-Node knot (Supplementary Fig. 4). Using the probability above with the combinatorial values, we obtain the probability of correctly classifying complex topologies of increasing crossing numbers (Supplementary Fig. 18).

Random Forest classification

To assess the ability of our approach to distinguish between relaxed and supercoiled conformations, we trained a random forest classifier on a data set of Ni immobilised nicked and supercoiled molecules, consisting of 31 relaxed molecules and 49 supercoiled molecules (Supplementary Fig. 9). The model was built using the following list of features: ‘smallest_bounding_width’, ‘smallest_bounding_length’, ‘aspect_ratio’, ‘max_feret’, ‘grain_endpoints’, ‘grain_junctions’, ‘total_branch_lengths’, ‘num_crossings’, ‘avg_crossing_confidence’, ‘min_crossing_confidence’, ‘num_mols’, “total_contour_length’, and ‘average_end_to_end_distance’. Following training, the model was used to classify a mixed population of relaxed and supercoiled molecules immobilised in Ni, consisting of 52 molecules in total. Of these, 29 were classified as relaxed and 23 were classified as supercoiled.

Topological determination

The 3D trace coordinates obtained during the ordering process were arranged as the index, N, x-coordinate, X, y-coordinate, Y, and pseudo height, Z values into an NXYZ array. If a second trace was present in the same object, e.g., in DNA catenanes, the following traces were re-indexed from 0. The NXYZ array was then input to the Topoly library⁶⁵ using a Jones polynomial to calculate the topological species. This code is available on the TopoStats GitHub (https://github.com/AFM-SPM/TopoStats.git) and as a package on PyPi (topostats 2.3.0).

Data cleanup

Data directly output from TopoStats was cleaned according to the ‘topo_utils‘ script used for topological classification. The cleanup steps are as follows: (1) The contour length row was empty - indicating the tracing pipeline failed. (2) The trace object contained linear molecules - due to poor masking/skeletonisation breaking up the trace. (3) The molecule contained greater than two crossing segments at a single crossing - this removed clustered crossings, which are unable to be resolved. (4) The number of molecules identified by the trace was not one for knots and two for catenanes - as all samples contain closed loops, any unpaired branches from inaccurate segmentation of close DNA duplexes manifested as small linear molecules or unexpected catenanes in the results.

For topological classification and their confidence rankings, steps 1, 2 and 3 were used because if the sample was unknown, the number of expected molecules in step 4 would also be unknown. For the calculation of contour length, steps 1, 2, 3 and 4 were applied, in order to separate the smaller and larger catenane molecules to ensure accurate length measurements. For the calculation of crossing distributions, steps 1, 2 and 4 were used to ensure we had full molecular representations. For surface compaction measurements, no additional steps were required as only the object masks and not the tracing pipeline were used. For catenane conformation classification, steps 1, 2, and 3 were performed. For the replication intermediate samples, all the TopoStats output data were used to obtain the total contour lengths. By forcing breaks in the molecular skeletons at odd-branched crossings via the configuration file, molecules that contain only 3 traces (two replicated regions and one unreplicated region) were filtered out to identify the unreplicated branch (most dissimilar in length) for analysis.

Calculating contour length averages

After data cleanup, a sample mean and error for each samples’ contour lengths were obtained. For catenated samples, molecules were further separated into single topologies and smaller and larger molecules for a catenane topology. To get an average percentage error across multiple samples, an average of Eq. 2 using the true contour lengths and predicted contour lengths were used.

$${percentage\; error}=\frac{{|predicted}-{true|}}{{true}}\times 100$$

(2)

Molecular simulations

Model of the DNA molecule. The catenated DNA was modelled by two intertwined coarse-grained discretized circular chains of interconnected beads with the diameter of 1 sigma. Each coarse-grained bead represented one turn of DNA 2.5 nm thick and contained 10.5 base-pairs⁴. The larger ring was made of N ≡ N_L = 120 beads and the smaller ring consisted of N_S = 38 beads.

The beads interacted via non-bonded interactions and bonded interactions (Supplementary Fig. 27). The non-bonded pair interactions were modelled by the WCA excluded volume interaction modelled by a fully repulsive truncated and shifted Lennard-Jones potential in the form of U_ex(r) = 4ε₀{[σ/(r-r₀)]¹²-[σ/(r-r₀)]⁶ + c} if r ≤ 2^1/6 and U_ex(r) = 0 otherwise, where c = ¼ and r₀ = 1σ¹⁰³. Furthermore, we considered in the model also the electrostatic repulsion modelled by Debye-Hückel potential in the form U_DH(r) = l_Bε₀q₁q₂exp(-κr)/r, with Bjerrum length l_B = 0.73 nm = 0.29 σ and the Debye-Hückel screening length set to κ = 0.81. The values of U_ex(r)+U_DH(r) were pre-calculated and tabulated to speed up computations. The calculation of the interactions was omitted for the nearest neighbours along the chain. The bonded interactions represented covalent bonds between the beads. The first of the bonded interaction is the harmonic potential between the beads in the form of U_s(r) = k_s(r-r₀)², where k_s is the force constant that was set to 50 ε₀ = 50 k_BT and r₀ is the equilibrium distance set to 1 σ. Furthermore, the bonded interactions were represented by the three-body angular potential and four-body torsional potential. The angular potential was implemented in the form of a harmonic potential U_b(θ) = 0.5 k_b(θ-θ₀)², where k_b is the force constant that corresponds to the energy penalty against angle bending, and θ₀ is the equilibrium angle that was set to θ₀ = π. The force constant of the bending potential was set to 15 k_BT/σ so that the persistence length of the modelled DNA is 50 nm¹⁰⁴ and the value was also corrected for the influence of bending stiffness introduced by electrostatic interaction. In order to involve torsional stiffness, new virtual beads were added following the approach we introduced earlier^105,106,107. The virtual beads did not interact via pair potentials, but they do exhibit mass and hydrodynamic drag. There were 5 virtual beads introduced for each real bead. First, a virtual bead was placed axially in between two real beads and bound by strong harmonic potentials. Furthermore, 4 virtual beads (p_1i, p_2i, p_3i, p_4i; i ϵ 1,...,N_L ∨ N_S) were placed periaxially around the first virtual bead, hence forming a cross with the arm length set to 0.5 σ. The arms of the cross were locked with the dihedral potential along the chain, where the harmonic dihedral potential was introduced in the form U_D(ϕ)=k_D(ϕ-ϕ₀)², where φ is the dihedral angle formed by seceding periaxial beads ϕ(a_i, p_1i, a_i+1, p_1i+1) and ϕ₀ is the equilibrium angle set to ϕ₀ = 0. For our models, we reduce underwinding by altering the number of turns defined by the equilibrium dihedral angle, ϕ≤2πΔLk/N. The underlying physical mechanisms may include electrostatic screening of the phosphate group charges, mechanical distortion and flattening of DNA on the surface, or partial dehydration due to water layer depletion between the DNA and the substrate. In our model, we use a coarse-grained approach to capture conformational phenomena on larger length scales. However, this model does not allow exploration of atomistic structural changes, which are instead incorporated through parameter adjustments, such as the equilibrium angle, in a top-down manner. We found the setting of ϕ for ΔLk being 90-100 percent of the initial value produces consistent images with the AFM images and analyses, where the simulated supercoiled molecules exhibit much more open conformations (Fig. 6b). The force constant k_D represents penalty against torsional deformation/twisting of the chain and it was set to 40ε0, so that the ratio between resulting writhe and twist Wr:Tw is about 66% in favour of the writhe.

Molecular dynamics simulations

We carried out Langevin molecular dynamics simulations by solving Langevin equations of motion for each bead $m\ddot{r}=-{\nabla }U\left(r\right)-\gamma m\dot{r}+R\left(t\right){(2{\varepsilon }_{0}m\gamma )}^{0.5}$, where each term in the equation represents the force acting on the bead, $-{\nabla }U({{\bf{r}}})$ is given by the molecular potential interactions, $-\gamma m\dot{r}$ represents the friction, and $R\left(t\right){(2{\varepsilon }_{0}m\gamma )}^{0.5}$ is the random kicking force, where R(t) is a delta-correlated stationary Gaussian process. After the initial structures were generated, we performed long simulation runs of 2 × 10⁸ simulation steps with the size of the integration step Δτ = 0.002 time units until the gyration radius of the larger of the rings was equilibrated. The physical units of time are given as τ = σ (m/ε₀)^0.5. We performed 100 independent simulations for each setting. The simulations were performed by using Extensible Simulation Package for Soft Matter^108,109.

Modelling of DNA topology

The topology of the DNA chain was modelled as a 4²₁ catenane - a 4-node catenane in which the two circles are intertwined in a right-handed path around each other. The initial conformation of the catenane was generated by using a set of parametric equations. In the initial conformation, the centre of masses of the smaller and the larger ring overlap. After the generation of the real beads, the virtual beads were added by using the previously reported algorithm. During this phase, the crosses formed by the periaxial beads were rotated so that the desired excess of the linking number ΔLk in terms of the twist ΔTw was imposed. In the case of simulations of the nicked DNA molecules, the dihedral lock along the molecule was interrupted at the chosen position by setting the penalty against torsional deformation/twisting k_D = 0. This resulted in the generation of a positive supercoil on the larger ring during the equilibration period, which was mostly preserved during the deposition. In the case of the supercoiled molecules, the ΔLk parameter for the larger ring was set to −4 and ΔLk imposed to the smaller ring is -1. In order to account for the impact of the surface immobilisation effects observed in the AFM experiments, which exhibited a decreased level of observed supercoiling-induced writhe, the level of writhe was adjusted by re-establishing the equilibrium dihedral angle in the dihedral lock (Fig. 6a–c). The maximum value of the new equilibrium angle considered was limited to ${\varphi }_{0}^{*}$ ≤ 2π ΔLk/N. The simulations where the change of the equilibrium angle compensates for the initial ΔLk were compared with the nicked ones.

Soft deposition onto the surface

After the equilibration period, the shortest distance between any of the coarse-grained beads and the surface of the wall was calculated as d = min|r_i;z = 0|for i = 1,...,N_L ∨ N_S. When the shortest distance was found, all the coordinates of the discretized chains were treated by extracting the calculated distance d, effectively placing the catenane to the very vicinity of the surface. Next, another long simulation of 2 × 10⁸ integration steps took place while the beads were held only by the short ranged attractive force of the surface modelled by Debye-Hückel potential under stipulation that each of the DNA beads having charge −2e is attracted by a divalent counterion with charge +2e in environment with Bjerrum length l_B = 0.28 and the Debye-Hückel screening length set to κ = 0.8¹¹⁰. In the case of supercoiled molecules with the adjusted equilibrium angle ${\varphi }_{0}^{*}$ ≤ 2π ΔLk/N, additional 2 × 107 integration steps were performed at the beginning of the soft deposition period.

Statistical analysis of simulations

Several metrics were calculated from the simulations to quantify the geometric and topological properties of both the large and small circles within supercoiled and nicked catenanes, namely twist, writhe, linking number difference (ΔLk = Lk - Lk0), radius of gyration and the distance between their centres of mass. To calculate the twist and writhe, we have used a routine developed earlier (available on GitHub at www.github.com/fbenedett/polymer-libraries). The routine is based on the discrete approximation of the Gauss linking integral³⁷, which evaluates pairwise interactions between segments of the curve. For twist, we use the angular deviation of local frames constructed along the curve. The resulting value of the twist takes into account the transformation of reinstalling the equilibrium dihedral angle described above. Writhe is calculated using the Gauss linking integral, where positive and negative values represent right-handed and left-handed crossings, respectively. Twist is determined by integrating the angular rotation of the polymer’s local reference frame along its length, with positive and negative values reflecting right-handed and left-handed twists. The sign conventions for both quantities follow standard definitions in polymer physics and are based on the geometric orientation of the polymer chain in space.

Principal Component Analysis (PCA)

To determine differences in properties of nicked and supercoiled molecules, as well as differences induced following molecule adsorption, Principal Component Analysis (PCA) was performed on just the final structures of each simulation trajectory to avoid unwanted correlation that could arise from including multiple frames from the same trajectory. Data were normalised prior to performing PCA to avoid larger-scale metrics from dominating the analysis. To further probe conformational differences between nicked and supercoiled catenanes adsorbed on the surface, the number of self-crossings and catenated crossings was computed geometrically by projecting line segments onto an XY plane. Here, self-crossings refer to instances when either the large or small catenated circle intersects with itself, whilst catenated crossings refer to inter-molecule intersections. The frequency of each crossing type was calculated, as well as the Euclidean distance between crossings of the same type.

Pseudo-AFM generation

Coarse-grained simulations were converted into “pseudo-AFM” images through a series of processing steps designed to replicate tip convolution and colour maps associated with experimental data, making visual comparison between simulations and AFM images easier. Coarse-grained simulations were converted into 2D heightmaps by projection onto the XY plane, with each coordinate coloured by height in nanometres. These heightmaps were then dilated to simulate the 5 nm tip radius associated with the AFM images presented throughout, allowing for visualisation of clustered crossings that were observed within the experimental data. Finally, Gaussian filtering was applied to smooth the structures, mimicking the resolution typically observed in AFM images. Qualitative comparisons between pseudo-AFM and experimental AFM images were used to guide simulation parameters based on their visual similarity, with greater similarity indicating optimised parameters.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All data used in this publication, and plotting scripts are publicly available on Figshare with https://doi.org/10.15131/shef.data.27143238¹¹¹. Source data are provided with this paper.

Code availability

All code developed in this publication is publicly available on Github via TopoStats v2.3.0 https://github.com/AFM-SPM/TopoStats, https://doi.org/10.15131/shef.data.22633528.v2¹¹².

References

Gartenberg, M. R. & Wang, J. C. Positive supercoiling of DNA greatly diminishes mRNA synthesis in yeast. Proc. Natl Acad. Sci. USA89, 11461–11465 (1992).
Article CAS PubMed PubMed Central Google Scholar
Deibler, R. W., Rahmati, S. & Zechiedrich, E. L. Topoisomerase IV, alone, unknots DNA in E. coli. Genes Dev. 15, 748–761 (2001).
Article CAS PubMed PubMed Central Google Scholar
Lemaitre, J.-M., Danis, E., Pasero, P., Vassetzky, Y. & Méchali, M. Mitotic Remodeling Of The Replicon And Chromosome Structure. Cell 123, 787–801 (2005).
Article CAS PubMed Google Scholar
Bates, A. D. & Maxwell, A. DNA Topology. (Oxford University Press, Oxford; New York, 2005).
Baxter, J. & Diffley, J. F. X. Topoisomerase II inactivation prevents the completion of DNA replication in budding yeast. Mol. Cell 30, 790–802 (2008).
Article CAS PubMed Google Scholar
Portugal, J. & Rodriguez-Campos, A. T7 RNA Polymerase cannot transcribe through a highly knotted DNA template. Nucleic Acids Res. 24, 4890–4894 (1996).
Article CAS PubMed PubMed Central Google Scholar
Cuvier, O., Stanojcic, S., Lemaitre, J.-M. & Mechali, M. A topoisomerase II-dependent mechanism for resetting replicons at the S–M-phase transition. Genes Dev. 22, 860–865 (2008).
Article CAS PubMed PubMed Central Google Scholar
Holm, C., Goto, T., Wang, J. & Botstein, D. DNA topoisomerase II is required at the time of mitosis in yeast. Cell 41, 553–563 (1985).
Article CAS PubMed Google Scholar
Andrews, C. A. et al. A mitotic topoisomerase II checkpoint in budding yeast is required for genome stability but acts independently of Pds1/securin. Genes Dev. 20, 1162–1174 (2006).
Article CAS PubMed PubMed Central Google Scholar
Pommier, Y., Nussenzweig, A., Takeda, S. & Austin, C. Human topoisomerases and their roles in genome stability and organization. Nat. Rev. Mol. Cell Biol. 23, 407–427 (2022).
Article CAS PubMed PubMed Central Google Scholar
McKie, S. J., Neuman, K. C. & Maxwell, A. DNA topoisomerases: Advances in understanding of cellular roles and multi-protein complexes via structure-function analysis. BioEssays 43, 2000286 (2021).
Article CAS Google Scholar
Sun, Y. et al. Excision repair of topoisomerase DNA-protein crosslinks (TOP-DPC). DNA Repair 89, 102837 (2020).
Article CAS PubMed PubMed Central Google Scholar
Krasnow, M. A. et al. Determination of the absolute handedness of knots and catenanes of DNA. Nature 304, 559–560 (1983).
Article CAS PubMed Google Scholar
Stark, W. M. & Boocock, M. R. The linkage change of a knotting reaction catalysed by Tn3 Resolvase. J. Mol. Biol. 239, 25–36 (1994).
Article CAS PubMed Google Scholar
Olorunniji, F. J. et al. Gated rotation mechanism of site-specific recombination by ϕC31 integrase. Proc. Natl Acad. Sci. USA 109, 19661–19666 (2012).
Article CAS PubMed PubMed Central Google Scholar
Crisona, N. J., Weinberg, R. L., Peter, B. J., Sumners, D. W. & Cozzarelli, N. R. The Topological Mechanism of Phage λ Integrase. J. Mol. Biol. 289, 747–775 (1999).
Article CAS PubMed Google Scholar
Crisona, N. J. et al. Processive RECOMBINATION BY WILD-TYPE GIN AND AN ENHANCER-INDEPENDENT MUTAnt. J. Mol. Biol. 243, 437–457 (1994).
Article CAS PubMed Google Scholar
Colloms, S. D., McCulloch, R., Grant, K., Neilson, L. & Sherratt, D. J. Xer-mediated site-specific recombination in vitro. EMBO J. 15, 1172–1181 (1996).
Article CAS PubMed PubMed Central Google Scholar
Pathania, S., Jayaram, M. & Harshey, R. M. Path of DNA within the Mu Transpososome. Cell 109, 425–436 (2002).
Article CAS PubMed Google Scholar
Grainge, I., Buck, D. & Jayaram, M. Geometry of site alignment during Int family recombination: antiparallel synapsis by the Flp recombinase. J. Mol. Biol. 298, 749–764 (2000).
Article CAS PubMed Google Scholar
Pathania, S. A unique right end-enhancer complex precedes synapsis of Mu ends: the enhancer is sequestered within the transpososome throughout transposition. EMBO J. 22, 3725–3736 (2003).
Article CAS PubMed PubMed Central Google Scholar
Wasserman, S. A., Dungan, J. M. & Cozzarelli, N. R. Discovery of a predicted DNA knot substantiates a model for site-specific recombination. Science 229, 171–174 (1985).
Article CAS PubMed Google Scholar
Kanaar, R. et al. Processive recombination by the phage Mu Gin system: Implications for the mechanisms of DNA strand exchange, DNA site alignment, and enhancer action. Cell 62, 353–366 (1990).
Article CAS PubMed Google Scholar
Valdés, A. et al. Quantitative disclosure of DNA knot chirality by high-resolution 2D-gel electrophoresis. Nucleic Acids Res. 47, e29–e29 (2019).
Article PubMed PubMed Central Google Scholar
Schvartzman, J. B., Martínez-Robles, M.-L., Hernández, P. & Krimer, D. B. Plasmid DNA Topology Assayed by Two-Dimensional Agarose Gel Electrophoresis. in DNA Electrophoresis (ed. Makovets, S.) 1054 121–132 (Humana Press, Totowa, NJ, 2013).
López, V., Martínez-Robles, M.-L., Hernández, P., Krimer, D. B. & Schvartzman, J. B. Topo IV is the topoisomerase that knots and unknots sister duplexes during DNA replication. Nucleic Acids Res. 40, 3563–3573 (2012).
Article PubMed Google Scholar
Mitchenall, L. A., Hipkin, R. E., Piperakis, M. M., Burton, N. P. & Maxwell, A. A rapid high-resolution method for resolving DNA topoisomers. BMC Res Notes 11, 37 (2018).
Article PubMed PubMed Central Google Scholar
Basu, A. et al. Measuring DNA mechanics on the genome scale. Nature 589, 462–467 (2021).
Article CAS PubMed Google Scholar
Arai, Y. et al. Tying a molecular knot with optical tweezers. Nature 399, 446–448 (1999).
Article CAS PubMed Google Scholar
Sharma, R. K., Agrawal, I., Dai, L., Doyle, P. S. & Garaj, S. Complex DNA knots detected with a nanopore sensor. Nat. Commun. 10, 4473 (2019).
Article Google Scholar
Plesa, C. et al. Direct observation of DNA knots using a solid-state nanopore. Nat. Nanotech. 11, 1093–1097 (2016).
Article CAS Google Scholar
Sharma, R. K., Agrawal, I., Dai, L., Doyle, P. & Garaj, S. DNA Knot Malleability in Single-Digit Nanopores. Nano Lett. 21, 3772–3779 (2021).
Article CAS PubMed Google Scholar
Rheaume, S. N. & Klotz, A. R. Nanopore translocation of topologically linked DNA catenanes. Phys. Rev. E 107, 024504 (2023).
Article CAS PubMed Google Scholar
Stasiak, A., Katritch, V., Bednar, J., Michoud, D. & Dubochet, J. Electrophoretic mobility of DNA knots. Nature 384, 122–122 (1996).
Article CAS PubMed Google Scholar
Vologodskii, A. V. et al. Sedimentation and electrophoretic migration of DNA knots and catenanes. J. Mol. Biol. 278, 1–3 (1998).
Article CAS PubMed Google Scholar
Laurie, B. et al. Geometry and Physics of Catenanes applied to the study of DNA replication. Biophys. J. 74, 2815–2822 (1998).
Article CAS PubMed PubMed Central Google Scholar
Katritch, V. et al. Geometry and physics of knots. Nature 384, 142–145 (1996).
Article MathSciNet CAS Google Scholar
Weber, C., Carlen, M., Dietler, G., Rawdon, E. J. & Stasiak, A. Sedimentation of macroscopic rigid knots and its relation to gel electrophoretic mobility of DNA knots. Sci. Rep. 3, 1091 (2013).
Article PubMed PubMed Central Google Scholar
Yamaguchi, H., Kubota, K. & Harada, A. Direct observation of DNA Catenanes by Atomic Force Microscopy. Chem. Lett. 29, 384–385 (2000).
Article Google Scholar
Yamaguchi, H., Kubota, K. & Harada, A. Preparation of DNA catenanes and observation of their topological structures by atomic force microscopy. Nucleic Acids Symp. Ser. 44, 229–230 (2000).
Article Google Scholar
Li, T., Zhang, H., Hu, L. & Shao, F. Topoisomerase-Based Preparation and AFM Imaging of Multi-Interlocked Circular DNA. Bioconjug. Chem. 27, 616–620 (2016).
Article CAS PubMed Google Scholar
Main, K. H. S. et al. Atomic force microscopy—A tool for structural and translational DNA research. APL Bioeng. 5, 031504 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pyne, A. L. B. et al. Base-pair resolution analysis of the effect of supercoiling on DNA flexibility and major groove recognition by triplex-forming oligonucleotides. Nat. Commun. 12, 1053 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pyne, A., Thompson, R., Leung, C., Roy, D. & Hoogenboom, B. W. Single-molecule reconstruction of oligonucleotide secondary structure by Atomic Force Microscopy. Small 10, 3257–3261 (2014).
Article CAS PubMed Google Scholar
Hansma, H. G. et al. Reproducible imaging and dissection of plasmid DNA under liquid with the Atomic Force Microscope. Science 256, 1180–1184 (1992).
Article CAS PubMed Google Scholar
Stasiak, A. & Di Capua, E. The helicity of DNA in complexes with RecA protein. Nature 299, 185–186 (1982).
Article CAS PubMed Google Scholar
Kleinschmidt, A. & Zahn, R. K. Uber Desoxyribonucleinsäure-Molekeln in Protein-Mischfilmen. Z. Für Naturforschung 14, 770–779.
Hudson, B. & Vinograd, J. Catenated circular DNA molecules in HeLa Cell Mitochondria. Nature 216, 647–652 (1967).
Article CAS PubMed Google Scholar
Di Capua, E., Engel, A., Stasiak, A. & Koller, T. H. Characterization of complexes between recA protein and duplex DNA by electron microscopy. J. Mol. Biol. 157, 87–103 (1982).
Article PubMed Google Scholar
Dean, F. B., Stasiak, A., Koller, T. & Cozzarelli, N. R. Duplex DNA knots produced by Escherichia coli topoisomerase I. Structure and requirements for formation. J. Biol. Chem. 260, 4975–4983 (1985).
Article CAS PubMed Google Scholar
Griffith, J. D. & Nash, H. A. Genetic rearrangement of DNA induces knots with a unique topology: implications for the mechanism of synapsis and crossing-over. Proc. Natl Acad. Sci. USA 82, 3124–3128 (1985).
Article CAS PubMed PubMed Central Google Scholar
Atwell, S. et al. Probing Rad51-DNA interactions by changing DNA twist. Nucleic Acids Res. 40, 11769–11776 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zechiedrich, E. L. & Crisona, N. J. Coating DNA with RecA Protein to Distinguish DNA Path by Electron Microscopy. in DNA Topoisomerase Protocols 94 99–108 (Humana Press, New Jersey, 1999).
Ficarra, E. et al. Automatic intrinsic DNA curvature computation from AFM Images. IEEE Trans. Biomed. Eng. 52, 2074–2086 (2005).
Article PubMed Google Scholar
Marilley, M., Sanchez-Sevilla, A. & Rocca-Serra, J. Fine mapping of inherent flexibility variation along DNA molecules. Validation by atomic force microscopy (AFM) in buffer. Mol. Genet. Genom.274, 658–670 (2005).
Article CAS Google Scholar
Benson, F. E., Stasiak, A. & West, S. C. Purification and characterization of the human Rad51 protein, an analogue of E. coli RecA. EMBO J. 13, 5764–5771 (1994).
Article CAS PubMed PubMed Central Google Scholar
Mortier-Barrière, I. et al. A key presynaptic role in transformation for a widespread bacterial protein: DprA conveys incoming ssDNA to RecA. Cell 130, 824–836 (2007).
Article PubMed Google Scholar
Wasserman, S. A. & Cozzarelli, N. R. Supercoiled DNA-directed knotting by T4 topoisomerase. J. Biol. Chem. 266, 20567–20573 (1991).
Article CAS PubMed Google Scholar
Lyubchenko, Y. L. & Shlyakhtenko, L. S. Imaging of DNA and Protein-DNA Complexes with Atomic Force Microscopy. Crit. Rev. Eukaryot. Gene Expr. 26, 63–96 (2016).
Article PubMed PubMed Central Google Scholar
Haynes, P. J., Main, K. H. S., Akpinar, B. & Pyne, A. L. B. Atomic Force Microscopy of DNA and DNA-Protein Interactions. In Chromosome Architecture (ed. Leake, M. C.) 2476 43–62 (Springer US, New York, NY, 2022).
Diggines, B. et al. Multiscale topological analysis of kinetoplast DNA via high-resolution AFM. Phys. Chem. Chem. Phys. 26, 25798–25807 (2024).
Article CAS PubMed Google Scholar
Pyne, A. L. B. & Hoogenboom, B. W. Imaging DNA structure by Atomic Force Microscopy. in Chromosome Architecture (ed. Leake, M. C.) 1431 47–60 (Springer New York, New York, NY, 2016).
Fernandez, M. et al. AFM-based force spectroscopy unravels stepwise formation of the DNA transposition complex in the widespread Tn3 family mobile genetic elements. Nucleic Acids Res. 51, 4929–4941 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kim, E., Gonzalez, A. M., Pradhan, B., Van Der Torre, J. & Dekker, C. Condensin-driven loop extrusion on supercoiled DNA. Nat. Struct. Mol. Biol. 29, 719–727 (2022).
Article CAS PubMed Google Scholar
Dabrowski-Tumanski, P., Rubach, P., Niemyska, W., Gren, B. A. & Sulkowska, J. I. Topoly: Python package to analyze topology of polymers. Brief. Bioinforma. 22, bbaa196 (2021).
Article Google Scholar
Vontalge, E. J., Kavlashvili, T., Dahmen, S. N., Cranford, M. T. & Dewar, J. M. Control of DNA replication in vitro using a reversible replication barrier. Nat. Protoc. 19, 1940–1983 (2024).
Article CAS PubMed PubMed Central Google Scholar
Duxin, J. P., Dewar, J. M., Yardimci, H. & Walter, J. C. Repair of a DNA-protein crosslink by replication-coupled proteolysis. Cell 159, 346–357 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. DNA interstrand cross-link repair requires replication-fork convergence. Nat. Struct. Mol. Biol. 22, 242–247 (2015).
Article CAS PubMed PubMed Central Google Scholar
Dewar, J. M., Budzowska, M. & Walter, J. C. The mechanism of DNA replication termination in vertebrates. Nature 525, 345–350 (2015).
Article CAS PubMed PubMed Central Google Scholar
Colloms, S. D., Alén, C. & Sherratt, D. J. The ArcA/ArcB two-component regulatory system of Escherichia coli is essential for Xer site-specific recombination at PSI. Mol. Microbiol. 28, 521–530 (1998).
Article CAS PubMed Google Scholar
Colloms, S. D., Bath, J. & Sherratt, D. J. Topological selectivity in Xer site-specific recombination. Cell 88, 855–864 (1997).
Article CAS PubMed Google Scholar
Alén, C., Sherratt, D. J. & Colloms, S. D. Direct interaction of aminopeptidase A with recombination site DNA in Xer site-specific recombination. EMBO J. 16, 5188–5197 (1997).
Article PubMed PubMed Central Google Scholar
Bregu, M. Accessory factors determine the order of strand exchange in Xer recombination at psi. EMBO J. 21, 3888–3897 (2002).
Article CAS PubMed PubMed Central Google Scholar
Cornet, F., Mortier, I., Patte, J. & Louarn, J. M. Plasmid pSC101 harbors a recombination site, psi, which is able to resolve plasmid multimers and to substitute for the analogous chromosomal Escherichia coli site dif. J. Bacteriol. 176, 3188–3195 (1994).
Article CAS PubMed PubMed Central Google Scholar
Nečas, D. & Klapetek, P. Gwyddion: an open-source software for SPM data analysis. centr eur. j. phys. 10, 181–188 (2012).
Google Scholar
Hansma, H. G. & Laney, D. E. DNA binding to mica correlates with cationic radius: assay by atomic force microscopy. Biophys. J. 70, 1933–1939 (1996).
Article CAS PubMed PubMed Central Google Scholar
Broekmans, O. D., King, G. A., Stephens, G. J. & Wuite, G. J. L. DNA Twist Stability Changes with Magnesium(2 +) Concentration. Phys. Rev. Lett. 116, 258102 (2016).
Article PubMed Google Scholar
Zechiedrich, E. L. et al. Roles of Topoisomerases in Maintaining Steady-state DNA Supercoiling in Escherichia coli. J. Biol. Chem. 275, 8103–8113 (2000).
Article CAS PubMed Google Scholar
Beton, J. G. et al. TopoStats – A program for automated tracing of biomolecules from AFM images. Methods 193, 68–79 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (eds. Navab, N., Hornegger, J., Wells, W. M. & Frangi, A. F.) 234–241 (Springer International Publishing, Cham, 2015).
Zhang, Y. Y. & Wang, P. S. P. A parallel thinning algorithm with two-subiteration that generates one-pixel-wide skeletons. In Proceedings of 13th International Conference on Pattern Recognition 457–461 4 (IEEE, Vienna, Austria, 1996).
Rolfsen, D. Knots and Links. (AMS Chelsea Publishing, Providence, Rhode Island, 2012). https://doi.org/10.1090/chel/346.
Amunugama, R. et al. Replication fork reversal during DNA interstrand crosslink repair requires CMG unloading. Cell Rep. 23, 3419–3428 (2018).
Article CAS PubMed PubMed Central Google Scholar
Masters, M. & Broda, P. Evidence for the bidirectional replication of the Escherichia coli Chromosome. Nat. N. Biol. 232, 137–140 (1971).
Article CAS Google Scholar
Kavlashvili, T., Liu, W., Mohamed, T. M., Cortez, D. & Dewar, J. M. Replication fork uncoupling causes nascent strand degradation and fork reversal. Nat. Struct. Mol. Biol. 30, 115–124 (2023).
Article CAS PubMed PubMed Central Google Scholar
Willis, N. A. et al. BRCA1 controls homologous recombination at Tus/Ter-stalled mammalian replication forks. Nature 510, 556–559 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hagerman, P. J. Flexibility of DNA. Annu. Rev. Biophys. Biophys. Chem. 17, 265–286 (1988).
Article CAS PubMed Google Scholar
Brasher, R., Scharein, R. G. & Vazquez, M. New biologically motivated knot table. Biochem. Soc. Trans. 41, 606–611 (2013).
Article CAS PubMed Google Scholar
Grosberg, A. Y. Do knots self-tighten for entropic reasons?. Polym. Sci. Ser. A 58, 864–872 (2016).
Article CAS Google Scholar
Lee, A. J., Szymonik, M., Hobbs, J. K. & Wälti, C. Tuning the translational freedom of DNA for high speed AFM. Nano Res 8, 1811–1821 (2015).
Article CAS Google Scholar
Mohammad-Rafiee, F. & Golestanian, R. Electrostatic contribution to twist rigidity of DNA. Phys. Rev. E 69, 061919 (2004).
Article Google Scholar
Mosconi, F., Allemand, J. F., Bensimon, D. & Croquette, V. Measurement of the Torque on a single stretched and twisted DNA using magnetic Tweezers. Phys. Rev. Lett. 102, 078301 (2009).
Article PubMed Google Scholar
Brouns, T. et al. Free energy landscape and dynamics of supercoiled DNA by High-Speed Atomic Force Microscopy. ACS Nano 12, 11907–11916 (2018).
Article CAS PubMed Google Scholar
Racko, D., Benedetti, F., Dorier, J., Burnier, Y. & Stasiak, A. Molecular Dynamics Simulation of Supercoiled, Knotted, and Catenated DNA Molecules, Including Modeling of Action of DNA Gyrase. in The Bacterial Nucleoid: Methods and Protocols (ed. Espéli, O.) 339–372 (Springer, New York, NY, 2017).
Stirling, C. J., Colloms, S. D., Collins, J. F., Szatmari, G. & Sherratt, D. J. xerB, an Escherichia coli gene required for plasmid ColE1 site-specific recombination, is identical to pepA, encoding aminopeptidase A, a protein with substantial similarity to bovine lens leucine aminopeptidase. EMBO J. 8, 1623–1627 (1989).
Article CAS PubMed PubMed Central Google Scholar
Burke, M., Merican, A. F. & Sherratt, D. J. Mutant Escherichia coli arginine repressor proteins that fail to bind l -arginine, yet retain the ability to bind their normal DNA-binding sites. Mol. Microbiol. 13, 609–618 (1994).
Article CAS PubMed Google Scholar
Subramanya, H. S. et al. Crystal structure of the site-specific recombinase, XerD. EMBO J. 16, 5178–5187 (1997).
Article CAS PubMed PubMed Central Google Scholar
Balagurumoorthy, P., Adelstein, S. J. & Kassis, A. I. Method to eliminate linear DNA from mixture containing nicked circular, supercoiled, and linear plasmid DNA. Anal. Biochem. 381, 172–174 (2008).
Article CAS PubMed PubMed Central Google Scholar
Stirling, C. J., Stewart, G. & Sherratt, D. J. Multicopy plasmid stability in Escherichia coli requires host-encoded functions that lead to plasmid site-specific recombination. Mol. Gen. Genet 214, 80–84 (1988).
Article CAS PubMed Google Scholar
Norrander, J., Kempe, T. & Messing, J. Construction of improved M13 vectors using oligodeoxynucleotide-directed mutagenesis. Gene 26, 101–106 (1983).
Article CAS PubMed Google Scholar
Sparks, J. & Walter, J. C. Extracts for analysis of DNA replication in a nucleus-free system. Cold Spring Harb. Protoc. 2019, pdb.prot097154 (2019).
Pyne, A. Protocol for AFM image processing using Gwyddion. 3718907 Bytes (2020) https://doi.org/10.15131/SHEF.DATA.12706259.
Weeks, J. D., Chandler, D. & Andersen, H. C. Role of repulsive forces in determining the equilibrium structure of simple liquids. J. Chem. Phys. 54, 5237–5247 (1971).
Article CAS Google Scholar
Langowski, J. Polymer chain models of DNA and chromatin. Eur. Phys. J. E 19, 241–249 (2006).
Article CAS PubMed Google Scholar
Racko, D., Benedetti, F., Dorier, J., Burnier, Y. & Stasiak, A. Generation of supercoils in nicked and gapped DNA drives DNA unknotting and postreplicative decatenation. Nucleic Acids Res.43, 7229–7236 (2015).
Article CAS PubMed PubMed Central Google Scholar
Benedetti, F. et al. Effects of physiological self-crowding of DNA on shape and biological properties of DNA molecules with various levels of supercoiling. Nucleic Acids Res. 43, 2390–2399 (2015).
Article CAS PubMed PubMed Central Google Scholar
Rusková, R. & Račko, D. Entropic competition between supercoiled and torsionally relaxed chromatin fibers drives loop extrusion through pseudo-topologically bound cohesin. Biology 10, 130 (2021).
Article PubMed PubMed Central Google Scholar
Limbach, H. J., Arnold, A., Mann, B. A. & Holm, C. ESPResSo—an extensible simulation package for research on soft matter systems. Comput. Phys. Commun. 174, 704–727 (2006).
Article CAS Google Scholar
Arnold, A. et al. ESPResSo 3.1: Molecular Dynamics Software for Coarse-Grained Models. in Meshfree Methods for Partial Differential Equations VI (eds. Griebel, M. & Schweitzer, M. A.) 89 1–23 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2013).
Manning, G. S. Limiting laws and counterion condensation in polyelectrolyte solutions. Biophys. Chem. 7, 95–102 (1977).
Article CAS PubMed Google Scholar
Gamill, M. et al. DNA knots, catenanes, and replication intermediates via high-resolution AFM. The University of Sheffield. Dataset. https://doi.org/10.15131/shef.data.27143238.v1 (2025).
Shephard, N., Whittle, S., Gamill, M., Du, M. & Pyne, A. TopoStats - Atomic Force Microscopy image processing and analysis. https://doi.org/10.15131/shef.data.22633528.v2 (2023).
Summers, D. K. & Sherratt, D. J. Resolution of ColE1 dimers requires a DNA sequence implicated in the three-dimensional organization of the cer site. EMBO J. 7, 851–858 (1988).
Article CAS PubMed PubMed Central Google Scholar
Flinn, H., Burke, M., Stirling, C. J. & Sherratt, D. J. Use of gene replacement to construct Escherichia coli strains carrying mutations in two genes required for stability of multicopy plasmids. J. Bacteriol. 171, 2241–2243 (1989).
Article CAS PubMed PubMed Central Google Scholar
Campbell, J. L., Richardson, C. C. & Studier, F. W. Genetic recombination and complementation between bacteriophage T7 and cloned fragments of T7 DNA. Proc. Natl Acad. Sci. USA 75, 2276–2280 (1978).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by the Leverhulme Trust Research Programme Grant RP2013K-017 to S.D.C., a UKRI Future Leaders Fellowship MR/W00738X/1 to A.L.B.P, the Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic (VEGA 2/0038/24 - Polymers with Active Chiral Topology and Nanotechnology) to R.R. and D.R, an UK Microbiology Society Research Visit Grant GA000994 to J.I.P., an ERC Marie Sklodowska-Curie Fellowship (SinMolTermination - 794962) to N.S.G. and a Wellcome Trust Investigator in Science Award (215510/Z/19/Z) to A.G. We wish to acknowledge the Henry Royce Institute for Advanced Materials, funded through EPSRC grants EP/R00661X/1, EP/S019367/1, EP/P02470X/1 and EP/P025285/1 and Robert Moorehead and Xinyue Chen for Dimension FastScan access and support at Royce@Sheffield; computational resources from the National competence centre for high performance computing funded by the European Regional Development Fund (11070AKF2) to R.R. and D.R. We thank Dorothy Buck and Andrzej Stasiak for scientific discussion during the project and Tony Maxwell and Jamie Hobbs for reading and commenting on the final version of the manuscript. We thank Aggeliki Skagia for help with the purification of proteins for Xenopus experiments and Kenneth Marians and James Dewar for the provision of plasmids and experimental protocols.

Author information

James I. Provan
Present address: Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, Gif-sur-Yvette, France
Neville S. Gilhooly
Present address: Oxford Nanopore Technologies plc, Gosling Building, Edmund Halley Road, Oxford Science Park, Oxford, OX4 4DQ, UK
These authors contributed equally: Elizabeth P. Holmes, Max C. Gamill, James I. Provan.

Authors and Affiliations

School of Chemical, Materials and Biological Engineering, University of Sheffield, Sheffield, UK
Elizabeth P. Holmes, Max C. Gamill, Laura Wiggins, Sylvia Whittle, Thomas E. Catley & Alice L. B. Pyne
School of Molecular Biosciences, University of Glasgow, Glasgow, UK
James I. Provan & Sean D. Colloms
Polymer Institute of the Slovak Academy of Sciences, Bratislava, Slovakia
Renáta Rusková & Dušan Račko
London Centre for Nanotechnology, University College London, London, UK
Kavit H. S. Main
School of Computer Science, University of Sheffield, Sheffield, UK
Neil Shephard
School of Medicine and Population Health, University of Sheffield, Sheffield, UK
Helen. E. Bryant
Department of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK
Neville S. Gilhooly & Agnieszka Gambus

Authors

Elizabeth P. Holmes
View author publications
Search author on:PubMed Google Scholar
Max C. Gamill
View author publications
Search author on:PubMed Google Scholar
James I. Provan
View author publications
Search author on:PubMed Google Scholar
Laura Wiggins
View author publications
Search author on:PubMed Google Scholar
Renáta Rusková
View author publications
Search author on:PubMed Google Scholar
Sylvia Whittle
View author publications
Search author on:PubMed Google Scholar
Thomas E. Catley
View author publications
Search author on:PubMed Google Scholar
Kavit H. S. Main
View author publications
Search author on:PubMed Google Scholar
Neil Shephard
View author publications
Search author on:PubMed Google Scholar
Helen. E. Bryant
View author publications
Search author on:PubMed Google Scholar
Neville S. Gilhooly
View author publications
Search author on:PubMed Google Scholar
Agnieszka Gambus
View author publications
Search author on:PubMed Google Scholar
Dušan Račko
View author publications
Search author on:PubMed Google Scholar
Sean D. Colloms
View author publications
Search author on:PubMed Google Scholar
Alice L. B. Pyne
View author publications
Search author on:PubMed Google Scholar

Contributions

Formal contributions in authorship order (CrediT taxonomy): Conceptualisation: E.P.H., M.C.G., J.I.P., N.S.G., A.G., S.D.C., and A.L.B.P.; Data curation: M.C.G., L.W., S.W., N.S., N.S.G. Formal Analysis: E.P.H., M.C.G., L.W., S.W., T.E.C., R.R., K.H.S.M., N.S., H.E.B., N.S.G., D.R., and A.L.B.P.; Funding acquisition: S.D.C., N.S.G., A.G. and A.L.B.P.; Investigation: E.P.H., J.I.P., R.R., T.E.C., K.H.S.M., D.R. and A.L.B.P.; Methodology: E.P.H., M.C.G., J.I.P., L.W., S.W., T.E.C., N.S., H.B., N.S.G. S.D.C., and A.L.B.P.; Project administration: A.L.B.P; Resources: J.I.P., D.R., N.S.G., A.G., S.D.C., and A.L.B.P.; Software: M.C.G., L.W., S.W. and N.S.; Supervision: H.E.B., N.S.G., A.G., S.D.C., and A.L.B.P.; Visualisation: E.P.H., M.C.G., J.I.P., L.W., T.E.C.; Writing – original draft: E.P.H., M.C.G., J.I.P., S.D.C. and A.L.B.P.; Writing – review & editing: E.P.H., M.C.G., J.I.P., L.W., R.R., S.W., T.E.C., N.S., H.E.B., N.S.G., A.G., D.R., S.D.C., and A.L.B.P.

Corresponding authors

Correspondence to Sean D. Colloms or Alice L. B. Pyne.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Yi-Chih Lin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Holmes, E.P., Gamill, M.C., Provan, J.I. et al. Quantifying complexity in DNA structures with high resolution Atomic Force Microscopy. Nat Commun 16, 5482 (2025). https://doi.org/10.1038/s41467-025-60559-x

Download citation

Received: 07 February 2025
Accepted: 27 May 2025
Published: 01 July 2025
Version of record: 01 July 2025
DOI: https://doi.org/10.1038/s41467-025-60559-x

This article is cited by

G-quadruplexes self-assembled from nucleotide monomers as stable prepolymer scaffolds in aqueous environments
- Simon H. J. Eiby
- Thomas E. Catley
- Tue Hassenkam
Scientific Reports (2026)
Recent progress in probing small molecule interactions with DNA
- Simon Poole
- Bríonna McGorman
- Andrew Kellett
Biophysical Reviews (2025)