Introduction

Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) enzymes comprise a prokaryotic adaptive immune system with memory that combats invading nucleic acids1,2,3. The type II and type V CRISPR endonucleases Cas9 and Cas12a, respectively, both use a programmable guide RNA (gRNA) to identify and cleave target DNA sequences. CRISPR-Cas enzymes search for a protospacer adjacent motif (PAM) next to a target sequence. Given sufficient complementarity, PAM identification is followed by invasion of the target DNA duplex by the gRNA to create a three-stranded R-loop structure consisting of a 20-base pair RNA-DNA heteroduplex and the displaced non-target strand (NTS)4,5,6. The enzyme cleaves both strands of target DNA after R-loop formation is complete. Cas12a has been used as an important complement to the more widely used Cas9 in genome editing contexts due to its T-rich PAM, autonomous pre-crRNA processing capabilities7, and its ability to indiscriminately cleave non-specific ssDNA in trans after target DNA cleavage8.

The widespread applications of Cas9 and Cas12a in genome editing, CRISPR interference (CRISPRi), base editing, and imaging depend on the specificity and efficiency of R-loop formation and/or DNA cleavage activation9,10,11,12. However, Cas9 and Cas12a are imperfect molecular machines that can bind with mismatches in the R-loop and induce off-target DNA cleavage at low frequency13,14,15,16,17,18,19. The sensitivity of these enzymes to mismatches differs, with Cas12a reported to have less off-target cleavage than Cas9 in a number of studies13,14,15,16,17,18,19. Cas12a orthologs also have different cleavage efficiencies and responses to mismatches, although comparisons between Acidaminococcus sp. (AsCas12a) and Lachnospiraceae bacterium (LbCas12a) using different experimental methods have resulted in different conclusions about their relative specificities15,16,18,19. While enormous effort has been put into reducing off-target cleavage using a variety of methods20,21,22,23,24,25,26, it has not been eliminated and remains an important issue to understand and mitigate.

R-loop formation is an essential step in the DNA binding and cleavage pathway of Cas9 and Cas12a. For DNA cleavage to occur, the enzyme must first sample partial R-loop intermediates with mismatch-dependent stabilities that affect R-loop progression and consequent DNA cutting. High-resolution structural models of Cas12a orthologs indicate that after PAM recognition, ~5 nucleotides of PAM-proximal gRNA, referred to as the seed, serve as a nucleation site for R-loop formation27,28,29. These “seed” nucleotides are preordered in a binary complex structure29, and mismatches in the seed substantially inhibit activity and binding6,15,16,19. Recent work suggests that between the seed and PAM-distal region (bases ~5-17), AsCas12a R-loop formation may be governed primarily by DNA-RNA duplex strand displacement energetics, with minimal influence from enzyme contacts30. Sampling of partial R-loops is thought to be important for detection of mismatches31,32,33. Moreover, complete Cas12a R-loop formation leads to transient unwinding or “breathing” of DNA downstream of the end of the guide, allowing the target strand (TS) to be cleaved by the Cas12a active site30,32,34. Cas12a R-loop formation has been investigated using magnetic tweezers (MT) and single-molecule FRET (smFRET), supporting a dynamic picture of Cas12a R-loop formation and associated conformations32,35,36,37. However, single-molecule experiments on Cas12a have not yet directly observed or kinetically characterized a seed intermediate, and high-resolution measurements are needed to understand the kinetics and energetics of partial R-loops and downstream unwinding.

Magnetic tweezers are well suited for the study of Cas9 and Cas12a R-loop formation due to the ability to simulate cellular environments by supercoiling the DNA target. Cellular DNA is globally underwound38,39,40,41, and local topological stress is dynamically created by processes that mechanically deform DNA42,43. In a gene editing context, these topological features of DNA could have substantial impact on Cas9 and Cas12a function when binding or editing supercoiled loci. MT experiments have established that both Cas9 and Cas12a R-loop formation are facilitated by negative supercoiling32,44,45,46. Single-molecule and bulk biochemical assays have also established that Cas9 DNA interrogation is more promiscuous on underwound DNA, with complete R-loop formation and efficient cleavage possible in the presence of multiple mismatches45,46,47. The supercoiling sensitivity of Cas9 is exploited by an anti-CRISPR protein acrIIA22 that nicks plasmids to relieve torsional stress48, and supercoiling-dependent Cas9 promiscuity has been observed in genome-wide in vitro measurements supported by suggestive observations in living cells47. However, the effects of DNA supercoiling on Cas12a promiscuity have not been established, and measurements are needed to understand the interplay of supercoiling and mismatches during Cas12a R-loop formation and cleavage.

In this study, we use gold (Au) rotor bead tracking (AuRBT)49, a derivative of MT, to directly measure Cas12a R-loop formation at base pair resolution under biologically relevant supercoiling conditions. Previous AuRBT experiments with SpCas9 (hereafter Cas9) found that R-loop formation proceeds through a discrete intermediate corresponding to Cas9’s ~9 bp seed region, and showed that DNA supercoiling strongly modulates Cas9 activity and specificity by controlling R-loop dynamics45. Here, we apply this approach to AsCas12a and LbCas12a and find that Cas12a has a dynamic, multi-step pathway to full R-loop formation. Cas12a R-loops proceed through a complex series of intermediates, including a ~5 bp intermediate corresponding to the seed region and a ~ 17 bp intermediate that may correspond to a pre-cleavage conformation. As seen for Cas9, negative supercoiling can overcome mismatches to promote R-loop completion, leading to promiscuous cleavage of mismatched targets in a bulk biochemical assay. We summarize our results in the context of a four-state model that illuminates the dynamic, mechanical basis of increased specificity and decreased activity of Cas12a orthologs in comparison to Cas9, and further describes differences between Cas12a orthologs. Comparative models lay the groundwork for understanding the biophysics of specificity and predicting the genome editing activities of other RNA-guided genome editors.

Results

AuRBT measures R-loop formation by dCas12a

An assay based on gold rotor bead tracking (AuRBT) allows us to observe structural transitions during DNA interrogation by Cas12a (Fig. 1). In these experiments, a DNA tether containing a PAM and target sequence (Supplementary Fig. 1; Supplementary Table 1) is constrained between a cover glass and a paramagnetic bead, while a gold nanoparticle attached to the side of the DNA molecule is tracked at high speed to measure changes in torque and twist associated with DNA structural transitions (Fig. 1A). In the presence of Cas12a, R-loops are expected to form readily on negatively supercoiled tethers and collapse under positive supercoiling. Tether supercoiling is controlled by turning magnets positioned above the paramagnetic bead, and torque is measured via angular deflection of the rotor bead. Experiments begin at high overwinding, suppressing R-loop formation. We then add dCas12a, a nuclease-deficient mutant of Cas12a50, to make long and repeated measurements of R-loop dynamics on a single tether without DNA cleavage. After flowing dCas12a:gRNA into the channel, we slowly turn the magnets to unwind the tether (3 rpm). Initially, unwinding results in the linear torsional response of the DNA tether (Fig. 1B). Further unwinding permits binding of dCas12a and R-loop formation, which is observed as a sudden jump in torque. The torque jump corresponds to a change in equilibrium twist \(\Delta {{{{\rm{\theta }}}}}_{0}\), which we express in units of base pairs (bp) by attributing the change to bubble formation due to strand separation of B-DNA. We then rewind the magnets to generate positive torque, leading to collapse of the R-loop. Magnets are cycled to repeatedly underwind and overwind the DNA, allowing measurement of many R-loop formation and collapse events (Fig. 1C). Our cycling protocols explore physiologically relevant torque values in both the negative and positive regime51,52,53, elucidating how Cas12a will interact with cellular DNA.

Fig. 1: Using AuRBT to detect Cas12a binding and R-loop formation.
figure 1

A Schematic of rotor bead tracking assay. A torsionally constrained DNA tether is stretched between a cover slip and magnetic bead. An evanescent field generated by TIR illuminates the rotor bead. Torque (τ) is measured from the difference between magnet position (θ) and the angular deflection of the rotor bead (Ψ), reporting on structural transitions in the target sequence. B R-loop formation event in the presence of dLbCas12a. Measured torque (above) and calculated change in equilibrium twist (below) as a function of total twist imposed by rotating magnets. Raw data are shown in black; 1 s averaged traces are shown in red. The change in equilibrium twist is shown in units of base pairs unwound, assuming that changes are due to unwinding of B-DNA. R-loop formation produces a jump in torque and a corresponding step in equilibrium twist. C Torque vs. twist for unwinding and rewinding cycles in the presence of dAsCas12a. Magnets were rotated to cycle between positive and negative twist, driving repeated R-loop formation and collapse events. Raw data for a single cycle of unwinding (black) and rewinding (gray) are shown with overlayed time-averaged (1 s) unwinding (red-orange) and rewinding (blue-green) traces from the same cycle and two additional magnet cycles. Vertical arrows indicate locations of R-loop formation at negative torque (pink) and collapse at positive torque (blue), respectively. Diagonal arrows show the direction of magnet rotation during unwinding (black) and rewinding (gray). D Example R-loop formation event for dAsCas12a with a fully-matched (FM) guide RNA (gRNA). Additional panels show zoomed-in regions. Yellow lines show idealizes traces generated by change-point analysis. Dashed magenta lines are at 5 bp, 17 bp, and 20 bp to guide the eye. E Example R-loop formation event for dLbCas12a with a FM sgRNA. Insets: From left to right, zoomed-in views of R-loop formation, hopping between distal R-loop states, and R-loop collapse.

For both Cas12a orthologs loaded with a fully matched (FM) guide for the chosen target sequence (sequence 1), the observed behavior is qualitatively similar to previous measurements of dCas945, showing transitions to long-lived states consistent with stable binding and ~20 bp R-loop formation (Fig. 1C–E). Under our experimental conditions, dCas12a R-loop formation and collapse typically occur at negative and positive torques, respectively. These large unwinding events are not seen in control experiments on bare DNA, with dCas12a protein alone, with guide RNA alone, or with guide RNA that is highly mismatched to the target DNA (Supplementary Fig. 2).

R-loop intermediates differ between Cas12a orthologs

The high resolution of AuRBT allows us to directly probe for the proposed seed intermediate, additional R-loop intermediates, and downstream DNA breathing. Example traces (Fig. 1D,E) and automated change point analysis54 show evidence of all of these features, although there are unexpected differences between orthologs. With FM guide RNA, dAsCas12a (but not dLbCas12a) shows clear short dwells in a ~ 5 bp intermediate during R-loop formation and collapse, consistent with the seed region (Fig. 1D). dAsCas12a also proceeds through a persistent distal intermediate with ~17 bp unwound, reaching 20 bp only at high values of negative twist (Fig. 1D). At the highest values of negative twist, the dAsCas12a (but not dLbCas12a) R-loop extends beyond 20 bp, consistent with downstream DNA breathing (Fig. 1D). In contrast, dLbCas12a with a FM guide RNA characteristically forms a full 20 bp R-loop without observable intermediates within the resolution of our assay during unwinding (Fig. 1E). During rewinding, R-loop collapse for dLbCas12a proceeds through reversible transitions to a discrete distal intermediate at ~17 bp, with many transitions between this intermediate and the full R-loop (Fig. 1E). Asymmetry between unwinding and rewinding, as seen in this example, can arise from mechanical loading that outpaces equilibration between some kinetic states55.

The differing behaviors of dAsCas12a and dLbCas12a with FM guides are summarized in histograms of AuRBT data (Fig. 2A,B: left plots, bp histogram) and the positions (in bp) and lifetimes of dwells identified by the Steppi automated change-point analysis algorithm (Fig. 2A,B: right plots, Steppi scatter plot). The distribution of scored dwells for dAsCas12a is complex, differing from the discrete, well-separated states seen with Cas945, but shows a clear cluster of dwells at ~5 bp consistent with the expected seed intermediate. This cluster is most apparent in the Steppi scatter plot, where short-lived scored events are more visible; transient states are less apparent in the histogram representation. The broad distribution of more highly unwound dwells includes a substantial population of events where the R-loop extends past 20 bp, consistent with breathing beyond the 20 bp R-loop34. With the FM guide RNA, dLbCas12a shows few detectable events at ~5 bp of unwinding, and a sharper population of ~20 bp R-loop states along with dwells in the incomplete ~17 bp R-loop state visited during rewinding.

Fig. 2: Analysis of R-loop formation by dCas12a across different orthologs, target sequences, and mismatches.
figure 2

A Left: histogram of measured change in equilibrium twist as a function of the imposed twist on the tether, aggregated from all recorded binding events for dAsCas12a with a FM sgRNA. The plot transitions from unwinding to rewinding at a twist of −5. Right: scatter plot of lifetime and change in equilibrium twist for all Steppi-scored states across all binding events, colored by twist density at state onset. Boxes highlight clusters of dwells at ~5 bp unwound ascribed to the seed region (black) and at >20 bp unwound ascribed to downstream unwinding (magenta). BH same as (A) but with different gRNA, target sequence, and enzyme ortholog conditions. Boxes highlight clusters of dwells at ~20 bp ascribed to the full R-loop (red), ~17 bp ascribed to a distal intermediate (blue), and ~5 bp ascribed to the seed (black). See Supplementary Table 2 for trace statistics. Source data are provided for this figure.

Overall, AuRBT allows a detailed observation of stable Cas12a binding and R-loop formation on supercoiled DNA. Analysis of traces reveals dynamic R-loop formation involving multiple intermediates and reversible transitions. In previous experiments with Cas9 under similar conditions, R-loop dynamics showed a stereotypical state structure with a dominant fully unwound state flanked by brief dwells in a well-defined seed intermediate during formation and collapse45. Cas12a R-loops appear more heterogeneous than Cas9 R-loops, with more available R-loop states and different patterns of R-loop intermediates for different orthologs. Cas12a R-loop formation also generally appears mechanically weaker than Cas9: under the same conditions, dCas9 often opened complete R-loops against a backpressure of positive torque, in contrast to the measurements of dCas12a made here.

R-loop dynamics depend on target sequence

Cas12a cleavage efficiency is known to be dependent on the chosen target sequence56,57. We asked whether the landscape of R-loop states was dependent on the target sequence, or whether the states visited were a stereotypical property of the enzyme as previously reported for Cas945. We repeated AuRBT measurements with a second target sequence (sequence 2), chosen from a recent single-molecule magnetic tweezers study of Cas12a32. LbCas12a R-loop state structure is substantially different from sequence 1, with no distinct ~17 bp distal intermediate detected (Fig. 2D; Supplementary Fig. 3). R-loop formation on sequence 2 more closely approximates two-state behavior for this enzyme, with a dominant well-defined ~20 bp R-loop. AsCas12a shows more subtle differences between sequence 1 and sequence 2, with consistent features including the ~5 bp intermediate but stronger population of the complete or hyper-unwound (downstream breathing) R-loops on sequence 2 (Fig. 2C). Sequence-dependent R-loop dynamics seen here could relate to guide-dependent editing efficiency in genomic contexts57.

Mismatches can trap R-loops in intermediate states

Extensive mismatch analysis with Cas12a has been done in vivo and in vitro, characterizing the effect of single and double mismatches throughout the R-loop6,15,16,18. Mismatches throughout the R-loop can cause reduction in cleavage efficiency, with PAM-distal mismatches generally having the smallest effects. We used AuRBT to probe the effect of mismatches on Cas12a R-loop state structure and dynamics (Fig. 2E–H; Supplementary Fig. 3). We chose a single mismatch at the 8th base (8 MM) and a double mismatch in the 19th and 20th bases (19–20 MM) of sequence 1. Mismatches at these positions are cleavage competent in vivo, but at reduced efficiency6,15,16. As was the case with Cas945, we predicted that destabilization caused by these mismatches would reduce occupancy of fully unwound, cleavage-competent R-loop states, increasing occupancy of intermediates and reducing the torque stability of R-loops. Our single-molecule twist assays also probe the extent to which negative supercoiling can overcome the destabilizing effects of mismatches and allow complete R-loop formation for off-target guides.

As expected, both mismatches cause destabilization of states where the R-loop extends through the mismatched positions. Using a 19–20 MM guide RNA (Fig. 2E,F) with dLbCas12a, the 20 bp state is destabilized, making the ~17 bp population more prominent; under our conditions, the ~17 bp state is only populated at high negative twist. dLbCas12a also displays the ~5 bp seed intermediate when distal states are destabilized by mismatches, populating an intermediate that was not resolvable for this enzyme with the FM guide RNA. For dAsCas12a, the 19–20 MM highly destabilizes the fully open R-loop, and the complex heavily populates the seed intermediate. During unwinding, the R-loop becomes largely trapped at ~17 bp at high negative twist, with full R-loop formation occurring only transiently. The 8 MM guide (Fig. 2G,H) also causes significant trapping of dAsCas12a in the ~5 bp seed intermediate as expected. The intermediates are redistributed for LbCas12a with the 8 MM guide, where the most prominent distal intermediate shifts to ~13 bp. Across species and mismatch conditions tested, full R-loop formation is suppressed by mismatches. As seen with Cas9, sufficient negative torque can rescue full R-loop formation in most cases, although LbCas12a with the 19–20 MM guide is strongly trapped in the distal intermediate.

The results above show how the Cas12a R-loop landscape is reshaped for small, 1–2 mismatch perturbations. To investigate the response of Cas12a to more significantly off-target sequences, we conducted further experiments with a series of guides containing 4, 8 or 12 distal mismatches to the 20 nt target (Supplementary Fig. 4). For these experiments we chose LbCas12a and sequence 2, the enzyme and target sequence with the strongest complete R-loop formation under the fully-matched condition. For analogous guides containing fewer than 17 matched nucleotides, previous single-molecule FRET measurements found only transient associations of LbCas12a with the target sequence in the absence of supercoiling33. Under negative supercoiling, we find that even highly mismatched guides can stably form partial R-loops. For these mismatches, the most prominent distal R-loop intermediate is at ~13 bp unwound, as seen for LbCas12a 8 MM with sequence 1. As expected, higher negative torques are required to reach this intermediate when there are larger numbers of mismatches. The 9–20 MM and 13–20 MM conditions do not show substantial opening beyond the ~13 bp intermediate, while 17–20 MM can access more open states only at the highest negative torques.

A four-state model approximately describes R-loop formation

Noting the complexity of Cas12a R-loop formation, we developed a simplified model of discrete transitions by clustering and calculating transition rates between identified R-loop states. For each species, we used peaks in histograms of scored Steppi states unweighted by state lifetime (Supplementary Fig. 5) to seed k-means clustering of states scored by Steppi analysis. Our model approximates the complex state structure seen in both species using four clusters associated with a closed state (C - 0 BPU), a seed intermediate (I1 - ~5 BPU), a distal intermediate (I2 - ~17 BPU), and the fully open R-loop (O - ~20 BPU) (Fig. 3A). Dwells in these states were scored by merging raw Steppi dwells as described previously45 (Figs. 3, 4, Supplementary Fig. 6,7), and transitions were scored between all states. Our analysis allows for all 12 possible transitions between defined states, although under each condition only a subset of possible transitions were observed.

Fig. 3: Kinetic analysis of state transitions during dLbCas12a R-loop formation.
figure 3

A Schematic of 4-state model: Dwells are assigned to R-loop states corresponding to different ranges of unwinding, and all transitions between states are scored to obtain effective transition rates as a function of imposed twist. B Clustering and kinetic analysis for dLbCas12a with FM Seq 1 gRNA. Left: dwells from Fig. 2 were clustered into 4 states; merged dwells (see Supplementary Fig. 6) are plotted and color coded by state. The weighted average of the change in equilibrium twist for each state is labeled below in units of bp. Middle: Twist-dependent observed rates k are plotted for all 12 possible transitions between states, shown as ln (k) where k is the rate expressed in s-1. Not all transitions are represented for each condition, due to omitting rates for undersampled bins (see Methods). Right: Rates are used to calculate twist-dependent equilibrium constants K. Best-fit lines are shown for data sets with at least 3 plotted points after omitting undersampled bins. CE Same as (B) for other experimental conditions. Legends corresponding to all plots are shown in (E). All error bars represent standard errors (see Methods). See Supplementary Table 2 for trace statistics. Source data are provided for this figure.

Fig. 4: Kinetic analysis of state transitions during dAsCas12a R-loop formation.
figure 4

(AD) Analysis as in Fig. 3B–H, applying the model shown in Fig. 3A to measurements of dAsCas12a. Linear fits to lnK plots show trends in the expected direction except for the I1-I2 transitions in (B) (fit not shown; discussed in text) and (D). All error bars represent standard errors (see Methods). See Supplementary Table 2 for trace statistics. Source data are provided for this figure.

As in prior work, transitions were binned to calculate twist-dependent observed transition rates (Figs. 3, 4, Supplementary Fig. 7 middle) for transitions that were sufficiently sampled within each bin (see Methods). For most transitions, negative torque speeds up transitions to more open states (solid lines with negative slopes), and positive torque pushes transitions toward closed states (dashed lines with positive slopes) as expected. Forward and reverse observed rates for each pair of states were used to calculate apparent equilibrium constants, plotted for each species and RNA condition (Figs. 3, 4, Supplementary Fig. 7 right). We model transitions between R-loop intermediates as transitions between states with changes in equilibrium twist \(\Delta {\theta }_{0}\) and differences in free energy \(\Delta G\) on a DNA polymer with torsional stiffness \(\kappa\) (see Methods). As expected, most slopes are negative, indicating that negative torque drives equilibria to more open states. The different intermediate state structure induced by the sequence change complicates interpretation of the approximate four-state model; for AsCas12a with sequence 2 (Fig. 4B) some more highly unwound dwells are assigned to I1, and the apparent K for I1 ↔ I2 is unphysically lower under negative supercoiling, which may reflect limitations of the analysis.

The overall model is necessarily coarse-gained, and further limitations include grouping ensembles of conformations into states, missed transitions and skipped intermediates due to finite resolution (Supplementary Fig. 8), and possible hidden kinetic states including bound complexes without R-loops or conformational rearrangements without detectable changes in \(\Delta {{{{\rm{\theta }}}}}_{0}\)58\(.\) The slopes of apparent lnK vs. twist plots generally have the expected sign, but often deviate from the expected magnitude based on the physical \(\varDelta {\theta }_{0}\) between states, which may reflect these limitations of the model (Supplementary Table 4). Note also that while \(\varDelta {\theta }_{0}\) is given in units of bp, this number could deviate from the number of bp in the heteroduplex for some states due to distortions in the geometry of the R-loop. With these caveats, the analysis captures much of the complex behavior of Cas12a R-loop formation. Mismatches disfavor more open states, reducing apparent equilibrium constants in comparison to FM conditions, and application of sufficient negative torque can then often overcome the destabilization from the mismatch, reflected in higher equilibrium constants. Our model allows for transitions between states along multiple pathways in which intervening states may or may not be visited; where possible, we have verified that equilibrium constants are consistent when calculated across different paths (Supplementary Fig. 9). We also performed an analysis that considers only transitions between adjacent states, which results in substantially similar plots of apparent equilibrium constants (Supplementary Figs. 10, 11).

Free energy landscapes visualize R-loop dynamics

The combined effect of torque and mismatches can be visualized in free energy landscapes for R-loop formation (Fig. 5A–D; Supplementary Fig. 12), which approximate multidimensional landscapes as 1D profiles along the R-loop progress coordinate and help display comparisons between orthologs, target sequences, and supercoiling conditions. Apparent equilibrium constants (Figs. 3, 4) are used to calculate free energy differences between states (see Methods; Supplementary Table 4). A comparison of FM with 8 MM for AsCas12a on relaxed (0 twist) DNA illustrates the effect of mismatches outside the seed region (Fig. 5A): the more distal I2 and O states are strongly and specifically destabilized. Negative supercoiling (-5 twist) tilts the landscape, overcoming the destabilization from the mismatch. The strongest effects of supercoiling are felt on the most open state, and the O state becomes accessible (Fig. 5B). Comparing orthologs (Fig. 5C, shown at -5 twist) illustrates the differences in landscapes for R-loop progression, including a more stable seed intermediate for AsCas12a. Target sequence comparison (Fig. 5D, shown for LbCas12a at -5 twist) shows that open R-loop states are more stable for sequence 2. Overall, these landscapes illustrate the dependence of Cas12a R-loop formation on both species of origin and target sequence, with mismatches causing destabilization that can be balanced by increased stability due to negative supercoiling.

Fig. 5: Cas12a R-loop free energy landscapes and supercoiling-dependent cleavage.
figure 5

AD Cartoons of free energy landscapes (ΔG in units of kBT), depicted based on the results plotted in Figs. 34 (see Methods; dashed lines denote transitions that were not directly quantified under the twist condition shown). A Example of 8 MM effect on AsCas12a R-loop landscape. B Negative supercoiling tilts the 8 MM landscape, making open states accessible even in the presence of the mismatch. C Comparison between AsCas12a and LbCas12a. D Comparison between Seq 1 and Seq 2 with LbCas12a. E Example gels and quantification of cleavage assays for supercoiled and nicked plasmids incubated with WT Cas12a under all mismatch conditions. % cleavage of supercoiled (blue) and nicked (red) plasmids after 5 min of incubation with the indicated WT Cas12a complex is plotted as mean ± s.d. across three independent experiments. Source data are provided for this figure.

Supercoiling favors promiscuous cleavage by Cas12a

The model described above makes predictions for the effects of supercoiling on bulk biochemical assays. Previously, wild-type (WT) Cas12a with a FM guide has been shown to have accelerated cleavage on supercoiled plasmids32,44. Based on the tilting effect of torque on energy landscapes to favor open R-loop states, we predicted that WT Cas12a with mismatched gRNA would be able to cleave supercoiled DNA in cases where cleavage of relaxed DNA is inhibited by the mismatch(es). We tested this prediction by comparing bulk cleavage of supercoiled plasmids and pre-nicked, torsionally relaxed plasmids (Fig. 5E; Supplementary Fig. 13). As expected, cleavage efficiency is high with FM guides for both orthologs on either relaxed or supercoiled DNA. The 8 MM guide displays substantially lower cleavage on the nicked, torsionally relaxed plasmid, but robust cleavage on the supercoiled plasmid, showing that negative supercoiling overcomes the destabilization due to the mismatch. Both orthologs also show reduced cleavage with the 19–20 MM gRNA on relaxed DNA, although the reduction is more pronounced with AsCas12a under our assay conditions. As expected, negative supercoiling can substantially recover cleavage activity with the 19–20 MM gRNA. Generally, as seen with Cas945, negative torque increases cleavage promiscuity by allowing increased cleavage in the presence of mismatches.

Discussion

We investigated CRISPR -AsCas12a and -LbCas12a R-loop formation using high-resolution single-molecule torque spectroscopy measurements to determine DNA interrogation pathways and rates in fully matched versus mismatched conditions. With fully matched sequences, catalytically inactivated Cas12a (dCas12a) orthologs stably bind to negatively supercoiled DNA, and R-loops collapse under positive supercoiling. We also detected a ~ 5 bp intermediate that forms between the guide RNA and the DNA target, consistent with both a Cas12a binary complex structure and an R-loop intermediate structure determined previously by cryo-EM29,30. Our direct and dynamic measurements of this ~5 bp “seed” interaction confirm the importance of these pre-ordered nucleotides in the guide RNA as an initial state during R-loop formation.

Our data support a dynamic, multi-state R-loop formation process, where state structure and occupancy are sequence-, ortholog-, and mismatch-dependent. A four-state model describes much of the behavior we observe, supported by bulk cleavage experiments that confirm predictions of supercoiling-induced promiscuity. This model provides a framework for understanding how Cas12a R-loop formation is favored and more promiscuous on supercoiled DNA. Previously, off-target Cas12a binding has also been observed on DNA under high tensions, which may occur due to related mechanisms59,60. Comparing R-loop states and mismatch sensitivity of Cas9, LbCas12a, and AsCas12a illustrates differences in behavior relevant to genome editing contexts (Fig. 6A, B). Although base pair mismatches impede initial formation of the RNA-DNA seed interaction by all three enzymes, Cas9 seed formation is more stable and more favored by negative torque than Cas12a. Once the seed interaction forms, Cas9 can complete the R-loop in a strongly downhill process even on relaxed DNA. In contrast, Cas12a proceeds through an additional intermediate, with weaker sampling of the fully open R-loop in the absence of negative supercoiling, particularly for AsCas12a where the complete R-loop is thermodynamically uphill from the distal intermediate. A recent study also reported incomplete population of the full R-loop with a fully matched guide, with fluorescence measurements showing equilibrium coexistence of partial and complete R-loop states31. On DNA under physiological levels of negative supercoiling, R-loop completion by Cas9 is so strongly favored that it can occur in the presence of eight PAM distal mismatches45. While negative supercoiling favors the complete R-loop in Cas12a orthologs, distal intermediates remain accessible, and R-loop completion remains significantly impaired by just two PAM distal mismatches and strongly suppressed by larger numbers of mismatches. The downhill R-loop propagation by Cas9 has been previously described as “excess” unwinding activity that can contribute to promiscuous cleavage31. Overall, the more complex and less favorable energetic landscape for Cas12a leads to sensitive and reversible sampling of the R-loop, enabling a late decision point for productive R-loop completion as suggested by earlier work19,30.

Fig. 6: Model comparing Cas12a and SpCas9.
figure 6

A Cartoons of R-loop states and the effects of mismatches and torque on R-loop state transitions. Arrows represent effects of perturbations due to mismatches or negative torque. B Landscapes of Cas12a orthologs in comparison to SpCas9 with FM guides on relaxed (0 twist) and negatively supercoiled (−5 twist) DNA molecules. Cas12a landscapes are as in Fig. 5A–D, reproduced here for comparison with Cas9. The Cas9 landscapes are adapted from ref. 45.

Breathing of DNA downstream of the target sequence34 has been observed experimentally in AsCas12a, a finding attributed to the guide RNA having its 3’ end located at the distal end of the R-loop. Downstream breathing is thought to be required for cleavage of the target strand due to the ssDNA substrate preference of the Cas12a active site. At high values of negative torque with AsCas12a, our experiments revealed brief events in which the R-loop extends past 20 bp. This represents a significant difference from Cas9, which does not require or induce downstream DNA breathing. Extra unwinding is much rarer with LbCas12a in our experiments. Naqvi et al. observe extra unwinding with WT LbCas12a in their MT assay in the absence of Mg2+32, so this behavior may be sensitive to experimental conditions in addition to the chosen ortholog61.

Cas12a has been reported to require a minimum of 17 bp of matching guide RNA for long-lived stable binding and cleavage, after which R-loop-protein contacts in the distal region stabilize the full R-loop30,33. We observe a distal intermediate at ~17 bp with both orthologs, supporting a checkpoint as the R-loop nears completion, where interactions of the protein with the R-loop stabilize the intermediate and induce conformational changes that prepare Cas12a for cleavage30,33. Even after reaching the ~17 bp intermediate, further R-loop extension is needed to fully stabilize and activate the enzyme, leading to the strong effect of PAM distal mismatches in the Cas12a R-loop in our experiments and others15,16,18,33. LbCas12a is largely arrested in the ~17 bp intermediate in the presence of 19–20 MM, and the response of AsCas12a to these mismatches is even more striking, with surprising depopulation of the ~17 bp intermediate in favor of the seed intermediate, inviting further investigation. The ~17 bp kinetic intermediate observed here corresponds well with a structurally observed state associated with Rec2 docking30, which was produced with a guide matching positions 1–16 of the target but producing an additional potential 17th noncanonical base pair. The prominence of distal intermediates, which was not seen with Cas945, contributes to a picture of late steps in R-loop completion that distinguish Cas12a from Cas913,14,62 and may underlie specificity differences.

This study highlights the exquisite sensitivity of Cas12a orthologs to both DNA torque and guide RNA-DNA mismatches. Given Cas12a’s naturally high specificity, it is a good target for continued future engineering efforts. We show the unique ability of AuRBT to probe the effect of DNA supercoiling on R-loop states with sufficient spatiotemporal resolution to finely distinguish between states separated by small numbers of base pairs. Expanding the use of high-resolution torque spectroscopy to other RNA-guided nucleoproteins, engineered Cas9s and Cas12s, and additional Cas orthologs will provide the information necessary to understand mechanistic reasons for specificity and activity differences. A closely related approach has already been recently used to map the R-loop landscape of Cascade63. More detailed models of R-loop formation kinetics may be informed by the measurements reported here as in prior work64,65, and may help resolve remaining limitations of the coarse-grained model we have presented. Models further describing sensitivity to DNA supercoiling and mismatches could inspire new engineering strategies to bias enzymes to be more active and/or specific. The continuous discovery and augmentation of new Cas enzymes will require direct and detailed biophysical comparisons for informing optimal use practices in research and medicine.

Methods

Protein:RNA complexes and nanoparticles

AuRBT experiments were performed using 2 nM nuclease-deficient Lachnospiraceae bacterium Cas12a or Acidaminococcus sp Cas12a with their respective guide RNAs. Complexes were formed by mixing dCas12a:RNA (D832A LbCas12a, D908A AsCas12a) at 100 nM:500 nM in C12T buffer and incubating at 37° C for 10 min prior to being diluted and introduced to the DNA tethers in the experimental flow cell. Guide RNAs were purchased from IDT. See Supplementary Table 1 for RNA sequences. Rotors were 100 nm gold nanospheres ordered from Cytodiagnostics (Cytodiagnostics, AC-80-04-15). Protein was prepared as follows8,34: An E. coli BL21 Star(DE3) culture expressing Cas12a was lysed in lysis buffer (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 0.5 mM PMSF, 10 tablets/L cOmplete EDTA-free protease inhibitor cocktail (Roche), 0.25 mg/mL chicken egg white lysozyme (Sigma-Aldrich)). Clarified lysate was loaded onto a Ni-NTA column, which was then washed with wash buffer (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 5% glycerol, 20 mM imidazole). Protein was eluted with Ni-NTA elution buffer (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 5% glycerol, 300 mM imidazole). TEV protease cleavage was performed overnight while dialyzing against dialysis buffer (50 mM HEPES (pH 7.5), 250 mM NaCl, 1 mM TCEP, 5% glycerol). Dialyzed protein was then loaded onto a HiTrap Heparin HP column and eluted with a gradient from low-salt ion exchange buffer (50 mM HEPES (pH 7.5), 250 mM KCl, 1 mM TCEP, 5% glycerol) to high-salt ion exchange buffer (50 mM HEPES (pH 7.5), 1 M KCl, 1 mM TCEP, 5% glycerol). Peak fractions were concentrated and purified on a Superdex 200 Increase 10/300 size exclusion column in size exclusion buffer (20 mM HEPES (pH 7.5), 200 mM KCl, 1 mM TCEP, 5% glycerol).

DNA tether construction

Tethers were prepared by ligation of digested, PCR-generated pieces, as previously described49,66,67. The portion of the tether below the bead was designed with only one canonical PAM site associated with the intended target sequence. See Supplementary Table 1 and Supplementary Fig. 1 for sequences of tether pieces and schematics. Sequence 1 was adopted from ref. 34, while sequence 2 was used in recent single-molecule study of LbCas12a32. DNA primers were purchased from IDT.

Single-molecule instrumentation and data collection

Experiments were conducted on custom-built microscopes for multimodal single-molecule tracking and manipulation49,66,68 using previously established protocols45. The rotor bead was tracked at 5 kHz using evanescent darkfield scattering. All AuRBT experiments were conducted in C12T (20 mM Tris-HCl, 150 mM KCl, 5 mM MgCl2, 1 mM TCEP) buffer supplemented with 0.2% Tween-20 (Sigma, P9416) and 0.2 mg/ml BSA (Invitrogen, 15561-020). C12T was prepared as a 5x stock and frozen. Fresh C12T was made from frozen stock for each experiment. Tethers were held at 5 pN of tension during experiments. DNA twist was introduced by rotating the magnets at 3 RPM. Three different twist cycling protocols were used, ramping the DNA twist between maximum and minimum values of ( + 12.5, -5), ( + 7.5, -5) and ( + 7.5, -4.5). Torque was calculated from the angular deflection of a transducer segment of torsional stiffness 0.26 pN nm/rad. The zero-torque angular position of the magnets was calculated as the position of maximum extension of the DNA tether at low force67,69 and this zero was subtracted to obtain the reported twist values. A software delay between recorded and actual magnet angle was left uncorrected because of negligible effects on the results for the slow magnet rotation speed used in this study. Each data set was taken on at least two tethers (Supplementary Table 2) using two different sgRNA preparations from IDT (except for each ortholog with sequence 2).

Bulk DNA cleavage assays

Negatively supercoiled plasmid for bulk cleavage experiments was isolated from Escherichia coli. The target site of Cas12a on the supercoiled plasmid was golden gate cloned into the pGGAselect vector (https://www.addgene.org/195714/). Nicked plasmids were generated by treating the supercoiled plasmid with Nb.BssSI nickase. The nick was placed at a site separate from the Cas12a cleavage site. Reactions contained 1 nM DNA plasmid, 50 nM Cas12a, and 100 nM gRNA in C12T buffer with 5% glycerol. Protein:RNA complexes were incubated for 10 min at 37 °C before being combined with the DNA template. Reactions were run for 5 min at 37 °C, then stopped and denatured by adding 25 nM EDTA and 1x TBE-Urea sample buffer (Thermo Fisher Scientific, LC6876), then heating to 70 °C for 5 min. Cleavage products were resolved on 0.8% agarose gels stained with SYBR gold (Thermo Fisher Scientific, S11494) and imaged on a Typhoon FLA 9500 gel scanner (GE Healthcare).

Data analysis

Rotor bead tracking and analysis was performed as previously described45. We expressed torque changes due to Cas12a R-loop formation in units of base pairs assuming any change in equilibrium twist (\({\Delta \theta }_{0}\)) of the tether is the result of bubble formation on B-DNA (10.5 bp per turn)

$${\Delta \theta }_{0}\left({BPU}\right)=10.5 * \frac{\tau -{\tau }_{B}}{2\pi \kappa }$$

where \(\tau\) is the measured torque, \({\tau }_{B}\) is the expected torque on a B-DNA tether, and \(\kappa\) is the torsional stiffness of the tether. Unlike prior work45, displayed BPU histograms were normalized for each twist bin according to the number of contributing trajectories in that bin.

R-loop states were identified using the Steppi change point analysis algorithm54. Angular noise was modeled as an Ornstein-Uhlenbeck process. Coupling and stiffness parameters were fixed globally by fitting to a short segment of data for each trace where only bare DNA was present, and change point times and mean unwinding levels were free parameters for each dwell. Steppi overlays presented in the paper are unmodified idealized output from the algorithm.

Twist-dependent transitions rates, apparent equilibrium constants, and free energies were calculated45 using an approach adapted from an earlier procedure for analyzing force-dependent protein unfolding58. Briefly, the number of transitions between states i and j in each twist bin was counted and normalized to obtain each rate kij. If a twist bin had <3 sampled transitions, it was excluded from analysis. Twist-dependent state lifetimes (Supplementary Fig. 8), which correspond to the inverse of the sum of rates leaving a given state, were similarly calculated from the total number of scored transitions leaving each state in each twist bin, omitting bins with fewer than 3 total observed transitions. Standard errors of transition rates were calculated by assuming Poisson statistics for transition events.

Apparent equilibrium constants were then calculated as

$${K}_{{ij}}=\frac{{k}_{{ij}}}{{k}_{{ji}}}$$

As described previously45, transitions between R-loop intermediates were modeled as transitions between states with changes in equilibrium twist \(\Delta {\theta }_{{ij}}\) and differences in free energy \(\Delta {G}_{{ij}}\) on a DNA polymer with torsional stiffness \(\kappa\). Model parameters are derived from a linear fit of \({{\mathrm{ln}}}({K}_{{ij}})\) as a function of imposed twist \(\theta\) :

$${{{\mathrm{ln}}}}({K}_{{ij}})={\beta \kappa \varDelta \theta }_{{ij}}\theta -{\beta \varDelta G}_{{ij}}(0)$$
$${{\rm{where}}} \, \beta=\frac{1}{{k}_{b}T}.$$

Estimated free energy differences were obtained as \(\Delta {G}_{{ij}}=-{k}_{b}{{{\rm{T}}}}{{{\mathrm{ln}}}}({K}_{{ij}})\).

Data obtained under all presented AsCas12a R-loop conditions were clustered into 4 states using k-means clustering seeded with peak locations observed in 1D histograms (Supplementary Fig. 4) of identified R-loop states. For LbCas12a with a FM sgRNA, state boundaries were adjusted manually to match state definitions in other data. Reported BPU averages were calculated as a time-weighted average over the dwells in each state (Supplementary Table 3).

Landscape construction

Cartoons of free energy landscapes (Fig. 5; Supplementary Fig. 12) were drawn with reference to the calculations described above. In regions where data were insufficient for linear fits, traces to guide the eye were informed by individual data points from the kinetic and equilibrium analysis (Figs. 3 and 4) and intuition from the model and dataset as outlined below.

Within each cartoon landscape, the well positions and free energies, transition state locations, and relative barrier heights were derived from model parameters. The shapes of the curves between peaks and valleys were drawn for illustration and do not convey physical meaning; dashed portions of traces are also not rigorously derived from data, and were drawn in (often using other data sets as a guide) to connect sampled regions together.

Well positions correspond to the mean unwinding level (in bp) for each state, averaged over all dwells and weighted by lifetime (Supplementary Table 3).

Well free energies were calculated from transition equilibrium constants (Supplementary Table 4). Adjacent transitions (C ↔ I1, I1 ↔ I2, I2 ↔ O) were used to calculated plotted free energies in cases where they were adequately sampled and fit. In cases where adjacent transitions were not sufficiently sampled, but alternative paths (Supplementary Fig. 9) were scored, the alternative paths were used to calculate \(\Delta {G}_{{ij}}\). Where free energies could be calculated along multiple paths, we also performed consistency checks (Supplementary Fig. 9) to confirm that equivalent results were obtained using alternative paths.

Barrier heights are depictions of measured transition rates kij and represented as -kbTln(kij)+ C where C = 7 kbT is an arbitrary constant chosen for readable display.

Transition state locations were estimated by fitting a selected approximately linear portion for each ln(kij) vs twist plot, assuming:

$${{{\mathrm{ln}}}}\left({k}_{{ij}}(\theta )/{k}_{{ij}}(0)\right)=\beta \kappa {\Delta \theta }_{{ij}}^{{{\dagger}} }\theta$$

The \({\Delta \theta }_{{ij}}^{{{\dagger}} }\) fit values were then scaled for display, so that the fractional positions of the displayed transition states correspond to \({\Delta \theta }_{{ij}}^{{{\dagger}} }/\Delta {\theta }_{{ij}}\) where \(\Delta {\theta }_{{ij}}\) is obtained from the equilibrium fits described above. Scaling was applied since fit values of \(\Delta {\theta }_{{ij}}\) did not reliably correspond to the physical differences in \(\Delta {\theta }_{0}\) values used to plot well positions.

Detailed notes are given below for differing approaches used for differing enzyme and mismatch conditions dependent on data availability.

AsCas12a FM sequence 1

All adjacent transitions were well-sampled.

AsCas12a 8MM sequence 1

All adjacent transitions were well-sampled; the position of the dashed I1 ↔ I2 barrier was placed arbitrarily due to the unphysical slightly positive slope of ln(KI1I2) vs twist and associated negative slope of ln(kI1I2) vs twist.

LbCas12a FM sequence 1

At 0 twist, most transitions are not sampled, so C ↔ I1 and I1 ↔ I2 transitions were drawn in using data from LbCas12a 19–20 MM. At -5 twist, single data points from the nearby final twist bin centered at -4.5 were used to calculate \(\Delta {G}_{{CI}1}\) and \(\Delta {G}_{{CI}2}\), with transition state heights and locations for C ↔ I1 and I1 ↔ I2 drawn in using LbCas12a 19–20 MM data.

LbCas12a FM sequence 2

Linear fits of C ↔ I1, C ↔ I2, and I2 ↔ O were used to calculate the relative well positions of all states. The dashed I1 ↔ I2 barrier was drawn in using LbCas12a 19–20 MM data. The dashed I2 ↔ O barrier was drawn arbitrarily, due to the unusual positive slope of ln(kI2O) vs twist.

LbCas12a 8 MM sequence 1

All adjacent transitions were well-sampled.

LbCas12a 19–20 MM sequence 1

Linear fits of C ↔ I1 and I1 ↔ I2 were used to calculate landscapes, with the I2 ↔ O transition omitted due to lack of population in the O state.

Data selection

Tethers were selected for displaying circular RB orbits and the expected torsional response of the complete torsionally constrained DNA construct. After Steppi processing, regions of the trace containing apparent R-loop events were selected for kinetic analysis. Unlike prior work45, flanking regions around R-loop events were also included for explicit representation of ~0 BPU closed states, which appear in the displayed scatter plots of scored dwells. Regions of the traces containing tracking artifacts (due to RB sticking, background diffusing nanoparticles, or occasional RB crossings through the center of the orbit49) were excluded from kinetic analysis. Regions of each trace outside of the scored R-loop events were assumed to be in the closed state for kinetic analysis, and complete datasets including these regions were used to generate BPU histograms for each dCas12a:gRNA condition.

Software used

Single-molecule data were collected using MATLAB v2017b for instrument control, and analyzed using MATLAB v2024b. Figures were generated using MATLAB v2024b and Inkscape v1.2.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.