Abstract
CASTs use both CRISPR-associated proteins and Tn7-family transposons for RNA-guided vertical and horizontal transmission. CASTs encode minimal CRISPR arrays but can’t acquire new spacers. Here, we report that CASTs can co-opt defense-associated CRISPR arrays for horizontal transmission. A bioinformatic analysis shows that CASTs co-occur with defense-associated CRISPR systems, with the highest prevalence for type I-B and type V CAST sub-types. Using an E. coli quantitative transposition assay and in vitro reconstitution, we show that CASTs can use CRISPR RNAs from these defense systems. A high-resolution structure of the type I-F CAST-Cascade in complex with a type III-B CRISPR RNA reveals that Cas6 recognizes direct repeats via sequence-independent π − π interactions. In addition to using heterologous CRISPR arrays, type V CASTs can also transpose via an unguided mechanism, even when the S15 co-factor is over-expressed. Over-expressing S15 and the trans-activating CRISPR RNA or a single guide RNA reduces, but does not abrogate, off-target integration for type V CASTs. Our findings suggest that some CASTs may exploit defense-associated CRISPR arrays and that this fact must be considered when porting CASTs to heterologous bacterial hosts. More broadly, this work will guide further efforts to engineer the activity and specificity of CASTs for gene editing applications.
Similar content being viewed by others
Introduction
CRISPR-Cas components are associated with multiple systems beyond adaptive immunity1,2,3. For example, CRISPR-associated transposons (CASTs) are an amalgam of a nuclease-inactive CRISPR effector complex and a Tn7-family transposon4,5,6. Tn7 consists of five genes, termed tnsA-E6,7,8,9,10,11. Transposition is catalyzed by tnsA-C, whereas tnsD and tnsE participate in target selection via two distinct mechanisms6,10,11. TnsD is a DNA-binding protein that recognizes and binds to specific sequences within the host genome, termed the attachment site (e.g., the glmS gene in the best-studied Tn7). In contrast, TnsE is a structure-specific DNA-binding protein that directs Tn7 to the lagging strand during DNA replication12. CASTs partially substitute both tnsD and tnsE with a CRISPR RNA (crRNA)-guided effector complex4,13,14,15. In some systems, the guiding functions of tnsD are replaced by “homing” spacers that target the genomic attachment site for vertical transmission16,17. The mechanism of horizontal transmission, however, remains poorly understood.
CAST systems are organized into two broad categories14,15,16. Type I CASTs, which are highly related to the Tn7 transposase, use a Cascade effector complex to target transposition. By contrast, Type V CASTs evolved from a distinct Tn5053-family transposon and used Cas12k as a single RNA-guided effector protein18,19. Both CAST sub-types encode CRISPR arrays that are markedly different from defense-associated CRISPR-Cas systems. First, CAST-associated CRISPR arrays are extremely short, generally fewer than three repeats14,15,16. For example, the type I-F3c system only retains a single self-targeting (“homing”) spacer, raising the question of how it can also target invading mobile DNA. In addition, the putative type I-C CASTs do not encode any recognizable CRISPR arrays and some I-F CASTs only encode a homing spacer17,20. In contrast, defense-associated CRISPR arrays have tens to hundreds of repeats21,22,23. Second, CASTs do not encode the adaptation genes cas1 and cas2, suggesting that they do not update their own CRISPR arrays22,24,25,26,27. Third, CASTs encode an “atypical” repeat that flanks a self-targeting spacer that is used for vertical transmission16,17. These differences raise the question of how CASTs use these limited CRISPR arrays to target invading mobile elements. Alternatively, there may be one or more mechanisms, not previously considered, that CASTs employ during horizontal gene transfer.
Here, we show that type I and type V CASTs can use spacers from heterologous defense-associated CRISPR arrays. A bioinformatic analysis reveals that all CAST sub-types can co-occur with defense-associated CRISPR-Cas systems. Mate-out transposition assays demonstrate that type I-F, I-B, and V CASTs can use crRNAs derived from defense-associated CRISPR systems nearly as efficiently as their own spacers. A cryo-electron microscopy structure of a type I-F TniQ-Cascade in complex with a type III-B crRNA shows that Cas6 interacts with the direct repeat (DR) of the crRNA via sequence-independent electrostatic and π–π stacking interactions. Interactions between an evolutionarily conserved Cas6 residue and a nucleotide at the apex of the DR stem-loop are essential for transposition and acts as a molecular ruler for the length of the DR stem. In agreement with this structure, we show that the DR must include a five-basepair stem and a five-nucleotide loop for efficient transposition. This mechanism suggests that CASTs may mobilize into invading mobile genetic elements (MGEs) because a history of MGE infections will be updated in the active CRISPR-Cas defense locus28. In addition, type V CASTs also integrate non-specifically via a crRNA-independent copy-and-paste mechanism that requires the Cas12k effector. This process is independent of the S15 specificity factor. Our findings highlight that care must be taken when using CASTs in heterologous hosts that encode additional CRISPR arrays. More broadly, we reveal design principles and potential considerations for optimizing CAST crRNAs for precision gene insertion in diverse organisms.
Results
CASTs co-exist with active CRISPR-Cas defense systems
We reasoned that CASTs may co-opt other CRISPR arrays that are scattered throughout the host genome for horizontal transmission. To test this hypothesis, we searched for all CRISPR arrays in the genomes of CAST-encoding organisms (Fig. 1). We identified 921 genomes that encoded a CAST amongst the ~ 1M high-quality assembled genomes in the NCBI reference sequences (RefSeq) database (Fig. 1A)20,29. All CASTs encoded very short or undetectable CRISPR arrays (Fig. 1B). Next, we searched these CAST-encoding genomes for co-occurring CRISPR-Cas systems and orphaned CRISPR arrays. Defense systems included an active nuclease (i.e., cas3), adaptation genes (i.e., cas1, cas2, cas4), and CRISPR arrays with ~ 10–120 spacers, suggesting active spacer acquisition (Fig. 1B)30,31,32. We also observed isolated examples of “orphaned” arrays that were not adjacent to a recognizable CRISPR-Cas defense system33,34,35. Fifteen percent of genomes that encode a type I-F CAST also encode additional CRISPR-Cas systems. This statistic likely represents a lower bound on the number of co-occurring CRISPR arrays because these microbes had relatively short, highly fragmented genome assemblies in the RefSeq database (Fig. S1A). All organisms with a type I-B or type V CAST encode at least one additional CRISPR array (Fig. 1C and S1B)35. 12.5% of type I-B CASTs and 11% of type V CASTs also co-occurred with two or more additional CRISPR-Cas systems (Fig. 1C). Type I-F CASTs mainly co-occurred with type III-B, I-F, I-E CRISPR defense systems (Fig. S1B). In two genomes, the type I-F CAST co-existed with a type II-A defense system (Fig. 1D). By contrast, type I-B and V CASTs co-occurred with type III-B and type I-D defense systems (Fig. 1D). Below, we test the hypothesis that CASTs can use protospacers from co-occurring defense-associated CRISPR arrays for horizontal transmission.
A A bioinformatics workflow for annotating CRISPR defense systems that co-occur with CASTs in the same genome. Blue: CRISPR-associated genes (cas); brown: transposase genes (tns). B CAST CRISPR arrays are shorter than defense-associated CRISPR arrays in the same genomes. C CASTs co-exist with one or more additional CRISPR arrays. D Defense-associated CRISPR-Cas sub-types that co-exist with CASTs in the NCBI microbial genome database.
Type I-F CASTs mobilize using heterologous CRISPR arrays
To determine whether CASTs can co-opt other CRISPR arrays, we first compared the sequences and secondary structures of their direct repeats (DRs)36. DRs from the type I-F CAST are structurally identical to defense-associated I-F and III-B CRISPR-Cas systems, with a five nucleotide (nt) loop, five basepair (bp) stem, and an eight nt 3’-handle (Fig. 2A). By contrast, the type I-E DR consists of a four nt loop, seven bp stem, and four nt 3’-handle. The type I-C and II-A DRs are even more divergent from the CAST I-F (Fig. S2A).
A The predicted structures of direct repeats (DRs) from a type I-F CAST and co-occurring defense CRISPR-Cas systems. Blue: 4-5 nt loop; green: 5-7 bp stem; yellow: 5'- and 3'-handles. B Schematic of a quantitative conjugation-based mate-out transposition assay. A plasmid harboring the CAST, along with the cargo antibiotic resistance (green), and a minimal CRISPR array is conjugated into the recipient strain and plated on the indicated Luria-Bertani (LB) agar plates. Guided transposition into lacZ is scored as white, chloramphenicol-resistant (Cmr) clones. The donor strain MFDpir+ is a conjugative, DAP auxotrophic, Mu-free donor strain of E. coli designed to support the replication of R6k suicide vectors which will be removed via counter-selection with diaminopimelic acid (DAP)39. C Direct repeats from the defense associated CRISPR arrays support transposition, but a scrambled direct repeat does not. D Colony-resolved long-read sequencing (E) and Sanger sequencing confirm cut-and-paste transposition into lacZ (triangle in E). Target site duplication (TSD) is also visible in this data. F Quantification of transposition from the native CAST array and co-occurring defense systems, in colony forming units (CFU). Error bars are the standard deviation across three biological replicates. Scrambling either the repeat or spacer suppressed transposition below our detection limit of < 106 CFU.
We developed a conjugation-based chromosomal transposition assay to determine whether CASTs can exploit these heterologous CRISPR arrays (Fig. 2B)17,37. In this assay, the CAST genes, a CRISPR array, and a chloramphenicol (Cm) resistance marker surrounded by left and right inverted repeats are assembled into a conditionally replicative R6K plasmid that only replicates in pir+ strains38,39,40. The pir+ donor also includes a chromosomally integrated RP4 conjugation system40,41. Donor cells are auxotrophic for diaminopimelic acid (DAP), allowing for counter-selection on DAP- plates following conjugation with a recipient strain39,42. The BL21(DE3) recipient cells support CAST expression and transposition14,15. Conjugative transfer of the R6K plasmid into the recipient cells and subsequent transposition of the CAST cargo into the host genome (targeting lacZ) results in chloramphenicol-resistant, ΔlacZ recipient cells. The R6K plasmid is lost shortly after conjugation in the recipient cells (pir-) and the donor cells are also removed due the absence of DAP43. Genomic transposition efficiency can be scored quantitatively via the ratio of recipient colonies on standard (DAP-) agar plates vs. CmR cells. Targeting lacZ results in white colonies on Cm/X-gal plates; integration outside lacZ produces blue colonies on the same plates44,45. Finally, we also scored the insertion accuracy via both Sanger- and whole-genome long-read sequencing.
We first tested this assay with the native and atypical direct repeats from the well-characterized V. cholerae HE-45 Type I-F3a system (Fig. 2C)14. This CAST encodes an atypical direct repeat and a homing spacer for site-specific integration into the host’s genome. We removed the homing spacer to avoid spurious transposition events17. Transposition efficiency was scored using a lacZ-targeting spacer14. A scrambled spacer or a scrambled direct repeat served as negative controls. The transposition efficiency was 1.4 ± 0.2% of all viable recipient cells. This was suppressed below the limit of detection (<10−6 CFU) when either the spacer or the repeat were scrambled. All chloramphenicol-resistant colonies (n = 395 across three biological replicates) were white on X-gal plates, suggesting transposition into lacZ (Fig. S2B). Whole-genome long-read sequencing indicated a single transposition event at the expected target size (Fig. 2D). Sanger sequencing of the insertion junctions from 32 colonies showed that the cargo inserted ~42–46 bp downstream of the end of target site (Figs. 2E, S2A). Integration occurred in the forward direction in 91% of all cases and in the reverse direction in the remaining 9%. An atypical direct repeat supported a nearly identical transposition efficiency and insertion orientation (Fig. S2C, D). The atypical direct repeat maintains the same overall stem-loop structure but has 12 nucleotide substitutions relative to the typical direct repeat17. Because the typical and atypical direct repeats maintained a high transposition rate, we conclude that the CAST effector complex can tolerate DRs with divergent RNA sequences.
Next, we tested whether this CAST can use DRs from co-occurring CRISPR defense systems (Fig. 2C, F)14. For this assay, the native CAST array targeted lacZ but encoded the DR from defense-associated CRISPR-Cas systems. All other protein and cargo components remained unchanged. Surprisingly, type I-F and III-B DRs supported transposition efficiencies that were comparable to those from the native CAST, despite differing in the RNA sequence (Fig. S2). CRISPR RNAs with type I-E DRs transposed ~ 103-fold less efficiently than the native CAST crRNAs (Fig. 2F). In all cases, > 99% of the resulting colonies were white on X-gal plates, indicating targeted transposition into lacZ (Fig. S2B). Integration occurred ~ 43–45 bp from the 3’ end of the target, with 90% of all events in the forward orientation (Figs. 2E, S2B). Long-read sequencing showed that a single copy of the cargo was inserted into lacZ (Fig. 2D)14. By contrast, type I-C and II-A direct repeats did not support any transposition activity (<10−6 CFU) in vivo. Expressing the CAST proteins in the recipient cells and conjugating in the cargo confirmed that defense-associated type III-B and I-F direct repeats support robust transposition activity (Fig. S3A, B). We also used an established in vitro transposition assay to confirm the in vivo results (Fig. S4A, B)46. We conclude that the structures of the type I-E, I-C, and II-A DRs differ substantially from the I-F DR, indicating that the DR stem loop is a major determinant of transposition (see below).
Cas6 stabilizes direct repeats via sequence-independent electrostatic interactions
To investigate the molecular basis for how CASTs exploit heterologous CRISPR arrays, we used cryo-electron microscopy to solve the structure of the V. cholerae HE-45 Cascade co-purified with a type III-B crRNA (Fig. 3). The crRNA contained a native direct repeat from the type III-B system and a 32 bp spacer. The density for Cascade and the crRNA was refined with a prior model (PDB: https://doi.org/10.2210/pdb6PIG/pdb6PIG; Fig. S5)47,48,49,50. The overall structure was similar to the prior model (CA − RMSD = 0.83 Å), indicating the native DR from the type III-B can assemble a functional Cascade.
A Structural overview of a type I-F TniQ-Cascade purified with a type III-B crRNA. B Magnified view of Cas6 (salmon) interacting with the direct repeat (gray). F138 interacts with the sole flipped nucleotide, C54, at the tip of the direct repeat through a stacking interaction. An arginine-rich helix stabilizes the entire stem-loop structure via a network of interactions along the sugar-phosphate backbone. C Schematic of the hydrophobic and electrostatic interactions between key Cas6 residues and the direct repeat from the type III-B crRNA. D Multiple sequence alignment across all CAST I-F cas6 genes reveals conserved residues in the arginine-rich helix. E Transposition requires Cas6 residues R121, R125, R129, and F138 to coordinate the direct repeat. (F)Quantification of transposition from CAST I-F system with mutated Cas6. Error bars are the standard deviation across three biological replicates.
The type III-B direct repeat engages Cas6 via sequence-independent interactions with the ribose phosphate backbone (Fig. 3B–C). The cytosine (C54) at the apex of the stem loop is flipped out of the plane and enters a long-range π–π interaction with Cas6(F138). An arginine/lysine-rich helix also forms a strong positive pocket to stabilize the crRNA handle. A multiple sequence analysis of I-F Cas6 proteins indicates that these electrostatic interactions are conserved across the entire CAST sub-family (Fig. 3D). Thus, Cascade engages diverse direct repeats via crRNA-sequence independent mechanism.
We tested the functional significance of the conserved Cas6 residues by mutating all arginines and lysines in the Arg-rich helix (Fig. 3E, F). Mutating any of the arginines and lysines that contacted the crRNA suppressed transposition below our detection range (<10−6 CFU). Similarly, the Cas6(F138A) mutation also abolishes transposition, indicating that the π–π interaction is also necessary for stably engaging the DR (Fig. 3C). Two point mutants—K118A and R120A—do not make any contacts with the crRNA and also didn’t impact transposition efficiency. We conclude that Cas6 stabilizes diverse DRs via RNA sequence-independent electrostatic and π–π stacking interactions.
The direct repeat tunes transposition efficiency
The reduced transposition efficiency with type I-E DRs indicates additional constraints on the CAST crRNA. To test these constraints, we systematically varied the DR sequence and/or structure and assayed the resulting transposition efficiency (Figs. 4A and S6). We first scrambled the DR nucleotide sequence but retained the 5 bp stem, the 5 nt loop, the 5 bp 5’ handle, and the 8 bp 3’ handle of the type I-F CAST. Surprisingly, three crRNA variants with scrambled DR sequenced that retained this structure still maintained wild type transposition efficiency (Fig. S6). By contrast, scrambling the stem-loop entirely abolished transposition. These results confirm that Cas6-DR contacts are sequence-independent but require a structured DR to maintain activity.
A We tested changes in the direct repeat sequence, stem (green), loop (blue), and handle lengths (5'-orange, 3'-yellow). B The effect of each feature on the integration efficiency. Data are shown as mean ± S.D., n = 3 biologically independent experiments. Black: DR from the native CAST CRISPR array; gray: a sequence-scrambled DR that preserved the wild-type stem loop structure; and other colors correspond to the schematic in (A).
Next, we systematically varied the length of the stem, loop, and the 5’ and 3’ handles to determine the key determinants of efficient transposition (Figs. 4A and S6). Starting with the CAST I-F DR, changing the stem length by even a single basepair reduced transposition efficiency up to five-fold. Increasing the length of the stem from five to seven basepairs (as in the type I-E DR) decreased transposition efficiency 500-fold as compared with the type CAST I-F DR (Fig. S6A, B). Decreasing the loop by one nucleotide also reduced transposition efficiency 100-fold (Fig. 4B). Changing the length of the 5’ and 3’ handles modestly reduced transposition efficiency. Consistent with these findings, shortening the type I-E DR stem from seven to five basepairs significantly increased transposition. Adding one nucleotide from the loop to five nucleotides also improved transposition 500-fold relative to the type I-E DR (Fig. S6). These results underscore that the DR structure is the key determinant for assembling a TniQ-Cascade effector complex. The stem must be five basepairs, whereas the loop can tolerate one nucleotide changes from the five-nucleotide native sequence. The structural basis for both effects likely arises from the base stacking interaction with Cas6.
Type I-B CASTs co-opt co-occurring CRISPR arrays for horizontal transfer
All type I-B CASTs co-occur with type I-A, I-B, I-D, or III-B defense systems or with orphaned CRISPR arrays (Fig. 1C). To test whether type I-B CASTs can use these CRISPR arrays, we adapted the mate-out transposition assay to the Anabaena variabilis ATCC 29413 type I-B CAST (Fig. S7A)16. Transposition efficiency with the native CRISPR DR was ~ 103-fold lower than the type I-F CAST (Fig. S7B, C). This may be due to poor expression in E. coli since we did not optimize codon usage, promoters, or translation efficiency. Most of the chloramphenicol-resistant colonies where white (~90%, n = 152), indicating that lacZ was disrupted. Sanger sequencing across the insertion junctions confirmed on-target integration with ~75% forward direction and ~44–48 bp away from the target site (Fig. S7D, E). Scrambling the crRNA without preserving the DR structure ablated all transposition activity (Fig. S7B). These results indicate that type I-B CASTs are active in the mate-out transposition assay.
Next, we tested whether the type I-B CAST can use DRs from two orphaned CRISPR arrays and a co-occurring type I-A defense system. When targeted to lacZ, the first orphaned DR supported transposition, albeit ~ 20-fold lower than the native DR (Fig. S7C). As expected, this system’s DR was structurally most similar to that of the CAST. We did not detect integration for any other DRs (<10−6 CFU). This may be because the predicted structures of the second orphaned CRISPR array, and the type I-A defense system are both divergent from the type I-B CAST (Fig. S7B). We conclude that type I-B and I-F CASTs can both co-opt heterologous CRISPR arrays, so long as the crRNA DRs can be structurally accommodated within the Cascade effector complex.
Type V CASTs transpose via crRNA-dependent and independent mechanisms
All type V CASTs co-exist with either type I or III defense-associated CRISPR systems (Fig. 1). Therefore, we assayed whether the S. hofmannii (Sh) type V CAST can use spacers from these CRISPR arrays for horizontal transfer15. As before, we removed the CAST’s homing spacer and targeted the native tracr-crRNA, or a single guide RNA (sgRNA) to lacZ. Recent structural studies identified the small ribosomal protein S15 as a core CAST subunit that increases the on-target transposition rate50,51. We therefore also tested whether the E. coli or S. hofmannii S15 aides transposition in our assay50,51. Transposition efficiency was scored by measuring the colony forming units on chloramphenicol-resistant plates. On-/off-target events were confirmed via Sanger and long-read whole genome sequencing.
Transposition remained high with the native crRNA, the sgRNA, or crRNAs with type I-D direct repeats (Fig. 5A). In addition, the ShCAST co-occurs with two orphaned CRISPR arrays, which also support transposition (Fig. 5A). Deleting cas12k dropped transposition below our detection limit (<10−6 CFU), unless all CAST components were expressed in the recipient cells (Fig. S8). Surprisingly, deleting the entire crRNA (ΔcrRNA) did not diminish this activity, indicating that unguided transposition is robust (Figs. 5 and S8B, C). Expressing either E. coli or S. hofmannii S15 (EcS15 or ShS15) reduced transposition ~50–250 fold, respectively. When S15 was not over-expressed, off-target integration dominated all insertion events (Fig. 5B). Over-expressing S15 in the recipient cells increased on-target integration up ~60% of all events. Integration with the defense-associated type I-D DR was indistinguishable from the native crRNA (Fig. 5C). CASTs assembled with a sgRNA transposed at the target site most frequently, especially in recipient cells over-expressing ShS15. Transposition was > 95% in the L/R orientation (n = 613 insertion events). The cargo DNA inserted via a co-integration mechanism ~29–34 bp away from the end of the lacZ target site (Fig. 5C–E). We also confirmed robust in vitrotransposition activity with purified CAST proteins (Fig. S9). Taken together, we conclude that ShCAST supports robust guided transposition with heterologous defense-associated CRISPR arrays.
A The small ribosomal protein S15 reduces overall transposition. Data are shown as mean ± S.D., n = 3 biologically independent experiments. B Over-expression of E. coli S15 stimulates on-target transposition, even with a defense-associated type I-D crRNA and the sgRNA. However, significant off-target transposition remains even when S15 is over-expressed. C Long-read sequencing with the type I-D crRNA confirms that most, but not all, insertions are in lacZ when ShS15 is over-expressed. Triangle: target site. The target site duplication (TSD, gray) and left and right inverted repeats (blue) are also visible at the insertion site. (bottom). D The cargo is inserted ~35–42 bp away from the target site (purple in B). E Schematic (left) and quantification (right) of simple insertion and co-integration products via long-read sequencing. Over-expressing S15 and using heterologous crRNA does not significantly alter this ratio. cmR: chloramphenicol resistance gene; tracr: trans-activating crRNA; CR: CRISPR array; R, L: left and right inverted repeats.
Discussion
Here, we show that CASTs can use defense-associated CRISPR arrays already present in the host (Fig. 6). This allows CASTs to use the spacers in active defense systems that continuously update their own CRISPR arrays with a record of prior infections. The most recent mobile genetic elements are inserted proximal to the leader of the CRISPR array and are expressed at the highest levels, potentially allowing the CAST to use spacers for horizontal gene transfer28. Many mobile genetic elements harbor anti-CRISPR systems that inactivate CRISPR defense systems (see refs. 52,53 for a review). Bacteria are still partially immune to such phages, but multiple rounds of infection lead to eventual CRISPR resistance, and a sustained high titer phage infection54. Most anti-CRISPR proteins are highly specific to a narrow group of defense systems, potentially providing a window of opportunity for the CAST to transpose into the mobile element. Follow-up studies will be required to define whether CASTs are also impeded by anti-CRISPR proteins.
All CASTs co-opt defense associated CRISPR arrays (red) for horizontal transmission. These arrays are updated by defense-associated Cas1-Cas2 integrases. Type V CASTs can also integrate non-specifically via crRNA-independent transposition. Both systems use a homing spacer for vertical transmission (purple).
Our bioinformatic analysis shows that all type I-B and type V systems co-exist with at least one CRISPR defense system that can be co-opted by the CAST for horizontal gene transfer. However, we only detected additional CRISPR arrays in ~ 15% of genomes that harbor a type I-F CAST. Several possibilities may explain this observation. First, type I-F CASTs were primarily found in short genomic assemblies, indicating that these are not high-quality closed genomes (Fig. S1). Our analysis may also be oversampling type I-F systems from a small group of cultivatable organisms that are over-represented in the NCBI database. These considerations can bias the results towards fewer systems with co-occurring CRISPR arrays. Second, CASTs may recognize invading DNA by interacting with replisome-associated DNA structures (e.g., the replication fork) or other replisome components (e.g., the sliding clamp). For example, Tn7 encodes tnsE, a gene that directs transposition to newly-replicated DNA6,55. Third, the majority of CRISPR systems in Vibrio, including type I-F CASTs, are already encoded in mobile genetic elements and may have been transmitted by horizontal gene transfer56. Additional genome-resolved metagenomic studies, as well as a focus on interactions between CASTs and host proteins, will shed light on these hypotheses.
A type I-F CAST reconstituted with type I-E, I-F, and III-B crRNAs mobilizes via on-target transposition with high integration efficiency. In addition, these CASTs also recognize a wide range of PAMs57. Indeed, type I-F CASTs recognize a broader range of PAMs than defense-associated type I-F systems58,59. Type I-F Cascades can also tolerate variable crRNA lengths by adjusting the number of Cas7 repeats in the complex46,60,61,62,63. We speculate that type I-F CASTs may also assemble for a variable number of Cas7 subunits, especially on heterologous crRNAs. Taken together, the extraordinary plasticity of type I CASTs in using various spacers and recognizing diverse PAMs may facilitate using defense-associated CRISPR arrays. These factors may lead to unplanned CAST transposition when these systems are ported to heterologous hosts that encode additional CRISPR arrays64. Conversely, the atypical CAST I-F homing spacer prevents defense-associated systems from targeting the host’s genome. Finally, additional Tn7-like (crRNA-independent) transposition activities may also support horizontal transfer.
Defense-associated CRISPR sub-types can also share crRNAs. For example, type III systems lack cas1 and cas2 and co-occur in genomes containing type I CRISPR-Cas loci65. Type III systems can use the pre-processed crRNAs from type I-F systems, acting as secondary defenses that counteract viral escape66,67. Another example is the type VI-B system of Flavobacterium columnare, which is also acquisition-deficient. This system can acquire spacers in trans from a type II-C system that is encoded in the same genome68. Therefore, we cannot rule out that CASTs may also use heterologous cas1 and cas2 for spacer acquisition into their own short arrays. The plasticity of spacer acquisition between CRISPR sub-types suggests that other cas1/cas2-deficient systems may use similar mechanisms to target viral pathogens.
Type V CASTs can randomly transpose via a mechanism that does not require a CRISPR array. However, type V CASTs are exclusively found in cyanobacteria, suggesting limited horizontal transmission4,15. The small ribosomal protein S15 suppresses, but does not completely abrogate, random transposition. Mechanistically, S15 forms a complex with Cas12k and TniQ to stabilize the R-loop50,51. Interestingly, over-expressing S15 also suppressed overall integration via an unknown mechanism. Additional host factors may further participate in transposition in vivo as compared to the purified in vitro system50. The dependence on S15 for on-target integration, the limited range of the homing spacer, and the possible host toxicity associated with random integration may all limit horizontal transmission. Boosting on-target integration while limiting off-target activity will be increasingly important for domesticating type V CASTs in heterologous organisms.
Methods
Bioinformatic analysis of CAST co-occurrence with other CRISPR systems
Genomes containing CAST systems were collected from NCBI genomic databases29. We searched for CRISPR-Cas systems in these genomes using Opfi [https://github.com/wilkelab/Opfi], a Python library to search DNA sequencing data for putative CRISPR systems69. First, we located all regions containing a CRISPR array that was not associated with a CAST. Within those regions, we next searched for cas genes located no more than 25 kilobase pairs away from CRISPR array using BLAST and a previously developed database of diverse cas genes20,30. We sub-typed CRISPR-cas systems based on signature genes65,70.
Software packages
The Python package that was used to identify CRISPR systems in genomic data is accessible on GitHub at the Opfi GitHub Repository [https://github.com/wilkelab/Opfi]69. Specific details regarding the version, dependencies, and configuration settings used are documented within the code repository to ensure reproducibility.
Proteins and nucleic acids
Oligonucleotides were purchased from IDT. Gene blocks for CRISPR arrays were purchased from Twist Biosciences. The R6K plasmid for mate-out transposition assays was obtained from Addgene (#64968)71. The type I-F Vibrio cholerae HE-45 CAST was sub-cloned from Addgene (#130637 and #130633)14. The type V Scytonema hofmanni (Sh)CAST was obtained from Addgene (#127922)15. The type I-B Anabaena variabilis ATCC 29413 CAST was sub-cloned from Addgene (#168137)16. For mate-out transposition assays, each of these systems was PCR amplified and cloned into pTNS2 to replace the parental mini-Tn7 (Addgene #64968) by Golden Gate assembly. The repeat, spacer, chloramphenicol resistance cargo, and left and right inverted repeats were synthesized by IDT and cloned into the same plasmid. Full plasmids information can be found in Table s1.
Cascade purification
Plasmids for type I-F CASTs Cascade over-expression were constructed by sub-cloning the individual genes into pRSFDuet1 (Addgene #126878) to create pIF1008. Type I-F Cascade was co-expressed with 6 × His-MBP-TEV-TniQ and a type III-B crRNA in NiCo21 cells (NEB). Cells were then induced with 0.5 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) at 18 °C for another 18–20 hours before harvesting. Cells were centrifuged and re-solubilized in lysis buffer containing 25 mM Tris pH 7.5, 200 mM NaCl, 5% glycerol, and 1 mM DTT. Cascade complexes were purified via the N-terminal maltose binding protein (MBP) tag using amylose beads (NEB) and eluted with lysis buffer containing 10 mM maltose. MBP was removed using TEV protease at 4 °C overnight. The sample was further diluted to 100 mM NaCl and developed over an anion exchange column (5 mL Q column HP). After loading Cascade, the column was washed extensively with buffer A (25 mM Tris pH 7.5, 100 mM NaCl, 5% glycerol, and 1 mM DTT). The complex was eluted with a 25 column volume gradient of buffer B (25mM Tris pH 7.5, 1 M NaCl, 5% glycerol, and 1 mM DTT.) Cascade was further purified by size exclusion chromatography using a Superose 6 increase column (GE healthcare) in SEC buffer (25 mM Tris pH 7.5, 200 mM NaCl, 5% glycerol, and 1 mM DTT). Fractions were further pooled and concentrated to 0.25 mg/ml and stored in the -80 °C freezer.
Cryo-electron microscopy (cryo-EM)
Sample preparation and data collection
Purified TniQ-Cascade complexes was diluted to a concentration of 0.25 mg/ml in 25 mM Tris pH 7.5, 200 mM NaCl, 5% glycerol, and 1 mM DTT. Samples were deposited on an Ultra Au foil R 1.2/1.3 grid (Quantifoil) that was plasma-cleaned for 1.5 min (Gatan Solarus 950). Excess liquid was blotted away for 4 s in a Vitrobot Mark IV (FEI) operating at 4 °C and 100% humidity before being plunge-frozen into liquid ethane. Data were collected on a Glacios cryo-transmission electron microscope (TEM; Thermo Fisher Scientific) operating at 200 kV, equipped with a Falcon IV direct electron detector camera (Thermo Fisher Scientific). Movies were collected using SerialEM at a pixel size of 0.94Å with a total exposure dose of 40e−/Å2.
Data processing and model building
Motion correction, contrast transfer function (CTF) estimation, and particle picking were all performed on cryoSPARC live and further transferred to cryoSPARC for two-dimensional (2D) classification, ab initio 3D reconstruction calculation, 3D classification, and nonuniform refinement72. Because of the flexibility of TniQ and Cas6, particle subtraction and focused refinement were also performed in cryoSPARC. A full description of the cryo-EM data processing workflows can be found in Fig. S3. A published Cascade structure (PDB: https://doi.org/10.2210/pdb8fuk/pdb8FUK) was docked into cryo-EM density maps using Chimera before being refined in Coot, ISOLDE, and PHENIX47,73,74,75. Full cryo-EM data collection and refinement statistics can be found in Table S2.
Conjugation-based transposition assays
CASTs were cloned into a conditionally replicative R6K plasmid (Addgene #64968). The CAST I-F system’s proteins, CRISPR array, and inverted repeats were sub-cloned from Addgene plasmids #130637, #130634, and #130633 to generate pf1001. CAST V and inverted repeat constructs were sub-cloned from Addgene plasmids #127922 and #127924 to generate pIF1005. CAST I-B system’s proteins and inverted repeat constructs were sub-cloned from Addgene plasmids #168137 and #168146 to generate pIF1003.
For transposition, the R6K plasmid was transformed into MFDpir cells, which contain the genomically-integrated RP4-based transfer machinery, termed the donor strain39. All growth steps were conducted at 37 °C. The donor strain was grown with 0.3 mM diaminopimelic acid (DAP) and appropriate antibiotics. The recipient strain was grown in lysogeny broth (LB). The donor and recipient cells were gently washed four times in PBS (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4) by spinning and re-suspending 1 mL cultures. The cell density was estimated by taking an optical density reading after re-suspension and the donor and recipient cells were combined in a 3:1 ratio. This mixture was plated on a non-selective plate containing DAP (0.3 mM) for conjugation. The conjugation plate was incubated overnight. The conjugation mixture was collected and washed by mixing with 1 mL PBS, vortexed, and gently spun down four times. Multiple ten-fold dilutions of this mixture were plated onto selective (LB+12 μl/ml chloramphenicol) and non-selective plates. The cfu/ml was calculated by counting the colonies on plates with 50-500 colonies. The integration efficiency equal to the cfu/ml on the selection plate divided by cfu/ml on non-selection plates.
DNA sequencing of transposition products
Sanger sequencing
Individual colonies were re-suspended in LB, pelleted by centrifugation at 16,000 g for 1 min and re-suspended in 80 μl of H2O, before being lysed by incubating at 98 °C for 10 min in a thermal cycler. Cell debris was pelleted by centrifugation at 16,000 g for 1 min, and the supernatant was removed and serially diluted with 90 μl of H2O to generate lysate dilutions for PCR analysis. PCR products were generated with Q5 Hot Start High-Fidelity DNA Polymerase (NEB) using 1 μl of the diluted lysate per 10 μl reaction volume. Reactions contained 200 μM dNTPs and 0.5 μM primers and were subjected to 30 thermal cycles. PCR amplicons were resolved by 1% agarose gel electrophoresis and visualized by staining with ethidium bromide (Thermo Scientific). To map integration sites by Sanger sequencing, bands were excised after separation by gel electrophoresis, DNA was gel-extracted (Qiagen), and samples were submitted to Sanger sequencing (Eton).
High-throughput long-read whole genome sequencing
Colonies from plate-based transposition reactions were washed off and diluted to an OD600 ~ 0.5 using LB. The liquid culture was then grown for two hours at 37 °C. Genomic DNA was extracted (ProMega Wizard Genomic DNA kit) and barcoded with a Nanopore technologies rapid barcoding kit. The barcoded DNA was sequenced on a MinION nanopore sequencer using the manufacturer-suggested protocol, all sequencing data are available at PRJNA1000618. Output reads were analyzed using seqkit76. First, the reads were processed by seqkit to collect adjacent target-DNA sequences from all the reads containing 40 bp of the shCAST left-end sequence. Then these sequences were mapped onto the BL21(DE3) genome using BLAST+ with a 95% sequence identity cutoff30. The position of the transposition reaction was defined as the number of basepairs between the end of the PAM and the beginning of the shCAST left-end sequence.
Single-colony whole genome sequencing
Single colonies were grown overnight in LB with 17 ng/μl chloramphenicol at 37 °C. 1 ml of the overnight culture was pelleted by centrifugation at 16,000 g for 1 min and the gDNA was extracted as described above. Genomic DNA samples were separated and barcoded in units of 12 per batch using the MinION rapid barcoding kit. Samples were loaded into a MinION flowcell (FLO-MIN106D) and sequenced with a MinION Mk1B device. The raw read fastq files were assembled with flye77.
In vitro transposition assays
To test type I-F CAST transposition, we adapted the in vitro transcription/translation (IVTT) assay from Wimmer et al. with minor modifications46. We prepared 5 μl IVTT reactions containing 3.75 μl myTXTL Sigma 70 Master Mix, 0.2 nM p70a T7 RNAP, 0.5 mM IPTG, 1 nM pSL0527 (donor plasmid), 2 nM pSL0283 (TnsABC plasmid), 1 nM p70a deGFP, and 1 nM pVchCasQ with a CRISPR I-D spacer which was cloned using Golden Gate assembly. The reactions were incubated at 29 °C for 16 hours. Transposition events were detected by PCR in a 1:400 dilution of the IVTT reaction using Q5 Hot Start High-Fidelity 2X Master Mix (NEB). Transposition products were amplified via junction PCR and verified via Sanger sequencing of the PCR products46.
To test type V transposition, we purified Cas12k, TnsB, TnsC, and TniQ following the protein expression and purification protocols outlined in previous studies6,15. The crRNA and tracrRNA were produced via in vitro transcription using the HiScribe T7 High Yield RNA Synthesis Kit (NEB). Purified RNA was aliquoted and stored at -20 °C. The proteins Cas12k, TnsC, TniQ, TnsB, ShS15 were diluted in a buffer (25 mM Tris pH 8.0, 500 mM NaCl, 1 mM EDTA, 1 mM DTT, 25% glycerol) to specific concentrations. Separate reactions were prepared for the target and donor pots. The target pot contained 2.24 nM pTarget, 104 nM Cas12k, TnsC, TniQ, 1.25 μM crRNA, and 1.25 μM tracrRNA in transposition buffer (26 mM HEPES, 50 mM KCl, 0.2 mM MgCl2, with 2 mM DTT, 50 μg/ml BSA, and 2 mM ATP). The donor pot contained 1.08 nM pDonor 104 nM TnsB, 2 mM DTT, 50 μg/ml BSA, and 2 mM ATP in transposition buffer. Both pots were incubated at 37 °C for 30 min, then combined and added MgOAc2 to a final concentration of 15 mM and further incubated at 37 °C for 2 h. After incubation, the transposition activity was detected by junction PCR and verified by Sanger sequencing.
Statistics and reproducibility
This study employed quantitative transposition assays, in vitro reconstitution, and high-resolution structural analysis to investigate the mechanisms of RNA-guided transposition by CRISPR-associated transposons (CASTs). Sample sizes for quantitative assays were determined to ensure adequate statistical power for detecting significant differences in transposition efficiency. Reproducibility was bolstered by the use of biological triplicates for all key experiments. Statistical analyses were performed using standard methods implemented in Python. Data normalization and transformation procedures were applied as needed based on the type of data and analytical requirements.
Data points were excluded only if technical failures were identified, such as pipetting errors or sample contamination, which were logged and reported. No data were excluded based on outlier status or deviation from expected results. To ensure the reproducibility, constructs used in transposition assays were validated by sequencing. Structural data were deposited in publicly accessible databases, allowing for independent validation of molecular models.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data collected or analyzed during this study are included in this published article and its Source Data. Additionally, the genomic sequences and associated CRISPR-Cas systems analyzed in this research are available in the NCBI genomic databases, accessible through the NCBI reference sequence database (RefSeq) under accession numbers https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1000618/PRJNA1000618. The previously published structure of CAST type I-F is under the accession code https://doi.org/10.2210/pdb6PIG/pdb6PIG. High-resolution structural data and images generated for the cryo-electron microscopy analysis are deposited in the Protein Data Bank (PDB) under the accession code 10.2210/pdb8FUK/pdb8FUK, and the corresponding electron microscopy data are available in the Electron Microscopy Data Bank (EMDB) under the code EMD-29459. Source data are provided with this paper.
References
Barrangou, R. & Horvath, P. A decade of discovery: CRISPR functions and applications. Nat. Microbiol. 2, 1–9 (2017).
Marraffini, L. A. Crispr-cas immunity in prokaryotes. Nature 526, 55–61 (2015).
Mohanraju, P. et al. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science 353, aad5147 (2016).
Faure, G. et al. CRISPR–Cas in mobile genetic elements: counter-defence and beyond. Nat. Rev. Microbiol. 17, 513–525 (2019).
Faure, G., Makarova, K. S. & Koonin, E. V. Crispr–cas: complex functional networks and multiple roles beyond adaptive immunity. J. Mol. Biol. 431, 3–20 (2019).
Shen, Y. et al. Structural basis for DNA targeting by the tn7 transposon. Nat. Struct. Mol. Biol. 29, 143–151 (2022).
Waddell, C. S. & Craig, N. L. Tn7 transposition: two transposition pathways directed by five Tn7-encoded genes. Genes Dev. 2, 137–149 (1988).
Kubo, K. M. & Craig, N. L. Bacterial transposon Tn7 utilizes two different classes of target sites. J. Bacteriol. 172, 2774–2778 (1990).
Craig, N. L. Transposon Tn7. Curr Top Microbiol Immunol. 204, 27–48 (1996).
Parks, A. R. & Peters, J. E. Tn7 elements: engendering diversity from chromosomes to episomes. Plasmid 61, 1–14 (2009).
Peters, J. E. Tn7. Microbiol Spectr. https://doi.org/10.1128/microbiolspec.MDNA3-0010-2014 (2014).
Peters, J. E. & Craig, N. L. Tn7 recognizes transposition target structures associated with DNA replication using the DNA-binding protein TnsE. Genes Dev. 15, 737–747 (2001).
Peters, J. E., Makarova, K. S., Shmakov, S. & Koonin, E. V. Recruitment of CRISPR-Cas systems by Tn7-like transposons. Proc. Natl Acad. Sci. USA 114, 7358–7366 (2017).
Klompe, S. E., Vo, P. L. H., Halpin-Healy, T. S. & Sternberg, S. H. Transposon-encoded CRISPR-Cas systems direct RNA-guided DNA integration. Nature 571, 219–225 (2019).
Strecker, J. et al. Rna-guided DNA insertion with CRISPR-associated transposases. Science 365, 48–53 (2019).
Saito, M. et al. Dual modes of CRISPR-associated transposon homing. Cell 184, 2441–2453 (2021).
Petassi, M. T., Hsieh, S.-C. & Peters, J. E. Guide RNA categorization enables target site choice in Tn7-CRISPR-Cas transposons. Cell 183, 1757–1771 (2020).
Minakhina, S., Kholodii, G., Mindlin, S., Yurieva, O. & Nikiforov, V. Tn5053 family transposons are res site hunters sensing plasmidal res sites occupied by cognate resolvases. Mol. Microbiol. 33, 1059–1068 (1999).
Blackwell, G. A., Iqbal, Z. & Thomson, N. R. Evolution and spread of bacterial transposons. Access Microbiology 1, https://doi.org/10.1099/acmi.ac2019.po0568. (2019).
Rybarski, J. R., Hu, K., Hill, A. M., Wilke, C. O. & Finkelstein, I. J. Metagenomic discovery of CRISPR-associated transposons. Proc. Natl Acad.Sci. 118, e2112279118 (2021).
Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 2551–2561 (2005).
Barrangou, R. et al. Crispr provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007).
Brouns, S. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964 (2008).
Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67–71 (2010).
Nuñez, J. K. et al. Cas1-cas2 complex formation mediates spacer acquisition during CRISPR-Cas adaptive immunity. Nat. Struct. Mol. Biol. 21, 528–534 (2014).
Dyda, F. & Hickman, A. B. Mechanism of spacer integration links the CRISPR/Cas system to transposition as a form of mobile dna. Mobile DNA 6, 1–5 (2015).
Lee, H. & Sashital, D. G. Creating memories: molecular mechanisms of CRISPR adaptation. Trends Biochem. Sci. 47, 464–476 (2022).
McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR-Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).
Pruitt, K. D., Tatusova, T. & Maglott, D. R. Ncbi reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501–D504 (2005).
Camacho, C. et al. Blast+: architecture and applications. BMC Bioinform. 10, 1–9 (2009).
Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).
Skennerton, C. Minced-mining CRISPRs in environmental datasets. Github. https://github.com/ctSkennerton/minced (2016).
Hullahalli, K. et al. Comparative analysis of the orphan CRISPR2 locus in 242 Enterococcus faecalis Strains. PLoS 10, e0138890 (2015).
Almendros, C., Guzmán, N. M., García-Martínez, J. & Mojica, F. J. M. Anti-cas spacers in orphan CRISPR4 arrays prevent uptake of active CRISPR-Cas I-F systems. Nat. Microbiol. 1, 1–8 (2016).
Shmakov, S. A. et al. Crispr arrays away from cas genes. CRISPR J. 3, 535–549 (2020).
Lorenz, R. et al. Viennarna package 2.0. Algorithms Mol. Biol. 6, 1–14 (2011).
Curtiss, R. Bacterial conjugation. Annu. Rev. Microbiol. 23, 69–136 (1969).
Kolter, R., Inuzuka, M. & Helinski, D. R. Trans-complementation-dependent replication of a low molecular weight origin fragment from plasmid R6K. Cell 15, 1199–1208 (1978).
Ferrières, L. et al. Silent mischief: bacteriophage Mu insertions contaminate products of Escherichia coli random mutagenesis performed using suicidal transposon delivery plasmids mobilized by broad-host-range RP4 conjugative machinery. J. Bacteriol. 192, 6418–6427 (2010).
Rakowski, S. A. & Filutowicz, M. Plasmid R6K replication control. Plasmid 69, 231–242 (2013).
Bradley, D. E. & Whelan, J. Conjugation systems of IncT plasmids. Microbiology 131, 2665–2671 (1985).
BAUMAN, N. & DAVIS, B. D. Selection of auxotrophic bacterial mutants through diaminopimelic acid or thymine deprival. Science 126, 170–170 (1957).
Choi, K.-H. & Schweizer, H. P. mini-tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protocols 1, 153–161 (2006).
Chaffin, D. & Rubens, C. Blue/white screening of recombinant plasmids in Gram-positive bacteria by interruption of alkaline phosphatase gene (phoZ) expression. Gene 219, 91–99 (1998).
UF, W. New multifunctional Escherichia coli-Streptomyces shuttle vectors allowing blue-white screening on XGal plates. Gene 165, 149–150 (1995).
Wimmer, F., Mougiakos, I., Englert, F. & Beisel, C. L. Rapid cell-free characterization of multi-subunit CRISPR effectors and transposons. Mol. Cell 82, 1210–1224 (2022).
Halpin-Healy, T. S., Klompe, S. E., Sternberg, S. H. & Fernández, I. S. Structural basis of DNA targeting by a transposon-encoded CRISPR–Cas system. Nature 577, 271–274 (2020).
Jia, N., Xie, W., de la Cruz, M. J., Eng, E. T. & Patel, D. J. Structure-function insights into the initial step of dna integration by a CRISPR-Cas-Transposon complex. Cell Res. 30, 182–184 (2020).
Li, Z., Zhang, H., Xiao, R. & Chang, L. Cryo-em structure of a type IF CRISPR RNA guided surveillance complex bound to transposition protein TniQ. Cell Res. 30, 179–181 (2020).
Park, J.-U. et al. Structures of the holo CRISPR RNA-guided transposon integration complex. Nature 613, 775–782 (2022).
Schmitz, M., Querques, I., Oberli, S., Chanez, C. & Jinek, M. Structural basis for the assembly of the type v crispr-associated transposon complex. Cell 185, 4999–5010 (2022).
Davidson, A. R. et al. Anti-CRISPRs: protein Inhibitors of CRISPR-Cas Systems. Annu. Rev. Biochem. 89, 309–332 (2020).
Pawluk, A., Davidson, A. R. & Maxwell, K. L. Anti-CRISPR: Discovery, mechanism and function. Nat. Rev. Microbiol. 16, 12–17 (2018).
Landsberger, M. et al. Anti-CRISPR Phages Cooperate to Overcome CRISPR-Cas Immunity. Cell 174, 908–916.e12 (2018).
Nancy, C. Tn7: a target site-specific transposon. Mol. Microbiol. 5, 2569–2573 (1991).
McDonald, N. D., Regmi, A., Morreale, D. P., Borowski, J. D. & Boyd, E. F. CRISPR–Cas systems are present predominantly on mobile genetic elements in Vibrio species. BMC Genomics 20, 105. 30717668.
Klompe, S. E. et al. Evolutionary and mechanistic diversity of Type IF CRISPR-associated transposons. Mol. Cell 82, 616–628 (2022).
Park, J.-U. et al. Multiple adaptations underly co-option of a CRISPR surveillance complex for RNA-guided DNA transposition. Mol. Cell 83, 1827–1838 (2023).
Yang, S. et al. Orthogonal CRISPR-associated transposases for parallel and multiplexed chromosomal integration. Nucleic Acids Res. 49, 10192–10202 (2021).
Kuznedelov, K. et al. Altered stoichiometry Escherichia coli cascade complexes with shortened crispr rna spacers are capable of interference and primed adaptation. Nucleic Acids Res. 44, 10849–10861 (2016).
Gleditzsch, D. et al. Modulating the Cascade architecture of a minimal type I-F CRISPR-Cas system. Nucleic Acids Res. 44, 5872–5882 (2016).
Inga, S. et al. Decision-making in Cascade complexes harboring crRNAs of altered length. Cell Rep. 28, 3157–3166 (2019).
Tuminauskaite, D. et al. Dna interference is controlled by R-loop length in a type I-F1 CRISPR-Cas system. BMC Biol. 18, 1–16 (2020).
Rubin, B. E. et al. Species- and site-specific genome editing in complex bacterial communities. Nat. Microbiol. 7, 34–47 (2022).
Makarova, K. S. et al. An updated evolutionary classification of CRISPR-Cas systems. Nat. Rev. Microbiol. 13, 722–736 (2015).
Silas, S. et al. Type III CRISPR-Cas systems can provide redundancy to counteract viral escape from type I systems. Elife 6, e27601 (2017).
Vink, J. N., Baijens, J. H. & Brouns, S. J. Pam-repeat associations and spacer selection preferences in single and co-occurring crispr-cas systems. Genome Biol. 22, 1–25 (2021).
Hoikkala, V. et al. Cooperation between different CRISPR-Cas types enables adaptation in an RNA-targeting system. Mbio 12, e03338–20 (2021).
Hill, A. M., Rybarski, J. R., Hu, K., Finkelstein, I. J. & Wilke, C. O. Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets. J. Open Source Softw. 6 (2021).
Makarova, K. S. & Koonin, E. V. Annotation and classification of CRISPR-Cas systems. Methods Mol Biol. 1311, 47–75 (2015).
Choi, K.-H. et al. A Tn7-based broad-range bacterial cloning and expression system. Nat. Methods 2, 443–448 (2005).
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryosparc: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D: Biol. Crystallogr. 66, 486–501 (2010).
Croll, T. I. Isolde: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. D: Struct. Biol. 74, 519–530 (2018).
Adams, P. D. et al. Phenix: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D: Biol. Crystallogr. 66, 213–221 (2010).
Shen, W., Le, S., Li, Y. & Hu, F. Seqkit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11, e0163962 (2016).
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).
Acknowledgements
The authors would to acknowledge Prof. Jeff Barrick and Dr. Sean Leonard for helping us to establish the conjugation assay. This work was supported by NIGMS grants R01GM124141 (to I.J.F.) and R01GM088344 (to C.O.W.), the Welch Foundation grant F-1808 (to I.J.F.), and the College of Natural Sciences Catalyst Award for seed funding.
Author information
Authors and Affiliations
Contributions
K.H. and I.J.F. conceived the project. K.H. and C.-W.C. performed all experiments and analyzed the data. K.H. and C.-W.C. prepared the figures. I.J.F. and C.O.W. secured the funding. I.J.F. and C.O.W. supervised the project. K.H., C.-W.C., C.O.W., and I.J.F. wrote the manuscript with input from all co-authors.
Corresponding authors
Ethics declarations
Competing interests
K.H., C.O.W., I.J.F., and UT-Austin have filed a patent disclosure relating to using CRISPR-associated transposons for bacterial and mammalian gene editing. The remaining authors declare no competing interests.
Peer review
Peer review information
: Nature Communications thanks Quanjiang Ji, who co-reviewed with Weizhong Chen, Sheng Yang, who co-reviewed with Siqi Yang, and the other, anonymous, reviewers for their contribution to the peer review of this work. A peer review file is available. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hu, K., Chou, CW., Wilke, C.O. et al. Distinct horizontal transfer mechanisms for type I and type V CRISPR-associated transposons. Nat Commun 15, 6653 (2024). https://doi.org/10.1038/s41467-024-50816-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-024-50816-w








