Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Exapted CRISPR–Cas12f homologues drive RNA-guided transcription

Abstract

Bacterial transcription initiation is a tightly regulated process that canonically relies on sequence-specific promoter recognition by dedicated sigma (σ) factors, leading to functional DNA engagement by RNA polymerase (RNAP)1. Although the seven σ factors in Escherichia coli have been extensively characterized2, Bacteroidetes species encode dozens of specialized, extracytoplasmic function σ factors (σE) whose precise roles are unknown, pointing to additional layers of regulatory potential3. Here we uncover a mechanism of RNA-guided gene activation involving the coordinated action of σE factor in complex with nuclease-dead Cas12f (dCas12f). We screened a large set of genetically linked dCas12f and σE homologues in E. coli using RNA and chromatin immunoprecipitation experiments, revealing systems that exhibit robust guide RNA enrichment and DNA target binding with a minimal 5′-G target-adjacent motif. Recruitment of σE was dependent on dCas12f and guide RNA, suggesting direct protein–protein interactions, and co-expression experiments demonstrated that the dCas12f–gRNA–σE ternary complex was competent for programmable recruitment of the RNAP holoenzyme. Remarkably, dCas12f–RNA–σE complexes drove potent gene expression in the absence of any requisite promoter motifs, with de novo transcription start sites defined exclusively by the relative distance from the dCas12f-mediated R-loop. Our findings highlight a new paradigm of RNA-guided transcription that embodies natural features reminiscent of CRISPR activation (CRISPRa) technology4,5.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Nuclease-dead Cas12f homologues are genetically associated with atypical σ factor genes.
The alternative text for this image may have been generated using AI.
Fig. 2: Experimental discovery of guide RNA and target DNA substrates of dCas12f.
The alternative text for this image may have been generated using AI.
Fig. 3: dCas12f-associated gRNAs target conserved ncRNA loci and regulate diverse gene expression programmes.
The alternative text for this image may have been generated using AI.
Fig. 4: RNA-guided dCas12f recruits σE, but not HTH, to genomic target sites.
The alternative text for this image may have been generated using AI.
Fig. 5: dCas12f and σE direct programmable, RNA-guided transcription with single base-pair resolution.
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Data availability

Next-generation sequencing data are available in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (BioProject accession: PRJNA1247282) and the Gene Expression Omnibus (GSE293889). The published genomes used for bioinformatics analyses were obtained from NCBI (Supplementary Table 4). Datasets generated and analysed in the current study are available from the corresponding authors on reasonable request.

Code availability

Custom scripts used for bioinformatics are available at https://github.com/sternberglab/Hoffmann_et_al_2026.

References

  1. Feklistov, A., Sharon, B. D., Darst, S. A. & Gross, C. A. Bacterial sigma factors: a historical, structural, and genomic perspective. Annu. Rev. Microbiol. 68, 357–376 (2014).

    Article  CAS  PubMed  Google Scholar 

  2. Sharma, U. K. & Chatterji, D. Transcriptional switching in Escherichia coli during stress and starvation by modulation of σ70 activity. FEMS Microbiol. Rev. 34, 646–657 (2010).

    Article  CAS  PubMed  Google Scholar 

  3. Casas-Pastor, D. et al. Expansion and re-classification of the extracytoplasmic function (ECF) σ factor family. Nucleic Acids Res. 49, 986–1005 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR–Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  6. Makarova, K. S. et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 18, 67–83 (2020).

    Article  CAS  PubMed  Google Scholar 

  7. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR–Cas system. Cell 163, 759–771 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Shmakov, S. et al. Diversity and evolution of class 2 CRISPR–Cas systems. Nat. Rev. Microbiol. 15, 169–182 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Altae-Tran, H. et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. Sasnauskas, G. et al. TnpB structure reveals minimal functional core of Cas12 nuclease family. Nature 616, 384–389 (2023).

    Article  ADS  CAS  PubMed  Google Scholar 

  12. Meers, C. et al. Transposon-encoded nucleases use guide RNAs to promote their selfish spread. Nature 622, 863–871 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  13. Durrant, M. G. et al. Bridge RNAs direct programmable recombination of target and donor DNA. Nature 630, 984–993 (2024).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  14. Siddiquee, R., Pong, C. H., Hall, R. M. & Ataide, S. F. A programmable seekRNA guides target selection by IS1111 and IS110 type insertion sequences. Nat. Commun. 15, 5235 (2024).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  15. Vaysset, H., Meers, C., Cury, J., Bernheim, A. & Sternberg, S. H. Evolutionary origins of archaeal and eukaryotic RNA-guided RNA modification in bacterial IS110 transposons. Nat. Microbiol. 10, 20–27 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Saito, M. et al. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature 620, 660–668 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  17. Altae-Tran, H. et al. Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12. Proc. Natl Acad. Sci. USA 120, e2308224120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Wiegand, T. et al. TnpB homologues exapted from transposons are RNA-guided transcription factors. Nature 631, 439–448 (2024).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. Workman, R. E. et al. A natural single-guide RNA repurposes Cas9 to autoregulate CRISPR–Cas expression. Cell 184, 675–688.e19 (2021).

    Article  CAS  PubMed  Google Scholar 

  20. Sampson, T. R., Saroj, S. D., Llewellyn, A. C., Tzeng, Y.-L. & Weiss, D. S. A CRISPR/Cas system mediates bacterial innate immune evasion and virulence. Nature 497, 254–257 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. Ratner, H. K. et al. Catalytically active Cas9 mediates transcriptional interference to facilitate bacterial virulence. Mol. Cell 75, 498–510.e5 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Wu, W. Y. et al. The miniature CRISPR–Cas12m effector binds DNA to block transcription. Mol. Cell 82, 4487–4502.e7 (2022).

    Article  CAS  PubMed  Google Scholar 

  23. Huang, C. J., Adler, B. A. & Doudna, J. A. A naturally DNase-free CRISPR–Cas12c enzyme silences gene expression. Mol. Cell 82, 2148–2160.e4 (2022).

    Article  CAS  PubMed  Google Scholar 

  24. Li, M. et al. Toxin-antitoxin RNA pairs safeguard CRISPR–Cas systems. Science 372, eabe5601 (2021).

    Article  ADS  CAS  PubMed  Google Scholar 

  25. Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Zhang, X. et al. Multiplex gene regulation by CRISPR–ddCpf1. Cell Discov. 3, 17018 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Zalatan, J. G. et al. Engineering complex synthetic transcriptional programs with CRISPR RNA scaffolds. Cell 160, 339–350 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Fontana, J. et al. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  30. Burgess, R. R., Travers, A. A., Dunn, J. J. & Bautz, E. K. F. Factor stimulating transcription by RNA polymerase. Nature 221, 43–46 (1969).

    Article  ADS  CAS  PubMed  Google Scholar 

  31. Ross, W. et al. A third recognition element in bacterial promoters: DNA binding by the α subunit of RNA polymerase. Science 262, 1407–1413 (1993).

    Article  ADS  CAS  PubMed  Google Scholar 

  32. Ishihama, A. Functional modulation of Escherichia coli RNA polymerase. Annu. Rev. Microbiol. 54, 499–518 (2000).

    Article  CAS  PubMed  Google Scholar 

  33. Paget, M. S. Bacterial sigma factors and anti-sigma factors: structure, function and distribution. Biomolecules 5, 1245–1265 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Erickson, J. W. & Gross, C. A. Identification of the sigma E subunit of Escherichia coli RNA polymerase: a second alternate sigma factor involved in high-temperature gene expression. Genes Dev. 3, 1462–1471 (1989).

    Article  CAS  PubMed  Google Scholar 

  35. Mecsas, J., Rouviere, P. E., Erickson, J. W., Donohue, T. J. & Gross, C. A. The activity of sigma E, an Escherichia coli heat-inducible sigma-factor, is modulated by expression of outer membrane proteins. Genes Dev. 7, 2618–2628 (1993).

    Article  CAS  PubMed  Google Scholar 

  36. Hove, B. V., Staudenmaier, H. & Braun, V. Novel two-component transmembrane transcription control: regulation of iron dicitrate transport in Escherichia coli K-12. J. Bacteriol. 172, 6749–6758 (1990).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Pinto, D. & da Fonseca, R. R. Evolution of the extracytoplasmic function σ factor protein family. NAR Genom. Bioinformatics 2, lqz026 (2020).

    Article  Google Scholar 

  38. Martens, E. C., Roth, R., Heuser, J. E. & Gordon, J. I. Coordinate regulation of glycan degradation and polysaccharide capsule biosynthesis by a prominent human gut symbiont. J. Biol. Chem. 284, 18445–18457 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Ades, S. E. Regulation by destruction: design of the σE envelope stress response. Curr. Opin. Microbiol. 11, 535–540 (2008).

    Article  CAS  PubMed  Google Scholar 

  40. Xiao, R., Li, Z., Wang, S., Han, R. & Chang, L. Structural basis for substrate recognition and cleavage by the dimerization-dependent CRISPR–Cas12f nuclease. Nucleic Acids Res. 49, 4120–4128 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Aravind, L., Anantharaman, V., Balaji, S., Babu, M. & Iyer, L. The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol. Rev. 29, 231–262 (2005).

    Article  CAS  PubMed  Google Scholar 

  42. Yokoyama, T. et al. The Escherichia coli S2P intramembrane protease RseP regulates ferric citrate uptake by cleaving the sigma factor regulator FecR. J. Biol. Chem. 296, 100673 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Hoffmann, F. T. et al. Selective TnsC recruitment enhances the fidelity of RNA-guided transposition. Nature 609, 384–393 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  44. Harrington, L. B. et al. Programmed DNA destruction by miniature CRISPR–Cas14 enzymes. Science 362, 839–842 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  45. Karvelis, T. et al. PAM recognition by miniature CRISPR–Cas12f nucleases triggers programmable double-stranded DNA target cleavage. Nucleic Acids Res. 48, 5016–5023 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Karvelis, T. et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692–696 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  47. Brooks, B. E. & Buchanan, S. K. Signaling mechanisms for activation of extracytoplasmic function (ECF) sigma factors. Biochim. Biophys. Acta 1778, 1930–1945 (2008).

    Article  CAS  PubMed  Google Scholar 

  48. Noinaj, N., Guillier, M., Barnard, T. J. & Buchanan, S. K. TonB-dependent transporters: regulation, structure, and function. Microbiology 64, 43–60 (2010).

    Article  CAS  Google Scholar 

  49. Birkholz, N. et al. Phage anti-CRISPR control by an RNA- and DNA-binding helix–turn–helix protein. Nature 631, 670–677 (2024).

    Article  ADS  CAS  PubMed  Google Scholar 

  50. Bayley, D. P., Rocha, E. R. & Smith, C. J. Analysis of cepA and other Bacteroides fragilis genes reveals a unique promoter structure. FEMS Microbiol. Lett. 193, 149–154 (2000).

    Article  CAS  PubMed  Google Scholar 

  51. Chen, S., Bagdasarian, M., Kaufman, M. G. & Walker, E. D. Characterization of strong promoters from an environmental Flavobacterium hibernum strain by using a green fluorescent protein-based reporter system. Appl. Environ. Microbiol. 73, 1089–1100 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  52. Xiao, R. et al. Structural basis of RNA-guided transcription by a dCas12f–σE–RNAP complex. Nature https://doi.org/10.1038/s41586-026-10178-3 (2026).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Harden, T. T. et al. Bacterial RNA polymerase can retain σ70 throughout transcription. Proc. Natl Acad. Sci. USA 113, 602–607 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318–356 (1961).

    Article  CAS  PubMed  Google Scholar 

  55. Balleza, E. et al. Regulation by transcription factors in bacteria: beyond description. FEMS Microbiol. Rev. 33, 133–151 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Bak, G., Han, K., Kim, D. & Lee, Y. Roles of rpoS-activating small RNAs in pathways leading to acid resistance of Escherichia coli. MicrobiologyOpen 3, 15–28 (2014).

    Article  CAS  PubMed  Google Scholar 

  57. Massé, E. & Gottesman, S. A small RNA regulates the expression of genes involved in iron metabolism in Escherichia coli. Proc. Natl Acad. Sci. USA 99, 4620–4625 (2002).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  58. Madej, M. et al. Structural and functional insights into oligopeptide acquisition by the RagAB transporter from Porphyromonas gingivalis. Nat. Microbiol. 5, 1016–1025 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Grondin, J. M., Tamura, K., Déjean, G., Abbott, D. W. & Brumer, H. Polysaccharide utilization loci: fueling microbial communities. J. Bacteriol. 199, e00860-16 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Lapébie, P., Lombard, V., Drula, E., Terrapon, N. & Henrissat, B. Bacteroidetes use thousands of enzyme combinations to break down glycans. Nat. Commun. 10, 2043 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  61. Tong, M. et al. A highly conserved SusCD transporter determines the import and species-specific antagonism of Bacteroides ubiquitin homologues. Nat. Commun. 15, 8794 (2024).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  62. Martens, E. C., Chiang, H. C. & Gordon, J. I. Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont. Cell Host Microbe 4, 447–457 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Feng, J. et al. Polysaccharide utilization loci in Bacteroides determine population fitness and community-level interactions. Cell Host Microbe 30, 200–215.e12 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Todor, H. et al. Rewiring the specificity of extracytoplasmic function sigma factors. Proc. Natl Acad. Sci. USA 117, 33496–33506 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  65. Gray, D. A. et al. Insights into SusCD-mediated glycan import by a prominent gut symbiont. Nat. Commun. 12, 44 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  66. Takeda, S. N. et al. Structure of the miniature type V-F CRISPR–Cas effector enzyme. Mol. Cell 81, 558–570.e3 (2021).

    Article  CAS  PubMed  Google Scholar 

  67. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  70. Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

    Article  CAS  PubMed  Google Scholar 

  71. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Winter, D. J. rentrez: An R package for the NCBI eUtils API. The R Journal 9, 520–526 (2017).

    Article  Google Scholar 

  73. Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: Functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Tang, S. et al. De novo gene synthesis by an antiviral reverse transcriptase. Science 386, eadq0876 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetJ. 17, 10–12 (2011).

    Article  Google Scholar 

  76. Vasimuddin, M., Misra, S., Li, H. & Aluru, S. in 2019 IEEE International Parallel and Distributed Processing Symposium 314–324 (IEEE Computer Society, 2019).

  77. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  78. Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  PubMed  Google Scholar 

  81. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Cooper, L. A., Stringer, A. M. & Wade, J. T. Determining the specificity of cascade binding, interference, and primed adaptation in vivo in the Escherichia coli Type I-E CRISPR–Cas system. mBio 9, e02100-17 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  87. Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Will, S., Joshi, T., Hofacker, I. L., Stadler, P. F. & Backofen, R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18, 900–914 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  91. Schwengers, O. et al. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom. 7, 000685 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. Rice, P., Longden, I. & Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

    Article  CAS  PubMed  Google Scholar 

  93. Wickham, H. Ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).

Download references

Acknowledgements

We thank A. J. Robinson, S. Kang., J. L. Ramirez and T. M. Smith for laboratory support; E. A. Campbell and S. A. Darst for helpful discussion; Z. Hua for custom scripts; Z. Quan for providing Fta strains; the JP Sulzberger Columbia Genome Center for NGS support; and L. F. Landweber for qPCR and gel imager instrument access. S.T. was supported by a Ruth L. Kirchstein Individual Predoctoral Fellowship (F30AI183830) from the NIH. L.C. was supported by NIH grant R01GM138675 and by the National Science Foundation (NSF) Faculty Early Career Development Program (CAREER) Award 2339799. S.H.S. was supported by NIH grant R01EB031935, NSF Faculty Early Career Development Program (CAREER) Award 2239685, a Pew Biomedical Scholarship, an Irma T. Hirschl Career Scientist Award, the Howard Hughes Medical Institute Investigator Program, and a generous startup package from the Columbia University Irving Medical Center Dean’s Office and the Vagelos Precision Medicine Fund.

Author information

Authors and Affiliations

Authors

Contributions

F.T.H. and S.H.S. conceived the project. F.T.H. designed and performed most experiments. T.W. performed most bioinformatics analyses. F.T.H., A.I.P. and J.G.-K. performed cloning and RFP activation assays. R.X. cloned RNAP expression plasmids and contributed to data interpretation. F.T.H. and S.T. analysed RNA-seq and RIP-seq data. H.C.L. and C.M. performed initial phylogenetics and bioinformatics analyses. G.D.L. and F.T.H. performed flow cytometry assays. L.C. contributed to data interpretation. S.H.S. oversaw the project. F.T.H., T.W. and S.H.S. discussed the data and wrote the manuscript, with input from all authors.

Corresponding author

Correspondence to Samuel H. Sternberg.

Ethics declarations

Competing interests

S.H.S. is a co-founder and scientific advisor to Dahlia Biosciences, a scientific advisor to CrisprBits and Prime Medicine, and an equity holder in Dahlia Biosciences and CrisprBits. S.H.S., F.T.H. and T.W. are inventors on patents related to CRISPR–Cas-like systems and uses thereof. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Wen Wu who co-reviewed with Prarthana Mohanraju, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Evolutionary analysis and genomic context of diverse dcas12f and rpoE genes.

a, Representative genomic neighbourhoods of predicted nuclease-dead Cas12f homologues that are not associated with rpoEE) genes. CRISPR arrays and putative gRNAs are annotated, as are nearby genes. Putative gRNAs were identified by detecting covariance in the intergenic regions upstream of CRISPR loci, and the predicted secondary structure of a representative example is shown in the inset. b, Map of a dcas12f locus and its putative gRNA target upstream of an RND efflux system (middle), in a metagenome assembled genome of a Chitinophagaceae bacterium (top). The putative guide-target duplex and predicted gRNA structure are highlighted (bottom). c, Correlation between the percent identity of select dCas12f homologues (y-axis) and either σE (rpoE) or HTH homologues (x-axis) to the F. taeanensis homologues. The stronger correlation with σE (R2 = 0.78) and weaker correlation with HTH (R2 = 0.19) suggest a tighter genetic linkage between dcas12f and rpoE genes rather than hth. d, Magnified and simplified view of a partial σE phylogenetic tree from Fig. 1f (left), showing homologues containing an additional C-terminal domain (CTD) that are associated with either dCas12f homologues (pink) or restriction enzyme (RE)-like homologues (lavender). Representative genomic neighbourhoods (right) highlight the tight operonic arrangement of rpoE and RE-like genes, supporting a potential model in which nuclease-dead RE proteins similarly recruit atypical σE proteins to sites of transcription.

Extended Data Fig. 2 Pairwise sequence identity matrices for dCas12f, σE, HTH, and gRNAs.

a, Heatmap of pairwise amino acid sequence identity percentages among dCas12f homologues tested in this study. The matrix is color-coded by sequence identity (see legend inset), and percentages are listed. b, Pairwise amino acid sequence identity between tested σE homologues, shown as in a. The Bba locus lacks an rpoE gene. c, Pairwise amino acid sequence identity between tested HTH homologues, shown as in a. Note that Sda, Lby, and Pdi loci contain two hth genes, while this gene is absent in the Bba locus. d, Pairwise nucleotide sequence identity between tested gRNAs, shown as in a. e, Annotated genomic loci of all homologous dCas12f-σE systems tested in this study, with hth, rpoE, and dcas12f genes labeled and colored in yellow, blue, and red, respectively. Predicted functions for other neighbouring genes in grey are indicated by COG (Clusters of Orthologous Groups) letters, as indicated in the legend below.

Extended Data Fig. 3 Additional analysis of dCas12f-associated gRNAs from RIP–seq experiments and comparative genomics analyses.

a, RIP–seq coverage plots for the five indicated dCas12f homologues, revealing a well-defined gRNA with scaffold (orange) and guide (purple) regions. Nucleotides outside the boundary of the presumed full-length gRNA are coloured in grey. Total gRNA lengths are noted to the right of each plot. The guide region of Zunongwangia profunda is plotted on a separate y-axis scale, for visual clarity. Coverage is shown as counts per million (CPM). b, Zoomed-out RIP–seq coverage plots for an additional dCas12f orthologue from Leeuwenhoekiella palythoae (Lpa), showing both a zoomed-out view (left), depicting the same data as shown in Fig. 2c, and magnified view as in a (right). Region 1 annotated in the operon schematic (top left) encodes multiple tandem gRNAs with similar scaffold and guide sequences. Coverage is shown as in a. c, RIP–seq coverage plots for an additional dCas12f orthologue from Paenimyroides ummariense (Pum), shown as in b, with the magnified view at right comparing the aligned scaffold and guide sequences. The first two gRNA sequences are identical, and bases that differ in the third gRNA are highlighted in red. d, Annotated genomic neighborhood of an rpoE-dcas12f operon that encodes both predicted full-length gRNAs (orange/purple) and multiple discontinuous CRISPR arrays (repeats in tan). The sequence similarity between the CRISPR repeats and gRNA scaffold (grey circles, bottom) suggests a potential evolutionary emergence of chimeric, dCas12f-associated single guide RNAs from CRISPR arrays.

Extended Data Fig. 4 Culturing, whole genome sequencing, and RNA–seq of Flagellimonas taeanensis strains that encode dCas12f-σE systems.

a, Summary table of strain information and culturing conditions for five F. taeanensis (Fta) strains and one Mucilaginibacter rigui strain, which were acquired because of the likely presence of rpoE-dcas12f loci. The number of loci identified after whole-genome sequencing (WGS) is highlighted. b, BioCircos plots of the six strains in a after WGS analysis, with the positions of rpoE-dcas12f loci highlighted in red. Information regarding the internal strain ID, total genome size, GC content; contigs are denoted in cases where genome assembly was incomplete. c, Genomic neighbourhoods of rpoE-dcas12f loci in the indicated strains from a. Genes encoding HTH (hth), σE (rpoE) and dCas12f (dcas12f) are shown in yellow, blue, and red, respectively; gRNAs and hth-associated ncRNAs are annotated in orange and magenta, respectively. d, Magnified RNA–seq coverage plots from Fta strain sSL4759 for two distinct loci, highlighting the abundance of reads corresponding to hth-associated ncRNAs at both rpoE-dcas12f locus 1 (left) and the presumed susC target site of dCas12f-associated gRNAs (right). The top panels show a 2-kbp window; the bottom panels zoom in on the ncRNA sequence. Coverage is shown as counts per million (CPM).

Extended Data Fig. 5 ChIP–seq experiments reveal TAM and gRNA guide sequences for additional dCas12f homologues.

a, Genome-wide representation of ChIP–seq data for the indicated dCas12f homologues (purple), compared to the input control scaled the same as Ebr (top). Coverage is shown as counts per million (CPM), normalized to the highest peak within each sample or to a value of 200, as shown. b, Binding events were analyzed by MEME-ChIP, which revealed strongly conserved consensus motifs for eight dCas12f homologues that correspond to the putative target-adjacent motif (TAM) and gRNA-matching target DNA sequence within the seed, for each homologue. E, E-value significance; n, number of peaks contributing to the motif. Percent of total peaks constituent for each motif are shown in parentheses. Motifs could not be confidently determined for the remaining five dCas12f homologs due to a paucity of enriched peaks in heterologous expression experiments.

Extended Data Fig. 6 Investigative strategy to uncover putative regulatory functions of dCas12f-σE systems.

a, Bioinformatics strategy to determine high-confidence covariance models (CM) for both the dCas12f-associated gRNA (top) and hth-associated ncRNA (bottom). b, Schematic of bioinformatics workflow to globally identify RNA-guided DNA targets of dCas12f-σE systems in sequenced bacterial genomes. After identifying dCas12f-associated gRNAs and extracting guide sequences, putative targets were identified that exhibit perfect complementarity within a 6-nt seed sequence, reside within intergenic regions, and exist upstream of protein-coding genes. Target loci were also analysed for the presence of predicted hth-associated ncRNAs. c, Histogram quantifying distances between the TAM of predicted dCas12f target sites and the start codons of associated target genes. 156 bioinformatically predicted dCas12f targets were included in this analysis; distances for Fta dCas12f.1 and dCas12f.2 are highlighted for reference. d, Table summarizing the bioinformatics results from b, listing the number of predicted DNA targets, gRNA guide sequences, and genomes for each class. Putative RNA-guided DNA targets fall into four functional categories that include regulation of transmembrane transport, regulation of two-component system (TCS)-like systems, auto-regulation of rpoE-dcas12f loci, and regulation of rpoE-dcas12f loci in trans; other predicted targets await further categorization and analysis. Note that some guides have multiple targets within a genome, so a single guide is represented in multiple functional categories and totals do not match totals in b. Each guide within a genome is also capable of targeting multiple loci. Thus, some genomes have gRNAs with targets spanning multiple functional classes and totals similarly do match totals in b. e, Exemplary dCas12f-σE system from Chryseobacterium gleum (Cgl), in which the dCas12f-associated gRNA putatively targets and transcriptionally regulates seven distinct genetic loci (1–7). The genome schematic (top) visualizes the approximate genomic location of each locus, and the magnified insets (below) report the position of dCas12f-gRNA targets (purple triangles) relative to nearby genes; note that each of the targets flanks a nearby hth gene and overlaps precisely with the predicted position of an hth-associated ncRNA (magenta rectangle). The schematics at right depict patterns of gRNA-DNA complementarity at each target site, relative to the TAM. f, Exemplary dCas12f-σE system from Sphingobacterium sp. DR205, in which both a single chimeric gRNA guide and two spacers from a vestigial CRISPR array target genomic sites proximal to susCD operons. The genome schematic (top) visualizes the approximate location of the rpoE-dcas12f locus and putative target sites, and the magnified insets (below) visualize the rpoE-dcas12f locus and RNA-guided DNA target sites, alongside corresponding published RNA–seq data for each locus. Coverage is shown as CPM. Three guides/spacers are indicated and labeled (circles), as well as their complementary targets (purple triangles), the predicted guide-target complementarity (purple shading), and the putative TAM (yellow shading).

Extended Data Fig. 7 ChIP–seq experiments and analyses for dCas12f and HTH homologues.

a, Schematic of ChIP–seq assay to study genome-wide binding of dCas12f homologues programmed with gRNAs targeting the indicated site upstream of yidX (left), and genome-wide representation of ChIP–seq data alongside the input control (right). Coverage is shown as counts per million (CPM), normalized to the highest peak in the targeting samples. b, Bioinformatics strategy to globally discover putative binding sites of HTH proteins (left). Based on ChIP–seq data for Fta HTH that revealed a highly enriched binding site with inverted repeats (IRs) upstream of its own open reading frame (ORF), 369 additional potential HTH motifs were identified that similarly exhibit IRs but share little conservation (WebLogo, bottom right), consistent with the broad diversity of hth genes encoded in rpoE-dcas12f loci. The histogram of IR mismatch (top right) plots the number of mismatches between left and right copies of putative IR substrates of HTH, and their relative distance from the start of the hth ORF.

Extended Data Fig. 8 Optimization and additional analyses relating to RFP fluorescence reporter and ChIP–seq data.

a, Scatter plot of RNA–seq data from mRFP transcriptional reporter assays in Fig. 5a, comparing three replicates of targeting (T) or non-targeting (NT) gRNA experiments. TPM, transcripts per million. b, Panel of modified mRFP reporter plasmids (left), in which perturbations were made within the native susC target region including deletions or multi-terminator cassette insertions, as indicated. The resulting OD-normalized RFP fluorescence measurements for transcriptional reporter assays using these constructs are shown (right), with the starting (WT) construct testing in both targeting (T) and non-targeting (NT) conditions. c, OD-normalized RFP fluorescence measurements from control assays to determine signal detection limits. Cell culture samples from targeting (T) and non-targeting (NT) experiments (left) were mixed together to simulate variable ratios of RFP signal (middle), and the resulting regression analysis (right) revealed the expected linear relationship, with excellent sensitivity down to low single-digit percentages. d, OD-normalized RFP fluorescence for the indicated reporter DNA constructs, in which the TAM was mutated to each of the alternative nucleotides while maintaining an invariant target and guide sequence. e, Flow cytometry analysis for T and NT samples from b, alongside a negative control (N.C.) encoding no mRFP1. f, Consensus WebLogo of dCas12f gRNA-matching target sites from 156 aligned genomic loci, demonstrating conservation within the TAM, the target region, and a short T-rich stretch immediately adjacent to the target — but an absence of substantial conservation in other promoter motifs or the TSS itself. This observation suggests that RNA-guided transcription proceeds largely without fixed DNA sequence requirements. Coordinates are numbered relative to the TAM (top, black text) or to the TSS (bottom, red text). g, Panel of mutations in the region between the RNA-matching target site and empirically determined TSS, which were tested in the transcription reporter assay shown in Fig. 5a. Nucleotides highlighted in dark grey were mutated; coordinates are numbered relative to the TAM (top, black text) or to the TSS (bottom, red text). h, OD-normalized RFP fluorescence for the indicated reporter DNA constructs shown in g. T, targeting gRNA with WT (unmutated) intergenic region; NT, non-targeting gRNA control. i, Magnified view of ChIP–seq data for Flag-tagged σE in the absence (-FtaRNAP) and presence (+FtaRNAP) of the native FtaRNAP, alongside an input control (top). Coverage is shown as counts per million (CPM), normalized to the highest peak in the +FtaRNAP sample. j, Panel of guide sequence truncations in 2-bp increments, which were tested in the transcription reporter assay shown in Fig. 5a. The nucleotides highlighted in brown were truncated from the targeting gRNA. k, OD-normalized RFP fluorescence for the indicated gRNA constructs shown in j, using the transcriptional reporter assay shown in Fig. 5a. NT, non-targeting gRNA control. Data in b, h, and k are shown as mean ± s.d. for n = 3 biologically independent samples. Data in c, d, and e are shown as mean ± s.d. for n = 3, n = 3, and for n = 5 technical replicates, respectively.

Extended Data Fig. 9 Additional analyses relating to RNA–seq data.

a, RNA enrichment measured by reverse transcription (RT)–qPCR of target-E RNA–seq sample (Fig. 5b) for two distinct primer pairs (left). The targeting condition (T) reveals around 140-fold enrichment of transcripts versus non-targeting (NT) condition. Standard curves for simulated gene activation to determine the limit of detection for primer pair 1 (center) and primer pair 2 (right) suggest a detection limit of around 5%. b, RNA–seq coverage plots for the target-E off-target site near tyrS, shown in Fig. 5f. The approximately 46 bp distance between TAM and TSS upstream of tyrS is indicated. Zoom-out view (right) shows pdxY is upregulated due to it being encoded directly downstream of tyrS. The TAM and extensive 11-bp complementarity between the off-target DNA site and guide sequence are shown below the coverage track. Coverage is shown for the reverse strand as counts per million (CPM). c, TSS plots derived from RNA-seq data, as shown in Fig. 5c, for all the individual gRNAs in Fig. 5g. Coverage is shown as CPM. d, RNA–seq tracks for a NT control and three distinct gRNAs designed for target-3 showing weak or no transcription initiation, likely due to binding site occlusion by other DNA binding factors involved in yidX regulation. Coverage is shown for the forward strand as CPM. NT, non-targeting. e, RNA–seq coverage plots for a gRNA designed for Target 6 (as shown in Fig. 5g) and three additional gRNAs that incrementally shift the TAM by 1 bp, changing it to A, T, and C. Targets flanked by a non-G-TAM fail to initiate transcription. Coverage is shown for the forward strand as CPM.

Extended Data Fig. 10 Mechanisms of bacterial gene activation and repression, including newly discovered, RNA-guided pathways driven by dCas12f and TldR.

RNA-based mechanisms (top row) can activate gene expression, such as the RprA small RNA (sRNA) activating rpoS by relieving secondary structure inhibition, or repress gene expression, such as the RyhB sRNA destabilizing sodB through base pairing and RNase recruitment. Protein-based mechanisms (second row from top) can activate gene expression, such as Crp (cAMP receptor protein) enhancing RNAP recruitment at the envZ promoter, or repress gene expression, such as HTH transcription factors binding DNA to block RNAP-promoter recognition. σ factor-based mechanisms (second row from bottom) can activate gene expression, such as extracytoplasmic function (ECF) σE factors recruiting RNAP to specific promoters to drive transcription, or repress gene expression, such as when their activities are inhibited by FecR anti-σ factors. Finally, we report novel RNA-guided pathways of gene regulation (bottom row). In previous work, we uncovered TnpB-like nuclease-dead repressors (TldR) that exploit gRNAs to bind complementary DNA target sites, thereby preventing promoter recognition by RNAP (bottom right). In this study, we uncover nuclease-dead Cas12f proteins that exploit gRNAs to bind complementary DNA target sites and directly recruit σE factors and RNAP, thereby driving promoter-independent transcription of diverse genetic operons such as susCD polysaccharide utilization loci (bottom left). Collectively, our work highlights a new axis of gene regulation control via exapted, RNA-guided transcription factors akin to CRISPRi and CRISPRa.

Supplementary information

Supplementary Figure 1 (download PDF )

Representative gating schemes for flow cytometry analysis of RFP fluorescence reporter.

Reporting Summary (download PDF )

Supplementary Tables (download XLSX )

Supplementary Tables 1–8.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hoffmann, F.T., Wiegand, T., Palmieri, A.I. et al. Exapted CRISPR–Cas12f homologues drive RNA-guided transcription. Nature 653, 277–287 (2026). https://doi.org/10.1038/s41586-026-10166-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41586-026-10166-7

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research