Extended Data Fig. 6: Investigative strategy to uncover putative regulatory functions of dCas12f-σE systems.
From: Exapted CRISPR–Cas12f homologues drive RNA-guided transcription

a, Bioinformatics strategy to determine high-confidence covariance models (CM) for both the dCas12f-associated gRNA (top) and hth-associated ncRNA (bottom). b, Schematic of bioinformatics workflow to globally identify RNA-guided DNA targets of dCas12f-σE systems in sequenced bacterial genomes. After identifying dCas12f-associated gRNAs and extracting guide sequences, putative targets were identified that exhibit perfect complementarity within a 6-nt seed sequence, reside within intergenic regions, and exist upstream of protein-coding genes. Target loci were also analysed for the presence of predicted hth-associated ncRNAs. c, Histogram quantifying distances between the TAM of predicted dCas12f target sites and the start codons of associated target genes. 156 bioinformatically predicted dCas12f targets were included in this analysis; distances for Fta dCas12f.1 and dCas12f.2 are highlighted for reference. d, Table summarizing the bioinformatics results from b, listing the number of predicted DNA targets, gRNA guide sequences, and genomes for each class. Putative RNA-guided DNA targets fall into four functional categories that include regulation of transmembrane transport, regulation of two-component system (TCS)-like systems, auto-regulation of rpoE-dcas12f loci, and regulation of rpoE-dcas12f loci in trans; other predicted targets await further categorization and analysis. Note that some guides have multiple targets within a genome, so a single guide is represented in multiple functional categories and totals do not match totals in b. Each guide within a genome is also capable of targeting multiple loci. Thus, some genomes have gRNAs with targets spanning multiple functional classes and totals similarly do match totals in b. e, Exemplary dCas12f-σE system from Chryseobacterium gleum (Cgl), in which the dCas12f-associated gRNA putatively targets and transcriptionally regulates seven distinct genetic loci (1–7). The genome schematic (top) visualizes the approximate genomic location of each locus, and the magnified insets (below) report the position of dCas12f-gRNA targets (purple triangles) relative to nearby genes; note that each of the targets flanks a nearby hth gene and overlaps precisely with the predicted position of an hth-associated ncRNA (magenta rectangle). The schematics at right depict patterns of gRNA-DNA complementarity at each target site, relative to the TAM. f, Exemplary dCas12f-σE system from Sphingobacterium sp. DR205, in which both a single chimeric gRNA guide and two spacers from a vestigial CRISPR array target genomic sites proximal to susCD operons. The genome schematic (top) visualizes the approximate location of the rpoE-dcas12f locus and putative target sites, and the magnified insets (below) visualize the rpoE-dcas12f locus and RNA-guided DNA target sites, alongside corresponding published RNA–seq data for each locus. Coverage is shown as CPM. Three guides/spacers are indicated and labeled (circles), as well as their complementary targets (purple triangles), the predicted guide-target complementarity (purple shading), and the putative TAM (yellow shading).