Fig. 2
From: SMRT-Cappable-seq reveals complex operon variants in bacteria

Operons in E. coli. a An example of the tff-rpsB-tsf operon in RegulonDB elongated at its 3′end with two additional genes: pyrH and frr. The x-axis represents the position (in bp) on the reference genome (NC_000913.3) and the y-axis represents individual mapped reads ordered by read size in ascending order. For clarity only reads from one TSS (position at 189712) are shown. Red arrow indicates the previously described Rho-independent terminator. tff encodes a putative small RNA. The product of pyrH is involved in nucleotide biosynthesis, while the products of rpsB, tsf, and frr are involved in translation. b An example of extending the predicted operon acpP-fabF18 with two additional genes: pabC and mltG. c Genome-wide distribution of elongated operons compared to RegulonDB with additional gene(s). Elongated operons are defined by at least one SMRT-Cappable-seq read that covers the entire known operon and extends it to include at least one additional fully covered gene. Only the longest known operons were used, and sub-operons fully included in the longest operons were excluded from this analysis. Each line corresponds to the size of the operons (in bp) with the previously annotated operons in pink and the extended operons in blue. Positions are relative to the 5′end of the annotated operon with 0 being the TSS of the annotated operons. In total there are 883 RegulonDB operons extended by SMRT-Cappable-seq from either 5′ or 3′ end or both. d An example of an excludon. The transcripts encoding the can gene in the sense direction (gray) overlap with the transcripts (blue) containing the entire can gene in the antisense direction. Both transcripts share a bidirectional terminator (red arrow) with differential read-through where the majority (93%) of antisense transcript coding for the can gene terminates while the majority (85%) of the transcripts coding for hpt has read-through