Extended Data Fig. 1: Bioinformatic analyses of IscB and TnpB homologs.
From: Transposon-encoded nucleases use guide RNAs to promote their selfish spread

a, Phylogenetic tree of IscB and IsrB protein homologs; IscB contains HNH and RuvC nuclease domains, whereas IsrB lacks the HNH nuclease. Genetic neighborhood analyses demonstrate that most homologs are encoded proximal to a predicted ωRNA (inner ring), whereas the vast majority do not reside near a predicted tyrosine-family TnpA transposase gene (outer ring). The GstIscB homolog encoded by ISGst6 used in this study is indicated. Bootstrap values are indicated for major nodes. b, Schematic of a non-autonomous IS element encoding IscB and its associated ωRNA; a structural covariation model is shown in the inset (top). The red rectangle and dotted black line indicate the transposon boundaries, and the guide portion of the ωRNA is shown in blue. LE and RE, transposon left end and right end. c, Orientation bias of the nearest upstream ORFs to the indicated protein-coding gene (iscB, tnpB, or IS630 transposase), demonstrating that IS elements encoding IscB are preferentially integrated (or retained) in an orientation matching that of the upstream gene. The y-axis indicates the frequency of ORFs containing the same orientation, at a distance from the gene start codon defined by the x-axis. 242 bp represents the average length of IscB-associated ωRNAs upstream of IscB ORF. The spike at ~0-bp for TnpB corresponds to IS elements that encode adjacent/overlapping tnpA and tnpB genes. IS630 transposase genes are included as a representative gene from unrelated transposable elements. d, Phylogenetic tree of TnpB homologs, with bootstrap values shown for major nodes. Genetic neighborhood analyses demonstrate that most homologs are encoded proximal to a predicted ωRNA (inner ring), whereas the vast majority do not reside near a predicted tyrosine- or serine-family TnpA transposase genes (outer rings). Interestingly, TnpB homologs are associated with two distinct transposase families in prokaryotes: tyrosine transposases (denoted TnpA (Y)) within IS200/605-family elements, and serine transposases (denote TnpA (S)) within IS607-family elements. GstTnpB homologs used in this study are highlighted, along with the predicted structures of their associated ωRNAs based on covariance modeling. The ωRNA encoded by ISGst2 did not show strong covariation in structure and was therefore omitted. e, Read coverage for RNA-seq data from G. stearothermophilus strain DSM 458, demonstrating expression of putative ωRNAs from each of the indicated ISGst families. ISGst5 is a PATE-like element that lacks any protein-coding ORFs. Other TnpB-associated ωRNAs are encoded within/downstream of the ORF, whereas IscB-associated ωRNAs are encoded upstream of the ORF.