Fig. 1: Identification of the RAGATH-18 RNA-associated protein.
From: Discovery and structural mechanism of DNA endonucleases guided by RAGATH-18-derived RNAs

a Scheme of the computational pipeline for the identification of the RNA-associated proteins across all genomic and metagenomic data. We identified non-coding RNAs with predicted conserved secondary structures which are enriched proximal to the defense-associated genes, and clustered both five upstream and five downstream proteins of these non-coding RNAs to explore conserved cassettes. DEGD, defense associated gene database; HKGD, housekeeping gene database; RM, restriction modification; IGR, intergenic regions. Gabija is a recently described defense system.11 Group II introns, T-box, pemK, and RAGATH-18 are examples of predicted RNAs in the vicinity of the defense associated genes. b Source of metagenomic data of microbiome used for analysis. c Source distribution of RAGATH-18 RNA-associated protein. d The phylogenetic tree of RAGATH-18 RNA-associated proteins. The microbial source is shown on the outermost with the human gut microbiome in brown. The cluster identifier, clade identifier, and copy number of IS607 TnpBs are shown on the ring from innermost to the second outermost.