Extended Data Fig. 1: RNAcompete- measured RNA sequence specificities. | Nature Biotechnology

Extended Data Fig. 1: RNAcompete- measured RNA sequence specificities.

From: A resource of RNA-binding protein motifs across eukaryotes reveals evolutionary dynamics and gene-regulatory function

Extended Data Fig. 1

a, The center donut chart depicts the breakdown of RNAcompete-measured RBPs across major eukaryotic clades and 33 species, including both the RBPs measured for this study and those from Ray et al.17. Surrounding donut charts depict the breakdown of RBPs by species for metazoa, fungi, plants, and other clades. The “other” category encompasses all species outside of metazoa, fungi, and land plants, including algae, excavates, amoebozoa, and SAR (Stramenopiles, Alveolates, and Rhizarians) supergroup species. Legends adjacent to the donut charts show the number of measured RBPs for individual species. b, The RNAcompete-measured RBPs were split into clusters based on RNA-binding profile similarity; sequence specificities were hierarchically, agglomeratively clustered on the Pearson correlation coefficients (PCCs) between RNA-binding profiles with complete linkage. Using a PCC cut-off of 0.6, 157 clusters were identified (Table S1) and the distribution of their sizes is displayed. c, Percentage of all 7-mers that are “specifically bound”, or recognized, (one-sided Z-test), FDR <0.01 or < 0.001 (Benjamini-Hochberg FDR correction over the 16382 7-mers), by at least one RNAcompete-measured RBP, or at least one RNAcompete-measured RRM- or KH-domain RBP. d, e, Box plots show the distribution of RNA-binding profile PCCs for pairs of RBPs whose RBRs fall within the percent amino acid sequence ID (AA SID) range indicated on the x- axis. d depicts only the 308 RRM-domain-containing RBPs and e depicts only the 47 KH-domain-containing RBPs. As a control, the distribution of PCCs between RNAcompete Set A and Set B for the same experiment are displayed to the right. The number of RBP pairs (N) in each AA SID range is indicated above each box. Boxes span the interquartile range (IQR) with the center line marking the median. Whiskers span from minimum to maximum value within IQR*3/2 from box boundaries. Outliers are displayed as dots.

Back to article page