Fig. 4: Motif co-occurrence uncovers the regulatory rules of gene expression. | Nature Communications

Fig. 4: Motif co-occurrence uncovers the regulatory rules of gene expression.

From: Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure

Fig. 4

a Distribution of the signal-to-noise ratio (SNR) of expression levels across genes carrying co-occurring motifs (co-occurrence rules, see Results text and Methods). Inset: comparison of SNR with single motifs (n = 2098) and with motif co-occurrence rules (n = 116,734). b The range of gene expression levels, relative to the full observed range in the initial RNAseq data (~4 orders of magnitude, Fig. 1a), that could be retrieved on average by genes carrying either single motifs or motif co-occurrence rules. c The amount of genes carrying a given motif co-occurrence rule versus the average expression level across the set of genes carrying the given rule, with increasing statistical significance levels from a chi-squared test76. FDR denotes Benjamini-Hochberg (BH) adjusted p-value and data is colored gray except for significance cutoffs p < 0.05 (black), FDR < 0.05 (dark red) and FDR < 2.7e-6 (equals Bonferroni correction, red). Inset: distribution of the number of co-occurring motifs in significant (FDR < 0.05) rules. d Distribution of motif co-occurrence rules across single or multiple cis-regulatory regions, according to the locations of the co-occurring motifs. e Distribution of gene expression levels with groups of motif co-occurrence rules that have one motif in common (i.e. unchanged). f Illustration of four genes (CDC6, RIO2, NSP1, EXG1) that carry a group of motif co-occurrence rules with a common motif (NHP6B transcription factor binding site, Tomtom67 BH adj. p-value < 0.005, SGD:S000002157104, blue line) in their promoter region, whereas they all diverge in possessing 2 to 4 other DNA motifs (red lines) across the remaining regulatory regions. These genes span a 648-fold range of expression levels. Red lines in the histogram denote the specific expression levels of the genes. g Distribution of gene expression levels with groups of motif co-occurrence rules that differ by a single motif. Source data are provided as a Source data file.

Back to article page