Fig. 3: Deep learning identifies the DNA positions predictive of expression levels. | Nature Communications

Fig. 3: Deep learning identifies the DNA positions predictive of expression levels.

From: Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure

Fig. 3

a Relevance profiles across cis-regulatory region sequences obtained by querying the deep models (Fig. 1f). Red lines denote cutoff of 2 standard deviations. The decrease in relevance of UTR sequences at left edges was due to the considerable number of sequences that were shorter than the analysed regions (see Fig. 1d). b Clustered relevance profiles across the cis-regulatory region sequences. The clusters 1 through 4 are colored dark red, light red, light blue and dark blue, respectively. Lines and shaded regions represent means and standard deviations, respectively. c GC content in the cis-regulatory regions (n = 368, 383, 834 and 625, respectively). d Median expression levels of genes in the clustered relevance profiles (n = 1385, 1187, 1206 and 460 with increasing cluster size, respectively). e Total amount of regulatory DNA motifs uncovered in the cis-regulatory regions, with the proportional amounts of motif overlap between adjacent regions (black) as well as with the Jaspar database68 (red) highlighted. f Examples of regulatory DNA motifs uncovered across all the cis-regulatory regions that correspond to published motifs and sequence elements (see Supplementary Fig. 1c). “TFBS” denotes transcription factor binding sites. For box plots in c, d, boxes denote interquartile (IQR) ranges, centres mark medians and whiskers extend to 1.5 IQR from the quartiles. Source data are provided as a Source data file.

Back to article page