Fig. 4: Whole gene regulatory structure unlocks a wider range of expression control than single regulatory regions.
From: Controlling gene expression with deep generative design of regulatory DNA

a Predicted gene expression levels with optimized generators of different single regulatory region parts or sequences spanning the whole gene regulatory structure (n = 64 per specific generator optimization target sample, maximization marked red, minimization blue). Boxes denote interquartile (IQR) ranges, centers mark medians and whiskers extend to 1.5 IQR from the quartiles. b Dynamic ranges between median (gray) and extreme values (red) in the optimized sequence samples from a. c Correlation analysis between published experimentally measured gene expression levels (defined medium) of 80 bp proximal promoter sequences (−170 to −90 relative to TSS)15 and our predictions (n = 10,282). Red line denotes the least-squares fit. The T-test was used. d Increases (red) and decreases (blue) of predicted gene expression levels with a random subset of the 80 bp proximal promoter designs15 when expanded and combined with all 4238 native gene regulatory structures to create 1000 bp constructs (n = 542,464). Black dots denote median levels, black lines the interquartile range and gray lines the 10th and 90th percentiles, respectively. e Correlation analysis between published estimated cell growth of 5′ UTR designs (at the optimal level of evolutionary rounds)24 and our predicted gene expression levels (n = 200). Red line denotes the least-squares fit. The T-test was used. f Increases (red) and decreases (blue) of gene expression levels with a random subset of the 5′ UTR designs24 when expanded and combined with all 4238 native gene regulatory structures to create 1000 bp constructs (n = 542,464). Black dots denote median levels, black lines the interquartile range and gray lines the 10th and 90th percentiles, respectively. Source data are provided as a Source Data file.