Extended Data Fig. 2: Surrounding sequence of TSSs modulates initiation efficiency. | Nature Structural & Molecular Biology

Extended Data Fig. 2: Surrounding sequence of TSSs modulates initiation efficiency.

From: Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels

Extended Data Fig. 2

(A) +1 TSS efficiency of all −7 to −2 sequences within each N-8N-1N+1 motif in WT, rank ordered by efficiency of A-8C-1A+1 version shown as a heat map. x-axis is ordered based on median efficiency for each N-8N-1N+1 motif group, as shown in Fig. 2B. Spearman’s rank correlation tests between A-8C-1A+1 group and all groups are shown beneath the heat map. (B) Efficiencies of designed +1 TSSs grouped by base identities between −8 and +1 positions. Statistical analyses by Kruskal-Wallis with Dunn’s multiple comparisons test for base preference at individual positions relative to +1 TSS are shown beneath plots. Lines represent median values of subgroups. ****, P ≤ 0.0001; ***, P ≤ 0.001; **, P ≤ 0.01; *, P ≤ 0.05. (C) Histogram showing the distribution of measured efficiencies for all designed −8 to +4 TSSs of all promoter variants from ‘AYR’, ‘BYR’ and ‘ARY’ libraries in WT. Dashed line marks the 5% efficiency cutoff. (D) A+2G+3G+4 motif enrichment is apparent for the top 10% most efficient designed −8 TSS. A(/G)+2G(/C)+3G(/C)+4 motif enrichment was observed for the top 10% most efficient −8 TSSs but not for the next 10% most efficient TSSs. A(/G)+1 enrichment observed for top 20% most efficient TSSs is consistent with the +1 R preference of TSS. Numbers (N) of variants assessed are shown. Sequence logos were generated using WebLogo 3. Bars represent an approximate Bayesian 95% confidence interval. (E) An A at position −9 results in different sequence preferences at position −8. The dataset of designed +4 TSSs deriving from ‘AYR’, ‘BYR’ and ‘ARY’ libraries was used to detect the −9/−8 interaction. All variants were divided into 16 subgroups defined by bases at positions −9 and −8 relative to designed +4 TSS, and then their TSS efficiencies were plotted. Lines represent median values of subgroups. (F) An A at position −8 results in different sequence preferences at position −7. The dataset of designed +1 TSSs deriving from ‘AYR’ and ‘BYR’ libraries was used to detect −8/−7 interaction. Calculations same as −9/−8 interaction described in E.

Source data

Back to article page