Fig. 4: Impact of transcription on the DNA replication origin landscape.
From: A predictable conserved DNA base composition signature defines human core DNA replication origins

a Plot representing the percentage of DNA replication origins in each quantile that overlap a promoter region (±2 Kb of TSS) of a GENCODE gene (in red). Overlaps with control regions (paler colour) which are randomly shuffled genomic regions of equal size and number as the origins are also shown. P-values obtained by Chi-square Goodness-of-Fit test using observed and expected values for overlap. b As in a for overlaps with intergenic regions (>2 Kb upstream of a GENCODE gene, TSS are excluded). c As in a for overlaps with gene body (genic region 2 Kb downstream of the TSS excluded). d Bar plot representing percentage of CpG-containing gene promoters that host a DNA replication origin within ±2 Kb of their TSS. Promoters with different transcriptional activity levels in hematopoietic cells are shown (silent = 0, low = 0–15, medium = 15–60, and high = >60 RPKM). In this figure, a promoter is considered CpG-containing (CpG(+)) if a CpG island is present within ±2 Kb of the TSS (Gencode v25). e Bar plot showing the average number of origins localised within 2 Kb of the TSS of genes with different transcriptional output levels (silent = 0, low = 0–15, medium = 15–60, and high = >60 RPKM) in hematopoietic cells. f Boxplots showing the average activity of origins localised within 2 Kb of the TSS of genes with different transcriptional output levels as in d in hematopoietic cells. P-values were obtained using the Wilcoxon test in R. g Dot plot shows the correlation of transcriptional output of CpGi(+) promoters in hematopoietic progenitors (y-axis; RPKMs, Log2) and the activity of core origins located within ±2 Kb of the TSS of these genes in hematopoietic progenitors (x-axis; normalised SNS-seq counts, Log2). Top and bottom 5% outliers were removed. The Pearson’s correlation coefficient (r) and P-value for correlation is indicated on the top, and trendline is shown in blue. h As in d for CpGi(−) promoter regions. i As in e for CpGi(−) promoter regions. j As in f for CpGi(−) promoter regions. k As in g for CpGi(−) promoter regions. l Schematic summary of findings. CpGi(+) promoters (black) tend to host DNA replication origins, irrespectively of their transcriptional status, while CpGi(−) promoters (grey) tend to host origins when they are transcriptionally active.