Figure 3: Binding site analysis of cancer-associated regulatory elements.

(a) Frequency of ENCODE-defined TFBSs overlapping with cancer-associated promoters (gained and lost). Values are presented as the number of TFBS per 10 kb coverage. TFs were sorted according to their frequency in all K4me3-defined promoter sets. EZH2, SUZ12 and ZNF217 binding sites are enriched (P<0.05, one-tailed Fisher’s exact test). The complete TF list is presented as Supplementary Fig. 17 and Supplementary Table 7. (b) TFBS frequency in cancer-associated predicted enhancer regions. (c) Overlap analysis between ESC-defined univalent (K4me3 only, K27me3 only) or bivalent (K4me3 and K27me3) regions and GC promoters (all and cancer-associated) and GC-predicted enhancers (all and cancer-associated). Cancer-associated GC promoters exhibit an elevated proportion of bivalent regions, exceeding univalent regions (P<2.2 × 10−16, one-tailed Fisher’s exact test). (d) Genome browser view of the ONECUT2 locus as a representative cancer-associated promoter overlapping with an ESC-defined bivalent region. (e) Box plot depicting changes in DNA methylation β-values in all promoters and cancer-associated promoters (gained or lost). P-values (Wilcoxon test) are: P=7 × 10−48 (all promoters versus gained promoters); P=0.48 (all promoters versus lost promoters); P=5.37 × 10−41 (gained promoters versus lost promoters).