Fig. 5: Upregulated PRC2+-CGI genes are linked to distal enhancers targeted by specific transcription factor binding sites (TFBSs).

a Sequence motif enrichment analysis was performed for upregulated PRC2+-CGI and PRC2−-CGI genes using either promoter or enhancer regions. The linked enhancers are from “enhancer-to-gene links” defined by the TCGA ATAC-Seq consortium. b The number of linked enhancers per gene in both gene classes. c The number of significantly enriched TF motifs in promoter and enhancer regions. d The top 15 enriched TFs identified in promoter regions, selected by taking the most significant p-values across cancer types. e IGV plots showing the promoter region of CDC6 (a PRC2−-CGI gene) with predicted SP1 motifs and occupied by SP1 in HCT116 COAD cancer cells (left) and A549 LUAD cells (right) by ChIP-Seq from the ENCODE project. f TF ChIP-Seq of SP1-binding overlapping PRC2+-CGI vs. PRC2−-CGI promoters. g The top 15 enriched TFs identified in enhancer regions, selected by taking the most significant p-values across cancer types. The one-sided hypergeometric test was performed in c, d and g and the enriched TFs with FPKM > 10 and unadjusted p-value < 0.01 in the corresponding cancer types were chosen. h HNF4A-binding motifs were predicted within distal enhancers for PRC2+-CGI genes MLXIPL in COAD and EFNA2 in EAC, which were validated by HNF4A ChIP-Seq in COAD cells (Caco-2) and EAC cells (OE19). ChIP-Seq datasets were re-analyzed from GSE23436, GSE96069, E-MTAB-6858 and GSE132686. i TF ChIP-Seq of HNF4A-binding overlapping PRC2+-CGI vs. PRC2−-CGI enhancers, from the same COAD and EAC dataset above. j Expression differences between TCGA HNF4A-high and HNF4A-low EAC/COAD tumors for the HNF4A target genes having enhancers overlapped by HNF4A in EAC or COAD cells (from panel i). High and low tumors were those in the upper and lower quintile of HNF4A expression. The cutoff for coloring is absolute fold-change ≥ 1.5. p-values in f and i were determined by a two-sided Fisher’s exact test. p < 0.0001****; p < 0.001***; p < 0.01**; p < 0.05*. The exact p-values are shown in Supplementary Data 4.