Extended Data Fig. 9: Genome-wide analysis of inter-CTCF contacts and peak-calling analysis of MCC.

a, b, MCC profiles of the promotors at the α- and β-globin loci (Hba-a1 and Hba-a2, and Hbb-b1 and Hbb-bs, respectively), DNase-seq and ChIP–seq data of CTCF, RAD21, H3K27ac, NIPBL, GATA1, NF-E2 and KLF1. c. Metaplots of RAD21 and CTCF-binding density (RPKM) at the two nearest CTCF-binding sites flanking erythroid- and ES-cell-specific superenhancers in erythroid (red, n = 190) and ES (green, n = 462) cells. A significant difference in enrichment of RAD21 was found at CTCF sites flanking superenhancers in the different cell types. Higher levels of RAD21 were found at active ES superenhancers in ES cells compared with the same sequence when the enhancers are inactive in erythroid cells. Conversely higher levels of RAD21 were found at active erythroid superenhancers in erythroid cells compared with the same sequence in ES cells. CTCF-binding density of RAD21 and CTCF at random sites (n = 500) was quantified as a control and was similar in both cell types. Box plots of RAD21 binding (RPKM) at the two flanking CTCF-binding sites (1-kb region around the centre of the CTCF site) flanking erythroid- and ES-cell-specific superenhancers in erythroid (red) and ES (green) cells. Two-tailed Student’s t-test; box plots show the median, interquartile range, maximum points within 1.5× the interquartile range of quartile 1 or 3. d, Heat maps of 10-kb genomic regions surrounding promoters, enhancers or CTCF-binding sites showing DNase I hypersensitivity and ChIP–seq data for H3K27ac, H3K4me3, mediator (MED1), NIPBL, RAD21 and CTCF. The chromatin loader NIPBL is highly enriched at enhancers and promoters compared with CTCF sites. e, Manhattan plot showing highly significant peaks of interaction irrespective of the method of peak calling (–log10 transformations of the P values are plotted on the y axis). The data have been peak-called with three different orthogonal methods. MACS2, a custom Poisson-based model and a machine-learning-based model. All of these methods calculate the enrichment over the background data, which has undergone targeted capture. f, Histogram of the percentage of peaks falling within the topologically associating domain (TAD) in erythroid cells from promoters and CTCF sites. g, Percentage of ligation junctions falling within the TAD in cis from erythroid promoters. n = 576, data are mean ± s.d. h, Analysis of the strength of contacts at promoters with different classes of element as categorized by GenoSTAN. This analysis shows that promoters broadly contact all classes of element but have a slight predilection to contact promoters and enhancers compared with CTCF sites. By contrast, CTCF sites preferentially contact other CTCF sites compared with other categories. Normalized total numbers of junctions in the 1-kb region surrounding different classes of elements within 400 kb of the viewpoint. n = 6, data are mean ± s.d., two-way ANOVA, Tukey’s multiple comparisons test.