Extended Data Fig. 2: Technical details of library preparation, reproducibility and biases.

a, Optimal MNase 3C library digestion keeps the nucleosome tails intact whereas over digestion leads to the loss of nucleosome linkers. This results in a failure of fragments to religate. Note that the digestion controls show that MNase cuts chromatin into mononucleosomes and that there are very few fragments over 1,000 bp. The fragment size is considerably smaller than DpnII-digested chromatin. b, MCC profiles of the Hba-a1 and Hba-a2 promoters in erythroid cells showing the interaction profiles derived from MNase-based 3C library preparation using a conventional NP-40-based nuclear extract, compared with data generated from intact cells permeabilized with digitonin. Data from three mice with two aliquots of cells from each mouse treated with two different concentrations of MNase are shown. Counts are normalized to the total number of reads across the genome. The assay is highly reproducible between replicates. To look for biases caused by MNase digestion, we sequenced the digestion controls. In addition, we sequenced the MNase 3C library without oligonucleotide capture to look for biases resulting from the ligation reaction. The global distribution of reads from the MNase digestion and the ligation junctions from the uncaptured library shows a very similar distribution to background (bottom two panels), without obvious biases towards the hypersensitive sites. c, Violin plots of the genome-wide analysis at different classes of element show the number of reads in a 1-kb window around different classes of element compared to control regions 10 kb downstream of the element generated by sequencing of the MNase digestion controls and ligation junctions in the unenriched MNase 3C library. Sequencing of the digestion controls shows no evidence of biases in MNase digestion at enhancers or CTCF sites. There is a small reduction in the number of reads at promoters, possibly due to the loss of smaller fragments from histone-depleted regions in the DNA extraction and sequencing library preparation. Conversely, sequencing of the ligation junctions reveals a slightly higher numbers of junctions at promoters and regulatory elements including CTCF sites, which is probably due to the ligation process. d, Analysis of the DNA sequence at ligation junctions detected no biases towards ligation junctions in AT-rich sequences (which MNase is reported to cut preferentially). e, Metaplots of the junction count from the uncaptured MNase 3C library at DNase I hypersensitive sites show a small bias to the central 200 bp where there are more junctions, but this is partially offset because the fragment size is reduced in the hypersensitive sites. There was no correlation between the strength of the hypersensitive site and the number of junctions per kb within the site. A model of the background distribution of reads was generated to correct for this effect using a 20-bp moving window across the metaplot of the hypersensitive sites. f, Plots of single-normalized MCC data (to the total number of reads across the genome from the viewpoint) compared to double normalization, which corrects for the small bias at hypersensitive sites. This analysis shows that double normalization for the hypersensitive site effect does not significantly change the interaction profile compared withs single normalization. Peak calling with the machine-learning-based peak caller LanceOtron of both single- and double-normalized data showed that 94% of the significant peaks remains unchanged by this correction (Supplementary Table 2).