Supplementary Figure 2: Extended identification and validation of off-target binding in DIP-seq.
From: A reassessment of DNA-immunoprecipitation-based genomic profiling

(a, b) Immuno dot blot n = 1 (a) and ELISA n = 3 biologically independent experiments (b) of 5mC, 5hmC, 5fC and 5caC antibodies in synthetic 426 bp oligos containing the different marks. Boxplots represent median and first and third quartiles with whiskers extending 1.5 * inter-quartile range. (c) Enrichment of IgG or Input reads over the intersection of DIP-seq 5modC (5mC+5hmC+5fC+5caC) n = 592 enriched regions or non-intersecting (5mC/5hmC/5fC/5caC) n = 259002 enriched regions. Data represented as in b. P-values calculated using two-tailed T-test. (d) Correlation matrix of enriched DIP-seq regions per Mbp of mm9. Correlation was calculated as pairwise two-tailed Pearson correlation r2 for each n = 1 biologically independent experiment. (e) Venn diagram of overlapping enriched regions for 5hmC, 5mC and IgG (left). Dinucleotide frequencies for overlapping IgG+5mC+5hmC n = 23317 regions, 5mC+5hmC, n = 6683 regions and mm9 n = 23317 randomly sampled regions. Data represented as in b. (f) Number of methylated CpH from WGBS data per IgG n = 137557 enriched region or 5mC n = 19091 enriched region. P-values calculated using two-tailed Mann-Whitney U-test. (g) Enrichment profile of IgG and 5hmC in DnmtTKO (left) or TetTKO (right) and WT mESCs over IgG n = 137557 enriched regions. Data shown as mean for WT and DnmtTKO n = 1 biologically independent sample (left) and mean and 95% confidence intervals for WT n = 2 and TetTKO n = 3 biologically independent samples (right). (h) DIP using a 5hmC antibody in wild-type (WT) (left) and DnmtTKO (right) mESCs for DIP-qPCR n = 3 and DIP-seq n = 1 biologically independent samples. Data shown as mean ±s.d. Correlation between mean DIP-qPCR and DIP-seq values calculated using two-tailed Spearman correlation. (i) CG content of enriched fragments for DIP and Seal profiling for 5hmC (left) and 5fC (right). Theoretical normal distribution modelled based on mean and s.d. for each mark (Norm). P-values calculated by two-tailed Kolmogorov-Smirnov test using the mean of n = 2 biologically independent experiments. (j) Estimation of PCR duplication for sequencing libraries at a depth of 10 million reads shown as the non-redundant fraction (ie. not duplicated fraction) for n = 2 biologically independent samples. Data represented as in b.