Extended Data Fig. 2: Joint analysis of 5mC and 5hmC from single cells. | Nature Biotechnology

Extended Data Fig. 2: Joint analysis of 5mC and 5hmC from single cells.

From: Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq

Extended Data Fig. 2

a, Barplot showing the enrichment of 5mCG and 5hmCG sites detected by SIMPLE-seq, WGBS and TAB-seq over different genomic regions. b, Boxplots showing the 5mC or 5hmC modification levels on different genomic regions from bisulfite sequencing, TAB-seq and SIMPLE-seq. For all boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. The minima/maxima/numbers of elements of all boxplots: 60.76%/74.60%/157,324 (Alu,(1)), 58.52%/68.61%/137,869 (Alu,(2)), 3.71%/10.71%/37,810 (Alu,(3)), 0.00%/10.91%/8,332 (Alu,(4)), 14.71%/67.84%/5,828 (CGI,(1)), 12.36%/54.72%/3,662 (CGI,(2)), 1.96%/5.87%/1,059 (CGI,(3)), 1.45%/5.24%/304 (CGI,(4)), 6.02%/72.62%/45,014 (Intron,(1)), 8.48%/79.35%/30,294 (Intron,(2)), 4.19%/5.01%/7,333 (Intron,(3)), 3.41%/4.85%/1,669 (Intron,(4)), 70.19%/75.69%/124,777 (L1,(1)), 64.68%/94.66%/103,247 (L1,(2)), 6.89%/10.32%/22,350 (L1,(3)), 4.90%/8.53%/5,414 (L1,(4)), 67.40%/75.88%/26,295 (L2,(1)), 68.26%/87.71%/18,819 (L2,(2)), 1.86%/7.30%/4,342 (L2,(3)), 2.20%/6.31%/875 (L2,(4)), 64.16%/73.22%/909 (LCP,(1)), 57.34%/79.78%/739 (LCP,(2)), 2.88%/10.32%/171 (LCP,(3)), 2.56%/7.71%/62 (LCP,(4)), 70.15%/75.10%/194,933 (LINE,(1)), 65.93%/81.40%/177,226 (LINE,(2)),3.79%/7.49%/34,376 (LINE,(3)), 3.99%/5.61%/9,412 (LINE,(4)), 68.59%/75.74%/189,543 (LTR,(1)), 60.01%/72.15%/170,132 (LTR,(2)), 2.85%/7.75%/32,147 (LTR,(3)), 3.24%/5.59%/10,073 (LTR,(4)), 68.60%/74.58%/47,771 (MIR,(1)), 66.88%/85.86%/30,355 (MIR,(2)), 2.88%/9.75%/7,707 (MIR,(3)), 0.00%/7.62%/1,520 (MIR,(4)), 62.68%/74.58%/325,867 (SINE,(1)), 58.25%/73.48%/302,881 (SINE,(2)), 4.93%/9.37%/56,712 (SINE,(3)), 0.00%/9.52%/19,715 (SINE,(4)), 62.82%/79.57%/4,651 (H3K9me3,(1)), 48.15%/94.31%/3,514 (H3K9me3,(2)), 3.79%/12.09%/862 (H3K9me3,(3)), 0.00%/11.97%/226 (H3K9me3,(4)), 14.28%/69.99%/21,711 (DNase,(1)), 12.70%/70.84%/15,340 (DNase,(2)), 0.00%/10.11%/3,961 (DNase,(3)), 0.00%/6.10%/973 (DNase,(4)). c, Venn plot showing the 5mCG sites overlap between SIMPLE-seq and TAPS. P-value, two-sided Fisher’s exact test. d, Venn plot showing the 5hmCG sites overlap between SIMPLE-seq and TAB-seq. P-value, two-sided Fisher’s exact test. e, Stacked barplot showing the fraction of called 5hmC sites overlapped with 5mC (grey, 5hmC-shared) and 5hmC-sites did not overlapped with a called 5mC sites (blue, 5hmC-only). f, Boxplot showing the 5hmC modification levels of 5mC-5hmC shared sites and the 5hmC-only sites. For both boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. For 5mC-5hmC shared sites, minima = 0.01, maxima = 1.00, sites number n = 622,323, and for 5hmC-only sites, minima = 0.01, maxima = 1.00, sites number n = 435,143. P-value, two-sided Fisher’s exact test. g, Barplot showing the enrichment of 5mC-5hmC shared sites and the 5hmC-only sites over different genomic regions. h, Barplot showing the relative enrichment of 5hmC-only sites over 5mC-5hmC shared sites on different genomic regions. i-k, UMAP embedding showing cells based on their (i) 5mCG, (j) 5mCHG and (k) 5hmCHG levels (in 100-kb non-overlapping bins). Each dot represents a single cell and is colored according to its original identity. l, Assignment of 2i mES cells and serum mES cells into two distinct clusters grouped by unsupervised clustering. m, Silhouette plot to evaluate the degree of separation of the clusters based on 5mC or 5hmC. n, Line plots showing the cumulated coverages of 10-kb non-overlapping bins with different depths for 5mC (green) and 5hmC (blue) from different numbers of single cells. The shadowed area showing the error ranges from 5 randomly sampled cell sets. o, Smoothed line plots showing the 5hmCG levels around genic regions of genes with different expression levels (using the smooth.spline function with parameter df = 30). p-q, Line plots showing the relationships between promoter 5mCG and 5hmCG modification levels with gene expression levels in (p) 2i mES cells, (q) serum mES cells.

Back to article page