Extended Data Fig. 9: Single cell methylation in-vivo and in MEEBs.
From: DNA methyltransferases 3A and 3B target specific sequences during mouse gastrulation

a, Number of single cells assayed from embryo E7.5 and MEEB at day5. b, Distribution of estimated unconverted C in CHH contexts over single cells. c, Distribution of the total number of methylation calls per cell. Note that our analysis aimed at a large sample of single cells with low coverage per-cell. Number of cells was 2217 for d5_3a, 2197 for d5_3b, 2984 for d5_wt and 2146 for e7.5. The middle line indicates the median, box limits represent quartiles, and whiskers are 1.5× the interquartile range. d, e, Scatter plot show distribution of CXCR4 and EPCAM levels in RNA (metacells from a gastrulation scRNA-seq manifold (left), points, color coded by cell type as in S7) and from FACS indices (on cells sorted from E7.5 embryos). Gating on EPCAM/CXCR4 is allowing (in-silico) identification of Epiblast, endoderm and mesoderm sub-populations. Blood cells were shown to have low EPCAM and low CXCR4 levels and were gated separately. Mean methylation analysis allows further separation of extraembryonic cells (right, Green points). f, Comparing total read coverage over early and late replication domains across single cells. Color coding represents the inferred cell cycle ordering, which is based on the early/late coverage ratio as well as the early/late methylation difference (Fig. 6, not shown in this panel). g, CpGs grouped according to their model (MEEB_3b/3a) sequence-based score, and according to their replication time (early or late). Box plot shows distribution of the computed differences in average methylation of CpGs with high (3b favoring) and low (3a favoring) scores, for each single cell (n=199 for ectoderm, 356 for mesoderm and 25 for endoderm). Single cells were grouped according to their germ layer (using index sorting as in D, color coded) and according to their inferred cell cycle phase (start, mid and end). The middle line indicates the median, box limits represent quartiles, and whiskers are 1.5× the interquartile range. Distributions are compared using 2-sided Kolmogorov-Smirnov statistics (**** = p < 0.0001, *** = p < 0.001, ** = p < 0.01, * = p < 0.05). h, Phased cell cycle ordering for additional batches of single cell data from WT and mutant MEEBs. i, Cell cycle trends are computed for loci with high/low sequence preferences in early and late replicating domains (as in Fig. 6), but here we are normalizing values to the maximum level across the replication cycle in each group, to allow comparison of the relative trends. Top: sequence model MEEB_3b/3a (regressing mutant difference), and methylation data is from WT MEEBs. Middle: sequence model is MEEB_3a (regression WT-Dnmt3a−/−) and methylation data is from Dnmt3b−/− MEEBs. Bottom: sequence model is MEEB_3b and methylation is from Dnmt3a−/−. We note the slower re-methylation for low-affinity sites in the middle panel, and deeper reduction in methylation for high affinity sites in the lower panel.