Extended Data Fig. 8: Single-cell analysis of chromatin dynamics in early-stage neoplasia.
From: A gene–environment-induced epigenetic program initiates tumorigenesis

a, UMAP representation of scATAC-seq profiles of mKate2+ cells isolated from Kras* and Kras* + injury tissue conditions (n = 1 mice each) and co-embedded together, revealing chromatin heterogeneity across Kras-mutant pancreatic epithelial cells from pre-malignant tissues. Dots represent individual cells (n = 6,369) and colours indicate cluster identity based on initial phenograph clustering (left). The heat map shows the degree of intersection of significantly enriched peaks (Fisher’s exact test, adjusted P < 0.05) between each pair of phenograph cluster (coloured matching UMAP plot), normalized by the total number of enriched peaks in the cluster for that row (left). Rows and columns are ordered according to their grouping into seven larger subpopulations derived from the merging of phenograph clusters on the basis of the overlap of their differentially accessible peak sets (see Methods for details). b, UMAP representation of the same mKate2+ scATAC-seq profiles shown in a, coloured by major subpopulations (see Methods for details). c, Heat maps showing patterns of accessibility at subpopulation-defining peaks, shown across each of the major subpopulations defined in b separated by tissue injury (+/−) condition. Colour illustrates the proportion of all cells in each subpopulation and condition with an accessible peak, where values have been z-scored. The complete list of subpopulation-defining peaks is provided in Supplementary Table 8. d, Visualization of differential chromatin opening for the indicated peaks associated with known pancreatic cell-state-defining markers or the housekeeping gene Gapdh, illustrated by opened-peak density plots for nearby proximal or distal elements within 50 kb of the TSS. Colour scale indicates a Gaussian kernel density estimate of cells containing the open peak in the UMAP visualization, with yellow signal marking increased density of cells with open chromatin at that specific locus. e, UMAP projection of scATAC-seq profiles of Kras-mutant (mKate2+) epithelial cells shown in a–d, coloured by the indicated tissue states. f, Correlation analysis comparing normalized accessibility signals per peak captured in scATAC- and bulk ATAC-seq analyses of the indicated conditions. For scATAC-seq data, values representing pooling of all individual cells to generate depth-normalized accessibility signals per condition (pseudo-bulk) are shown. For bulk ATAC-seq data, values from a representative sample (independent mouse) of a total of n = 3 (Kras*) or n = 6 (Kras* + injury) are shown. g, Volcano plot showing dynamic peaks identified between PDAC and normal conditions in bulk ATAC-seq analyses (Fig. 1), coloured according to their relative accessibility fold change detected between Kras* + injury and Kras* samples in scATAC-seq analyses. Peaks gained or lost in PDAC versus normal are found differentially represented in scATAC-seq data from early-stage neoplasia, correlating with tissue injury status. h, UMAP projection illustrating examples of peaks exhibiting chromatin closing (left) or opening (right) within the same Kras-mutant cell cluster upon tissue injury (+), visualized by opened-peak density plots in which colour indicates a Gaussian kernel density estimate of cells containing the open peak in the UMAP visualization. i, scATAC-seq tracks of the indicated loci showing chromatin accessibility patterns across the indicated subpopulations, marked with colour labels matching b and separated by experimental condition. The first two rows (aggregate, in grey) show global patterns from pooling all cells from each condition, regardless of subpopulation identity, and population-specific dynamics are shown below. Blue- and red-coloured boxes mark ATAC gains or losses detected in aggregate populations, and dashed boxes highlight examples of peaks displaying injury-associated accessibility changes between Kras-mutant cells from the same subpopulation. j, AP-1 and NR5A2 activity scores are anticorrelated across single-cell epigenetic profiles, separated by subpopulation. Logged activity scores are plotted as a heat map, with cells (columns) ordered by ratio of AP-1/NR5A2 activity within each subpopulation. k, Heat maps showing accessibility signals for the indicated cluster of peaks (columns) identified from bulk ATAC-seq analyses (see Extended Data Fig. 2a) across each major subpopulation of Kras-mutant cells, separated by experimental condition. The colour scale represents the proportion of all cells in each subpopulation and condition with an accessible peak, where values have been z-scored. As above, the first two rows (aggregate) show global accessibility patterns from pooling all individual cells in each condition, regardless of subpopulation; and subpopulation-specific dynamics are shown below. l, Proportion of mKate2+ cells per cluster (marked with colour labels matching Extended Data Fig. 8c) derived from Kras* (grey) or Kras* + injury (orange) tissue conditions.