Fig. 4: Multiple gastrointestinal cancers develop via a metaplastic intermediate. | Nature Communications

Fig. 4: Multiple gastrointestinal cancers develop via a metaplastic intermediate.

From: Learning the cellular origins across cancers using single-cell chromatin landscapes

Fig. 4

a Left: UMAP of epithelial cells (n = 1161; 2 samples) from human chronic pancreatitis single-cell RNA sequencing (scRNA-seq data)68, colored by human stomach goblet cell module score. Warm colors (red, right end of the scale) indicate high module scores, whereas cold colors (blue, left end of the scale) indicate low module scores. Right: Violin plots comparing human pancreatic acinar and stomach goblet cell module scores in acinar (top) and metaplasia (bottom) cell clusters. Two-sided Mann-Whitney test p-values are displayed; *\(p < 0.0001.\) b Left: UMAP of epithelial cells (n = 13,362; 4 samples) from mouse pancreas injury model scRNA-seq data69, colored by mouse stomach goblet cell module score. Warm colors (red, right end of the scale) indicate high module scores, whereas cold colors (blue, left end of the scale) indicate low module scores. Right: Violin plots comparing mouse pancreatic acinar and stomach goblet cell module scores in acinar (top) and metaplasia (bottom) cell clusters. Two-sided Mann-Whitney test p-values are displayed; *\(p < 0.0001.\) c Violin plots comparing mouse pancreatic acinar and stomach goblet cell module scores in epithelial cells (21 samples) per experimental condition: normal (N1), regenerating (N2), pre-malignant (K1-K4), and malignant (K5, K6). Mouse models and treatment conditions are represented on the x-axis. Two-sided Mann-Whitney test p-values are displayed; *\(p < 0.0001.\) Mouse model illustration created in BioRender. Tsankov, A. (2025) https://BioRender.com/2uc0u4y. d Violin plots comparing human colon stem and goblet cell module scores in precancerous stem-like (left) and metaplastic (right) cell clusters (55 samples). Two-sided Mann-Whitney test p-values are displayed; *\(p < 0.0001.\) e Box plots of the feature importance distribution (100 SCOOP runs) of the top 5 cell-of-origin (COO) predictions for biliary (n = 34), esophageal (n = 97), and stomach cancer (n = 68; predicted COOs are highlighted in red, similar cell subsets in pink). Each cell subset is followed by a dataset indicator for that cell subset: D1 for28, D2 for29, D3 for31, D4 for32, D5 for34, D6 for33. One-sided Mann-Whitney test p-values are displayed, with Bonferroni correction for multiple hypothesis testing in the case of stomach and esophageal adenocarcinoma. f Left: Box plots of the feature importance distribution (100 SCOOP runs) for the most predictive scATAC-seq feature (highlighted in red, similar cell subsets in pink) for the binned mutational profile of intestinal metaplasia whole-genome sequencing (WGS) samples74 (n = 5). Also displayed is the number of times the feature appeared in the top 5 features across 100 SCOOP runs (n). Each cell subset is followed by a dataset indicator for that cell subset: D1 for28, D2 for29, D3 for31, D4 for32, D5 for34, D6 for33. One-sided Mann-Whitney test p-values are displayed, with Bonferroni correction for multiple hypothesis testing. Middle: Bar plots of the number of times cell subsets appeared as the top feature across 100 runs of SCOOP with the most frequently appearing feature highlighted in red. Exact binomial test p-values are shown, with Bonferroni correction for multiple hypothesis testing. Right: Box plots displaying the test set variance explained (test \({R}^{2}\)) by the model runs (n) for which goblet cells (title) were the top predicted feature when 10, 5, 2, and 1 features remained following backward feature selection. Box plot horizontal (or vertical) lines show 25th, 50th (median), and 75th percentiles, with vertical (or horizontal) whiskers extending to a maximum distance of 1.5 × interquartile range from the hinge. Data beyond the whisker ends are plotted individually.

Back to article page