Fig. 3: Histological cancer subtypes are associated with different COOs.
From: Learning the cellular origins across cancers using single-cell chromatin landscapes

a Box plots of the feature importance distribution (100 SCOOP runs) of the top 5 cell-of-origin (COO) predictions (predicted COO is shown in red) amongst kidney-related cell subsets for papillary renal cell carcinoma (pRCC, n = 32), clear cell renal cell carcinoma (ccRCC, n = 111), and chromophobe renal cell carcinoma (chRCC, n = 43). Each cell subset is followed by a dataset indicator for that cell subset: D2 for29, D6 for33. Also displayed is the number of times the feature appeared in the top 5 features across the 100 runs (n). One-sided Mann-Whitney test p-values are displayed. Kidney model created in BioRender. Tsankov, A. (2025) https://BioRender.com/ht2q3vc. b UMAP of individual kidney cancer whole-genome sequencing (WGS) samples binned mutational profiles (dots) colored by cancer subtype (chRCC, n = 43; ccRCC, n = 111; pRCC, n = 32). c UMAP dimensionality reduction of individual pancreatic cancer WGS samples binned mutation profiles (dots) colored by cancer subtype (adenocarcinoma, n = 232; neuroendocrine, n = 47). d Left: UMAP of stomach and pancreas scATAC-seq data from29 (n = 58,175 cells; 12 samples). Thinner dashed line encompasses pancreatic cells, whereas thicker dashed line demarcates stomach cells. Right: UMAPs displaying the Pearson correlation coefficient (r) between aggregated pancreas adenocarcinoma (PDAC, n = 232) and pancreatic neuroendocrine tumor (PNET, n = 47) mutational profiles and single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data meta-cells. For PDAC, the strongest anti-correlations (r ≈ −0.60, red/bottom end of the scale) are observed in stomach goblet and parietal and chief cells, while the weakest anti-correlations (r ≈ −0.52, blue/top end of the scale) occur in stromal cells. For PNET, highest anti-correlations (r ≈ −0.52) are enriched in pancreas islet endocrine cells, whereas stromal and pancreas acinar cells tend to have the lowest anti-correlations (r ≈ −0.47). e Box plots of the feature importance distribution (100 SCOOP runs) of the top 5 COO predictions amongst pancreas- and stomach-related cell subsets28,29 for PDAC (n = 232) and PNET (n = 47; predicted COOs highlighted in red, similar cell subsets in pink). Each cell subset is followed by a dataset indicator for that cell subset: D1 for28, D2 for29. One-sided Mann-Whitney test p-values are displayed, with Bonferroni correction for multiple hypothesis testing. f Accepted model of colorectal cancer (CRC) COOs agrees with SCOOP’s predictions: colon goblet cells for microsatellite instable (MSI), and intestinal epithelial stem cells for MSS. Intestinal model created in BioRender. Tsankov, A. (2025) https://BioRender.com/001gcxg. g Box plot of the feature importance distribution (100 SCOOP runs) of the top 5 COO predictions amongst colon-related cell subsets28,29,31 for CRC, MSI (n = 7; predicted COO in red). Each cell subset is followed by a dataset indicator for that cell subset: D1 for28, D2 for29, D3 for31. One-sided Mann-Whitney test p-values are displayed. Cell type abbreviations are defined in Supplementary Data 3. Box plot vertical lines show 25th, 50th (median), and 75th percentiles, with horizontal whiskers extending to a maximum distance of 1.5 × interquartile range from the hinge. Data beyond the whisker ends are plotted individually.