Fig. 9: Patient-specific high-risk cell distributions and precision therapeutic insights derived from in silico perturbation using SIDISH.

a, b Composition of high-risk cells across patients in the PDAC (24 patients) and BRCA TNBC (ten patients) scRNA-seq datasets, highlighting variability in cell type contributions. c Heatmap comparing the marker gene expression profiles of the top-ranking PDAC patient (T20) and the lowest-ranking patient (T3). d Box plots showing gene set scores for known PDAC markers in patients T20 and T3. Gene set scores were calculated for each cell in a single scoring pass. Sample sizes were T20 (N = 482 cells) and T3 (N =  1317 cells). Boxes indicate the interquartile range (IQR; 25th–75th percentile), with the line inside each box representing the median. Whiskers extend to the 5th–95th percentiles. P value was calculated using a one-sided Mann–Whitney U-test. e, f Kaplan–Meier survival curves stratifying TCGA-PDAC patients based on disease markers derived from the top-ranking patient T20 (e) and the lowest-ranking patient T3 (f). P values were calculated using the two-tailed log-rank-sum test to compare survival curves between high-risk and background patient groups. g Heatmap comparing the marker gene expression profiles between the top-ranking patient (CID3946) and the lowest-ranking patient (CID4523). h Box plots showing gene set scores for known BRCA markers in patient CID3946 and CID4523. Sample sizes were CID3946 (N = 774 cells) and CID4523 (N = 1754 cells). Boxes indicate the interquartile range (IQR; 25th–75th percentile), with the line inside each box representing the median. Whiskers extend to the 5th–95th percentiles. P value was calculated using a one-sided Mann–Whitney U-test. i, j Kaplan–Meier survival curves stratifying TCGA-BRCA patients based on disease markers derived from the top-ranking patient CID3946 (i) and the lowest-ranking patient CID4523 (j). P values were calculated using the two-tailed log-rank-sum test to compare survival curves between high-risk and background patient groups. k Heatmap displaying single-gene perturbation scores for PDAC patients. l UMAP visualizations illustrating the effects of SPARC perturbation in PDAC for patients T3 (top), T8 (middle), and T20 (bottom). m Heatmap of single-gene perturbation scores for BRCA patients. n UMAP visualizations showing the impact of CTLA4 perturbation in BRCA for CID4495 (top), CID3946 (middle), and CID4523 (bottom). o Heatmap of combinatorial in silico perturbation scores in PDAC patients. p UMAP visualizations for PDAC patients showing effects of combinatorial perturbation of SPARC and SLC12A2 in T3 (top), T8 (middle), and T20 (bottom). q Heatmap of combinatorial perturbation scores in BRCA patients. r UMAP visualizations for BRCA patients showing combinatorial perturbation effects of CTLA4 and IL6 in CID4495 (top), CID3946 (middle), and CID4523 (bottom). The color legend in the UMAPs indicates: gray for unchanged background cells (background to background), blue for high-risk cells transitioned to background cells (high-risk to background), purple for background cells transitioned to high-risk cells (background to high-risk), and red for persistent high-risk cells (high-risk to high-risk). ST represents the perturbation score. Significance thresholds: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; n.s. not significant. Exact P values are provided in the Source Data File.