Fig. 3: SIDISH identifies high-risk cell populations in breast cancer subtype TNBC. | Nature Communications

Fig. 3: SIDISH identifies high-risk cell populations in breast cancer subtype TNBC.

From: SIDISH integrates single-cell and bulk transcriptomics to identify high-risk cells and guide precision therapeutics through in silico perturbation

Fig. 3: SIDISH identifies high-risk cell populations in breast cancer subtype TNBC.

a UMAP clustering of major cell types in the TNBC scRNA-seq dataset, including cancer epithelial cells, cancer-associated fibroblasts (CAFs), perivascular-like (PVL) cells, normal epithelial cells, myeloid cells, T cells, B cells, plasmablasts, and endothelial cells. b UMAP visualization of 3789 high-risk cells identified by SIDISH is shown in red, while the 38,723 background cells are depicted in gray. c Bar plot depicting the distribution of high-risk cells across various cell types. Cancer epithelial cells account for the majority (65.4%) of high-risk cells, followed by CAFs (16.4%) and PVL cells (8.6%). d SIDISH was applied to HER2+ and TNBC datasets, revealing a higher prevalence of high-risk cells in TNBC. e Heatmap of differential gene expression analysis between high-risk and background cells. High-risk cells show significantly higher expression of marker genes (P = 9.25 × 10−9). P value highlighting the differences in expression levels between high-risk and background cell subpopulations was calculated using a one-sided Mann–Whitney U-test. f Functional enrichment analysis, including GO terms, pathways, and disease terms, highlights terms related to BRCA progression. g Kaplan–Meier survival curve for the TCGA breast cancer cohort (TCGA-BRCA). High-risk patients shown in pink show clear stratification from background patients shown in gray (P = 7.44 × 10−19). h, i Kaplan–Meier survival curves for two independent bulk validation datasets: Caldas 2007 cohort (h) and Chin 2006 cohort (i). Both plots confirm significantly worse survival outcomes for high-risk patients (P = 2.19 × 10−10 and P = 1.51 × 10−12, respectively). P values were calculated using the two-tailed log-rank-sum test to compare survival curves between high-risk and background patient groups in all three cohorts.

Back to article page