Comprehensive molecular characterization of high-stemness gastric cancer cells using single-cell transcriptomics, spatial mapping, and machine learning

Wang, Ziyi; Li, Xuehao; Wang, Jin; Yu, Huidong; Zhao, Defeng; Xu, Yan; Zhou, Siyu; Men, Wanfu

doi:10.1038/s41698-025-01177-0

Download PDF

Article
Open access
Published: 17 December 2025

Comprehensive molecular characterization of high-stemness gastric cancer cells using single-cell transcriptomics, spatial mapping, and machine learning

Ziyi Wang^1,2,3^na1,
Xuehao Li³^na1,
Jin Wang⁴^na1,
Huidong Yu³,
Defeng Zhao³,
Yan Xu¹,
Siyu Zhou³ &
…
Wanfu Men³

npj Precision Oncology volume 9, Article number: 400 (2025) Cite this article

7687 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Gastric cancer (GC) remains a global clinical challenge due to late diagnosis, high heterogeneity, and poor prognosis. Tumor stemness has emerged as a key factor driving tumor aggressiveness and therapeutic resistance. However, the systematic characterization of high-stemness GC cells and their molecular features remains limited. We integrated single-cell RNA sequencing (scRNA-seq), spatial transcriptomics, and bulk RNA-seq data to identify and characterize high-stemness GC cells. Stemness scores were calculated using CytoTRACE, and malignant cells were classified into high stemness (top 25% CytoTRACE-scored cells, HighStem), dynamic transition stemness (DTStem), and low stemness (LowStem) subpopulations based on the quartile method cutoff. ScPagwas and cell–cell communication profiling were used to explore genomic instability, genetic susceptibility, and microenvironmental interactions. HighStem-specific co-expression modules were identified via high-dimensional WGCNA (hdWGCNA), and features were screened using six machine learning algorithms. A benchmark model was constructed for HighStem prediction and interpreted using SHAP analysis. HighStem GC cells exhibited enhanced intercellular signaling, metabolic reprogramming, and stemness-related pathway activity. Five genes—APMAP, MAPRE1, GLB1, TSPAN6, and CDKN2A—were identified as robust HighStem features. Spatial and bulk transcriptomic validation confirmed their tumor-specific expression and prognostic relevance. The Support Vector Machine (SVM) model incorporating these genes achieved high accuracy (AUC = 0.973) in distinguishing HighStem cells, demonstrating strong clinical utility at the scRNA-seq level. In addition, experimental validation through knockdown of core genes (APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1) in SGC7901 and HGC-27 gastric cancer cell lines revealed a significant reduction in JAK1-STAT3 pathway activity, supporting their functional involvement in tumor stemness regulation. Furthermore, knockdown of these genes increased the sensitivity of GC cells to chemotherapeutic agents like 5-FU and cisplatin, indicating their potential role in chemoresistance. This study provides a comprehensive molecular and functional characterization of high-stemness GC cells. The identified signature genes and predictive models offer novel insights into GC stemness biology and could guide personalized therapeutic strategies. Furthermore, our findings suggest that the core genes identified in this study may serve as potential biomarkers for predicting treatment outcomes and monitoring therapeutic resistance in GC.

Comprehensive analysis of the tumor immune microenvironment in gastric cancer and peritoneal metastasis based on single-cell RNA sequencing analysis

Article Open access 15 September 2025

The molecular subtype based on cell cycle-related gene signature predicts the prognosis and chemotherapy and immunotherapy response in gastric cancer

Article Open access 01 July 2025

Prognostic molecular subtype reveals the heterogeneity of tumor immune microenvironment in gastric cancer

Article Open access 25 April 2025

Introduction

Gastric cancer (GC) is one of the most common malignant tumors worldwide, ranking fifth in incidence and fourth in cancer-related mortality^1,2. Despite advances in treatment, GC remains a major clinical challenge due to its low 5-year survival rate among digestive system malignancies and its high propensity for distant metastasis^3,4. Clinically, early-stage GC is often asymptomatic or presents with nonspecific gastrointestinal symptoms, leading to frequent misdiagnosis as gastritis or other benign conditions^5,6. As a result, most patients are diagnosed at an advanced stage. GC is characterized by significant heterogeneity⁷. Pathologically, it includes several subtypes such as adenocarcinoma, squamous cell carcinoma, and carcinoid tumors¹. From a molecular perspective, The Cancer Genome Atlas (TCGA) categorizes GC into four major subtypes: Epstein–Barr virus (EBV)-positive, microsatellite instability (MSI), genomically stable (GS), and chromosomal instability (CIN)^8,9. Numerous studies have demonstrated that these molecular subtypes exhibit distinct prognoses and responses to immunotherapy or targeted therapies. However, a robust and clinically applicable molecular classification system or predictive model for guiding individualized treatment remains lacking¹⁰. Therefore, it is crucial to further elucidate the molecular mechanisms underlying GC and to promote the development of reliable biomarkers and personalized therapeutic strategies, which may significantly enhance diagnostic accuracy and treatment efficacy in GC management.

Cancer stem cells (CSCs) are a distinct subpopulation of cancer cells with unlimited self-renewal and differentiation potential. They play critical roles in tumor initiation, progression, metastasis, and therapeutic resistance¹¹. In GC, beyond the classical CSCs, recent transcriptomic analyses—particularly at the single-cell level—have identified cancer cell subsets with high stemness scores^12,13. Although these cells may not fully meet the phenotypic or functional criteria of traditional gastric cancer stem cells (GCSCs), they display pronounced stemness-like features and contribute similarly to tumor aggressiveness, immune evasion, and drug resistance^14,15. As a result, high-stemness GC cells have garnered increasing research interest. Given their malignant molecular characteristics and transcriptional profiles, these cells represent a biologically and clinically relevant subpopulation. Investigating their regulatory networks, functional roles, and interactions with the tumor microenvironment (TME) may enhance our understanding of GC heterogeneity and uncover novel therapeutic targets^16,17,18.

Based on these insights, our study identified a subset of malignant GC cells with high stemness scores (HighStem) using CytoTRACE analysis of scRNA-seq data. We comprehensively characterized the biological features of these cells and explored their interactions with other cell types within TME. Through high-dimensional WGCNA (hdWGCNA) and multiple machine learning approaches, we further identified five core marker genes of HighStem cells—APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1. Leveraging these key features, we constructed a robust and highly accurate predictive model for HighStem cell identification using a benchmark machine learning framework. Together, our findings offer novel insights into the molecular landscape of high-stemness GC cells and may provide potential targets and strategies for improving the clinical diagnosis and treatment of gastric cancer.

Results

Identification of cell populations

scRNA-seq analysis revealed 38 transcriptionally distinct clusters with 0.8 resolution in gastric cancer samples via Seurat pipeline (Fig. 1A). Based on canonical marker genes, these clusters were annotated into major cell types including T cells, B cells, NK cells, monocytes, macrophages, dendritic cells, epithelial cells, fibroblasts, endothelial cells, and others (Fig. 1B). Marker gene expression supported accurate cell-type classification (Fig. 1C). For example, CD3D and CD8A marked T cells, NKG7 and KLRD1 marked NK cells, CD68 and CD163 marked macrophages, EPCAM and CDH1 marked epithelial cells, while CD1C, FCER1A, and CLEC9A were enriched in dendritic cells. To visualize gene expression patterns across the UMAP space, gene density plots were generated for selected marker genes (Fig. 1D).

**Fig. 1: Identification and annotation of cell populations in GC tissue.**

Identification and characterization of HighStem malignant gastric cancer cells

Given that GC originates from epithelial cells, we first isolated epithelial subpopulations for CNV analysis (Fig. 2A). Cluster 3 exhibited significantly lower CNV scores compared to clusters 1, 2, and 4 (Fig. 2B), indicating a lack of large-scale chromosomal alterations. Therefore, cluster 3 was annotated as normal epithelial cells, while the remaining epithelial clusters were considered malignant GC cells. To evaluate stemness, we applied CytoTRACE, which estimates cellular differentiation potential based on transcriptional diversity. As shown in Fig. 2C, malignant epithelial cells exhibited a continuum of differentiation states. Based on the distribution of CytoTRACE scores, cells were stratified into three groups: LowStem, DTStem (intermediate), and HighStem, corresponding to the bottom 25%, middle 50%, and top 25% of the score range, respectively (Fig. 2D). UMAP visualization demonstrated distinct spatial separation of these subgroups (Fig. 2E), and HighStem cells displayed significantly elevated CytoTRACE scores compared to the other groups (Fig. 2F).

**Fig. 2: Identification and characterization of high-stemness malignant GC cells.**

To further explore the genetic basis underlying stemness, we performed scPagwas analysis. The tumor relevance score (TRS), representing the degree of GWAS signal enrichment at the single-cell level, was visualized across the UMAP space (Fig. 2G). HighStem cells exhibited significantly higher TRS scores (Fig. 2H), suggesting a potential genetic contribution to their stemness phenotype. Consistently, UMAP plots of stemness score, TRS score, and their combination revealed overlapping high-scoring regions (Fig. 2J). Finally, correlation analysis revealed a strong positive relationship between CytoTRACE-based stemness scores and TRS values (Fig. 2I), supporting the hypothesis that stemness in malignant GC cells may be driven by inherited genetic regulatory programs.

HighStem exhibit enhanced intercellular communication, active signaling pathways, and distinct metabolic reprogramming

To explore the functional characteristics of HighStem GC cells, we analyzed intercellular communication patterns using the CellChat framework. Compared with LowStem and DTStem groups, HighStem cells exhibited markedly increased interaction frequency and strength with diverse cell types in TME, particularly with macrophages, endothelial cells, and fibroblasts (Fig. 3A–C). Outgoing and incoming signaling analysis further revealed that HighStem cells were active hubs in multiple signaling pathways, suggesting enhanced crosstalk potential (Fig. 3B).

**Fig. 3: Cell-cell acrosstalk profiling of HighStem cells.**

Further quantitative comparison of ligand–receptor interactions showed that HighStem cells engaged in significantly more communication events than LowStem cells, both as ligand providers and receptor recipients (Fig. 3D, E). Key ligand–receptor pairs enriched in HighStem cells included MIF–CD74, NAMPT–INSR, and MDK–SDC1, indicating their involvement in immune modulation, stress responses, and stemness maintenance (Fig. 3F, G).

Pathway enrichment analysis highlighted that HighStem cells were positively associated with signaling cascades such as PI3K, WNT, TGF-β, and JAK–STAT, all of which are known to promote stemness and tumor progression (Fig. 3H). Moreover, metabolic profiling revealed that HighStem cells exhibited upregulation of a wide range of metabolic pathways, including glutathione metabolism, fatty acid metabolism, steroid biosynthesis, and glycosaminoglycan biosynthesis, indicating active metabolic reprogramming that supports their aggressive phenotype (Fig. 3I).

Identification of HighStem gene co-expression modules via hdWGCNA

To uncover gene expression programs associated with HighStem cells, we performed hdWGCNA. A soft-thresholding power of 6 was selected based on scale-free topology and connectivity criteria (Fig. 4A). Hierarchical clustering identified five distinct co-expression modules, each represented by a unique color (Fig. 4B). Module eigengene (ME) analysis revealed key genes contributing to each module (Fig. 4C). The UMAP projection of module expressions confirmed the modules across malignant cells (Fig. 4D). Among them, the brown, green, yellow, and blue modules showed distinct expression patterns across cells with different stemness states.

**Fig. 4: Identification of HighStem-associated gene co-expression modules using hdWGCNA.**

Correlation analysis between modules revealed moderate co-expression relationships, with the turquoise module showing less connectivity to others (Fig. 4E). Importantly, the brown, green and yellow modules were specifically enriched in HighStem cells, both in terms of average expression and proportion of expressing cells (Fig. 4F). The 246 genes included in these modules are shown in Supplementary Table 1.

HighStem signature gene selection

To obtain a robust HighStem gene set, we initially identified the 171 genes showing the strongest positive correlation with the HighStem phenotype from three hdWGCNA modules (brown, green, and yellow) in 3367 HighStem cells using Pearson correlation analysis (Fig. 5A, Supplementary Table 2), collectively defined as the HighStem activity gene set. Spatial transcriptomics analysis demonstrated that HighStem activity scores were predominantly enriched within tumor regions (Fig. 5B). Consistently, RNA-seq data from the TCGA cohort revealed that HighStem activity was significantly elevated in GC tissues compared to adjacent normal tissues (Fig. 5C). All GC samples were subsequently scored for HighStem activity and divided into high- and low-activity groups based on the optimal cutoff values. Kaplan–Meier survival analysis showed that patients with high HighStem activity had significantly shorter overall survival than those with low activity (Fig. 5D), highlighting its prognostic relevance.

Among them, Boruta and Random Forest were first used to rank gene importance based on their contribution to classification accuracy, with Boruta evaluating feature relevance through comparison with shadow features (Fig. 5E, F, Supplementary Table 3), and DT analysis provided a clear hierarchical view of the gene importance based on node splitting criteria (Fig. 5G, Supplementary Table 4). LASSO regression applied L1 regularization for dimensionality reduction, screening the most valuable predictors while minimizing overfitting (Fig. 5H, Supplementary Table 5). RF analysis further evaluated the importance of each gene based on the mean decrease in accuracy, with a stable decline in model error observed as the number of decision trees increased (Fig. 5I, Supplementary Table 6).GBM, as an integrated boosting method, continuously optimized the residuals of weak learners to improve the model’s accuracy, and prioritized candidate genes based on feature importance (Fig. 5J, Supplementary Table 7). Additionally, ABESS selected the optimal gene subset by evaluating different feature combinations with the lowest loss function, enhancing the reliability of feature screening (Fig. 5K, Supplementary Table 8). By integrating results from all four algorithms, five consistently selected hub genes (APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1) were identified (Fig. 5L).

Validation of HighStem hub genes and machine learning model performance

To investigate the spatial expression characteristics of the HighStem hub genes, we first visualized their expression density across all single cells. The results showed that APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1 were broadly expressed in the tumor tissue microenvironment, with relatively high signal intensity in specific cell populations (Fig. 6A). When focusing on malignant epithelial cells, these five genes demonstrated expression patterns that closely overlapped with the distribution of stemness scores, as indicated by CytoTRACE, suggesting that their expression was tightly associated with the high-stemness phenotype (Fig. 6B). Next, we evaluated the predictive performance of each gene in distinguishing malignant from non-malignant cells at the single-cell level. ROC curve analysis showed that all five genes had moderate to strong discriminatory power, with AUCs ranging from 0.709 to 0.832, among which APMAP (AUC = 0.832) and MAPRE1 (AUC = 0.827) showed the highest accuracy (Fig. 6C).

**Fig. 6: Validation of HighStem hub genes and evaluation of machine learning classifier performance.**

Furthermore, to develop a reliable classifier for identifying HighStem cells, we systematically compared the performance of 8 machine learning algorithms, including support vector machine (SVM), random forest (ranger), XGBoost, decision tree (rpart), etc. Benchmark analysis showed that the SVM model achieved the highest average AUC in cross-validation folds, showing excellent and stable predictive performance (Fig. 6D, Supplementary Table 9). The SVM model also demonstrated superior precision–recall and ROC curve performance (Fig. 6E), with a final AUC of 0.973 in the independent test set (Fig. 6F). Confusion matrix analysis revealed high classification accuracy in both training and test cohorts, with balanced sensitivity and specificity (Fig. 6G). Decision curve analysis further confirmed the clinical utility of the SVM model, with the proposed classifier showing a significantly higher net benefit across a wide range of thresholds (Fig. 6H). SHAP analysis identified APMAP, MAPRE1, GLB1, TSPAN6, and CDKN2A as the top contributors to model output (Fig. 6I). Predicted probability plots demonstrated a strong positive correlation between normalized gene expression and the predicted HighStem probability (Fig. 6J). Similarly, SHAP dependency plots revealed that increased expression levels of these genes positively contributed to the classification score, further supporting their functional relevance in defining HighStem GC cells (Fig. 6K).

Validation of HighStem hub gene expression and prognostic significance

To further validate the HighStem hub genes, we assessed their expression and clinical relevance using bulk RNA-seq data from TCGA. All five genes—APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1—were significantly upregulated in GC tissues compared to adjacent normal tissues (Fig. 7A). These findings were further confirmed in a paired GC cohort, where consistent overexpression of APMAP, CDKN2A, MAPRE1, and GLB1 was observed, while TSPAN6 showed an upward trend that did not reach statistical significance (Fig. 7B). Time-dependent ROC analysis further supported the diagnostic potential of these genes, with APMAP demonstrating the highest AUC (0.934), followed by TSPAN6 (0.836), MAPRE1 (0.801), CDKN2A (0.750), and GLB1 (0.697) (Fig. 7C). We next explored the prognostic value of these genes. K–M survival analysis revealed that high expression of TSPAN6 (p = 0.018) and MAPRE1 (p = 0.027) was significantly associated with worse overall survival in GC patients (Fig. 7D). GLB1 and APMAP showed borderline significance (p = 0.053 and p = 0.079, respectively), while CDKN2A did not reach statistical significance. These results highlight the potential of the five hub genes as diagnostic and prognostic biomarkers for high-stemness GC cells.

**Fig. 7: Validation of HighStem hub genes in TCGA datasets.**

Five core marker genes were positively related with maker of stem cells

To further validate the correlation among the core marker genes (APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1), we first analyzed their association with tumor stem cell-related markers—specifically components of the JAK1-STAT3 signaling pathway—using the TIMER 2.0 platform. The analysis revealed that APMAP (also known as C200RF3), GLB1, TSPAN6, and MAPRE1 were positively correlated with both JAK1 and STAT3 expression levels. In contrast, CDKN2A showed no significant correlation with either JAK1 or STAT3 (Fig. 8A).

**Fig. 8: Analysis of the correlation between core marker genes and stem cell markers.**

In addition, we explored the functional involvement of these core genes in tumor stemness. We assessed the expression of highStem signature genes following knockdown or overexpression of the core genes in the SGC7901 and HGC-27 GC cell lines. The results demonstrated that the mRNA expression levels of highStem signature genes—including JAK1 (a tyrosine kinase that mediates cytokine signaling), STAT3 (a transcription factor crucial for cell proliferation and stemness maintenance), Hippo (a signaling pathway that restricts organ size and regulates stem cell self-renewal), YAP1 (a key effector of the Hippo pathway involved in cell growth and survival), and WNT3A (a ligand in the Wnt signaling pathway essential for stem cell regulation and tumor progression). The highStem signature genes were downregulated upon knockdown of the core genes, whereas their expression was upregulated in cells overexpressing the core genes (Fig. 8B).

Knockdown of core genes suppressed JAK1-STAT3 pathway

The JAK-STAT3 signaling pathway plays a pivotal role in the progression of various cancers by promoting cell proliferation, survival, immune evasion, and the maintenance of cancer stem cell properties. Aberrant activation of this pathway has been closely associated with tumor development and poor clinical outcomes in multiple malignancies, including GC.

Then, we analyzed the protein expression levels of components of the JAK1-STAT3 pathway in SGC7901 and HGC-27 GC cell lines following the knockdown of the core marker genes. The results demonstrated that in both SGC7901 and HGC-27 cell lines, knockdown of the core marker genes—APMAP, CDKN2A, TSPAN6, MAPRE1, and GLB1—led to a marked reduction in the protein expression of JAK1 and STAT3 (Fig. 9A–E), suggesting that these genes may act as upstream regulators of the JAK-STAT3 axis in GC cells.

**Fig. 9: Effect of core gene knockdown on the JAK1-STAT3 pathway.**

Knockdown of core genes enhances drug sensitivity in GC cell line

5-Fluorouracil (5-FU) and cisplatin are widely used chemotherapeutic agents in the treatment of GC and other solid tumors. 5-FU functions primarily as a pyrimidine analog that inhibits thymidylate synthase, thereby disrupting DNA synthesis and inducing apoptosis in rapidly dividing cells. Cisplatin exerts its antitumor effect by forming DNA crosslinks, which interfere with DNA replication and transcription, ultimately triggering cell death. However, resistance to these agents remains a major clinical challenge, often associated with tumor stemness and molecular alterations.

Further, we investigated the role of the core marker genes in modulating chemotherapy sensitivity. Upon treatment with 5-FU or cisplatin, knockdown of the core genes enhanced the sensitivity of both SGC7901 and HGC-27 GC cell lines to these agents. Notably, silencing of TSPAN6 and MAPRE1 led to a pronounced inhibition of cell proliferation under chemotherapeutic stress (Fig. 10A–D), suggesting that these genes may contribute to drug resistance mechanisms and could serve as potential therapeutic targets to overcome chemoresistance.

**Fig. 10: Core gene knockdown enhances drug sensitivity in gastric cancer cell lines.**

Discussion

Tumor cell stemness has attracted increasing attention in recent years. Many cancer cells acquire enhanced proliferative capacity and immune evasion ability through the activation of embryonic development-related gene programs^19,20. In 2006, the American Association for Cancer Research (AACR) defined CSCs as a subpopulation of tumor cells with self-renewal ability and the potential to generate heterogeneous tumor progeny. Experts at the time also suggested that CSCs may be inherently resistant to conventional therapies, making tumor stemness one of the emerging hallmarks of cancer²¹.

To date, extensive efforts have been made to identify markers associated with high-stemness tumor cells²². Surface markers such as CD44, CD24, CD29, CD90, and CD133 have been widely used to isolate CSCs from various cancers and cell lines^23,24. With the rapid development of transcriptome-based analytical tools such as CytoTRACE²⁵, mRNAsi (OCLR)²⁶, and StemSC²⁷, numerous potential markers have been further identified. For example, Yibo Fan et al. demonstrated that SOX9 maintains stem-like properties in advanced gastric cancer²⁸, while Xiaoli Liu et al. showed that the activation of the Hippo-YAP1 signaling pathway upregulates FOXP4 to sustain gastric cancer stemness, highlighting FOXP4 as both a biomarker and a therapeutic target²⁹. In addition, several non-coding RNAs such as lncRNA HCP5³⁰ and circSLC4A7³¹ have also been implicated in stemness regulation. Despite these advances, few studies have systematically screened for high-stemness markers in GC or comprehensively evaluated their predictive value. To address this gap, we applied CytoTRACE to quantitatively assess stemness in malignant GC cells and further identified five robust HighStem signature genes through hdWGCNA combined with multiple machine learning algorithms. A predictive model with high classification accuracy was then constructed using a benchmark machine learning framework. This study not only provides a novel strategy for identifying high-stemness GC cells but also offers new insights into addressing the current clinical challenges in GC treatment.

In our study, five HighStem hub genes were identified, namely APMAP, MAPRE1, GLB1, TSPAN6, and CDKN2A. APMAP (Adipocyte Plasma Membrane Associated Protein) encodes a protein predominantly localized to the plasma membrane and endoplasmic reticulum of adipocytes^32,33,34. Recent studies suggest that APMAP plays a potential regulatory role in promoting tumor progression, particularly through the induction of epithelial-mesenchymal transition (EMT) in certain tumor types^35,36,37. Notably, EMT is closely linked to tumor stemness. During EMT, tumor cells lose epithelial traits and acquire mesenchymal features, including enhanced migratory capacity and stem cell-like properties^38,39. Conversely, CSCs are often characterized by high expression of EMT-related signaling pathways, such as TGF-β⁴⁰, Notch⁴¹, Wnt⁴², and Hippo/YAP1⁴³. Interestingly, these pathways were also enriched in the HighStem subpopulation in our cell–cell communication analysis. Furthermore, APMAP demonstrated the highest SHAP value among all features, indicating its strongest positive predictive contribution to the HighStem phenotype, supporting its potential role as a key regulator of stemness in GC.

MAPRE1 (Microtubule-Associated Protein RP/EB Family Member 1) encodes a protein that regulates microtubule dynamics and was originally identified through its interaction with the adenomatous polyposis coli (APC) gene⁴⁴. It plays a critical role in maintaining microtubule organization and chromosomal stability^45,46. The aberrant expression of MAPRE1 has been implicated in the pathogenesis of various malignancies by disrupting these essential cellular processes^47,48,49. In GC, a study by Ye Feng et al. reported CNV of the MAPRE1 gene, suggesting its potential genomic instability⁵⁰. These features may contribute to the self-renewal capacity and undifferentiated phenotype observed in HighStem GC cells.

The GLB1 gene encodes β-galactosidase, a lysosomal enzyme responsible for the degradation of specific glycolipids and glycoproteins⁵¹. Although GLB1 is widely recognized as a canonical marker of cellular senescence^52,53, its high expression in the HighStem cell population appears paradoxical. However, it is important to note that GLB1, beyond serving as a senescence marker, plays broader biological roles in intracellular metabolism, glycolipid catabolism, and lysosomal homeostasis^54,55. Interestingly, a study by Monique Bernard et al. highlighted that both cancer stem cells and senescent cells exhibit enhanced stress resistance, suggesting that elevated GLB1 expression may reflect a quiescent or dormant cellular state⁵⁶. In this context, GLB1 may help HighStem cells reduce proliferation and metabolic burden under adverse conditions, thereby facilitating long-term survival and potentially contributing to tumor recurrence or metastasis at later stages. Moreover, emerging evidence indicates that certain cells in EMT or high-stemness states can concurrently activate senescence-associated signaling pathways, such as p16 and p21^57,58, implying that HighStem cells are not constantly in a proliferative state. Instead, they may undergo dynamic transitions between dormancy and reactivation in response to tumor microenvironmental cues. This regulated balance between quiescence and activation could represent a self-protective mechanism by which tumor cells maintain homeostasis and stemness potential.

Similarly, CDKN2A (Cyclin-Dependent Kinase Inhibitor 2A) is a well-established tumor suppressor gene involved in the regulation of the cell cycle, induction of cellular senescence, and inhibition of tumorigenesis^59,60,61. While the CDKN2A gene can exert antiproliferative effects when properly transcribed and translated, numerous studies have shown that in many tumor types, it is frequently subject to genetic alterations such as mutations, deletions, or epigenetic modifications—particularly promoter hypermethylation—which result in reduced expression or complete functional loss^62,63,64. This downregulation contributes to uncontrolled cell proliferation and tumor progression. In GC, CDKN2A promoter hypermethylation has been associated with the loss of p16^INK4a protein expression^65,66, enabling GC cells to escape cell cycle control and acquire a highly proliferative phenotype. Moreover, copy number loss of CDKN2A has been identified as a potential biomarker for predicting hematogenous metastasis in GC patients⁶⁷. These findings align, to a certain extent, with the biological features of high-stemness GC cells, which are characterized by enhanced proliferative capacity and metastatic potential. It is also important to note that, although our study observed elevated CDKN2A mRNA levels in HighStem cells, this may not directly reflect functional protein expression due to possible post-transcriptional regulatory mechanisms such as RNA methylation, miRNA-mediated repression, or impaired translation. Therefore, when interpreting the role of CDKN2A in the context of tumor stemness, both transcriptomic and proteomic levels should be considered.

TSPAN6 (tetraspanin 6) is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Current studies have reported conflicting roles of TSPAN6 across different cancer types^68,69. For instance, in colorectal cancer, TSPAN6 is considered to function as a tumor suppressor, and its downregulation has been associated with tumor progression⁷⁰. In contrast, in glioblastoma, high TSPAN6 expression has been linked to malignant progression and poor patient prognosis, suggesting a context-dependent role in tumor biology⁷¹. Although TSPAN6 has been relatively understudied in GC, other members of the tetraspanin family, such as TSPAN8⁷², TSPAN1⁷³, CD151⁷⁴, and TSPAN4⁷⁵, have been shown to be highly expressed in GC tissues and are closely associated with enhanced proliferation, migration, and invasiveness of GC cells. Given the functional similarity within the tetraspanin family, it is plausible that TSPAN6 may play a comparable role in GC. However, further experimental validation is required to clarify its function and clinical relevance in GC.

The JAK-STAT3 signaling pathway has long been recognized as a central mediator in cancer stemness, tumor cell proliferation, and anti-apoptosis processes^76,77,78. Our data indicate that APMAP, TSPAN6, MAPRE1, and GLB1 are positively correlated with the expression levels of JAK1 and STAT3, suggesting that these genes may promote GC stemness by modulating the JAK-STAT3 axis. Notably, although CDKN2A did not show a direct correlation with either JAK1 or STAT3, its known role in cell cycle regulation may contribute indirectly to the modulation of cell differentiation and proliferation. Thus, these core genes likely interact in a complex molecular network to regulate tumor stemness and promote cancer progression. Further experimental validation demonstrated that knockdown of these core genes resulted in a significant reduction in JAK1 and STAT3 protein expression, underscoring their role as upstream regulators in the JAK-STAT3 pathway. This finding not only enhances our understanding of the involvement of these genes in maintaining tumor stemness but also suggests their potential as therapeutic targets, especially in strategies targeting cancer stemness and overcoming therapeutic resistance. In terms of chemoresistance, knockdown of these genes notably increased the sensitivity of GC cells to common chemotherapy agents such as 5-FU and cisplatin. This suggests that these genes play a crucial role not only in maintaining tumor stemness but also in mediating drug resistance mechanisms. Specifically, the silencing of TSPAN6 and MAPRE1 led to a pronounced inhibition of cell proliferation under chemotherapy stress, further supporting their potential as targets for overcoming chemoresistance.

In summary, the five identified HighStem hub genes may contribute to maintaining stemness in gastric cancer through diverse yet complementary mechanisms. These findings provide a foundation for future studies aimed at validating their roles and exploring their potential as biomarkers or therapeutic targets in GC.

Methods

Data collection and processing

scRNA-seq datasets (GSE183904⁷, GSE206785⁷⁹) and stRNA-seq data (GSE251950⁸⁰) were obtained from GEO. Bulk RNA-seq data were retrieved from TCGA, and GWAS summary statistics for scPagwas analysis were downloaded from the IEU OpenGWAS database. Dataset details are provided in Supplementary Table 10. Cells with >20% mitochondrial content or <200 detected genes were excluded. Genes expressed in ≥3 cells and within 200–7000 counts were retained, yielding 269213 high-quality cells from 88 samples for analysis. Data were processed using Seurat, including normalization, PCA, UMAP, and clustering (resolution = 0.8), with batch correction via Harmony. Cell types were annotated using known markers. For stRNA-seq, SCTransform normalization and unsupervised clustering defined spatial domains, supported by H&E staining and marker gene expression. Spatial patterns were visualized with “SpatialDimPlot” and “SpatialFeaturePlot”.

Inference of copy number variations

Copy number variations (CNVs) were inferred from scRNA-seq data using the inferCNV R package. To reduce technical variability, normalization procedures were applied, and malignant cells were analyzed relative to normal reference cells to detect regions with abnormal expression indicative of genomic instability^81,82. A CNV score was then calculated to quantify the extent of deviation in each cell from the reference baseline. Malignant cells were extracted according to CNV-based classification, yielding 13,483 tumor cells for downstream analysis.

Stemness scoring of malignant gastric cancer cells

To evaluate cellular stemness, we applied the CytoTRACE algorithm to malignant GC cells identified through CNV analysis. CytoTRACE is a computational framework that infers the differentiation state of individual cells based on transcriptional diversity, under the premise that less differentiated cells express a broader array of genes. Unlike traditional stemness assessments, CytoTRACE does not depend on predefined gene sets or prior biological assumptions, making it broadly applicable across diverse cell types and tissues^25,83. Based on the quartile distribution of CytoTRACE scores, cells were stratified into three groups: high stemness (top 25%), dynamic transition stemness (25–75%), and low stemness (bottom 25%)⁸⁴.

scPagwas analysis

To explore the genetic underpinnings of cellular stemness and tumor progression, we employed scPagwas, an integrative computational framework implemented in the “scPagwas” R package. This method enables the integration of scRNA-seq data with GWAS summary statistics to uncover trait-associated genetic variants that potentially influence cell fate decisions at single-cell resolution. In this study, we focused on mapping genes associated with stemness scores to GWAS summary data derived from large-scale population studies⁸⁵. By linking cell-type-specific gene expression patterns to genomic loci associated with cancer-related traits, scPagwas allowed us to identify candidate genetic variants that may drive intercellular heterogeneity in stemness and contribute to tumor evolution.

High-dimensional WGCNA (hdWGCNA) analysis

As CytoTRACE provides a global stemness score for individual cells without pinpointing gene-level expression patterns across subpopulations, we employed high-dimensional weighted gene co-expression network analysis (hdWGCNA) to further elucidate the transcriptional characteristics of malignant GC cells. A weighted co-expression network was constructed by calculating pairwise gene expression correlations, and genes were clustered into distinct co-expression modules^86,87. To identify modules associated with tumor progression, we performed module–trait relationship analysis, focusing on stemness and metastatic phenotypes. Modules showing strong correlations with high-stemness or metastatic features were considered functionally relevant. Within these key modules, hub genes were defined based on high intra-module connectivity, representing potential core regulators that may drive tumor heterogeneity and malignant progression in GC.

Cell–cell interaction analysis

To investigate intercellular communication within the tumor microenvironment, we utilized the CellChat R package, which infers ligand–receptor interactions based on scRNA-seq data⁸⁸. Communication networks were constructed to delineate signaling exchanges among annotated cell populations. Visualization of interaction strength and frequency between specific cell types was performed using the netVisual_circle function, which provides a circular plot representing outgoing and incoming signaling patterns. To further dissect individual signaling pathways, the netVisual_bubble function was applied, generating bubble plots that highlight key ligand–receptor pairs and their associated signaling axes.

Screening of HighStem (top 25% CytoTRACE-scored cells) signature genes

To identify key signature genes of the HighStem subpopulation, we employed an integrated machine learning approach combining six algorithms: random forest, LASSO, Boruta, and decision tree (DT), Adaptive Best Subset Selection (ABESS), and Gradient Boosting Machine (GBM). These methods were chosen for their complementary strengths in feature selection and bias reduction.

Random forest⁸⁹ and Boruta⁹⁰ (a random forest-based wrapper algorithm) were used to rank gene importance. LASSO regression⁹¹ applied L1 regularization to eliminate redundant features, while decision tree⁹² analysis provided interpretable hierarchical classification. ABESS⁹³, a recently developed algorithm that performs optimal subset selection with theoretical guarantees, was introduced to further refine the gene set while avoiding overfitting or selection bias. Meanwhile, the GBM algorithm⁹⁴, an ensemble learning method that builds additive models in a forward stage-wise fashion using decision trees, was applied to capture complex nonlinear relationships between genes and stemness phenotypes.Genes identified by all four methods were considered hub HighStem markers, and their intersection was visualized using a Venn diagram.

Machine learning benchmark models for HighStem features

To identify the most effective predictive model for HighStem features at the single-cell level, we benchmarked nine machine learning algorithms using the “mlr3” R package. The models included k-nearest neighbor (KNN), linear discriminant analysis (LDA), naive Bayes (NB), random forest (Ranger), recursive partitioning and regression trees (RPART), support vector machine (SVM), and extreme gradient boosting (XGBoost). Cells classified as HighStem were used as positive samples, and LowStem cells served as controls. The dataset was randomly split into training (80%) and test (20%) sets. Hyperparameter tuning was conducted using five-fold internal cross-validation (CV), and model generalization was evaluated via ten-fold external CV. The model with the highest average area under the curve (AUC) was selected as the optimal framework for HighStem signature prediction.

SHAP analysis for feature interpretation

To interpret the contributions of individual genes to the HighStem prediction model, we applied SHAP (SHapley Additive exPlanations) analysis⁹⁵. SHAP is a model-agnostic method that quantifies the impact of each feature on the model’s output, based on cooperative game theory. This approach enables both global and local interpretation of feature importance. We computed SHAP values for all samples to evaluate the contribution of each gene to model predictions. The mean absolute SHAP value (mean_phi) for each gene was used as an indicator of its overall importance, with higher values reflecting greater average influence on classification outcomes. Genes with the highest mean SHAP values were considered the most influential predictors and were selected for further biological interpretation.

Cell culture

The SGC7901 and HGC-27 GC cell lines, originally from the American Type Culture Collection (Manassas, VA), was stored by our laboratory and grown in RPMI 1640 supplemented with 2 mM glutamine, 10 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid, pH 7.4, and 10% fetal bovine serum at 37 °C in a 5% CO₂ humidified incubator.

Transfection with siRNA

Core genes expression was silenced using siRNA transfection mediated by Lipofectamine 2000 (Invitrogen). Cells were seeded in 6-well plates at 50–70% confluency and transfected the following day. For each well, 5 μL of Core genes-targeting siRNA (final concentration 50 nM) was diluted in 125 μL of Opti-MEM (Gibco), and separately, 5 μL of Lipofectamine 2000 was diluted in another 125 μL of Opti-MEM. After a 5-min incubation at room temperature, the two solutions were combined and incubated for 20 min to form siRNA-lipid complexes. The complexes were then added to the cells in antibiotic-free complete medium. After 4–6 h, the medium was replaced with fresh complete medium. Cells were harvested 24–48 h post-transfection for RNA and protein extraction to evaluate gene silencing efficiency. The information of siRNA sequence are described in the Supplementary Table 11.

Western blot

Cells were transiently transfected or treated with 5-Fluorouracil (5-FU) and cisplatin. The cells were lysed in RIPA buffer. The lysates were subjected to immunoprecipitation with anti-Flag antibody. The lysates and immunoprecipitates were subjected to SDS- 12.5% PAGE, transferred onto PVDF membranes and probed with antibody(s) described. The following antibodies were used: JAK1 (1:1000, Santa Cruz), STAT3 (1:1000, Santa Cruz). Secondary HRP-conjugated Abs were obtained from GE Healthcare Life Sciences,and the light emission was quanti¢ed with a Lumino image analyzer LAS-1000 (FUJI, Japan). Signal quantification was performed by using ImageJ Software.

Realtime PCR

Total RNAwasextractedfromcellsbyusingTrizolreagent (Life Technologies). Taqman probes were used for the detection of miRs (Applied Biosystems), as described by the manufacturer, by using beta-actin as endogenous control. For mRNA-level analysis, cDNA was generated by using reverse transcriptase SuperScript II and poly dT primers (Invitrogen). Realtime PCR was performed by using SYBR Green Master Mix (Invitrogen) and the ABI 7900HT fast real-time PCR System (Applied Biosystems). Primers are described in the Supplemental Information. S18 was used as endogenous control. Primers are described in the Supplementary Table 12. β-actin was used as endogenous control.

Statistical analysis

All data processing, statistical analyses, and visualizations were performed using R software (version 4.1.3). Group comparisons for continuous variables were conducted using either the Wilcoxon rank-sum test or Student’s t test, depending on the distribution. Categorical variables were compared using the chi-squared test or Fisher’s exact test, as appropriate. Multiple testing correction was applied using the False Discovery Rate (FDR) method. Pearson correlation analysis was used to evaluate associations between continuous variables. All statistical tests were two-sided, and a p-value < 0.05 was considered statistically significant.

Data availability

The data that supports the findings of this study are available in the manuscript and supplementary materials. Further inquiries can be directed to the corresponding authors.

References

Smyth, E. C. et al. Gastric cancer. Lancet 396, 635–648 (2020).
Article CAS PubMed Google Scholar
Chia, N. Y. & Tan, P. Molecular classification of gastric cancer. Ann. Oncol. 27, 763–769 (2016).
Article PubMed Google Scholar
Joshi, S. S. & Badgwell, B. D. Current treatment and recent progress in gastric cancer. CA Cancer J. Clin. 71, 264–279 (2021).
PubMed PubMed Central Google Scholar
Guan, W. L., He, Y. & Xu, R. H. Gastric cancer treatment: recent progress and future perspectives. J. Hematol. Oncol. 16, 57 (2023).
Article CAS PubMed PubMed Central Google Scholar
Machlowska, J. et al. Gastric Cancer: Epidemiology, Risk Factors, Classification, Genomic Characteristics and Treatment Strategies. Int. J. Mol. Sci. 21, 4012 (2020).
Yang, W. J. et al. Updates on global epidemiology, risk and prognostic factors of gastric cancer. World J. Gastroenterol. 29, 2452–2468 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kumar, V. et al. Single-Cell Atlas of Lineage States, Tumor Microenvironment, and Subtype-Specific Expression Programs in Gastric Cancer. Cancer Discov. 12, 670–691 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gullo, I. et al. Heterogeneity in Gastric Cancer: From Pure Morphology to Molecular Classifications. Pathobiology 85, 50–63 (2018).
Article PubMed Google Scholar
Lim, B. et al. Genomic and epigenomic heterogeneity in molecular subtypes of gastric cancer. World J. Gastroenterol. 22, 1190–1201 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, F. H. et al. The Chinese Society of Clinical Oncology (CSCO): Clinical guidelines for the diagnosis and treatment of gastric cancer, 2023. Cancer Commun. 44, 127–172 (2024).
Article CAS Google Scholar
Loh, J. J. & Ma, S. Hallmarks of cancer stemness. Cell Stem Cell 31, 617–639 (2024).
Article CAS PubMed Google Scholar
Pan, G. et al. EMT-associated microRNAs and their roles in cancer stemness and drug resistance. Cancer Commun. 41, 199–217 (2021).
Article Google Scholar
Kim, J. et al. Single-cell analysis of gastric pre-cancerous and cancer lesions reveals cell lineage diversity and intratumoral heterogeneity. NPJ Precis. Oncol. 6, 9 (2022).
Rao, X. et al. Targeting Gastric Cancer Stem Cells to Enhance Treatment Response. Cells 11, 2828 (2022).
Chang, Z. et al. THBS2 promotes gastric cancer progression and stemness via the Notch signaling pathway. Am. J. Cancer Res. 14, 3433–3450 (2024).
Article CAS PubMed PubMed Central Google Scholar
Fan, X. et al. YTHDF2-mediated m(6)A modification of ONECUT2 promotes stemness and oxaliplatin resistance in gastric cancer through transcriptionally activating TFPI. Drug Resist Updat 79, 101200 (2025).
Article CAS PubMed Google Scholar
Liu, T. et al. FAP(+) gastric cancer mesenchymal stromal cells via paracrining INHBA and remodeling ECM promote tumor progression. Int. Immunopharmacol. 144, 113697 (2025).
Article CAS PubMed Google Scholar
He, C. et al. Stem cell landscape aids in tumor microenvironment identification and selection of therapeutic agents in gastric cancer. Cell Signal 113, 110965 (2024).
Article CAS PubMed Google Scholar
Marimuthu, S. et al. Mucins reprogram stemness, metabolism and promote chemoresistance during cancer progression. Cancer Metastasis Rev. 40, 575–588 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zeng, L. et al. New insights into KLFs and SOXs in cancer pathogenesis, stemness, and therapy. Semin Cancer Biol. 90, 29–44 (2023).
Article CAS PubMed PubMed Central Google Scholar
Clarke, M. F. et al. Cancer stem cells-perspectives on current status and future directions: AACR Workshop on cancer stem cells. Cancer Res. 66, 9339–9344 (2006).
Article CAS PubMed Google Scholar
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
Article CAS PubMed Google Scholar
Medema, J. P. Cancer stem cells: the challenges ahead. Nat. Cell Biol. 15, 338–344 (2013).
Article CAS PubMed Google Scholar
Nassar, D. & Blanpain, C. Cancer Stem Cells: Basic Concepts and Therapeutic Implications. Annu. Rev. Pathol. 11, 47–76 (2016).
Article CAS PubMed Google Scholar
Gulati, G. S. et al. Profiling cell identity and tissue architecture with single-cell and spatial transcriptomics. Nat. Rev. Mol. Cell Biol. 26, 11–31 (2025).
Article CAS PubMed Google Scholar
Chen, D. et al. Integrated Machine Learning and Bioinformatic Analyses Constructed a Novel Stemness-Related Classifier to Predict Prognosis and Immunotherapy Responses for Hepatocellular Carcinoma Patients. Int. J. Biol. Sci. 18, 360–373 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zheng, H. et al. StemSC: a cross-dataset human stemness index for single-cell samples. Stem Cell Res. Ther. 13, 115 (2022).
Article CAS PubMed PubMed Central Google Scholar
Fan, Y. et al. Epithelial SOX9 drives progression and metastases of gastric adenocarcinoma by promoting immunosuppressive tumour microenvironment. Gut 72, 624–637 (2023).
Article CAS PubMed Google Scholar
Liu, X. et al. FOXP4 Is a Direct YAP1 Target That Promotes Gastric Cancer Stemness and Drives Metastasis. Cancer Res. 84, 3574–3588 (2024).
Article CAS PubMed PubMed Central Google Scholar
Wu, H. et al. MSC-induced lncRNA HCP5 drove fatty acid oxidation through miR-3619-5p/AMPK/PGC1α/CEBPB axis to promote stemness and chemo-resistance of gastric cancer. Cell Death Dis. 11, 233 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hui, Y. et al. circSLC4A7 accelerates stemness and progression of gastric cancer by interacting with HSP90 to activate NOTCH1 signaling pathway. Cell Death Dis. 14, 452 (2023).
Article CAS PubMed PubMed Central Google Scholar
Albrektsen, T. et al. Identification of a novel integral plasma membrane protein induced during adipocyte differentiation. Biochem. J. 359, 393–402 (2001).
Article CAS PubMed PubMed Central Google Scholar
Ilhan, A. et al. Localization and characterization of the novel protein encoded by C20orf3. Biochem. J. 414, 485–495 (2008).
Article CAS PubMed Google Scholar
Mekenkamp, L. J. et al. Chromosome 20p11 gains are associated with liver-specific metastasis in patients with colorectal cancer. Gut 62, 94–101 (2013).
Article PubMed Google Scholar
Jiang, S. et al. Cholesterol Induces Epithelial-to-Mesenchymal Transition of Prostate Cancer Cells by Suppressing Degradation of EGFR through APMAP. Cancer Res. 79, 3063–3075 (2019).
Article CAS PubMed Google Scholar
Zhu, X. et al. APMAP Promotes Epithelial-Mesenchymal Transition and Metastasis of Cervical Cancer Cells by Activating the Wnt/β-catenin Pathway. J. Cancer 12, 6265–6273 (2021).
Article CAS PubMed PubMed Central Google Scholar
Akhter, M. D. Q. et al. An exploratory SWATH plasma proteomics analysis of phyllodes tumor- a type of female breast tumor. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 1254, 124508 (2025).
Article CAS Google Scholar
Huang, Y., Hong, W. & Wei, X. The molecular mechanisms and therapeutic strategies of EMT in tumor progression and metastasis. J. Hematol. Oncol. 15, 129 (2022).
Article PubMed PubMed Central Google Scholar
Pastushenko, I. & Blanpain, C. EMT Transition States during Tumor Progression and Metastasis. Trends Cell Biol. 29, 212–226 (2019).
Article CAS PubMed Google Scholar
Peng, D. et al. Targeting TGF-β signal transduction for fibrosis and cancer therapy. Mol. Cancer 21, 104 (2022).
Article CAS PubMed PubMed Central Google Scholar
Espinoza, I. & Miele, L. Deadly crosstalk: Notch signaling at the intersection of EMT and cancer stem cells. Cancer Lett. 341, 41–45 (2013).
Article CAS PubMed Google Scholar
Xue, W. et al. Wnt/β-catenin-driven EMT regulation in human cancers. Cell Mol. Life Sci. 81, 79 (2024).
Article CAS PubMed PubMed Central Google Scholar
Wang, H. et al. GABAB receptor inhibits tumor progression and epithelial-mesenchymal transition via the regulation of Hippo/YAP1 pathway in colorectal cancer. Int. J. Biol. Sci. 17, 1953–1962 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ladd, J. J. et al. Increased plasma levels of the APC-interacting protein MAPRE1, LRG1, and IGFBP2 preceding a diagnosis of colorectal cancer in women. Cancer Prev. Res. 5, 655–664 (2012).
Article CAS Google Scholar
Su, L. K. & Qi, Y. Characterization of human MAPRE genes and their proteins. Genomics 71, 142–149 (2001).
Article CAS PubMed Google Scholar
Islam, F. et al. Protein interactions of FAM134B with EB1 and APC/beta-catenin in vitro in colon carcinoma. Mol. Carcinog. 57, 1480–1491 (2018).
Article CAS PubMed Google Scholar
Tang, L. et al. circSETD3 regulates MAPRE1 through miR-615-5p and miR-1538 sponges to promote migration and invasion in nasopharyngeal carcinoma. Oncogene 40, 307–321 (2021).
Article CAS PubMed Google Scholar
Chanez, B. et al. EB1 Restricts Breast Cancer Cell Invadopodia Formation and Matrix Proteolysis via FAK. Cells 10, 388 (2021).
Rodrigues-Ferreira, S. et al. Combinatorial expression of microtubule-associated EB1 and ATIP3 biomarkers improves breast cancer prognosis. Breast Cancer Res. Treat. 173, 573–583 (2019).
Article CAS PubMed Google Scholar
Feng, Y. et al. Integrated Analysis of DNA Copy Number Changes and Gene Expression Identifies Key Genes in Gastric Cancer. J. Comput. Biol. 27, 877–887 (2020).
Article CAS PubMed Google Scholar
Lee, B. Y. et al. Senescence-associated beta-galactosidase is lysosomal beta-galactosidase. Aging Cell 5, 187–195 (2006).
Article CAS PubMed Google Scholar
Sun, J. et al. A Glb1-2A-mCherry reporter monitors systemic aging and predicts lifespan in middle-aged mice. Nat. Commun. 13, 7028 (2022).
Article CAS PubMed PubMed Central Google Scholar
Guerrero, A. et al. Galactose-modified duocarmycin prodrugs as senolytics. Aging Cell 19, e13133 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, J. et al. Proteomics-Derived Biomarker Panel Facilitates Distinguishing Primary Lung Adenocarcinomas With Intestinal or Mucinous Differentiation From Lung Metastatic Colorectal Cancer. Mol. Cell Proteom. 23, 100766 (2024).
Article CAS Google Scholar
Mou, L. et al. Construction of a lipid metabolism-related risk model for hepatocellular carcinoma by single cell and machine learning analysis. Front. Immunol. 14, 1036562 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bernard, M. et al. Autophagy drives fibroblast senescence through MTORC2 regulation. Autophagy 16, 2004–2016 (2020).
Article CAS PubMed PubMed Central Google Scholar
Park, S. S. et al. Cellular senescence is associated with the spatial evolution toward a higher metastatic phenotype in colorectal cancer. Cell Rep. 43, 113912 (2024).
Article CAS PubMed Google Scholar
Brambs, C. E. et al. Epithelial-mesenchymal transition (EMT) in vulvar cancer with and without inguinal lymph node involvement. J. Cancer Res. Clin. Oncol. 148, 1183–1193 (2022).
Article CAS PubMed Google Scholar
Negrini, S., Gorgoulis, V. G. & Halazonetis, T. D. Genomic instability-an evolving hallmark of cancer. Nat. Rev. Mol. Cell Biol. 11, 220–228 (2010).
Article CAS PubMed Google Scholar
Zucman-Rossi, J. et al. Genetic Landscape and Biomarkers of Hepatocellular Carcinoma. Gastroenterology 149, 1226–1239.e4 (2015).
Article CAS PubMed Google Scholar
Sherr, C. J. Cancer cell cycles. Science 274, 1672–1677 (1996).
Article CAS PubMed Google Scholar
Dilworth, D. et al. Germline CDKN2A mutation implicated in predisposition to multiple myeloma. Blood 95, 1869–1871 (2000).
Article CAS PubMed Google Scholar
Tang, V. et al. Loss of p16 expression is a sensitive marker of CDKN2A homozygous deletion in malignant meningiomas. Acta Neuropathol. 145, 497–500 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kreuger, I. Z. M. et al. Therapeutic Strategies for Targeting CDKN2A Loss in Melanoma. J. Invest. Dermatol. 143, 18–25.e1 (2023).
Article CAS PubMed Google Scholar
Spagnol, L. W. et al. P16 gene promoter methylation is associated with oncogenesis and progression of gastric carcinomas: A systematic review and meta-analysis. Crit. Rev. Oncol. Hematol. 180, 103843 (2022).
Article PubMed Google Scholar
Wang, M. et al. The SMAD2/miR-4256/HDAC5/p16(INK4a) signaling axis contributes to gastric cancer progression. Oncol. Res. 31, 515–541 (2023).
Article PubMed PubMed Central Google Scholar
Dermawan, J. K. et al. Novel Genomic Risk Stratification Model for Primary Gastrointestinal Stromal Tumors (GIST) in the Adjuvant Therapy Era. Clin. Cancer Res. 29, 3974–3985 (2023).
Article CAS PubMed PubMed Central Google Scholar
Qi, Y. et al. Expression and function of transmembrane 4 superfamily proteins in digestive system cancers. Cancer Cell Int. 20, 314 (2020).
Article CAS PubMed PubMed Central Google Scholar
Huang, R. et al. The role of tetraspanins pan-cancer. iScience 25, 104777 (2022).
Article CAS PubMed PubMed Central Google Scholar
Andrijes, R. et al. Tetraspanin 6 is a regulator of carcinogenesis in colorectal cancer. Proc. Natl Acad. Sci. USA, 118, e2011411118 (2021).
Zhang, C. et al. TSPAN6 reinforces the malignant progression of glioblastoma via interacting with CDK5RAP3 and regulating STAT3 signaling pathway. Int. J. Biol. Sci. 20, 2440–2453 (2024).
Article CAS PubMed PubMed Central Google Scholar
Zhang, L. et al. TSPAN8 regulates EGFR/AKT pathway to enhance metastasis in gastric cancer. Mol. Biol. Rep. 50, 7955–7965 (2023).
Article CAS PubMed Google Scholar
Lu, Z. et al. Delivery of TSPAN1 siRNA by Novel Th17 Targeted Cationic Liposomes for Gastric Cancer Intervention. J. Pharm. Sci. 109, 2854–2860 (2020).
Article CAS PubMed Google Scholar
Deng, Y. et al. Tetraspanins: Novel Molecular Regulators of Gastric Cancer. Front. Oncol. 11, 702510 (2021).
Article CAS PubMed PubMed Central Google Scholar
Qi, W. et al. Tetraspanin family identified as the central genes detected in gastric cancer using bioinformatics analysis. Mol. Med. Rep. 18, 3599–3610 (2018).
CAS PubMed PubMed Central Google Scholar
Johnson, D. E. et al. Targeting the IL-6/JAK/STAT3 signalling axis in cancer. Nat. Rev. Clin. Oncol. 15, 234–248 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wang, T. et al. JAK/STAT3-Regulated Fatty Acid β-Oxidation Is Critical for Breast Cancer Stem Cell Self-Renewal and Chemoresistance. Cell Metab. 27, 136–150.e5 (2018).
Article CAS PubMed Google Scholar
Jin, W. Role of JAK/STAT3 Signaling in the Regulation of Metastasis, the Transition of Cancer Stem Cells, and Chemoresistance of Cancer by Epithelial-Mesenchymal Transition. Cells 9, 217 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kang, B. et al. Parallel single-cell and bulk transcriptome analyses reveal key features of the gastric tumor microenvironment. Genome Biol. 23, 265 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H. et al. Spatial dissection of tumour microenvironments in gastric cancers reveals the immunosuppressive crosstalk between CCL2+ fibroblasts and STAT3-activated macrophages. Gut 74, 714–727 (2024).
Luo, L. et al. Single-cell RNA sequencing identifies molecular biomarkers predicting late progression to CDK4/6 inhibition in patients with HR+/HER2- metastatic breast cancer. Mol. Cancer 24, 48 (2025).
Article CAS PubMed PubMed Central Google Scholar
Xin, S. et al. ScRNA-seq revealed an immunosuppression state and tumor microenvironment heterogeneity related to lymph node metastasis in prostate cancer. Exp. Hematol. Oncol. 12, 49 (2023).
Article CAS PubMed PubMed Central Google Scholar
Gulati, G. S. et al. Single-cell transcriptional diversity is a hallmark of developmental potential. Science 367, 405–411 (2020).
Article CAS PubMed PubMed Central Google Scholar
Fan, G. et al. Single-cell and spatial analyses revealed the co-location of cancer stem cells and SPP1+ macrophage in hypoxic region that determines the poor prognosis in hepatocellular carcinoma. NPJ Precis Oncol. 8, 75 (2024).
Article CAS PubMed PubMed Central Google Scholar
Ye, L. et al. Inferring the genetic relationships between unsupervised deep learning-derived imaging phenotypes and glioblastoma through multi-omics approaches. Brief Bioinform 26, bbaf037 (2024).
Dai, L. et al. Single-cell and spatial transcriptomics reveal a high glycolysis B cell and tumor-associated macrophages cluster correlated with poor prognosis and exhausted immune microenvironment in diffuse large B-cell lymphoma. Biomark. Res. 12, 58 (2024).
Article PubMed PubMed Central Google Scholar
Sun, Y. et al. Single-cell hdWGCNA reveals metastatic protective macrophages and development of deep learning model in uveal melanoma. J. Transl. Med. 22, 695 (2024).
Article CAS PubMed PubMed Central Google Scholar
Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12, 1088 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wallace, M. L. et al. Use and misuse of random forest variable importance metrics in medicine: demonstrations through incident stroke prediction. BMC Med. Res. Methodol. 23, 144 (2023).
Article PubMed PubMed Central Google Scholar
Yan, F. et al. Association between the stress hyperglycemia ratio and 28-day all-cause mortality in critically ill patients with sepsis: a retrospective cohort study and predictive model establishment based on machine learning. Cardiovasc. Diabetol. 23, 163 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kang, J. et al. LASSO-Based Machine Learning Algorithm for Prediction of Lymph Node Metastasis in T1 Colorectal Cancer. Cancer Res. Treat. 53, 773–783 (2021).
Article CAS PubMed Google Scholar
Lyu, M. et al. Maxillary sinus floor augmentation: a review of current evidence on anatomical factors and a decision tree. Int. J. Oral. Sci. 15, 41 (2023).
Article PubMed PubMed Central Google Scholar
Kong, W. et al. Adaptive best subset selection algorithm and genetic algorithm aided ensemble learning method identified a robust severity score of COVID-19 patients. Imeta 2, e126 (2023).
Article CAS PubMed PubMed Central Google Scholar
Liu, X. P. et al. Clinical significance and molecular annotation of cellular morphometric subtypes in lower-grade gliomas discovered by machine learning. Neuro Oncol. 25, 68–81 (2023).
Article CAS PubMed Google Scholar
Qi, X. et al. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol. 79, 103470 (2025).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (82072733) and the Natural Science Foundation of Liaoning Province (2020-BS-088).

Author information

These authors contributed equally: Ziyi Wang, Xuehao Li, Jin Wang.

Authors and Affiliations

Department of Surgical Oncology and General Surgery, The First Hospital of China Medical University, Shenyang, Liaoning, China
Ziyi Wang & Yan Xu
Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Ziyi Wang
Department of Thoracic Surgery, The First Hospital of China Medical University, Shenyang, Liaoning, China
Ziyi Wang, Xuehao Li, Huidong Yu, Defeng Zhao, Siyu Zhou & Wanfu Men
Department of E.N.T., Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
Jin Wang

Authors

Ziyi Wang
View author publications
Search author on:PubMed Google Scholar
Xuehao Li
View author publications
Search author on:PubMed Google Scholar
Jin Wang
View author publications
Search author on:PubMed Google Scholar
Huidong Yu
View author publications
Search author on:PubMed Google Scholar
Defeng Zhao
View author publications
Search author on:PubMed Google Scholar
Yan Xu
View author publications
Search author on:PubMed Google Scholar
Siyu Zhou
View author publications
Search author on:PubMed Google Scholar
Wanfu Men
View author publications
Search author on:PubMed Google Scholar

Contributions

W.M. conceived, designed and funded the study. Z.W. prepared and wrote the paper and prepared the figures. X.L. and J.W. performed data analyses. Z.W. and X.L. performed experiments. H.Y. and D.Z. acquired tissues and data. Supervision, manuscript reviewing, and editing was performed by Y.X. and S.Z. All the authors contributed to writing the manuscript.

Corresponding authors

Correspondence to Yan Xu, Siyu Zhou or Wanfu Men.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Z., Li, X., Wang, J. et al. Comprehensive molecular characterization of high-stemness gastric cancer cells using single-cell transcriptomics, spatial mapping, and machine learning. npj Precis. Onc. 9, 400 (2025). https://doi.org/10.1038/s41698-025-01177-0

Download citation

Received: 05 July 2025
Accepted: 31 October 2025
Published: 17 December 2025
Version of record: 17 December 2025
DOI: https://doi.org/10.1038/s41698-025-01177-0