Introduction

Chondrosarcoma is a primary bone tumor of the axial skeleton and appendicular skeleton. It is the second most common primary bone tumor after osteosarcoma1,2. It is reported that approximately 3,610 new patients with primary malignant bone tumors were diagnosed in the United States in 20213. The survival rate of CS patients declines with age4, which is a heavy burden in an aging society. The most effective treatment for CS is surgical resection. Other treatment options (such as chemotherapy/radiotherapy) exert no significant effect on CS, which makes the treatment effect of CS very limited5. Biomarkers have a wide range of applications, including cancer screening, differential diagnosis, prognosis prediction, and progression monitoring, and are of great value in the management of malignant tumors. Therefore, identified biomarkers could provide new targets for precise CS treatment, thereby expanding treatment options and improving patient prognosis, which has important clinical significance.

Angiogenesis, the formation of new capillaries from existing vessels, is tightly regulated by the balance of angiogenic factors in the vascular microenvironment6. Angiogenesis is crucial in various diseases and a key feature of cancers, diabetic retinopathy, autoimmune diseases, rheumatoid arthritis, atherosclerosis, cerebral ischemia, cardiovascular disease, and delayed wound healing7. Study have found that wingless-type MMTV integration site family, member 1 (WNT1)-induced signaling protein 3 (WISP-3) promotes chondrosarcoma metastasis by regulating angiogenesis8. Adipocytokines inhibit the production of miR-1264 through the PI3K/Akt/mTOR signaling cascade, thereby promoting platelet-derived growth factor expression and chondrosarcoma angiogenesis9. The above studies show that inhibiting tumor angiogenesis has become an important strategy for the treatment of CS.

In eukaryotic cells, the integrated stress response (ISR) is a key signaling mechanism activated by physiological and pathological stimuli. Hypoxia, nutrient deprivation, and viral infection are common external triggers for this activation10. However, oncogene activation can trigger the ISR. Activating Transcription Factor 4 is the main effector of the ISR, and global or fibroblast-specific deletion of ATF4 leads to insufficient angiogenesis and significantly delays the growth of syngeneic melanoma and pancreatic tumors11. Platelet-rich plasma, as a toll-like receptor ligand, can increase the unfolded protein response in CS cells12. These studies suggest that abnormal activation of the ISR is associated with the development and progression of various tumors. However, the potential role of angiogenesis- and ISR-related genes in the diagnosis and treatment of CS remains to be fully elucidated. Further exploration of the mechanisms of action of the ISR in the development and progression of CS is needed to improve the treatment outcomes for CS patients.

Single-cell RNA sequencing (scRNA-seq) is a powerful new technology that can perform high-throughput sequencing analysis of the genome, transcriptome, and epigenome at the single-cell level to detect clinically important tumor subpopulations, explore the heterogeneity of complex tissues and high-resolution intercellular communication, and provide a wider range of transcriptome characterizations with higher resolution and higher accuracy13. scRNA-seq has proven to be invaluable in identifying rare malignant cells, contributing to the study of tumor biology and personalized medicine14. scRNA-seq can capture gene expression differences at the single-cell level, making it particularly suitable for complex and highly heterogeneous tumors such as CS15. It can more accurately pinpoint specific cell subpopulations and molecular targets associated with angiogenesis and the ISR. Its application not only deepens our understanding of the biological characteristics of CS but also provides important experimental and theoretical support for the development of novel targeted therapeutics. Furthermore, the tumor immune microenvironment plays a crucial role in tumor progression and treatment response, and its interaction with angiogenesis and stress signaling pathways has become an emerging research area16,17,18,19. In summary, angiogenesis and ISR may play a vital role in the development of CS and play a certain role in improving the clinical treatment of CS patients. Therefore, this study utilized a public transcriptome database to identify potential biomarkers through differential expression analysis, WGCNA, and expression validation, while exploring their biological pathways and molecular regulatory networks to uncover potential mechanisms in CS. Finally, scRNA-seq sequencing identified key cell distributions, and potential biomarkers were validated to offer new insights into CS pathogenesis and treatment.

Materials and methods

Data selection

The transcriptome data from Microarray datasets (GSE30835 and GSE22855), and the scRNA-seq dataset GSE184118, associated with chondrosarcoma (CS), were retrieved from GEO database (http://www.ncbi.nlm.nih.gov/geo/). Among them, GSE30835 (platform: GPL6884) served as the training set, which included 21 CS tumor tissue samples and 6 healthy controls. Sample preparation, RNA isolation, cDNA synthesis, cRNA amplification, hybridization onto Illumina Human-6 v3.0 Expression BeadChips, microarray data preprocessing, and quality control were performed as previously described20,21. The validation set, GSE22855 (platform: GPL6884), comprised 7 CS tumor tissues and 6 healthy controls. For GSE22855, Single Nucleotide Polymorphism (SNP) data were analyzed using Affymetrix SNP 6.0 arrays and genotyped with Birdseed and CRLMM algorithms; copy number analysis was conducted using the aroma.affymetrix package with a reference set of 29 controls. Expression data were subjected to variance stabilization and quantile normalization, followed by differential analysis using the limma package with false discovery rate (FDR) correction. scRNA-seq dataset GSE184118 (platform: GPL24676) included 8 CS tumor tissue samples of different grades.

Additionally, the angiogenesis-related genes (A-RGs) and integrated stress response-related genes (ISR-RGs) employed in this study were derived from published literature respectively. A-RGs were directly adopted from a systematically screened set of 36 genes (Table S1) in the supplementary materials of a gastric cancer study22. ISR-RGs were sourced from a tumour stress network study23, from which we incorporated 529 genes (Table S2) associated with various tumours as listed in its supplementary materials. The analytical workflow for this study was illustrated in Figure S1.

Differential expression analysis

The limma package (v 3.54.0)24 was employed to conduct differential expression analysis between normal and CS samples in GSE30835. The purpose of these analysis was to identify differentially expressed genes (DEGs) (|log2 fold change (FC)| > 0.5 and p < 0.05)25,26. To ensure sufficient genes for subsequent multi-step joint analyses, this study employed a relatively lenient threshold (raw p-value) during the differential expression analysis phase. Afterwards, the ggplot2 package (v 3.4.1)27 and circlize package (v 0.4.15)28 were utilized to establish the volcano plot and heatmap, respectively, to display the markers and expression patterns of DEGs.

Weighted gene co-expression network analysis (WGCNA)

To explore the potential role of A-RGs in CS, A-RG scores were calculated using the ssGSEA algorithm in the GSVA package (v1.53.28)29 within GSE30835. The Bayesian t-test (p < 0.05) assessed A-RG score differences between normal and CS groups. Subsequently, these A-RG scores were treated as phenotypic traits to establish a WGCNA network using an unsigned type via the WGCNA package (v 1.73)30. This study did not exclude A-RGs prior to network construction; the design aimed to identify co-expression modules associated with the overall angiogenic phenotype, rather than independent novel modules. Samples were clustered initially using gene expression profiles. Secondly, the pickSoftThreshold function determined the optimal soft threshold to maintain a scale-free topology (R2 = 0.80). The minimum module size was set to 200 genes, with a mergeCutHeight of 0.131, aiming to ensure the biological significance of modules while avoiding excessive merging. Finally, the Pearson correlation coefficient (cor) values between A-RG scores and modules were computed and then visualized by generating a heatmap (|cor| > 0.3 and p < 0.05)32. The two modules that had the most positive and the most negative correlations with the A-RG scores were picked out and designated as key modules33,34. The aim of this approach was to enable a comprehensive bidirectional mechanistic dissection of the angiogenesis phenotype and to enhance the robustness of gene screening. Genes included in the key modules were denoted as angiogenesis-related module genes (ARMGs).

Functional enrichment analysis

The differentially expressed integrated stress response-related angiogenesis genes (DE-IAGs) were identified by intersecting the sets of DEGs, ARMGs, and ISR-RGs. The resulting overlap was visualized using the VennDiagram package (v 1.7.1)35. To make clear the functions and processes of the DE-IAGs in CS, the clusterProfiler package (v 4.2.2)36 was applied to perform GO enrichment analysis. GO analysis included three parts: biological process (BP), cellular component (CC), and molecular function (MF). Using the org.Hs.eg.db database for human gene annotation and the whole genome as the background gene set, the analysis employed the following significance thresholds: pvalueCutoff = 0.05, qvalueCutoff = 0.05, and an adjusted p-value (padjust) < 0.05 was set as the criterion for statistically significant enriched pathways.

Chromosome and subcellular localization of potential biomarkers

Based on the pre-selected DE-IAGs, the Wilcoxon test was further employed to identify genes exhibiting significant expression differences (p < 0.05) between the CS group and the normal group, with consistent expression trends, across the GSE30835 and GSE22855 datasets. These genes were defined as potential biomarkers. Gene localization holds great significance for the study of gene structure, function, and interactions. Therefore, the chromosomal distribution of potential biomarkers was analyzed and visualized using the RCircos package (v 1.2.2)37.

Furthermore, the FASTA sequences of the potential biomarkers were downloaded from the National Center for Biotechnology Information (NCBI) website (https://www.ncbi.nlm.nih.gov/). Subsequently, the FASTA sequences were input into the mRNALocater database (http://bio-bigdata.cn/mRNALocater) to obtain the predicted scores of the subcellular localization for each potential biomarker.

Gene set enrichment analysis (GSEA)

GSEA was applied to all GSE30835 samples to investigate the potential biological functions of potential biomarkers. Notably, the Spearman correlation coefficients between each potential biomarker and other genes were assessed. Then, the cor values were ranked in descending order, and GSEA was performed with the clusterProfiler package (v 4.2.2)36(p < 0.05, NES > 1, FDR < 0.25)38. GSEA applied the reference gene set “c2.cp.kegg_medicus.v2024.1.Hs.symbols.gmt” from the Molecular Signature Database (MSigDB) (http://www.gsea-msigdb.org).

Molecular regulatory network

MicroRNAs (miRNAs) and transcription factors (TFs) act as cross-regulators to control gene regulatory networks in either a dependent or independent manner. Therefore, a deeper understanding of the functions of messenger RNAs (mRNAs) may lie in the regulatory interactions between TFs and miRNAs. In this study, to further explore the regulatory mechanisms of the potential biomarkers during the development of CS, the iRegulon plugin in the Cytoscape software (3.7.2)39 was utilized to predict the TFs of the potential biomarkers (enrichment score threshold was set at 5.0). Relevant miRNAs targeting the potential biomarkers were determined through the miRWalk (http://mirwalk.umm.uni-heidelberg.de/) and miRTarBase (https://mirtarbase.cuhk.edu.cn/) databases. In the miRWalk database, the default prediction results were adopted (i.e., mRNA-miRNA interaction scores > 0). To determine the key miRNAs, the eulerr package (v 7.0.2)40 was used to find the intersection of miRNAs from the two databases. Long non-coding RNAs (lncRNAs) could render miRNAs functionally impaired or reduce their functionality. Subsequently, the miRNAs that simultaneously target the potential biomarkers were selected for the prediction of upstream lncRNAs. The ENCORI database (https://rnasysu.com/encori/index.php) was utilized to identify the lncRNAs, with the screening criterion set to require support from at least two independent CrossLinking and Immunoprecipitation followed by high-throughput Sequencing (CLIP-Seq) datasets. Finally, using Cytoscape software (v 3.7.2), TF-potential biomarker and potential biomarker-miRNA-lncRNA networks were created.

Compound prediction and molecular docking

Potential therapeutic drugs or molecular compounds that might act on the potential biomarkers were explored via the Drug-Gene Interaction database (DGIdb) (https://www.dgidb.org/), and the Cytoscape (v 3.7.2) was utilized to present the compound-potential biomarker interaction network.

Molecular docking is commonly used in compound design to predict binding affinities between potential compounds and potential biomarker targets and to identify binding sites. In this study, the combination of potential biomarkers and compounds with the highest interaction scores was selected for molecular docking prediction. The Protein Data Bank (PDB) database (https://www.rcsb.org/) provided the 3D structures of potential biomarker proteins, which were used as receptors. Molecular structures of the potential compounds were extracted from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/) and used as the ligands. The binding degree between potential biomarkers and potential compounds was analyzed using the AutoDockVina website (https://autodock-vina.readthedocs.io/en/latest/index.html) and the results were displayed through the Pymol software (v 3.1.1)41. Potential compounds showed strong protein binding affinity when the binding energy was under − 5.0 kcal/mol.

The scRNA-seq data processing

In the GSE184118 dataset, scRNA-seq data were filtered via the Seurat package (v 4.3.0)42 to select high-quality cells (nCount RNA between 800 and 6,500, and mitochondrial proportion less than 10%). First, data standardization was performed with NormalizeData, and FindVariableFeatures identified the top 2,000 highly variable genes (HVGs). To further identify key genes driving cellular heterogeneity, HVGs were rigorously screened using dispersion analysis (dispersionTable), ultimately retaining genes with an average expression level ≥ 0.1 for subsequent analysis. Next, the principal component analysis (PCA) results were analyzed by the runPCA, Jackstraw, and Elbowplot functions to determine the top significant principal components (PCs) (p < 0.05)43. Uniform manifold approximation and projection (UMAP) was applied for dimensionality reduction with top PCs and for the determination of cell clusters via the FindNeighbors and FindClusters functions (resolution = 0.3, dimension = 20). This parameter setting ensures the preservation of major cell type heterogeneity while minimizing the generation of technically noise-driven subpopulations. Following this, cells were annotated as different types based on marker genes according to the literature44.

A bubble plot was utilized to display the expression profiles of the potential biomarkers in different annotated cells. Annotated cells showing a high potential biomarker expression proportion and a literature-reported association with CS were defined as key cells. To further confirm the subpopulations of the key cells, dimensionality reduction and clustering (resolution = 0.12, dimension = 15) were performed on these key cells again, followed by cell annotation. This analysis employed a lower clustering resolution to avoid over-partitioning biologically homogeneous populations. This approach of dynamically adjusting parameters based on the data characteristics of the target cell group is a common strategy in single-cell data analysis to ensure reliable biological discovery. Specifically, the CellMarker database (http://117.50.127.228/CellMarker/) was referred to for annotating key cell subpopulations. Then the ReactomeGSA package (v 1.16.1)45 helped investigate pathways linked to the key cell subpopulations.

Pseudo-time and cell communication analyses

In the GSE184118 dataset, pseudotemporal analysis was first performed on key cells using the monocle package (v 2.28.0)46. By designating the cell state with the highest proportion of key cell differentiation subpopulations as the root state and reconstructing the developmental trajectory, a pseudotime axis with clear biological directionality was constructed to analyze the differentiation process of key cells and the expression dynamics of potential biomarkers along this trajectory. To validate the reliability of the pseudotime analysis results, this study employed a Bootstrap method (n = 10) for robustness assessment: each iteration involved random sampling without replacement of 80% of cells, followed by repeated execution of the dimensionality reduction and trajectory construction processes. Subsequently, cell communication analysis was conducted at the whole-cell level using the CellChat package (v 1.6.1)47. Low-confidence interaction signals, where either the sender or receiver cell count was less than 10, were filtered out by calling the filterCommunication function, thereby constructing a robust intercellular communication network between cell types while maximizing the retention of interaction information. It should be noted that all of the above analyses were based on data merged from all samples, aiming to reveal the overall interaction patterns at the cell type level, without separately examining communication differences between samples.

Experimental validation by reverse transcription quantitative PCR (RT-qPCR)

The tissue samples of three CS patients and three controls were collected at General Hospital of Western Theater Command, and all participants filled out and signed an informed consent form. Ethics approval was granted by the Ethics Committee of General Hospital of Western Theater Command. Trizol reagent (Ambion, Texas, USA) was applied to extract total RNA. Then, RNA was converted into cDNA with the SweScript First Strand cDNA Synthesis Kit (ServiceBio, Wuhan, China). The amplification conditions were determined according to the instructions. PCR amplification was carried out in a 10 µL reaction mixture. The relative mRNA expression levels of potential biomarkers were measured by the 2−ΔΔCt approach. Sangon (Shanghai) designed the RT-qPCR primer pairs, and Table 1 shows the sequences. GAPDH was utilized as the internal control for data normalization.

Table 1 The primer sequences.

Statistical analysis

Data were analyzed with R software (v 4.2.2) and GraphPad Prism (v 10). All statistical tests were two-tailed. Comparisons were analyzed through the Wilcoxon test or Bayesian t-test, and p < 0.05 denoted statistical significance.

Results

Identification of 701 DEGs and 3,777 ARMGs

Analysis of GSE30835 revealed 701 DEGs, comprising 403 significantly upregulated and 298 downregulated genes in the CS group. (Fig. 1A,B).

Fig. 1
Fig. 1
Full size image

Identification of candidate genes associated with A-RGs and ISR-RGs in CS. (A, B) Analysis of differentially expressed genes in CS. The DEGs volcano plot shows that the horizontal axis is Log2FC and the vertical axis is -Log10(P. Value). Each point represents a gene. The horizontal reference line represents -Log10(0.05) = 1.3, and the vertical reference line represents log2FC = ± 0.5. The genes in the upper right corner are up-regulated differentially expressed genes (in red), the genes in the upper left corner are down-regulated differentially expressed genes (in blue), and the remaining genes are genes with no statistical significance (in gray). The marked genes in the figure are the top 10 up-regulated genes and the top 10 down-regulated genes with the largest |log2(FoldChange)| (i.e., the largest difference fold). The DEGs circular heat map shows the expression levels of the top 10 up-regulated and down-regulated differentially expressed genes in different samples. Red represents tumor samples (tumor) and green represents normal samples (normal). Each ring represents a different gene, and the color depth indicates the intensity of gene expression. The darker the red, the higher the expression, and the darker the blue, the lower the expression. (C) Analysis of differences in the GSEA scores of gene sets between groups. The figure shows the data distribution of different groups. * indicates p < 0.05, ** indicates p < 0.01, and *** indicates p < 0.001. (D) Sample clustering tree analysis. The sample clustering dendrogram is used to detect outliers. The horizontal axis indicates the sample number, and the vertical axis indicates the cluster height. (E) Soft threshold screening. Independence and average connectivity of co-expression modules under different soft thresholds. (F) Hierarchical clustering tree analysis. The dendrogram shows the similarity hierarchy between genes, while the colored strips at the bottom represent different gene modules or categories. (G) Heat map of module-trait correlation. The heat map shows the relationship between different modules and traits. The color of each module indicates the strength and direction of its correlation. The text in the module indicates the correlation. The value in the bracket indicates the significant P value.

Subsequently, significantly higher A-RG scores were observed in the CS group (p = 4.555361e-27, Fig. 1C). WGCNA analysis confirmed that there were no outliers, and all samples were retained for subsequent analysis (Fig. 1D). The optimal soft threshold value (power) was ascertained to be 11 (R² > 0.80), when mean connectivity tended to be 0 (Fig. 1E). Using this threshold, genes were organized into a scale-free network, and 14 co-expression modules were identified (excluding the MEgrey in which genes that could not be classified were assigned) (Fig. 1F). The two key modules most positively and negatively correlated with the A-RG scores were the MEpink (1,434 genes, cor = 0.65, p = 0.0002) and the MEbrown (2,343 genes, cor = −0.57, p = 0.002), respectively (Fig. 1G). Therefore, all module genes in the key modules were selected, and a total of 3,777 ARMGs were obtained (Table S3).

Identification and functional evaluation of 8 DE-IAGs

The intersection among 701 DEGs, 529 ISR-RGs, and 3,777 ARMGs was taken to obtain 8 DE-IAGs, namely SERPINH1, HSPA8, LMNA, FBXO31, CCND1, SOD3, BCL6 and GPX3 (Fig. 2A).

Fig. 2
Fig. 2
Full size image

Identification and functional evaluation of 8 DE-IAGs. (A) Identification of candidate genes. The Venn diagram shows the intersection of the expanded A-RGs, ISR-RGs genes and differentially expressed genes. (B) GO enrichment analysis of candidate genes. The nodes are arranged along the circumference, and the nodes are connected by weighted arcs. The width of the arcs represents the strength of the relationship between different groups of data.

Following this, to explore the molecular biological processes, GO analysis was performed on 8 DE-IAGs, yielding 25 enriched terms. “response to unfolded protein”, “response to topologically incorrect protein”, “mitotic G1 DNA damage checkpoint signaling”, “mitotic G1/S transition checkpoint signaling”, and “negative regulation of mitotic cell cycle” were significantly enriched in 23 BP entries. In the two entries of MF category, there were “antioxidant activity” and “unfolded protein binding” (padjust < 0.05, Fig. 2B, Table S4). These analyses provided insights into the molecular mechanisms underlying CS related to angiogenesis and integrated stress response (ISR).

Chromosome and subcellular localization of HSPA8, LMNA and SERPINH1

Among the eight DE-IAGs identified, three genes (HSPA8, LMNA, and SERPINH1) were selected as potential biomarkers for further validation based on their consistent and significant upregulation in both the training (GSE30835) and validation (GSE22855) datasets (p < 0.05, Fig. 3A). As shown in Fig. 3B, the distribution of different potential biomarkers on different chromosomes was as follows: HSPA8 and SERPINH1 were located on chromosome 11, and LMNA was located on chromosome 1. The top 5 tissues associated with potential biomarkers were shown in Fig. 3C. The tissue distributions with higher frequencies might shed light on the potential mechanisms related to angiogenesis and ISR in CS. The results of subcellular localization indicated that HSPA8, LMNA and SERPINH1 were mainly localized in the cytoplasm (Table S5). Understanding the intracellular locations of HSPA8, LMNA and SERPINH1 enabled researchers to predict their interactions with other cellular components.

Fig. 3
Fig. 3
Full size image

Chromosomal and subcellular localization of key genes. (A) Analysis of the expression levels of HUB candidate genes in the training set GSE30835 and the validation set GSE22855. The boxes in the box plot represent the interquartile range of the data (25% to 75%), and the horizontal line in the box represents the median. The ends of the lines above and below the box represent the minimum and maximum values of the data (excluding outliers). * indicates p < 0.05, ** indicates p < 0.01, and *** indicates p < 0.001. (B) Chromosomal localization of potential biomarkers. The figure shows the chromosomal localization of potential biomarkers, and the chromosomes are arranged counterclockwise from top to bottom. (C) Subcellular localization of potential biomarkers. The predicted score percentage of the subcellular localization of potential biomarkers, the horizontal axis represents the gene, and the cumulative bar chart is colored according to the cellular localization classification.

GSEA of the potential biomarkers in CS

GSEA helps uncover comprehensive changes in biological processes. In CS, the GSEA pathways associated with HSPA8 were closely related to 28 pathways like “medicus reference citrate cycle second carbon oxidation 2” (p < 0.05, Fig. 4A, Table S6), LMNA were enriched in 35 pathways like “medicus reference 26S proteasome mediated protein degradation” (p < 0.05, Fig. 4B, Table S7), and SERPINH1 were enriched in 22 pathways like “medicus reference 26S proteasome mediated protein degradation” (p < 0.05, Fig. 4C, Table S8). It was worth noting that HSPA8, LMNA and SERPINH1 were collectively notably enriched in pathways like “medicus variant mutation caused aberrant Abeta to 26S proteasome mediated protein degradation” and “medicus variant mutation caused aberrant HTT to 26S proteasome mediated protein degradation” (p < 0.05). This implied that the potential biomarkers might influence these functional pathways during the pathophysiological process of CS.

Fig. 4
Fig. 4
Full size image

GSEA of the potential biomarkers in CS. (AC) GSEA results for HSPA8, LMNA and SERPINH1. The horizontal axis represents the gene ranking in the dataset sorted by enrichment, and the vertical axis represents the Running Enrichment Score. Different curves represent different KEGG pathways. The color and style of each curve corresponds to the specific variant listed in the legend. The gray shaded area represents the random expectation range, while the solid line represents the actual observed enrichment score.

Molecular regulatory network of potential biomarkers

Based on the pivotal roles of the three potential biomarkers, studies were carried out to predict potential regulatory elements, encompassing associations with TFs, miRNAs, and lncRNAs, in order to delineate a multifaceted regulatory landscape. The TF-potential biomarker network was employed to investigate the molecular mechanisms in CS. A total of 9 TFs (including KLF4, HSF4, ATF6, HSF2, SRF, HSF1, JAZF1, STAT1, and CREB3) were predicted as regulators. Notably, STAT1 was found to co-target and regulate all of the potential biomarkers (Fig. 5A). The potential biomarker-miRNA-lncRNA network was employed to investigate the molecular mechanisms in CS. A total of 69 key miRNAs that could target and regulate the potential biomarkers were predicted using miRWalk and miRTarBase databases (Fig. 5B). Meanwhile, a total of 78 lncRNAs (such as SNHG7, MEG3, MALAT1, NEAT1 and XIST) might affect the potential biomarkers by regulating two miRNAs (hsa-miR-361-3p and hsa-miR-6504-5p), which simultaneously targeted three biomarkers (Fig. 5C). These network analyses strongly supported the elucidation of the regulatory mechanisms underlying CS.

Fig. 5
Fig. 5
Full size image

Molecular regulatory network of potential biomarkers. (A) TF-mRNA regulatory network. The figure shows the structure of the TF-mRNA regulatory network, where blue square nodes represent transcription factors (TFs) and pink oval nodes represent messenger RNAs (mRNAs). Arrows point to the direct regulatory relationship between TF and mRNA. (B) Acquisition of key miRNAs. The figure shows the number of miRNAs predicted for potential biomarkers by the two databases miRWalk and miRTarBase. The intersection of the two circles represents the common part of the predicted miRNAs, which is 69 in size. (C) Potential biomarker-key miRNA relationship network. The figure shows the structure of the TF-mRNA regulatory network, where green diamond nodes represent transcription factors (TFs) and pink triangle nodes represent messenger RNAs (mRNAs).

Protein-ligand docking results

The potential biomarkers were input into the DGIdb to identify 25 targeted compounds. Among them, 16 potential compounds (such as ellagic acid, cisplatin, cyclosporine, and adenosine diphosphate (ADP)) targeted and regulated HSPA8, 6 potential compounds (such as antisense oligonucleotides and aspartic acid) targeted and regulated SERPINH1, and another three potential compounds (such as histamine and lonafarnib) also targeted and regulated LMNA. These compounds highlighted their potential correlations with the potential biomarkers (Fig. 6A).

Fig. 6
Fig. 6
Full size image

Protein-ligand docking results. (A) Potential biomarker-drug network. This figure shows the potential biomarker-drug interaction network. The yellow diamond nodes represent potential biomarkers, the blue circle nodes represent drugs, and the thickness of the line represents the interaction score. (BD) Molecular docking prediction of potential biomarkers and drugs.

Based on the molecular docking results, the three compounds with the highest scores for HSPA8, LMNA, and SERPINH1 were ADP, lonafarnib, and threo-3-methylglutamic acid, respectively. Among them, HSPA8 bound to ADP, and its binding energy was − 7.0 kcal/mol (Fig. 6B). LMNA bound to lonafarnib, and its binding energy was − 7.2 kcal/mol (Fig. 6C). However, the binding energy between SERPINH1 and threo-3-methylglutamic acid was − 4.3 kcal/mol (Fig. 6D). The findings implied that these compounds might serve as potential therapeutic agents for treating CS.

Identification of cell clusters in scRNA-seq datasets

Initially, after integrating and filtering the original data of GSE184118 for CS, the data contained 23,746 cells and 33,742 genes after quality control (QC) (Figure S2A), and the top 2,000 HVGs were picked for later analyses (Fig. 7A). The scree plot was employed to determine that the optimal dimensionality was 20, and the top 20 PCs were retained for downstream analyses (p < 0.05, Figure S2B-C). Then, UMAP clustering analysis identified 8 distinct cell clusters (Fig. 7B) and these cell clusters were annotated (Table S9). The annotated cell populations included Chon1: chondroid clusters 1; Chon2: chondroid clusters 2; High1: major neoplastic cell clusters of High_1; High2: the major neoplastic cell clusters of High_2; Cos: major neoplastic cell clusters of COS; Prol: proliferating clusters; Stro: stromal cells (excluding leucocytes); Leuk: leukocytes (Fig. 7C).

Fig. 7
Fig. 7
Full size image

Identification of cell clusters in scRNA-seq datasets. (A) Screening of highly variable genes. The scatter plot shows the relationship between the average expression and standardized variance of 2000 highly variable genes. The horizontal axis represents the average expression, and the vertical axis represents the standardized variance. Each red dot represents a gene. The higher the position of the dot, the larger its standardized variance, and the closer to the right, the higher its average expression. (B) UMAP map of cell cluster classification. (C) Cell annotation map of different cell types. (D) Expression of potential biomarkers in each cell. (E) Subpopulation classification of key cells. (F) Enrichment analysis of key cells.

In order to screen out the key cells, a bubble plot was employed to display the expression patterns of potential biomarkers in different annotated cells. HSPA8, LMNA, and SERPINH1 exhibited relatively high expression levels in stromal cells (Fig. 7D). Considering that stromal cells had the ability of proliferation and differentiation and might play a crucial role in tumor progression48, they were selected as the key cells. Subsequently, stromal cells were clustered and annotated into 4 cell subpopulations, namely fibrocartilage chondrocytes, stromal cells 1 (Stro1), stromal cells 2 (Stro2), and cartilage progenitor cells (Fig. 7E). These key cell subpopulations were found to be correlated with pathways, such as “FCERI mediated MAPK activation”, “Fcgamma receptor (FCGR) dependent phagocytosis”, and “regulation of actin dynamics for phagocytic cup formation” (Fig. 7F, Table S10).

Role of potential biomarkers in stromal cells and cell communication landscaping in CS

To infer differentiation trajectories, a pseudo-time analysis divided stromal cells into four subpopulations in GSE184118. Notably, fibrocartilage chondrocytes remained relatively active throughout the entire differentiation period of stromal cells (Fig. 8A-C). Bootstrap iterative validation (n = 10) revealed that the pseudo-timing values for most cells exhibited minimal fluctuation between samples, with stable core trajectory trends. Only a minority of cells (coefficient of variation (CV) > 0.2) demonstrated sensitivity to sampling in their timing results, necessitating caution in interpreting their trajectory positions. Cell clustering results showed high consistency, with the majority of cells exhibiting inconsistency rates ≤ 0.2 (Figure S3). In addition, the expression patterns of potential biomarkers across pseudo-time trajectories were unveiled in stromal cells. As stromal cells differentiated, the expression of HSPA8 and LMNA first increased significantly and then leveled down. Regarding SERPINH1, as stromal cells differentiated, its expression level showed a fluctuating downward trend (Fig. 8D).

Fig. 8
Fig. 8
Full size image

Role of potential biomarkers in stromal cells and cell communication landscaping in CS. (AD) Pseudo-temporal differentiation trajectories of key cells. (E, F) Communication between key cells. Communication network between key cells, colors represent different cell types. In the circle diagram, the size of the dot is proportional to the number of cells, the color of the line is consistent with the source, and the thickness of the line is proportional to the communication strength.

The cell communication analysis network diagram illustrated the number and strength of interactions among annotated cells, revealing that stromal cells communicated with several other cell types. Overall, there was a relatively strong communication intensity between stromal cells and chondroid clusters 1 (Fig. 8E-F). These findings provided crucial insights into the complex biological processes underlying stromal cells’ differentiation, potential biomarker expression dynamics, and the interactions between stromal cells and chondroid clusters 1.

RT-qPCR confirmation of HSPA8, LMNA and SERPINH1

In order to confirm the expression differences of potential biomarkers HSPA8, LMNA and SERPINH1 in CS patients, RT-qPCR findings displayed that compared to the control group, mRNA expression levels of HSPA8, LMNA and SERPINH1 were remarkably increased in CS patients (p < 0.05, Fig. 9A-C). The findings implied that these potential biomarkers may be crucial for creating novel therapies and targets for CS associated with angiogenesis and ISR.

Fig. 9
Fig. 9
Full size image

RT-qPCR confirmation of expression difference of potential biomarkers. (A) mRNA expression levels of HSPA8. (B) mRNA expression levels of LMNA. (C) mRNA expression levels of SERPINH1. (*** represents p < 0.001, ** represents p < 0.01, * represents p < 0.05).

Discussion

Visfatin promotes PDGF expression by inhibiting miR-1264 and activating the PI3K/Akt/mTOR pathway9. WISP-3 mediates metastasis by regulating angiogenesis49. Furthermore, the integrative stress response is a core mechanism for cellular responses to microenvironmental stress50. Loss of its effector, ATF4, significantly inhibits tumor angiogenesis11. Platelet-rich plasma can activate the unfolded protein response in CS cells51. However, the mechanisms of synergy between angiogenesis and the ISR and their key drivers remain largely undefined in CS. In this study, we used bioinformatics analysis and expression validation to detect potential biomarkers associated with angiogenesis and ISR in CS (HSPA8, LMNA, and SERPINH1), and conducted a preliminary study of their potential molecular pathways and regulatory network factors in CS progression. This study explained the molecular mechanism of CS with a holistic approach of studying bulk RNA sequencing and single-cell RNA sequencing data, providing new ideas and targets for the diagnosis, treatment, and prognostic assessment of CS.

In the CS dataset, we identified three potential biomarkers related to angiogenesis and ISR: HSPA8, LMNA, and SERPINH1, by integrating differential expression analysis, WGCNA, and cross-dataset validation. HSPA8, as a highly conserved molecular chaperone, may play an important role in the pathological progression of chondrosarcoma by linking ISR to angiogenesis52,53. Within the ISR, HSPA8 is a core regulator of cellular responses to stresses such as hypoxia and protein misfolding54. It mitigates endoplasmic reticulum stress by recognizing and facilitating the proper folding of unfolded proteins and collaborating with the ubiquitination system to clear abnormal proteins, thereby preventing apoptosis caused by excessive ISR activation55. Furthermore, HSPA8 interacts with key ISR transcription factors such as ATF4 to regulate downstream target genes such as HIF-1α, balancing stress adaptation and cell survival56. In terms of angiogenesis, HSPA8 promotes the formation of tumor vascular networks through multiple mechanisms. It can stabilize the conformation of angiogenic factors such as VEGF and PDGF and promote their secretion, enhance the adaptability of endothelial cells to stresses such as hypoxia (such as promoting migration and lumen formation through the PI3K/Akt pathway)57, and activate MMPs (such as MMP-2) to accelerate the degradation of the extracellular matrix58, providing space for the invasion of new blood vessels. Studies have found that significant overexpression of HSPA8 in tumor tissues is considered an adaptive response of cells to microenvironmental stresses (such as endoplasmic reticulum stress and hypoxia)59. This study also observed significant overexpression of HSPA8 in chondrosarcoma tissues. This suggests that the upregulation of HSPA8 may be an important molecular event in the response of chondrosarcoma cells to tumor microenvironmental stress, providing a basis for further investigation into its role in the development and progression of chondrosarcoma. In extraskeletal myxoid chondrosarcoma (EMC), fusion of HSPA8 and NR4A3 can abnormally activate the NR4A3 pathway, forming a vicious cycle of enhanced ISR and promoted angiogenesis60,61. Specifically, HSPA8 alleviates stress by assisting collagen folding to maintain tumor cell survival, while promoting the secretion of factors such as VEGF by stabilizing HIF-1α62. Activating MMP-13 accelerates cartilage matrix degradation, providing a pathway for vascular invasion, and ultimately may promote the invasion and progression of chondrosarcoma through ISR adaptation and vascular support63,64.

Encoding lamin A/C, the LMNA gene belongs to the lamin family. Lamin A/C is the main component of the nuclear lamina of the cell nucleus, providing structural support for the cell nucleus and participating in a variety of cellular processes, such as DNA replication, transcriptional regulation, cell cycle progression and apoptosis65,66. The occurrence of chondrosarcoma is related to chromosomal abnormalities, and the protein encoded by the LMNA gene is essential for maintaining the stability of the nuclear structure and function. Its abnormal function may lead to the instability of cellular genetic material67. The SERPINH1 gene, also known as HSP47, is a member of the Serpin family and is located on chromosome 11q13.568. The protein encoded by this gene is a molecular chaperone that specifically recognizes and binds to unfolded collagen molecules, helping them to fold correctly and eventually form functional fibers, and plays a key role in the synthesis of collagen in the extracellular matrix69,70. The main feature of chondrosarcoma is the production of cartilage matrix, of which collagen is an important component of the cartilage matrix. Therefore, the SERPINH1 gene may have a certain indirect relationship with the occurrence and development of chondrosarcoma by affecting the synthesis of collagen71,72. In summary, the three potential biomarkers might aid in better understanding and handling of CS.

We also identified a total of 9 transcription factors that can regulate potential biomarkers. Signal transducer and activator of transcription 1 (STAT1), which regulates all potential biomarkers at the same time, may be a key TF. STAT1 is a core cross-regulatory node integrating the stress response (ISR) and angiogenesis. Within the ISR, it is directly activated by stress signals (such as IFN-γ and oxidative stress)73. By regulating molecular chaperones like HSPA8 and target genes like CHOP74,75, it enhances cellular stress adaptation and balances survival and apoptosis. During angiogenesis, it directly binds to the VEGF and PDGF promoters76,77, synergizing with HIF-1α to promote factor secretion, thereby regulating endothelial cell survival, permeability, and vascular maturation. STAT1 has a regulatory effect on the growth and apoptosis of chondrocytes. Study have found that STAT1 can mediate the inhibitory effect of fibroblast growth factor (FGF) on chondrocyte growth78. FGF signaling can induce STAT1 phosphorylation. STAT1 deficiency can significantly correct the chondrodysplasia phenotype of mice overexpressing FGF2, and restore chondrocyte proliferation and apoptosis to near normal levels79,80. As a malignant tumor, the growth and progression of chondrosarcoma are closely related to the body’s immune status81. STAT1 may affect the immune surveillance and tumor microenvironment of chondrosarcoma by regulating immune responses82. Study have suggested that STAT1 may be involved in the NF-κB pathway, and the NF-κB signaling pathway plays an important role in the proliferation, survival, invasion and metastasis of tumor cells83. NF-κB pathway activation is abnormal in chondrosarcoma84. It is worth noting that STAT1 is expected to have many binding sites with the three potential biomarkers, which provides a direction for further elucidation of regulatory relationships. Considering the widespread expression of TFs in humans, it is worthwhile to further explore specific regulation.

In this study, we predicted small molecule compounds (ADP, lonafarnib and threo-3-methylglutamic acid) based on potential biomarkers. The study showed that some compounds have therapeutic effects on CS in basic experiments. These compounds may provide ideas for the development of new targets for anti-CS drugs. For example, lonafarnib is an oral small molecule targeted drug, and its mechanism of action is mainly to inhibit the activity of farnesyl transferase. Farnesyl transferase can catalyze the transfer of farnesyl groups to Ras protein85. The Ras protein family regulates cell signaling, influencing growth, proliferation, differentiation, and apoptosis. Abnormal Ras signaling activation is linked to the occurrence and progression of many tumors. Therefore, lonafarnib can inhibit farnesyl transferase, block the farnesyl modification of Ras protein, and then inhibit the abnormally activated Ras signaling pathway to exert anti-tumor effects86,87.

Tumorigenesis heavily depends on the tumor microenvironment (TME), which includes tumor cells, stromal cells, and immune cells. Stromal cells are composed of cancer-associated fibroblasts (CAFs), cancer-associated adipocytes (CAAs), tumor-endothelial cells (TECs), etc88. Study have found that periostin from cancer-associated fibroblasts can promote chondrosarcoma cell survival, migration and lymphangiogenesis89. Vascular endothelial growth factor (VEGF), produced by vascular endothelial cells, is vital to the growth and metastasis of chondrosarcoma8,90. Resistin, a cytokine secreted by adipocytes, enhances the expression of VEGF-C in human chondrosarcoma cells and lymphangiogenesis associated with lymphatic endothelial cells (LECs), thereby enhancing the distant metastasis of chondrosarcoma91. In summary, stromal cells affect the occurrence, development and metastasis of CS, and this pathway of action may be related to angiogenesis to a certain extent. MAPK mediates apoptosis and cell cycle arrest in human chondrosarcoma cells92. Visfatin promotes the metastatic potential of chondrosarcoma cells by stimulating the production of AP-1-dependent MMP-2 in the MAPK pathway93. After treating chondrosarcoma cells with MAPK inhibitors, the levels of MMP - 1, MMP − 13 and TIMP - 1 were inhibited, thereby inhibiting the invasion of chondrosarcoma94. The MAPK pathway might represent a novel target for treating chondrosarcoma. However, based on the current research results, the specific molecular mechanism of FCERI-mediated MAPK pathway effects on chondrosarcoma needs further study.

Conclusion

Overall, this study deeply explored angiogenesis and ISR-related genes in CS by integrating large-scale RNA sequencing and single-cell RNA sequencing methods, and explored their biological functions and potential regulatory mechanisms, providing a new perspective for the treatment and drug development of CS. In the clinical diagnosis process, detecting the levels of HSPA8, LMNA, and SERPINH1 in patients can serve as an important basis for assisting CS diagnosis and disease progression. Single-cell analysis results deepen our understanding of CS pathogenesis. However, there are limitations to this study. Although this study combined public disease databases, the sample size of the data set used for bioinformatics analysis is still limited. This study should verify the mechanisms and pathways of potential biomarkers in CS, as well as potential therapeutic compounds, through further biological model experiments and more clinical samples. Therefore, further research is necessary to clarify the molecular mechanisms linked to angiogenesis and ISR in CS.