Introduction

Bladder cancer (BLCA) ranks among the most frequent malignant tumors of the urinary tract worldwide, characterised mainly by painless haematuria, with significant differences in morbidity and mortality between regions and populations. Overall, there are approximately 550,000 new cases and 200,000 deaths per year globally 1. It was found that risk factors for the development of BLCA include smoking, occupational exposure to aromatic amine chemicals, chronic cystitis and pelvic radiotherapy, of these, smoking is the most significant risk factor. Currently, the gold standard for the diagnosis of BLCA is cystoscopic biopsy, and Computed Tomography (CT) or Magnetic Resonance Imaging (MRI) can be used as an adjunctive test to assess tumor infiltration and metastasis 2. In terms of treatment, patients diagnosed with non-muscle invasive bladder cancer typically undergo transurethral resection of the bladder tumor (TURBT) as the standard treatment, followed by intravesical adjuvant therapy using a risk stratification approach, with an overall survival rate of up to 90%, whereas patients diagnosed with muscle-invasive bladder cancer are mainly treated with radical cystectomy followed by systemic adjuvant therapy, with a still low cure rate 3. Over the past decade, despite the great progress in the fields of molecular typing and targeted therapy of BLCA, its high recurrence rate and poor prognosis have not been significantly improved. In-depth exploration of the pathogenesis of BLCA, searching for reliable biomarkers, and developing novel therapeutic strategies are still the focus of current research.

Mitochondrial autophagy is a selective mode of autophagy, a process in which the autophagosome selectively wraps mitochondria with abnormal intracellular function and transports them to the lysosome for degradation, which mainly involves the steps of selective recognition, wrapping by autophagosome, and degradation by the lysosome 4. Its role is to ensure the efficient removal of malfunctioning mitochondria under stress or physiological conditions, to reduce the accumulation of reactive oxygen species, to ensure the balance between the quality and quantity of mitochondria in the cell, and thus to maintain the metabolic homeostasis of the cell mitochondrial autophagy has a ‘double-edged sword’ effect in cancer cells, with some tumor cells sustaining aberrant proliferation through inhibition of mitochondrial autophagy, while others can rely on it for survival 5. It is reported in studies DARS2 promotes BLCA progression by enhancing pink1-mediated mitochondrial autophagy 6; Corosolic acid inhibits BLCA cell proliferation in vivo and in vitro by inducing mitochondrial autophagy 7; Similarly Histone lactylation regulates prkn-mediated mitochondrial autophagy to promote M2 macrophage polarisation in BLCA 8. However, studies on the progression of mitochondrial autophagy in BLCA are currently scarce, and the exact molecular mechanism of its action is unclear and deserves further investigation.

Therefore, in this study, based on BLCA transcriptome data and combined with mitochondrial autophagy-related genes, differential analysis, one-way Cox combined with PH assumption, and then LASSO were performed to screen the genes related to BLCA prognosis and calculate risk scores. After that, the prognostic model was constructed and validated. Then the differences between immune microenvironment, immunotherapy, gene mutation and drug sensitivity were explored to provide new theoretical support and reference basis for the treatment of BLCA.

Results

Acquisition of 42 candidate genes

In the TCGA-BLCA dataset, 12,479 Differentially expressed genes (DEGs) were identified, with 8,037 genes upregulated and 4,442 genes downregulated in the BLCA group (Fig. 1a–b). Enrichment analysis demonstrated that the DEGs were significantly associated with 491 Gene Ontology (GO) terms and 60 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. For instance, GO terms included “extracellular structure organization” and “contractile fiber” (adj.P < 0.05) (Fig. 1c, Supplementary Table S1–3). The KEGG pathways primarily included the “cGMP-PKG signaling pathway” and “Axon guidance” (adj.P < 0.05) (Fig. 1d, Supplementary Table S4).

Fig. 1
figure 1

Acquisition of 42 candidate genes. (a) Distribution of differentially expressed genes between BLCA and Control groups. The genes in the upper right corner are upregulated differentially expressed genes (represented in red), the genes in the upper left corner are downregulated differentially expressed genes (represented in blue), and the rest of the genes are not significantly statistically significant (represented in gray). (b) In the heatmap, expression levels of different genes. Red is for high expression genes, blue is for low expression genes. (c) Enrichment analysis of Gene Ontology (GO): biological processes (BP), cellular components (CC), and molecular functions (MF). Each circle represents a pathway, the size of the circle represents the number of genes enriched in the pathway, and the color of the circle represents the significance of the enrichment in the pathway. (d) Enrichment analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG). (e) Candidate key genes. The blue circle represents DEGs, the orange circle represents genes associated with MRGs, and the overlapping part in the middle represents genes in both sets of genes. (f) PPI network of candidate key genes. Each circle represents a candidate key gene, and the darker the color, the more genes it interacts with.

By intersecting DEGs and mitophagy-related genes (MRGs), 42 candidate genes were identified (Fig. 1e). In addition, a Protein–Protein Interaction (PPI) network was constructed. The network consisted of 42 nodes and 157 edges, with SRC interacting closely with several genes, including SRC (Fig. 1f).

Identification of CTSK, MTERF3, SRC, and CSNK2B as biomarkers

A total of 5 candidate biomarkers (CTSK, RHOT2, MTERF3, SRC, and CSNK2B) were identified through univariate Cox regression analysis (HR ≠ 1 and P < 0.05) and PH assumption test (P > 0.05) (Fig. 2a, Supplementary Fig. S1a). The results of k-fold cross-validation show that the Cox model achieves a high and stable C-index (0.323) under tenfold cross-validation, indicating good predictive performance and strong generalization ability (Supplementary Fig. S1b). A total of 4 biomarkers (CTSK, MTERF3, SRC, and CSNK2B) were then selected using LASSO, with a minimum lambda of 0.014 as the threshold (Fig. 2b).

Fig. 2
figure 2

Identification of as biomarkers. (a) A total of 5 candidate biomarkers (CTSK, RHOT2, MTERF3, SRC, and CSNK2B) were identified through univariate Cox regression analysis (HR ≠ 1 and P < 0.05) and PH assumption test (P > 0.05). (b) LASSO Regression analysis to screen prognostic genes. (c) Survival curves for patients in the high-risk and low-risk groups. The horizontal axis represents the follow-up time (in days In units of months), the vertical axis represents the survival probability; in the figure below, the actual survival time range and follow-up time points of patients are shown. (d) Distribution of risk scores and survival status in patients with TCGA-BLCA. Each point is a BLCA sample. (e) ROC curve analysis. Evaluate the effectiveness of the model in predicting 1, 3 and 5 years survival. (fh) The prognostic risk model was validated in the GSE32894 dataset.

Development of a risk model with high predictive accuracy

Based on 4 biomarkers, a risk model was constructed. Based on the median risk score value of 0.0134, the 406 BLCA samples in the TCGA-BLCA dataset were classified into 203 high-risk group (HRG) samples and 203 low-risk group (LRG) samples. Kaplan–Meier (KM) curves showed that the survival rate was considerably reduced in the HRG (P < 0.001) (Fig. 2c). Risk curves and survival status maps further showed that the HRG experienced more deaths and had a higher mortality rate than the LRG (Fig. 2d). ROC curve analysis demonstrated that the AUC values for TCGA-BLCA were all greater than 0.6, indicating that the risk model exhibited stable predictive performance (Fig. 2e). Additionally, based on the median risk score of -0.0718, the 224 BLCA samples in the GSE32894 dataset were divided into 112 HRG and 112 LRG samples, and analyzed in the same way, producing results consistent with those from the TCGA-BLCA dataset (Fig. 2f–h). These results emphasized the reliability of the risk model’s predictions, showcasing its potential for guiding personalized treatment in BLCA.

Association between risk score and clinical features

The risk score was strongly associated with N stage, grade, T stage, stage, and survival status (P < 0.05). The risk score increased with higher pathological grade, progression of T and N stages, and advancement of tumor stage (Fig. 3).

Fig. 3
figure 3

Association between risk score and clinical features. Each violin in this figure represents a set of pathological features, and the width of the violin corresponds to the density of the data at that point. P < 0.05 indicates significant difference between the two groups. ns indicates p > 0.05, ***indicates p < 0.001.

Exploration of signaling pathways

Gene set enrichment analysis (GSEA) results indicated that the main signaling pathways significantly enriched for DEGs between the HRG and LRG were “chemical carcinogenesis-DNA adducts” and “cytokine-cytokine receptor interaction” (Fig. 4a, Supplementary Table S5). Pathway correlation analysis indicated that there were common genes between the “viral protein interaction with cytokine” pathway and the “cytokine receptor” pathway, as well as between the “cytokine-cytokine receptor interaction” pathway and the “hematopoietic cell lineage” pathway (Fig. 4b). These pathways interacted in the development of BLCA.

Fig. 4
figure 4

Exploration of signaling pathways. (a) GSEA analyze. Include high risk groups and low risk groups. The main enrichment pathways of the high-risk group are shown in this figure. (b) The relationship between KEGG signaling pathways enriched by GSEA was analyzed. Each point represents a different KEGG pathway, and the position and connection of the points represent the correlation and regulation between these pathways.

Significant association between risk score and immune microenvironment

The infiltration of 22 types of immune cells was presented in the stacked plot (Fig. 5a). Among them, 14 immune cells, such as CD8 T cells and M2 macrophages, exhibited significant differences in their scores (P < 0.05) (Fig. 5b). Additionally, Spearman correlation analysis revealed that CTSK exhibited the strongest correlation with naive B cells (cor = 0.33, P < 0.001), while SRC exhibited the strongest correlation with regulatory T cells (Tregs) (cor = 0.33, P < 0.001) (Fig. 5c, Supplementary Table S6). SRC was more strongly associated with the CD4T immune cell subtype; MTERF3 and CTSK were more strongly associated with the macrophage immune cell subtype; CSNK2B was more strongly associated with the mast immune cell subtype (Fig. 5d).

Fig. 5
figure 5

Significant association between risk score and immune microenvironment. (a) The abundance of 22 immune cells in the high and low risk groups. (b) Differences in infiltration scores of 22 immune cells between high and low risk groups. The length of the box and whiskers in the figure can intuitively reflect the distribution and differences of different immune cells between the two groups. ns indicates p > 0.05, * indicates p < 0.05, ** indicates p < 0.01, ***indicates p < 0.001, **** indicates p < 0.0001. (c) Correlation analysis between prognostic genes and 22 kinds of immune cell infiltration. The rows in the heat map represent different prognostic genes and the columns represent different types of immune cells. The color changes indicate the strength of the correlation, with the gradient from blue (negative correlation) to red (positive correlation). (d) Circular tree diagrams show the correlation between prognostic genes and immune cell infiltration in different subtypes of tumors. Each axis represents a subtype of immune cell, and the correlation between prognostic genes and immune cells is indicated by points. The size of the points generally correlates with the strength of the correlation. Colors can be used to distinguish types of immune cells.

Expression of biomarkers and immune response correlation

The expression of 26 immune checkpoint genes differed significantly between the HRG and LRG. In the HRG, most of these immune checkpoint genes were expressed at higher levels, such as CTLA4, LAG3, and PDCD1 (Fig. 6a). Additionally, patients with high and low expression of CTSK, SRC, and CSNK2B showed differences in their response to immunotherapy (Fig. 6b–e). These varying immunotherapy responses were significant for the treatment of BLCA.

Fig. 6
figure 6

Expression of biomarkers and immune response correlation. (a) Differential expression analysis of immune checkpoint genes in high-risk and low-risk groups. The length of the box and whiskers can intuitively reflect the distribution and difference of different immune cells between the two groups. (b–e) The relationship between prognostic gene expression and patient response to immunotherapy. The vertical axis is the immune phenotype score associated with CTLA4 and PD-1, and the horizontal axis is the prognostic gene group according to the median value of gene expression. ns indicates p > 0.05, * indicates p < 0.05, ** indicates p < 0.01, ***indicates p < 0.001, **** indicates p < 0.0001.

High mutation of TP53

In HRG and LRG, TP53 exhibited the highest mutation frequency, with missense mutations being the most common (Fig. 7a–b). After the high mutation of TP53, the p53 protein encoded by it lost its function and may have even acquired oncogenic characteristics, making the cells more prone to malignant transformation, uncontrolled proliferation, and subsequently driving tumor formation and progression.

Fig. 7
figure 7

Genetic mutation and drug sensitivity analysis. (a) A waterfall chart of somatic mutations in high risk groups and low risk groups. (b) Different mutation sites of the TP53 gene between high and low risk groups. In the stick figure diagram, the horizontal axis usually represents the exon of a gene or the amino acid position of a protein, and the vertical axis represents the frequency or effect of the mutation. The length and color of each stick may represent the type of mutation or the severity of the effect. (c) IC50 sensitivity analysis of chemotherapeutic drugs between high and low risk groups. The top 5 with the smallest P values are shown.

Revealing drug sensitivity

The drug sensitivity analysis revealed 135 drugs differed in sensitivity between HRG and LRG (P < 0.05). Drugs KU.55933, NU7441, AZD8186, SB216763, and BMS.754807 exhibited higher IC50 values in HRG (Fig. 7c).

Exploring the molecular regulatory mechanisms of biomarkers

A total of 84 differentially expressed miRNAs, 915 differentially expressed lncRNAs, and 4035 differentially expressed circRNAs were identified (Fig. 8a). After integrating with data from databases, 2 key miRNAs, 42 key lncRNAs, and 1124 circRNAs were obtained (Fig. 8b). The identified regulatory networks included ADAMTSL4-AS1-hsa-miR-149-5p-SRC and DMD-hsa-miR-149-5p-SRC (Fig. 8c). These potential regulatory mechanisms had implications for the development of BLCA.

Fig. 8
figure 8

Exploring the molecular regulatory mechanisms of biomarkers. (a) The volcano map shows the differences in miRNAs, lncRNAs, and circRNAs. The significance and change factor of differential expression were shown. (b) The Venn diagram shows targeting miRNAs, targeting lncRNAs and targeting circRNAs. (c) Based on the construction of ceRNA targeting lncRNAs, circRNA, miRNAs, and prognostic genes, the relationships between genes in the figure are presented in a network diagram. Red dots represent targeted lncRNAs and circRNA, yellow dots represent targeted miRNAs, and blue dots represent prognostic genes. The lines connecting the dots represent the regulatory relationships between genes.

Biomarker expression and survival analysis

Expression analysis showed that CTSK was significantly downregulated in the BLCA group, while MTERF3, SRC, and CSNK2B were significantly upregulated (Supplementary Fig. S2a). KM curve analysis revealed that patients in the high expression group of CTSK had poor survival outcomes, whereas patients in the high expression groups of MTERF3, SRC, and CSNK2B had better survival outcomes (Supplementary Fig. S2b).

Verification of expression of CTSK, MTERF3, SRC and CSNK2B

Quantitative RT-PCR analysis revealed significantly elevated mRNA expression levels of MTERF3, SRC and CSNK2B in bladder tumor cells compared to normal bladder cells (Supplementary Fig. S3), which is consistent with our previous bioinformatics studies using public datasets; However, there was indeed no statistical difference in the expression of CTSK.

Discussion

BLCA is one of the most common tumors of the urinary tract, and the overall prognosis for patients with intermediate to advanced stages remains poor at present 9. Mitochondrial autophagy is a mechanism by which cells selectively remove dysfunctional or damaged mitochondria to safeguard the functional integrity of the mitochondrial network 10. In different cancers, dysregulation of mitochondrial autophagy is closely associated with tumorigenesis, progression and drug resistance. Some research has found that mitochondrial autophagy is closely related to prostate cancer, and mitochondrial autophagy-related genes provide new genetic markers for predicting the prognosis of patients with prostate cancer 11; in the study of hepatocellular carcinoma (HCC), Parkin deficiency led to a decrease in the level of mitochondrial autophagy, and the toxicity of CD8 + T cells was weakened while the proportion of depletion-type T cells was increased, forming an immunosuppressive network, accelerating tumor growth 12; in urological cancers, dysregulation of this pathway leads to the accumulation of mitochondrial damage, which in turn promotes the proliferation of tumor cells and the development of drug resistance 13; lncRNA-RP11-498C9.13 promotes ROS generation and mitochondrial autophagy in bladder cancer cells by stabilizing PYCR1 mRNA generation and mitochondrial autophagy in bladder cancer cells, driving tumorigenesis, and this axis is expected to be a potential target for bladder cancer therapy 14. In this study, in order to explore the molecular linkage of the intrinsic intersection, we obtained clinical information and gene expression data of BLCA from public databases, screened four prognosis-related genes by LASSO regression analysis, based on these genes, calculated risk scores of patients, and provided new insights for exploring the treatment of BLCA through the construction and validation of the risk model, functional enrichment analysis, tumor immune microenvironment assessment, somatic mutation profiling, and pharmacogenomic sensitivity evaluation, molecular regulatory network construction, and expression level analysis, which provided new insights for exploring the treatment of BLCA.

Four biomarkers (CTSK, MTERF3, SRC and CSNK2B) were obtained in this study. CTSK (Cathepsin K) is a lysosomal cysteine protease that plays a degrading role in the extracellular matrix and is involved in osteoclast resorption and may be partially involved in impaired bone remodelling 15. Research has shown that glucocorticoids upregulate CTSK in osteoblasts through PINK1-mediated mitochondrial autophagy, and that aberrant accumulation of CTSK, a lysosomal cysteine protease, may interfere with lysosomal function, indirectly affecting mitochondrial autophagy 16. CTSK has been found to inhibit mitochondrial autophagy through the mTOR pathway, leading to accumulation of damaged mitochondria and oxidative stress, thereby promoting tumor survival 15. CTSK may also contribute to reactive oxygen species (ROS) accumulation through degradation of antioxidant proteins or promotion of metabolic reprogramming. Excessive ROS can activate mitochondrial autophagy to scavenge damaged mitochondria, which in turn promotes bladder cancer via the lncRNA-RP11-498C9.13/PYCR1 axis 16 development 14. Therefore, CTSK may affect bladder cancer by regulating mitochondrial autophagy and mediating oxidative stress and metabolic reprogramming, among other mechanisms. Numerous findings have shown CTSK’s association with various tumors, like Helicobacter pylori-associated gastric cancer 17 and prostate cancer 18. In addition, survival analysis showed that the risk of death in gastric cancer patients exhibiting elevated CTSK expression was 1.73-fold higher than that in patients demonstrating low CTSK expression levels, and high CTSK expression led to a great increase in stromal and immune cell infiltration in TME, which was closely associated with GC (notably early lymph node involvement and survival prognosis), and patients in the CTSK high expression group had a significantly lower OS and recurrence-free survival (RFS) than those in the low expression group 19. It has also been shown that CTSK is lowly expressed in BLCA tumor tissues, and it has been validated in clinical samples. In this study, CTSK was also lowly expressed in BLCA tumor tissues, which is consistent with the results of a previous study 20. However, the KM curve analysis showed that patients in the high CTSK expression group had poorer survival outcomes. The paradoxical association between high CTSK expression and adverse prognosis may stem from the following factors: Stage Discrepancy: Gene pathways associated with invasion, such as extracellular matrix degradation and cell migration, are significantly enriched in invasive bladder cancer (MIBC). CTSK, a key regulator of these pathways, may be expressed at a higher level in invasive bladder cancer than in non-invasive bladder cancer (NMIBC) 21; Functional biphasicity: In a research on colorectal cancer development, it was found that an imbalance in the intestinal microbiota induced tumor cells to secrete CTSK at an early stage in an attempt to limit aberrant proliferation of tumor cells and maintain tissue homeostasis 22. When the tumor progresses to advanced stages, the role of CTSK undergoes a significant shift, and CTSK promotes tumor metastasis through degradation of the extracellular matrix. In a variety of cancer types, such as colorectal 23, prostate, and ovarian cancers 15, the high expression of CTSK is closely associated with tumor aggressiveness and poor prognosis.

MTERF3(Mitochondrial Transcription Termination Factor 3), the most conserved MTERF family member, negatively regulates mtDNA transcription. 24. MTERF3 may affect the functional state of mitochondria by regulating mitochondrial transcription and protein synthesis. When mitochondrial function is impaired, MTERF3 may be involved in the regulation of mitochondrial autophagy and help to remove damaged mitochondria 25; It has also been shown that knockdown of MTERF3 inhibits lung cancer cell proliferation and migration, enhances mitochondrial autophagy, and increases mitochondrial superoxide production 26; In bladder cancer, the In bladder cancer, inhibition of the metabolic regulator AMPKα promotes mTORC1 activation through a macrophage-mediated mechanism, and MTERF3 may indirectly regulate this pathway by maintaining mitochondrial function. In conclusion, MTERF3 may affect mitochondrial function by regulating mitochondrial transcription and protein synthesis, and participate in the regulation of mitochondrial autophagy in order to remove damaged mitochondria when the function is impaired. Knockdown of MTERF3 enhances mitochondrial autophagy, increases the production of mitochondrial superoxide, and inhibits the proliferation and migration of tumor cells, and it may also indirectly regulate the AMPKα/mTORC1 pathway through the maintenance of mitochondrial function, thus affecting bladder cancer. In addition, integrated IHC and RNA-seq analysis revealed significantly elevated MTERF3 protein and transcript levels in paraneoplastic non-tumorous tissues compared to thyroid carcinoma (THCA) specimens. 27. Survival analyses showed that reduced expression of MTERF3 was associated with poorer prediction. Furthermore, elevated MTERF3 expression showed a significant inverse correlation with cancer stemness properties and chemotherapeutic response in THCA 27.

SRC (Src Kinase), a tyrosine kinase, exerts its pro-carcinogenic function mainly by catalysing the tyrosine phosphorylation of various protein substrates 28. One study constructed a risk model based on genes such as SRC to predict clinical outcomes in BLCA, which showed a low survival rate in patients stratified into the high-risk category. BLCA patients were divided into high and low expression groups based on SRC expression, and the survival of the two groups was significantly different, suggesting that SRC is associated with survival in BLCA 29. SRC is a signature gene associated with mitochondrial autophagy 30; in research on uroepithelial carcinoma (UC), SRC was found to help tumor cells resist the killing effect of chemotherapeutic drugs by maintaining mitochondrial function and inhibiting apoptotic signaling, and most UC cases (about 90–95%) occurred in the bladder, suggesting that SRC may play a role in bladder cancer development by affecting mitochondrial autophagy-related pathways 31. CSNK2B (Casein Kinase 2 Beta Subunit) is a widely expressed protein kinase that modulates diverse cellular processes such as proliferation, differentiation and survival through serine/threonine phosphorylation of downstream targets 32. In the field of cancer research, emerging evidence is gradually suggesting that CSNK2B plays a key role in various malignancies. Previous studies demonstrate that Elevated CSNK2B expression has been found to activate NF-κB signalling in hepatocellular carcinoma (HCC), thereby promoting cell proliferation and inhibiting apoptosis in HCC 33. In another breast cancer (BC) study, higher CSNK2B expression was significantly associated with poor prognosis in BC patients 32. It has also been shown that CSNK2B is associated with survival in cervical cancer 34. CSNK2B is closely related to mitochondrial autophagy, and it may regulate the phosphorylation status of mitochondria-related proteins, which in turn affects the stability of mitochondria and promotes the initiation of mitochondrial autophagy 35; CSNK2B enhances the proliferation of cells by activating the mTOR signaling pathway in colorectal cancer 32, suggesting that CSNK2B may be involved in the regulation of mitochondrial autophagy and tumor cell proliferation through a similar mechanism in bladder cancer, thus affecting the progression of bladder cancer.

GSEA identified several key signaling pathways with significant enrichment in differentially expressed genes between HRG and LRG were ‘chemical carcinogenesis-DNA adducts’ and ‘cytokine- cytokine receptor interaction’. The chemical carcinogenesis-DNA adducts pathway refers to the covalent binding of carcinogenic chemicals to bases in the DNA molecule, resulting in the formation of chemical-DNA complexes. The formation of these DNA compounds is a key step in the chemical carcinogenesis process 36. This formation contribute to the development of cancer by causing DNA damage, gene mutations and chromosomal aberrations. It has been shown that the chemical carcinogenesis-DNA adducts pathway is significantly enriched in BLCA 37. Our study further found that the chemical carcinogenesis-DNA adducts pathway is associated with BLCA, confirming a possible role of this pathway in BLCA. In addition, this pathway is also associated with various cancers, such as colorectal cancer 38 and glioma 39. The association of cytokine-cytokine receptor interactions with mitochondrial autophagy has been clearly revealed in several studies. The interaction of cytokine IL-1β with its receptor IL-1R promotes mitochondrial autophagy by activating NLRP3 inflammatory vesicles. It was found that IL-1β-induced activation of NLRP3 inflammasome after LPS stimulation of macrophages was dependent on the modulation of mitochondrial autophagy 40, while inhibitory cytokines limited excessive inflammation by modulating autophagy 41. One study found that MSCs in the tumor microenvironment are able to produce the tryptophan metabolite kynurenine (Kyn), which in turn activates the AMPK signaling pathway. This enhanced mitochondrial autophagy and mitochondrial production in bladder cancer cells, increasing energy production and thus promoting the proliferation, migration and invasive ability of cancer cells, suggesting that mitochondrial autophagy supports the survival and progression of tumor cells to a certain extent 42; and there are also researches that developed in-situ convertible nanoparticles, KCKT, which can effectively inhibit tumor cell proliferation and progression by preventing damaged mitochondria from being autophagically clearance, exacerbating oxidative damage and effectively inhibiting bladder cancer, which highlights the importance of mitochondrial autophagy in bladder cancer therapy 43. These mechanisms suggest a correlation between biomarker-enriched signaling pathways and mitochondrial autophagy and, by affecting mitochondrial autophagy, also have an impact on the process of bladder cancer development.

The analysis on immune cell infiltration profiles between high- and low-risk patient subgroups showed that there were significant differences in the scores of 14 types of immune cell subsets, particularly CD8 + T lymphocytes and M2-polarized macrophages. Macrophages are highly heterogeneous and are classified into M1 and M2 types according to their activation status and function, which the high abundance of M2 macrophages indicates a poor prognosis for bladder cancer patients and a significant impact on the development of malignant tumors 44. In addition, studies have shown that CTSK, MTERF3, SRC and CSNK2B signature genes are closely related to various immune cells and immune functions, and they may be involved in the regulation of immune-inflammatory responses in BLCA.

Analysis on TP53 mutations in the high- and low-risk groups revealed that the mutation frequency of TP53 was the highest in both groups, and most of these mutations were missense mutations, while the high-risk group had more mutation sites. TP53 has been reported to be central to cancer-associated cellular functions, and high levels of mutation can render p53 proteins dysfunctional and oncogenic, driving tumor development 45. Its mutations have been shown to be associated with bladder cancer (BLCA) progression and poor prognosis 46. The present study further confirms that its mutation is closely associated with BLCA pathogenesis. We assessed the sensitivity differences of the two groups patients to 135 chemotherapeutic agents (such as KU.55933, NU7441, AZD8186, SB216763, and BMS.754807, among others).NU7441 has been found to affect triple-negative breast cancer genes and signalling pathways 47. In addition, expression analysis revealed that CTSK was lowly expressed in the bladder cancer group, and MTERF3, SRC, and CSNK2B were highly expressed. qRT-PCR results were basically consisten, but only the result of CTSK was inconsistent, and Kaplan–Meier analysis demonstrated significantly worse survival in CTSK-high patients, while favorable outcomes were observed in cases with elevated expression of MTERF3, SRC, and CSNK2B.

In conclusion, four biomarkers associated with MRGs were identified in this study by bioinformatics methods: CTSK, MTERF3, SRC and CSNK2B. A multivariable risk prediction model was established using the identified biomarkers and subsequently validated in independent cohorts. Biological processes significantly enriched in high-risk versus low-risk groups, associations with the immune microenvironment, drug sensitivity analysis, and regulatory network construction were analysed. These may provide new insights for the development of new strategies for BLCA treatment. Notwithstanding these findings, certain methodological limitations warrant consideration. Firstly, the sample size included was insufficient and there was some heterogeneity in different datasets, which may lead to some bias in our results. Secondly, we only validated the expression of the biomarkers at the cellular level and lacked clinical sample validation. In this qRT-PCR experiment, no statistically significant difference in CTSK expression levels was observed. We hypothesize that this result may be attributed to two factors: first, the relatively limited sample size of the experiment resulted in insufficient statistical validity to detect potential expression differences; second, the qRT-PCR technique itself has limitations in detection sensitivity and experimental variability, which may have affected the accurate capture of the real expression changes of CTSK. In our future research, we plan to carry out immunohistochemistry (IHC) and Western blot experiments using animal models for in-depth validation of CTSK; overexpression and knockdown experiments to clarify the specific roles of CTSK in bladder cancer (BLCA); and functional experiments, such as autophagy activity assay and autophagy assessment in cellular models to verify our hypotheses in order to enhance the reliability and validity of our research findings. In addition, we will continue to focus on BLCA and conduct a large number of experimental mechanism studies and clinical application researches in order to provide new strategies and targets for the diagnosis and treatment of BLCA.

Materials and methods

Data acquisition

The bladder cancer (BLCA) dataset (TCGA-BLCA) was obtained from UCSC Xena (https://xenabrowser.net/datapages/) (Accessed on May 9, 2024). This dataset included 411 BLCA tumor tissue samples (BLCA group) and 19 adjacent normal tissue samples (Control group), with survival information available for 406 of the BLCA samples. Clinical feature data, survival information, somatic mutation data, as well as miRNAs, lncRNAs, and circRNAs data were also downloaded. GSE32894 (GPL6947) was downloaded from the GEO database as the training set. This dataset included 224 BLCA tumor tissue samples with complete survival information. Additionally, based on existing literature and results from public database searches, a total of 137 MRGs (Supplementary Table S7) were identified 48.

Candidate genes acquisition

DEGs between the BLCA and Control groups in the TCGA-BLCA dataset were identified using the DESeq2 package (v 1.42.0) 49 (|log2 fold-change (FC)|> 0.5 and P < 0.05). The ggpubr package (v 0.6.0) 50 was used to generate a volcano plot of DEGs, with the top 10 upregulated and downregulated genes annotated. The pheatmap package (v 1.0.12) 51 was utilized to generate a heatmap of top 10 upregulated and downregulated genes. The clusterProfiler package (v 4.8.3) 52 was used to conduct the enrichment analysis of DEGs, including GO and KEGG enrichments (adj.P < 0.05). The top 5 GO terms for each category and the top 5 KEGG pathways were displayed based on adj.P-value.

Candidate genes were selected by intersecting the DEGs with MRGs using VennDiagram package (v 1.7.3) 53. A PPI network was established based on candidate genes using STRING database (confidence score > 0.4). The result was visualized using Cytoscape (v 3.10.2) 54.

Acquisition of biomarkers and construction of the risk model

In the TCGA-BLCA samples with survival information, a univariate Cox regression analysis was conducted using the survival package (HR ≠ 1, P < 0.05) (v 3.7.0) 55 to identify candidate biomarkers, and a forest plot was generated using the forestplot package (v 3.1.3) 56 to visualize the results. Within this research, the k-fold cross-validation method was utilized to assess the performance of the Cox regression model across various values of k, ranging from 2 to 10. This approach systematically evaluates how well the model predicts outcomes and generalizes across different subsets of the data. For each iteration of cross-validation, the Cox model was fitted on the training dataset and the C-index, a measure of predictive accuracy, was determined on the corresponding validation dataset. The mean and standard deviation (SD) of the C-index were calculated for each value of k, and these results were graphically represented with error bars to illustrate trends. Then, candidate biomarkers were initially selected using the proportional hazards (PH) assumption test (P > 0.05). The LASSO method was then used to pinpoint biomarkers (lambda min), utilizing the glmnet package (v 4.1–8) 57. Biomarkers were used to develop a risk model, and the risk score was calculated based on the following formula:

$$\sum\limits_{i = 1}^{n} {Coef(gene_{i} )*Expr(gene_{i} )}$$

N represented the number of genes included in the scoring system, β_i denoted the LASSO regression coefficient of the i-th gene, and x_i represented the expression level of the i-th gene. Next, based on the median risk score, the 406 BLCA samples in the TCGA-BLCA dataset were divided into HRG and LRG. The survival package (v 3.7.0) was employed for survival analysis of HRG and LRG, as well as for plotting KM curve (Log-rank test, P < 0.05). In addition, risk curves and survival status maps were plotted to demonstrate the distribution of patients with BLCA. The timeROC package (v 0.4) 58 was used to generate ROC curves for the BLCA samples at 1, 3, and 5 years and the AUC values (AUC > 0.6) were computed. To validate the risk model, the same analytical approach was used to assess the model in 224 BLCA samples from GSE32894 dataset.

Analysis of the association between risk score and clinical characteristics

In the TCGA-BLCA samples with survival information, the association between risk scores and clinical characteristics (including Age, T classification, Gender, N classification, Grade, M classification, TNM stage, and Status) was analyzed using the Wilcoxon test or Kruskal–Wallis test (P < 0.05).

GSEA

The TCGA-BLCA dataset were analyzed using the DESeq2 package (v 1.42.0) to identify differences between the groups. The log2FC was calculated, and the genes were sorted by log2FC from largest to smallest, generating a list of genes associated with HRG and LRG. The reference gene set “c2.kegg.v7.4.symbols” downloaded from the MSigDB was used for GSEA with the clusterProfiler package (v 4.8.3) (|NES|> 1, FDR < 0.25, and adj.P < 0.05). The top 5 pathways were visualized based on |NES| value. The cnetplot function in the clusterProfiler package (v 4.8.3) was also applied to visualize the potential regulatory relationships among the top 5 pathways.

Immune microenvironment analysis

To gain a comprehensive understanding of the immune microenvironment in BLCA, First, in the TCGA-BLCA samples with survival information, an immune infiltration analysis was meticulously conducted. The IOBR package (v 0.99.8) 59 was strategically employed to precisely determine the scores of 22 disparate immune cells present in various samples. The Wilcoxon test was methodically employed to meticulously assess the scores of infiltration between the HRG and LRG, with the specific aim of identifying those immune cells exhibiting significant differences (P < 0.05) for subsequent and more detailed investigation. Spearman correlation analysis was conducted to explore the relationships between biomarkers and immune cells using linkET package (|cor|> 0.30, P < 0.05) (v 0.0.7.4) 60. Furthermore, the correlation between biomarkers and 98 immune cell subtypes was analyzed using the TIMER database.

Immune therapy analysis

In the TCGA-BLCA samples with survival information, the Wilcoxon test (P < 0.05) was employed to analyze the expression of immune checkpoints (BTNL2, HHLA2, IDO2, ADORA2A, CD160, BTLA, CD40LG, TNFSF14, TNFRSF8, CD200R1, CD244, CD28, CD70, ICOS, TMIGD2, PDCD1, TNFSF4, TIGIT, TNFSF9, CTLA4, LAG3, LAIR1, CD48, CD27, TNFRSF4, CD200, CD276, CD44) between the HRG and LRG. Additionally, immune phenotype scores of BLCA patients (including anti-PD-1/PD-L1 treatment and anti-CTLA-4 treatment scores) were obtained from the TCIA database. BLCA patients were divided into high and low expression groups based on the median value of biomarker expression. The immune phenotype scores between the high and low expression groups of biomarkers were compared using the Wilcoxon test (P < 0.05).

Somatic mutation analysis

BLCA samples from the TCGA-BLCA dataset, which included somatic mutation data, were analyzed for mutation frequencies in HRG and LRG using the maftools package (v 2.18.0) 61. The top 20 genes with the highest mutation frequencies were displayed using a waterfall plot. At the same time, bar plots were used to display the different mutation locations of the most common mutated genes between the HRG and LRG.

Drug sensitivity analysis

To assess the drug sensitivity of HRG and LRG from the TCGA-BLCA dataset, half-maximal inhibitory concentration (IC50) values for 198 conventional drugs, obtained from the Genomics of GDSC database were calculated using the oncoPredict package (v 1.2) 62. Subsequently, differences between groups were compared using Wilcoxon test (P < 0.05). The top 5 drugs were displayed based on P-value.

Regulation network analysis

The DESeq2 package (v 1.42.0) was utilized to identify differentially expressed miRNAs, lncRNAs, and circRNAs between the BLCA and Control groups in the TCGA-BLCA dataset (The threshold was kept consistent with the differential analysis). The ENCORI and the miRWalk were used to predict miRNAs associated with the biomarkers. The VennDiagram package (v 1.7.3) was employed to integrate the results from the 2 databases and the differentially expressed miRNAs to obtain the key miRNAs. ENCORI and miRNet were used to predict lncRNAs associated with the key miRNAs. The VennDiagram package (v 1.7.3) was employed to integrate the results from the 2 databases and the differentially expressed lncRNAs to obtain the key lncRNAs. The ENCORI was used to predict circRNAs associated with the key miRNAs. The VennDiagram package (v 1.7.3) was employed to integrate the results from the database and the differentially expressed circRNAs to obtain the key circRNAs. Igraph package (v 2.0.3) 63 was utilized to visualize the ceRNA network.

Expression levels and survival analysis

The mRNA expression of biomarkers in the BLCA and Control groups was analyzed using the Wilcoxon test (P < 0.05) in all samples of the TCGA-BLCA dataset. Meanwhile, the BLCA patient samples with survival information from the TCGA-BLCA training set were split into high and low expression groups according to the median expression levels of the biomarkers. Subsequently, the survminer package (v 0.4.9) 64 was used to plot the KM curve (P < 0.05) to assess the differences in OS between the high and low expression groups of the biomarkers.

Cell culture

The normal human ureteral epithelial immortalised cell line SV-HUC-1 and the human bladder cancer cell line 5637 were used in this study, with F-12 K medium for the SV-HUC-1 cell line and 1640 medium for the 5637 cell line. Both culture media contained 1% penicillin solution and 10% fetal bovine serum as supplements. Both cell lines were purchased from Procell (Wuhan, China).

qRT-PCR

The isolation of RNA from SV-HUC-1 and 5637 cells was performed with Trizol reagent (Servicebio), and cDNA was synthesized following the manufacturer’s protocol provided by the Hifair® AdvanceFast One-step RT-gDNA Digestion SuperMix for qPCR(Yeasen, China). qRT-PCR carried out using LightCycler® 96 (Roche Applied Science) with an SYBR Green-based reagent (Yeasen,China). Primer sequences were designed as follows: CTSK,Forward: GGCTCAAGGTTCTGCTGCTAC, Reverse: TGTTATATTGCTTCCTGTGGGTCTTC; SRC, Forward: TCCAAGCCGCAGACTCAGG, Reverse: CATCCACACCTCGCCAAAGC; MTERF3, Forward: GCAGCCAATTTCAGAGGAAGAGG, Reverse: CAGAGTCTCAGAATGATCCACATAGTC; CSNK2B, Forward: GAGCCTGATGAAGAACTGGAAGAC, Reverse: CACGGTTGGTAAGGATGTAGCG.

Statistical assessment

The qRT-PCR results were analyzed and reported as mean ± standard error of the mean (SEM). Statistical comparisons between the two groups were performed using Student’s t-test with GraphPad Prism 10.1.2 (GraphPad Software, Inc.). Statistical significance was set at p < 0.05, with all experiments repeated three times. The bioinformatics analysis was carried out using R (v 4.3.1). The Wilcoxon test (P < 0.05) was employed to evaluate the differences between the two groups. In addition, a multiple comparison correction was performed to ensure the reliability of the results: firstly, the significance level for acceptance or rejection of the original hypothesis in a single hypothesis test was determined, then the appropriate multiple comparison correction method was selected, and finally, the adjusted p-value was compared to the significance level to determine the results of the significance test. In this study, to control the risk of false positives from multiple hypothesis tests, all original p-values were corrected for false discovery rate (FDR) using the Benjamini–Hochberg method. The significance levels are set based on the corrected p-values: p.adj < 0.001 is marked as “***”, p.adj < 0.01 as “**”, p.adj < 0.05 as “*”, indicating significant differences, while the rest are marked as “ns” (no significant difference).