Introduction

Acute B-lymphoblastic leukemia (B-ALL) is a hematological malignancy characterized by the uncontrolled proliferation of immature B-lymphocytes1. With the development of minimal residual disease monitoring(MRD) and dose-intensive chemotherapy, a large proportion of pediatric patients can achieve sustainable complete remission, leading to an estimated 5-year overall survival of 90% in childhood acute lymphoblastic leukaemia2. However, relapsed B-ALL remains a significant challenge, as 15–20% of patients still experience recurrence, resulting in substantially lower survival rates compared to those at initial diagnosis3. Recent data indicate that the 5-year overall survival (OS) rates for patients experiencing a first relapse of acute lymphoblastic leukemia (ALL) are approximately 50%, with significantly worse outcomes observed in cases of second or subsequent relapses4.

The evolution of leukemia leading to relapse follows a complex and branched pathway, where numerous abnormalities persist from initial diagnosis5. Genetic alterations in epigenetic regulators and chromatin modifiers are commonly observed in patients with relapsed acute lymphoblastic leukemia, potentially contributing to reducing treatment efficacy by influencing the pharmacokinetics of anti-leukemic agents and interactions between the tumor microenvironment and ALL cells as well as tumor biology itself6. Risk variants in PDE4B gene is related to to relapse through their effects on methotrexate pharmacodynamics and tumor sensitivity to steroids7. As comprehensive genomic profiling provides deeper insights into the biology of relapse, the incorporation of molecularly targeted therapies remains an active and evolving area of research3. This underscores the critical need to identify novel biomarkers that can more accurately predict relapse and enhance long-term outcomes in pediatric B-ALL.

Next-generation sequencing has significantly enhanced risk stratification by identifying novel acute lymphoblastic leukemia(ALL) subtypes and uncovering prognostic genes and signaling pathways8. Programmed cell death (PCD) is a regulated biological process essential for tumor progression9. Recent discoveries have revealed multiple forms of PCD including apoptosis, pyroptosis, ferroptosis, autophagy, necroptosis, cuproptosis, disulfidptosis, parthanatos, entotic cell death, NETosis, lysosome-dependent cell death, alkaliptosis, oxeiptosis and zinc-dependent cell death10. In B-ALL, the dysregulation of the PI3K/AKT/mTORC1 and NOTCH1 signaling pathways has been shown to affect the regulation of autophagy. Glucocorticoids have been employed to induce autophagy and cell death in B-ALL treatment11. However, autophagy has also been shown to promote cell proliferation and drug resistance in ETV6-RUNX1-positive cells, further complicating treatment outcomes12. In pediatric ALL, remission failure and relapse are associated with a reduced Bax/Bcl-2 ratio, a key factor balancing anti-apoptotic and pro-apoptotic activities13. Despite these insights, comprehensive studies examining the relationship between PCD patterns and pediatric B-ALL are needed14,15,16.

In this study, we aim to identify a PCD-related signature to enhance relapse prediction in B-ALL. This approach intends to elucidate the role of PCD in B-ALL relapse, ultimately contributing to more personalized and effective therapeutic strategies for patients.

Materials and methods

Data collection

RNA-sequencing (RNA-seq) data and clinical profiles of pediatric B-ALL patients and healthy controls were obtained from the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) via the UCSC Xena platform, the Gene Expression Omnibus (GEO) and the GTEx database. Additionally, RNA-sequencing data from 78 pediatric B-ALL patients, provided by Nanfang Hospital in China, were collected along with comprehensive clinical information. The TARGET cohort served as the discovery cohort, while the Nanfang cohort and GSE7440 were employed as validation cohorts. Notably, GSE7440 did not report relapse-free survival (RFS) but instead classified patients by relapse status, with groups categorized as either continuous complete remission for more than 4 years (CCR) or relapse within 3 years of initial diagnosis. This study received approval from the Ethics Committee of Nanfang Hospital, Southern Medical University (NFEC-201506-K3) and was conducted in accordance with the Declaration of Helsinki. Clinical characteristics of the TARGET cohort, Nanfang cohort and GSE7440 are shown (Table 1).

Table 1 Characteristics of the TARGET cohort, Nanfang cohort and GSE7440.

A comprehensive collection of 1571 genes associated with PCD was compiled from GSEA gene sets, KEGG, and published literature17. This collection encompassed 578 Apoptosis genes, 52 Pyroptosis genes, 88 Ferroptosis genes, 368 Autophagy genes, 101 Necroptosis genes, 14 Cuproptosis genes, 24 Disufidptosis genes, 9 Parthanatos genes, 15 Entotic cell death genes, 8 Netotic cell death genes, 220 Lysosome-dependent cell death genes, 7 Alkaliptosis genes, 5 Oxeiptosis genes, 24 Zinc-dependent cell death genes, 24 NETosis genes, and 34 Immunogenic cell death genes.

Identification of PCD patterns related to relapse

To examine the relapse-related characteristics of each PCD pattern, we utilized single-sample gene set enrichment analysis (ssGSEA) using the R package “Gene Set Variation Analysis (GSVA)” to calculate the enrichment scores of 16 PCD patterns in the TARGET cohort. Subsequently, we conducted univariate Cox regression analysis to identify significant relapse-associated PCD patterns, employing a significance threshold of p< 0.0518.

Single-cell RNA-seq data processing

This study utilized single-cell RNA sequencing (scRNA-seq) data from seven pediatric B-ALL patients at initial diagnosis, obtained from GSE130116. Quality control was performed using the R package “Seurat”, applying screening criteria of nFeature_RNA > 300, percent.mt < 30, and nCount_RNA > 100019. The top 3,000 genes were subsequently filtered using the “FindVariableFeatures” function. Cell cluster classification into specific types was conducted using the R package “SingleR”20. The activity of each PCD pattern was evaluated using the R package “AUCell”.

Construction of CDI model

The TARGET cohort was randomly divided into a training cohort (n = 75) and a testing cohort (n = 49) using a 6:4 ratio. Initially, univariate Cox regression identified genes (p< 0.05) associated with PCD patterns related to relapse. Subsequently, LASSO regression analysis, implemented through the R package “glmnet,” selected genes with non-zero penalty coefficients. These selected genes underwent multivariate Cox regression analysis to construct a CDI model21. Finally, patients were classified into high-CDI and low-CDI groups based on the optimal cut-off value determined from the CDI model.

Construction and evaluation of the nomogram

To evaluate the independent predictive capability of the CDI model, we conducted both univariate and multivariate Cox regression analyses. We generated and visualized a nomogram using the R packages ‘rms’ and ‘regplot’, respectively. Furthermore, we employed calibration plots to assess the model’s accuracy.

Tumor environment analysis

CIBERSORT was utilized to compare the proportions of 22 immune cell types between high-CDI and low-CDI groups within the TARGET cohort. Pearson correlation analysis was employed to assess the relationships between CDI scores and immune cell types. Additionally, the study examined the correlations between CDI scores and immune checkpoints, as well as modulators. ESTIMATE was used to compare the cell score, immune cell score and ESTIMATE score between high-CDI and low-CDI groups.

Analysis of functional enrichment and drug sensitivity

To examine the relationship between CDI scores and enriched signaling pathways, we utilized the R package ‘GSVA’ to evaluate biological functions and signaling pathways. Subsequently, we employed the R package ‘pRRophetic’ to compute the half maximal inhibitory concentration (IC50) values for 138 drugs in both high-CDI and low-CDI groups, thus assessing drug sensitivity22.

Statistical analysis

All statistical analyses were performed using R software (version 4.1.2). The Wilcoxon test was employed for comparisons between two groups. The ‘survminer’ package was utilized to determine the optimal cut-off value. Kaplan-Meier survival analysis was conducted using the log-rank test. Receiver operating characteristic (ROC) curves were used to evaluate the specificity and sensitivity of the risk score, and the AUC was calculated.

Results

General landscape of programmed cell death patterns in pediatric B-ALL patients

Figure 1 shows the workflow of the integrative bioinformatics analyses. To comprehensively characterize the landscape of PCD patterns in B-ALL, we compiled gene sets representing 16 distinct forms of PCD, including apoptosis, pyroptosis, ferroptosis, autophagy, necroptosis, cuproptosis, disulfidptosis, parthanatos, entotic cell death, netotic cell death, lysosome-dependent cell death, alkaliptosis, oxeiptosis, zinc-dependent cell death, NETosis, and immunogenic cell death. To evaluate the significance of these PCD patterns within the TARGET cohort, we employed univariate cox regression analysis to identify patterns significantly associated with relapse (p < 0.05) (Fig. 2A). Notably, nine PCD patterns emerged from this analysis: ferroptosis, autophagy, necroptosis, entotic cell death, and alkaliptosis were identified as protective factors, whereas apoptosis, netotic cell death, oxeiptosis, and NETosis were categorized as risk factors. Subsequently, Kaplan-Meier analysis was utilized to elucidate the relapse-related roles of these nine PCD patterns in pediatric B-ALL (Fig. 2B).

Fig. 1
figure 1

Workflow of the integrative bioinformatics analyses.

Fig. 2
figure 2

Landscape of PCD patterns in B-ALL patients. (A) Univariate Cox analysis of 9 relapse-associated PCD patterns. (B) Kaplan-Meier survival curves of 9 PCD patterns associated with relapse.

Exploration of PCD patterns in single-cell sequencing

To further investigate 16 PCD patterns in B-ALL, we utilized the single-cell database GSE130116, which comprises data from 7 pediatric B-ALL patients at initial diagnosis. After conducting multiple quality control and filtering steps, we obtained an expression matrix consisting of 21,209 genes across 39,926 single cells. Using the ‘SingleR’ package, we classified the cells into seven major cell types: T-cells, Pro-B-cell-CD34+, CMP, NK-cells, B-cells, Monocytes, and Erythroblasts (Fig. 3A). Subsequently, we employed AUCell to calculate the activity scores of 16 PCD patterns in each cell (Fig. 3B-J, Supplementary Fig. 1). Notably, with the exception of oxeiptosis and netotic cell death, most relapse-related PCD patterns exhibited widespread activation, particularly apoptosis, autophagy, ferroptosis and necroptosis.

Fig. 3
figure 3

Investigation of various PCD patterns in single-cell RNA sequencing (scRNA-seq). (A) The distribution of cell types. (B-J) The distribution of activity scores for: apoptosis (B), autophagy (C), entotic cell death (D), ferroptosis (E), necroptosis (F), alkaliptosis (G), netotic cell death (H), NETosis (I), and oxeiptosis (J).

Construction of the CDI model for pediatric B-ALL patients at diagnosis

Through univariate Cox regression analysis (p < 0.05), we identified 356 relapse-related genes from nine relapse-associated PCD patterns. Subsequently, LASSO regression was employed to refine these candidate genes, resulting in the identification of seven genes with non-zero penalty coefficients: Bcl-2-interacting killer (BIK), Translocator Protein (TSPO), BCL2L2, PIP4K2C, mixed-lineage kinase-like (MLKL), STAT2, and WW domain-containing oxidoreductase (WWOX) (Fig. 4A-B). A multivariate Cox regression model was constructed based on these seven genes, yielding the following formula: cell death index (CDI) = (0.08118 × BIK exp) + (0.35611 × TSPO exp) + (−0.03171 × BCL2L2 exp) + (0.24034 × PIP4K2C exp) + (−0.18908 × MLKL exp) + (−0.15406 × STAT2 exp) + (0.74128 × WWOX exp). BIK, BCL2L2, and WWOX are associated with apoptosis, while TSPO and PIP4K2C are linked to autophagy, and MLKL and STAT2 are related to necroptosis. To assess the significance of these model genes, we compared their expression levels between B-ALL and normal samples using the Wilcoxon test (Supplementary Fig. 2A). The training and testing cohorts were classified into high-CDI and low-CDI groups based on the optimal cut-off value of 6.378772 and 4.704651, respectively. Kaplan-Meier analysis revealed that patients in the high-CDI group exhibited worse RFS compared to those in the low-CDI group in both the training and testing cohorts (Fig. 4C-D). Risk curves and survival status plots confirmed that patients with high CDI had lower survival rates than those with low CDI. The AUC values for 1-year, 3-year, and 5-year relapse-free rates were 0.890, 0.883, and 0.943, respectively, in the training cohort (Fig. 4E) and 0.896, 0.752, and 0.935, respectively, in the testing cohort (Fig. 4F). Additionally, we utilized the Nanfang cohort and GSE7440 as validation cohorts. The best cut-off value for Nanfang cohort was 2.398343. Kaplan-Meier analysis demonstrated worse RFS in the high-CDI group compared to the low-CDI group (Supplementary Fig. 2B), with the AUC value for 2 years being 0.778 (Supplementary Fig. 2C). The CDI scores of relapsed patients in GSE7440 were significantly higher than those of patients reported to be in complete clinical remission (CCR), as determined by Wilcoxon tests, effectively distinguishing between CCR and relapsed patients (Supplementary Fig. 2D).

Fig. 4
figure 4

Construction of the CDI model. (A,B) LASSO regression analysis for identifying candidate genes. (C-D) The Kaplan-Meier survival curves and survival status of different CDI groups are illustrated in the training cohort (C) and testing cohort (D). (E-F) The time-dependent ROC curves of the training cohort (E) and testing cohort (F).

The CDI signature emerged as an independent prognostic factor for pediatric B-ALL

This study employed both univariate and multivariate Cox regression analyses to evaluate the independent predictive value of the CDI for relapse. CDI model was compared to various clinicopathological features including age at diagnose, white blood count(WBC) at diagnose, MRD at the end of induction(day 29) and four fusion genes(BCR::ABL1, MLL-r, TCF3::PBX1, ETV6::RUNX1). The age and white blood cell count (WBC) at diagnose are comprehensively considered for risk stratification into the high-risk group (WBC > 50 or age between 10 and 18 years) and the standard-risk group (WBC < 50 and age between 1 and 10 years). The univariate Cox regression analysis identified CDI as a significant risk factor (HR = 1.26, 95% CI: 1.20–1.34, p < 0.001) (Fig. 5A). Multivariate Cox regression analyses, after adjusting for potential confounding factors, further substantiated the critical predictive role of CDI for relapse (HR = 1.3, 95% CI: 1.20–1.30, p < 0.001) (Fig. 5B). Interestingly, both univariate and multivariate Cox regression analyses indicate that high-risk group has a better prognosis than standard-risk group. On the other hand, TCF3::PBX1 fusion gene is associated with significant worse prognosis in both univariate cox analyses(p < 0.001) and multivariate cox analyses(p = 0.014). The prognostic impact of different genotypes may have influenced the abnormal results of stratification. Subsequently, we conducted univariate and multivariate cox analyses on both high-risk and standard-risk group. The results showed that CDI and age are risk factors in high-risk group in both univariate cox analyses and multivariate cox analyses(Fig. 5C-D). Furthermore, a nomogram was developed to predict the 1-year, 3-year, and 5-year relapse-free survival rates in high-risk group (Fig. 5E). The calibration plots demonstrated that the nomogram exhibited good accuracy in predicting RFS (Fig. 5F). DCA analysis demonstrated that the nomogram and CDI model performed better than the other predictors evaluated in high-risk group(Supplementary Fig. 3A). Univariate cox analyses and multivariate cox analyses were also conducted in standard-risk group (Supplementary Fig. 3B-C). The results showed that CDI and TCF3::PBX1 are significant risk factors in standard-risk group.

Fig. 5
figure 5

(A) Univariate Cox regression analysis and (B) multivariate Cox regression analysis demonstrated that the CDI model was an independent prognostic factor. (C) Univariate Cox regression analysis and (D) multivariate Cox regression analysis in high-risk group. (E) A nomogram was constructed using the CDI score and clinicopathological factors to predict 1-year, 3-year, and 5-year relapse-free survival rates in high-risk group. (F) Calibration plots for 1-year, 3-year, and 5-year relapse probabilities in high risk group.

Tumor microenvironment analysis

To investigate the relationships between various immune cell types and risk groups, we employed the CIBERSORT algorithm to compare the proportions of 22 immune cell types between the high-CDI and low-CDI groups. The analysis revealed that the high-CDI group exhibited a higher proportion of plasma cells, while the low-CDI group showed increased levels of macrophages M2, eosinophils, and neutrophils (Fig. 6A). All model genes demonstrated a strong correlation with immune cell infiltration (Fig. 6B). Moreover, the CDI score was significantly associated with immune checkpoints and modulators, indicating potential immune-related implications of the CDI model (Fig. 6C). Furthermore, we examined the correlation between the CDI score and immune cell types. The results showed that a high CDI score positively correlated with monocytes and negatively correlated with macrophages M2 (p < 0.05) (Fig. 6D-E). The results of ESTIMATE analyse showed that there was a significant difference in stromal cell score between the high-risk and low-risk groups(p = 0.0014) (Fig. 6F).

Fig. 6
figure 6

The relationship between CDI score and tumor microenvironment. (A) Differences in the infiltration of 22 immune cells between high-CDI and low-CDI patient groups. (B) Correlation between immune cells and 7 model genes. (C) Correlation between CDI score and immune checkpoints. (D-E) Correlation between CDI score with monocytes (D) and macrophages M2 (E). (F) Differences in stromal cell score, immune cell score and ESTIMATE score between high-CDI and low-CDI patient groups.

Functional enrichment analysis of CDI signature

We utilized ‘GSVA’ to investigate the correlation between HALLMARK pathways and CDI scores. The analysis demonstrated that CDI scores exhibited positive correlations with multiple HALLMARK pathways, including adipogenesis, DNA repair, estrogen response late, fatty acid metabolism, glycolysis, MTORC1 signaling, MYC targets v1, NOTCH signaling, oxidative phosphorylation, pancreatic beta cells, peroxisome, reactive oxygen species pathway, UV response up, and xenobiotic metabolism. In contrast, CDI scores showed inverse correlations with IL6 JAK STAT3 signaling, interferon alpha response, interferon gamma response, KARS signaling up, and mitotic spindle (Fig. 7).

Fig. 7
figure 7

Correlation between CDI score and the enrichment scores of HALLMARK pathways.

Therapeutic value of prognostic signature

To examine the correlation between the model and drug sensitivity, we calculated the IC50 values for each drug in B-ALL samples to identify significant variations in drug responsiveness. The analysis demonstrated that the low-risk group exhibited enhanced sensitivity to sunitinib, ifosfamide, imatinib, tandutinib and bortezomib (Fig. 8A-E).

Fig. 8
figure 8

Differences in chemotherapy sensitivity between high-CDI and low-CDI groups. (A-E).

Discussion

This study investigated the activity of 16 PCD patterns in B-ALL and developed a CDI model for relapse prediction. The model incorporates seven PCD-related genes: BIK, TSPO, BCL2L2, PIP4K2C, MLKL, STAT2, and WWOX. The utility of the CDI model in clinical management was demonstrated through Kaplan-Meier and ROC curve analyses. Furthermore, its role as an independent risk factor was confirmed using both univariate and multivariate Cox regression analyses. The study also explored correlations between the CDI model and the tumor microenvironment, immunomodulators, and drug sensitivity.

Through a comprehensive analysis of various PCD patterns using RNA-seq and scRNA-seq, this study identified nine PCD patterns strongly associated with relapse in B-ALL: ferroptosis, autophagy, necroptosis, entotic cell death, alkaliptosis, apoptosis, netotic cell death, oxeiptosis, and NETosis. Apoptosis, a form of regulated cell death mediated by caspase serine proteases, was examined in detail17. The study found three model genes involved in apoptosis: BIK, BCL2L2, and WWOX. BIK and BCL2L2 belong to the BCL-2 family23. Previous research has linked remission failure in ALL to high Bcl-2/Bax ratios9. BIK, a pro-apoptotic BH3-only protein, interacts with anti-apoptotic and pro-survival proteins of the BCL2 family24. Overexpression of anti-apoptotic genes, including BCL2L1 (BCL-XL) and BCL2L2 (BCL-W), can inhibit cell death induced by anti-cancer drugs, leading to chemotherapy resistance25. The WWOX gene functions as a tumor suppressor in leukemia, potentially by inhibiting proliferation and promoting apoptosis through the mitochondrial pathway26. Autophagy, crucial for regulating energy metabolism, influences B-ALL development and drug resistance27. This study identified TSPO and PIP4K2C, both derived from autophagy, as candidate model genes. TSPO, localized at the outer mitochondrial membrane, is involved in regulating oxidative stress, iron homeostasis, cholesterol transport, and immune cell infiltration28,29,30. The PIP4K2 family includes PIP4K2A, PIP4K2B, and PIP4K2C. Pharmacological inhibition of these kinases induces cell death and mitochondrial damage in ALL cells31,32,33. Necroptosis, a regulated form of necrosis, relies on the phosphorylation of MLKL by receptor-interacting protein kinase 1 (RIPK1) and RIPK334. Notably, mixed-lineage kinase domain-like protein (MLKL), a key mediator of necroptotic cell death that modulates membrane permeabilization and promotes the release of granulocyte colony-stimulating factor (G-CSF) in leukemia cells, was identified as one of the model genes35,36. Another necroptosis-related model gene, signal transducer and activator of transcription 2 (STAT2), a member of the STAT family that modulates responses to extracellular signaling proteins, acts as a survival factor in BCR-ABL-positive cells37,38. Further investigation is necessary to elucidate the role of STAT2 in B-ALL. Ferroptosis, a form of programmed cell death regulated by iron-dependent lipid peroxidation, plays a critical role in Philadelphia chromosome-negative B-ALL patients39. Additionally, this study demonstrated that entotic cell death, alkaliptosis, netotic cell death, oxeiptosis, and NETosis are associated with relapse in B-ALL. However, their roles in B-ALL require further investigation.

Acute B-lymphoblastic leukemia (B-ALL) modifies the surrounding microenvironment to establish a niche that supports self-renewal and maintenance40. Notably, our investigation revealed elevated expression of M2 macrophages in patients within the high-CDI group of the TARGET cohort. Macrophages M0 differentiate into two subtypes: M1-like macrophages, which primarily participate in pro-inflammatory responses, and M2-like macrophages, which are mainly involved in anti-inflammatory responses41. Dysregulated monocyte/macrophage compartments are part of the extensive bone marrow (BM) microenvironment remodeling observed at B-ALL diagnosis and could represent potential targets for novel treatments in combination with conventional chemotherapy42.

Despite the promising potential of our model, several limitations should be acknowledged. First, the external validation cohort (Nanfang cohort) may introduce bias due to the relatively low number of positive cases. Second, the nomogram developed in this study was specifically designed for the high-risk group (WBC > 50 and age between 10 and 18 years), which restricts its generalizability to other patient populations. Moreover, it lacks external validation, underscoring the need for further validation with larger cohorts to assess the CDI model and nomogram’s applicability. Third, the lack of immunotherapy-related data hindered our ability to explore the relationship between CDI and drug sensitivity. This limitation emphasizes the need for additional research to evaluate the responsiveness of ALL to both conventional and emerging therapeutic agents. To further validate the role and function of CDI in pediatric B-ALL, more comprehensive experiments are needed, including protein-level analysis and functional assays such as Western blotting and cellular functional assays. These studies will be crucial in elucidating the specific mechanisms and roles of CDI in pediatric B-ALL.

In conclusion, this study provides a comprehensive demonstration of the activity of various PCD patterns in B-cell acute lymphoblastic leukemia (B-ALL). Additionally, it establishes correlations between the CDI model and TME characteristics, immune status, and drug sensitivity. These findings offer novel insights that may contribute to the development of personalized treatment strategies for B-ALL patients.