Introduction

Acute myeloid leukemia (AML) is the most common indication for allogeneic hematopoietic cell transplantation (allo-HCT), which offers a potentially curative must be considered before the procedure [1,2,3]. These risks include mortality, both short and long-term morbidity, and a significant impact on the subsequent quality of life directly linked allo-HCT adverse events [4,5,6,7]. Therefore, clinicians must balance the potential toxicities of allo-HCT and the relapse risk to determine allo-HCT eligibility.

According to the latest European LeukemiaNet AML risk classification guidelines (ELN 2022) [8], updated in 2022, allo-HCT should be considered when the relapse risk without transplantation exceeds 35–40%. This applies to a large group of patients with intermediate or adverse-risk AML in first complete response (CR1) after induction therapy and all patients who achieve a second or subsequent complete response. Additionally, treatment response and measurable residual disease (MRD) status serve as dynamic disease modifiers during follow-up, meaning that patients across all genetic risk groups may benefit from allo-HCT if the depth of response is insufficient.

Given the widespread use of the ELN 2022 guidelines in clinical practice, several studies have demonstrated the accuracy of these guidelines in predicting outcomes for patients with AML [9, 10] and more specifically in those undergoing allo-HCT consolidation [11, 12]. In this context, a recent single-center study validated the predictive ability of the ELN 2022 risk classification in patients diagnosed with AML in CR, and proposed a new AlloHCT-Refined ELN 2022 classification, introducing an additional subgroup within the adverse-risk category, termed Adverse-plus (AdvP). This subgroup included patients with complex karyotype (CK), MECOM(EVI1) rearrangements, and TP53 mutations or del (17p) (AdvP subgroup), that have a notably worse prognosis.

Considering the broad implementation of the the ELN 2022 guidelines in clinical practice, and the findings from the application of the Allo-HCT Refined ELN 2022 classification in a single institution, the Grupo Español de Transplante Hematopoyético y Terapia Celular (GETH-TC) initiated this multicenter, retrospective study to further investigate the prognostic value of the Allo-HCT Refined ELN 2022 classification in a large cohort of AML patients undergoing allo-HCT in CR.

Patients and methods

Patient selection

This retrospective, multicenter, registry-based analysis was conducted under the auspices of the Myeloid Malignancies Working Committee of the GETH-TC. GETH-TC is a non-profit, scientific society representing all HCT and cell therapy units in Spain and Portugal. All affiliates of the GETH-TC were invited to participate in the study, and sixteen institutions contributed to the project.

Inclusion criteria were as follows: adults aged 18 or older with AML treated with at least one line of anthracycline-based induction therapy who underwent their first allo-HCT in complete remission (CR1/CR2 or beyond) between January 2015 and June 2023. Only patients with information at diagnosis that permitted the retrospective risk classification into the ELN 2022 risk categories were included.

All patients signed informed consent for retrospective data collection. The study received ethical approval from the Ethics Committee of the Hospital Clínic de Barcelona and the GETH-TC, and was conducted following the standards set forth by the Declaration of Helsinki.

AML diagnosis, biological risk assessment (Allo-HCT refined ELN 2022 risk classification), and treatment

AML risk was defined after data collection according to the updated 2022 AML classifications [13, 14]. All patients included were retrospectively classified into favorable (Fav), Intermediate (Int), and adverse (Adv) risk groups according to the ELN 2022 risk classification [13, 14] and based on cytogenetic and mutational information documented at diagnosis. Subsequently, patients classified into the Adv risk group, were redistributed in the AdvP risk or the Adv* risk categories, according to the Allo-HCT Refined ELN 2022 classification proposed by Jimenez-Vicente et al. [12]. Specifically, patients with a complex karyotype, inv(3)/t(3;3)/MECOM (EVI1) rearrangement, or TP53 mutations and/or loss of the 17p region at diagnosis were included in the newly defined AdvP, while the rest of the adverse-risk genetic subgroups were reclassified in the Adv* group.

Biological abnormalities at diagnosis were defined through cytogenetic analyses (G-Banding Karyotype) and mutational analysis based on standard polymerase chain reaction (PCR)-based analysis (NPM1, FLT3-ITD,…) and targeted next-generation sequencing (NGS). MRD analysis was assessed before allo-HCT by every individual center according to the ELN-DAVID MRD 2021 updated recommendations [15].

All patients were treated with at least one line of anthracycline-based induction therapy, undergoing their first allo-HCT in a morphological CR (CR1/CR2 or beyond). Response criteria were defined according to the ELN 2022 Guidelines and assessed by the participant institutions.

Allo-HCT information and main definitions

Eligibility criteria for allo-HCT, donor selection and conditioning regimen adhered to standard practices and internal protocols from the participant institutions. Conditioning regimen intensity was defined using classical criteria and was tailored to chronological age and comorbidities. Most patients older than 55 years and those with clinically relevant comorbidities received reduced-intensity conditioning (RIC) regimens. Peripheral blood stem cell sources were infused into all patients. Grading of acute and chronic GVHD (aGVHD and cGVHD) followed established criteria [16,17,18,19]. Main definitions are detailed in the Supplementary Methods.

Endpoint and statistical analysis

The primary endpoint was the validation of the prognostic value of the Allo-HCT Refined ELN 2022 classification for overall survival (OS) prediction. Secondary outcome variables included leukemia-free survival (LFS) and cumulative incidence of relapse (CIR).

Categorical variables were presented as counts and percentages and continuous variables as medians with interquartile range (IQR). χ2, Fisher’s exact and analysis of variance tests were used to compare categorical and continuous variables.

OS and LFS were estimated using the Kaplan-Meier method. LFS was calculated until the date of first relapse, death from any cause, or the last follow-up for patients in morphological CR. CIR and non-relapse mortality were estimated using cumulative incidence and competing risk analyses. Death was used as a competing risk for CIR and relapse was used for NRM. Long-rank test was used for univariate analysis of LFS and OS and OS and Gray’s test for cumulative incidence. The impact of the main variable of interest (Allo-HCT Refined ELN 2022 risk groups) on transplant outcomes was explored using univariate (UVA) and multivariate (MVA) Cox regression analyses. The baseline risk factors included in each of the multivariable models were selected based on clinical judgment prior to the analysis and variables with a p value lower than 0.1 in the UVA. All p values were two-sided with statistical significance evaluated at the 0.05 alpha level. All statistical analyses were performed with R statistics version 4.3.2 (R core Team, R Foundation for Statistical Computing, Vienna, Austria).

Results

AML and allo-HCT baseline information

All baseline characteristics are displayed in Table 1. Overall, 651 patients were included in the study, with a median age of 55 years (interquartile rank (IQR): 44–62). Males comprised 52.7% (n = 343) of the patients, and 15.4% (n = 100) had an HCT-CI > 3. Most patients were diagnosed with de novo AML (86.6%, 564), and 13.4% (n = 100) with secondary or therapy-related AML. All patients were treated with at least one line of anthracyclines-induction therapies, and 84.9% (n = 553) of them, underwent their allo-HCT in CR1

Table 1 Baseline characteristics.

Related to the allo-HCT characteristics, 61.3% (n = 399) of the patients received myeloablative conditioning regimens (MAC), while 38.7% (n = 252) underwent allo-HCT using RIC protocols. Allo-HCT was performed from matched sibling donors (MSD) in 32.7% (n = 213) of the patients, from 10/10 HLA-matched unrelated donors (MUD) in 31.2% (n = 203), from 9/10 mismatched unrelated donors (MMUD) in 7.5% (n = 48), and from haploidentical donors in 27.1% (n = 187). GVHD prophylaxis with PTCY platforms were administered to 51.9% (n = 334) of the patients.

Allo-HCT refined ELN 2022 risk classification

According to the Allo-HCT Refined ELN 2022 risk classification, 19.4% (n = 126) were categorized as Fav, 38.1% (n = 248) as Int, 27.2% (n = 177) in the Adv* and 15.4% (n = 100) in the AdvP. As reported in Table 1, median age (58 and 57 vs. 53 and 48 years, p < 0.001), and the proportion of patients with secondary or therapy-related AML (15% and 21.7% vs. 9.3% and 4.2%, p < 0.001) were higher in patients grouped in the AdvP and Adv* groups than in those classified into the Fav and Int groups. In contrast, the proportion of patients undergoing allo-HCT in CR2 or beyond (41.3% vs. 10.9%, 6.2% and 8%, p < 0.001) and with positive MRD status (52.3% vs. 31.9%, 33.9% and 26%, p < 0.001) were superior in the Fav group than in the Int, Adv* and AdvP groups.

Key allo-HCT features, including conditioning intensity, donor type, and GVHD prophylaxis were balanced among the different groups, with no significant differences observed

Allo-HCT refined ELN 2022 and ELN 2022 risk classification outcomes

The median follow-up for LFS of the study cohort was 44.2 months (95% CI: 37–49). During follow-up, 28.4% (n = 185) of the patients relapsed and 34.5% (n = 225) died, being relapse the most frequent cause of death (50.7%, n = 109). At 5 years, OS and LFS rates were 63.4% (95%CI: 59.5–67.5) and 55.8% (95%CI: 51.9–60.1) (Fig. 1a). The 5-year CIR of the entire cohort was 30.2% (95%CI: 26.8–33.4), with a 5-year NRM rate of 13.9% (95%CI: 11.9–15.9) (Table 2).

Fig. 1: Survival outcomes.
figure 1

Overall survival and leukemia-free survival of all patients included in the study (a). Overall survival (b) and leukemia-free survival (c) of patients according to Allo-HCT Refined ELN2022 risk classification.

Table 2 Main outcome and relevant morbidity estimators.

Post-transplant outcomes of patients were analyzed by the ELN 2022 risk classification and Allo-HCT Refined ELN 2022 risk classification. The ELN 2022 risk classification stratified survival accurately among the patients included in the Int and Adv risk (p < 0.001), with 5-year OS rates of 70% (95%CI: 64–76) and 52% (95%CI: 46–59) respectively, while there were no differences between Int and Fav risk groups (74% vs. 70%, p = 0.32). LFS showed the same pattern, discriminating properly between Int and Adv risk (p < 0.001), without significant differences comparing Fav and Int risk groups (p = 0.41) (Supplementary Data, Fig. 1).

According to the Allo-HCT Refined ELN 2022 risk classification, different outcomes were observed in patients classified into the AdvP compared to the other groups. Patients categorized into the AdvP risk group category had lower OS and LFS (5-y OS: 28.4% (95%CI: 20.2–39.9), 5-y LFS: 24.3% (95%CI: 16.8–35.2), both p < 0.001) (Fig. 1b, c) and higher CIR (5-y CIR: 64.3% (95%CI: 54.5–74.1), p < 0.001) (Fig. 2). Patients allocated in the Adv* group had similar OS (5-y OS: 66.7% (95%CI: 59.5–74.7) vs. 70.2% (95%CI: 64.5–76.4), p = 0.69) and LFS (5-y LFS: 55.9% (95%CI: 48.4–64.7) vs. 63.8% (95%CI: 57.9–70.4), p = 0.33) to those classified into the Int risk group. NRM rates were showed no differences among the different risk groups (AdvP 5y-NRM: 11.6% (95%CI: 5.1–18.1), p = 0.523).

Fig. 2: Cumulative incidence of non-relapse mortality and relapse in the patients of this study according to the Allo-HCT Refined ELN 2022 risk classification.
figure 2

Cumulative incidence of non-relapse mortality and relapse at 2 years (a) and at 5 years (b). 1: Favorable risk, 2: Intermediate Risk, 3: Adverse Risk, 4: Adverse Plus Risk.

The restricted analysis for patients allografted in CR1 confirmed these poor outcomes for the AdvP risk group (5-y OS: 29.5% (95%CI: 21–41.5), 5-y LFS: 25.1% (95%CI: 17.3–36.3), both p < 0.001) (Supplementary Data, Fig. 2).

The predictive power of the Allo-HCT Refined ELN 2022 risk classification for OS and LFS was further investigated using MVA (Fig. 3). It confirmed the lower OS (HR: 3.05, p < 0.001) and LFS (HR: 2.66, p < 0.001) than the rest of the subgroups. and that patients classified into the Adv* risk group had similar outcomes to those included in the Int risk group (OS (HR: 1.02, p = 0.90), LFS (HR: 1.17, p = 0.37).

Fig. 3: Multivariate analysis of survival outcomes according to the Allo-HCT Refined ELN 2022 risk classification.
figure 3

Overall survival (a) and leukemia-free survival (b). PTCY post-transplant cyclophosphamide, AML acute myeloid leukemia, MRD measurable residual disease, CR complete response.

In addition, the MVA revealed another two significant values for OS and LFS: adults older than 55 years (OS (HR: 1.72, p = 0.002), LFS (HR: 1.41, p = 0.011)), and those undergoing allo-HCT with positive MRD status (OS (HR: 1.49, p = 0.007), LFS (HR: 1.51, p = 0.002)). Lastly, the use of PTCY-based prophylaxis was associated with an improved LFS (HR: 0.67 (95%CI: 0.5–0.91), p = 0.01). (Supplementary Data, Figs. 3 and 4)

Fig. 4: Multivariate analysis of survival outcomes in patients included in the Adverse-Plus risk group.
figure 4

Overall survival (a) and leukemia-free survival (b). PTCY post-transplantcy clophosphamide, AML acute myeloid leukemia, MRD measurable residual disease, CR complete response.

Subanalysis of the Adverse-Plus risk group

Further investigations were conducted in the 100 patients included in the AdvP group based to the genetic abnormalities identified at diagnosis. Among these patients, 22% n = 22) had MECOM(EVI1) rearrangements, and 78% (n = 78) had a CK and/or mutated TP53/del(17p).

During follow-up, 62% (n = 62) of the patients relapsed and 67% (n = 67) died. Two-year OS showed that patients with MECOM(EVI1) rearrangements had better survival compared to those with CK and/or TP53 mutated/del(17p) (2-y OS rate: 51% vs. 34% (p = 0.044), while there was also a trend for longer LFS (2-y LFS: 40% vs. 30% (p = 0.07) (Supplementary Data, Fig. 5). MVA confirmed the UVA results for OS, with patients having CK, TP53 mutations or del(17p) having a higher risk of death compared to those with MECOM(EVI1) rearrangements (HR: 2.39 (95%CI: 1.1–5.2, p = 0.027). There was also a trend towards shorter LFS (HR: 1.91 (95%CI: 0.63–3.9), p = 0.067).

MVA showed worse outcomes in patients older than 55 years (OS (HR: 2.16 (95%CI: 1.15–4.1), p = 0.017), LFS (HR: 1.91 (95%CI: 1.05–3.5), p = 0.033)). Interestingly, a negative pre-transplant MRD status did not show a significant survival benefit in this subgroup, neither for OS (HR: 1.31 (95%CI: 0.95–2.7), p = 0.349) nor for LFS (HR: 1.56, p = 0.11) (Fig. 4).

Discussion

This multicenter retrospective study validates the prognostic value of the Allo-HCT Refined ELN 2022 risk classification for AML patients treated with anthracycline-based induction therapies and undergoing allo-HCT. These findings highlight the enhanced value of this refined classification, particularly in distinguishing outcomes within the broad adverse risk group of the ELN 2022 guidelines.

One of the most important contributions of this study is the reclassification of patients from the ELN 2022 Adv category into two distinct risk groups: Adverse* (Adv*) and Adverse-Plus (AdvP). This refinement is critical, as it highlights the poor prognosis of patients in the AdvP group—those with complex karyotype [20], MECOM(EVI1) rearrangement [21], and/or TP53 mutation/del(17p) [22,23,24,25]. Patients in the AdvP subgroup showed significantly worse outcomes compared to the Adv* group, with a median OS of only 12 months post-transplant and 5-year OS and LFS rates of 32.3% and 24.3%, respectively. These results align with existing literature, which has consistently demonstrated the aggressive nature of these genetic abnormalities and their association with a higher risk of relapse after the procedure [11, 12].

Within the AdvP risk group, patients with MECOM(EVI1) rearrangements had relatively better outcomes compared to those with CK and/or TP53 mutations or del(17p). The 2-year OS rate for MECOM(EVI1) rearranged patients was 51% versus 34% for CK/TP53m patients (p = 0.044). However, despite this apparent difference, the overall prognosis of all patients in the AdvP group remains poor, and there is an urgent need for more personalized and intensive therapeutic strategies in these patients.

Interestingly, patients classified into the Adv* subgroup had similar outcomes to those in the Int group. This suggests that the Allo-HCT Refined ELN 2022 classification more accurately refines risk stratification, revealing that some patients previously classified as high-risk may actually have more favorable outcomes after allo-HCT. These findings align with recent evidence showing that patients with the most common abnormalities within the Adv* group, such as myelodysplasia-related abnormalities [26] and KMT2A rearrangements [27] have better outcomes when transplanted in first complete remission.

Furthermore, the refined Allo-HCT Refined ELN 2022 classification identifies a subgroup of patients (AdvP) with particularly dismal outcomes after allo-HCT. These findings indicate a potential but uncertain advantage of the role of allo-HCT consolidation in AdvP patients and highlight the urgent need for more effective strategies both before and after transplantation. Despite the poor outcomes for AdvP patients, the allo-HCT remains the only potentially curative option, particularly for younger individuals. Improving the response prior to allo-HCT with new frontline treatments, promoting a stronger graft-versus-leukemia effect through maintenance therapies or implementing preemptive treatment after molecular relapse could contribute to increase survival rates in this challenging subgroup.

Patients over 55 years and those with positive MRD status prior to allo-HCT had lower OS and LFS. The higher proportion of older patients in the Adv and Int groups may partially explain their worse prognosis, as well as the fact that older adults are often treated with RIC regimens, which, while improving safety, may be associated with higher relapse rates.

MRD positive status, measured according to the ELN-DAVID MRD 2021 recommendations, was found to be a predictor for disease relapse in the global MVA [28, 29]. However, MRD-positive status prior to allo-HCT was not associated with worse outcomes in the subgroup of patients classified into AdvP subgroup. This result may suggest that the genetic features of the AdvP subgroup confer such a poor prognosis that MRD status becomes less relevant. However, no molecular markers were used (e.g., NPM1 [30] and CBF mutation) to assess MRD, limiting the whole analysis to MFC. Recent evidence highlights that using more sensible MRD techniques in this context, such as NGS-MRD [31, 32], DNA sequencing [33,34,35] or ddPCR analysis of clonal hematopoiesis linked mutations [36] may be more sensitive for defining the AML response after treatments, guide therapeutic options, and help on decision making.

Finally, using PTCY-based prophylaxis was associated with improved LFS, supporting its ongoing use in clinical practice. The potential benefit of PTCY may stem from an inferior risk of severe chronic GVHD or shorter immunosuppressive therapy and potentially reducing relapse risk. This observation suggests that PTCY may be particularly advantageous for high-risk patients, as it could enable quicker immune reconstitution and a stronger graft-versus-leukemia effect, thereby improving outcomes [37].

There are limitations in the study that should be mentioned. First, the retrospective nature of the study and the limited number of patients in some subgroups may have impacted the robustness of the findings. Second, the MRD measurement heterogeneity across institutions limited the ability to thoroughly assess its impact on outcomes in the AdvP group. Lastly, there is a lack of information regarding the impact of new findings on disease biology, such as DDX41 mutations [38,39,40], or the effects of lower-intensity induction treatments such as venetoclax and azacitidine [41], which have been increasingly used in clinical practice and may affect outcomes in the future.

To conclude, this study validates the Allo-HCT Refined ELN 2022 classification for prognostic stratification of AML patients undergoing allo-HCT, particularly by identifying the AdvP subgroup, which requires special attention due to its significantly poorer outcomes [12]. This refined classification provides clinicians with a more precise tool for acknowledging allo-HCT outcomes and may help physicians to improve personalized treatment strategies for patients allocated in the AdvP risk group in order to prevent post-transplant relapse and improve survival.