Introduction

Acute myeloid leukemia (AML) currently represents the first indication of hematopoietic stem cell transplantation (HSCT) [1], especially in intermediate and poor risk profile in first complete remission (CR) [2] the sole potentially curative treatment for relapsed/refractory disease [3]. For decades, the association between busulfan and cyclophosphamide (BuCy2) has been the standard of care conditioning regimen for HSCT in AML patients, showing favorable outcomes when compared to the combination of cyclophosphamide and total body irradiation (TBI) [4, 5]. However, pivotal studies incorporating this regimen showed significant rates of non-relapse mortality (NRM) events, which make this therapy difficult to apply in patients older than 40 years old. Reduced-intensity conditioning regimens are associated with lower NRM, but this benefit could be offset by an increased incidence of relapse [6]. Therefore, new conditioning regimens were developed, to maintain anti-leukemic and myeloablative properties while reducing therapy-related toxicity, making it suitable also for older or comorbid AML patients. These are generally referred to as reduced toxicity myeloablative conditioning regimens. Among reduced toxicity regimens, the association of busulfan and fludarabine (BuFlu) represents a prototype [7]. Early experiences showed the efficacy of this combination in AML patients [8], showing nonetheless conflicting data on disease relapse when compared to BuCy2 in non-randomized studies [9, 10]. The AML-R2 trial was a phase 3 trial planned and conducted by the Gruppo Italiano Trapianto di Midollo Osseo (GITMO) comparing busulfan-fludarabine (BuFlu) and busulfan-cyclophosphamide (BuCy2) as myeloablative conditioning regimens in acute myeloid leukemia (AML) patients undergoing allogeneic stem cell transplantation (Clinicaltrial.gov identifier: NCT01191957) [11]. The primary endpoint was 1-year non-relapse mortality (NRM), which proved significantly lower in BuFlu (7.9% vs. 17.2% in BuCy2 arm, p = 0.026). There was no difference between the two arms in terms of cumulative incidence of relapse (CIR), leukemia-free survival (LFS), overall survival (OS) at 1 and 2 years after transplant. Grade III–IV acute graft versus host disease (GvHD) was significantly higher in BuCy2 group (12 patients [10%]) than BuFlu group (3 patients [2%]). Hereby, we present the long-term follow-up analysis of this clinical trial and provide a sub-analysis on patients older than 51 years.

Methods

Patients’ study population and trial design

Details concerning eligibility criteria and study treatments were reported in the original publication [11]. Briefly, patients were eligible if they were aged 40–65 years, had an AML in first or second CR, had an Eastern Cooperative Oncology Group (ECOG) less than 3, had a sibling or matched unrelated donor with a molecular high-resolution typing of the HLA-A, B, C and DRB1 [One antigen/allele disparity (class I) or one allele disparity (class II, DRB1) was acceptable]. Key exclusion criteria were represented by AML with t(15;17) or PML/RARα positive acute promyelocytic leukemia, or AML with t(8;21)(q22;q22), inv(16) or t(16;16)(p13;q22) without additional adverse cytogenetic abnormalities; a previous HSCT; presence of comorbidities that would preclude the HSCT or the ability to provide truly informed consent; an active neoplasm or history of malignancies within 2 years before enrollment. both treatment groups received the same intravenous myeloablative dose of busulfan, 0.8 mg/kg four times per day for four consecutive days. In BuCy2 group, busulfan was combined with a standard dose of cyclophosphamide, 60 mg/kg per day for two consecutive days (on days −4 and −3, total dose 120 mg/kg); in BuFlu group, cyclophosphamide was replaced by fludarabine 40 mg/sqm per day for four consecutive days (from day −6 through day −3, total dose 160 mg/sqm). Transplant was performed on day 0, with patients receiving either bone marrow or granulocyte-colony stimulating agent (G-CSF) mobilized peripheral blood stem cells (PBSC). Graft-versus-host-disease (GvHD) prophylaxis was based on cyclosporine-A (1.5 mg/kg twice per day, starting from day −1) and methotrexate (15 mg/sqm on day 1, 10 mg/sqm on days 3, 6, 11); in case of unrelated donors, anti-Thymocyte Globulin (ATG) was given at a total dose of 5 mg/kg (or 7.5 mg/kg in case of HLA acceptable disparity).

For this long-term analysis, upon obtaining a written informed consent, we evaluated the long-term NRM, LFS, CIR, OS, chronic GvHD and relapse-free survival (GRFS) and the incidence of cGvHD of any grade. We also searched for long-term adverse events, especially second neoplasms, which were defined as invasive cancers, excluding basal cell and in situ skin cancers [12]. Outcomes were evaluated from the day of transplant in the intention-to-treat population, for the entire study population and then further stratified according to the median age (51 years). The choice of analyzing this subpopulation is based on the paucity of data from randomized trial concerning the use myeloablative conditioning regimen for AML transplantation in older adults. The study was approved by the Bergamo ethics committee (n. 61/2022) and conducted in accordance with the Declaration of Helsinki.

Statistical analysis

All the clinical outcomes were calculated from transplant to the first event or the last follow-up. Non-relapse mortality (NRM) was defined as death from any cause not subsequent to relapse. Overall survival (OS) was defined as the time from transplant to death from any cause. Leukemia-free survival (LFS) was defined as the time from transplant to disease relapse or death from any cause, whichever came first. Chronic GVHD-free and leukemia-free survival (GLFS) was defined as the time from transplant to disease relapse or chronic GVHD or death from any cause, whichever came first. Disease relapse was defined as hematological relapse (blast counts >5% in bone marrow). NRM, cumulative incidence of relapse (CIR) and chronic GvHD incidence were estimated using cumulative incidence function, considering relapse and death as a competing event for NRM and other incidence, respectively; the Gray’s non-parametric test was used to assess group differences. OS, LFS and GLFS were estimated using the Kaplan-Meier method and the log-rank test was applied to test differences between groups. Univariate analysis was performed by fitting Fine and Gray models for cumulative incidences and Cox models for survival outcomes. Hazard ratio with 95% confidence intervals were reported. A significance level of 0.05 was fixed. All the analysis were performed with R software (version 4.0.0).

Results

From January 3, 2008, to December 20, 2012, 252 AML patients with a median age of 51 years, were randomized 1:1 to BuCy2 (n = 125) or BuFlu (n = 127). Treatment was delivered in 121 patients in BuCy2 arm and 124 patients in BuFlu arm; baseline patients’ characteristics are reported in Table 1. The median follow-up for the study population was 6 years (range 0.03–13). The NRM for the entire study population remained significantly different up to 4 years after transplant (10% in BuFlu arm vs. 20% in BuCy2 arm, p = 0.0388). The NRM was not different for patients younger than 51 years (10% in BuFlu vs. 14% in BuCy2, p = NS), but it was remarkably significant in patients older than 51 years (11% in BuFlu vs. 27% in BuCy2, p = 0.0262) (Fig. 1).

Table 1 Baseline patients’ and transplant characteristics.
Fig. 1: Non-relapse mortality.
figure 1

Non-relapse mortality (NRM) at year 4 and 10 after transplantation of the entire study population (A); Non-relapse mortality at 4 and 10 years in patients younger than 51 years (B); Non-relapse mortality at 4 and 10 years in patients older than 51 years (C).

Moreover, patients older than 51 years in the BuCy2 arm, had an incidence of non-relapse death of 23% in the first year after transplant, with infections being the first cause of death (14.3%), even more than disease relapse (10.7%); on the contrary, in this population, non-relapse death in BuFlu arm was 9.5% in the first year. Death events in the first year after transplant for patients older than 51 years are summarized in Fig. 2.

Fig. 2: Causes of death.
figure 2

Death events in patients older 51 years old in the first year after transplant.

Patients older than 51 years receiving BuFlu also experienced fewer organ-specific toxicities, which are early-onset adverse events strictly related to the conditioning regimen itself. As shown in Table 2, patients receiving BuCy2 had a higher number of any grade toxic event, especially grade 2–3 (27 events in BuCy arms vs. 17 events in the BuFlu arm).

Table 2 Conditioning-related toxicities in the first month after transplant in patients older than 51 years.

We observed 46 relapses in the BuCy2 arm and 53 in the BuFlu arm, with most relapses diagnosed in the first 2 years after transplant; 8% of relapses occurred later than 5 after transplant (3 in the BuCy2 arm, 5 in the BuFlu arm).

The CIR was not statistically different in both arms (39% vs. 37% in BuFlu vs. BuCy2 arm, Fig. 3). Similarly, we did not observe any difference in LFS (45% vs. 38% in BuFlu and BuCy2 arms, respectively) and OS (45% in both arms); curves are shown in Fig. 4.

Fig. 3: Cumulative incidence of relapse.
figure 3

A Cumulative incidence of relapse (CIR) at 10 years in the entire study population; B Cumulative incidence of relapse at 10 years in patients younger than 51 years; C Cumulative incidence of relapse at 10 years in patients older than 51 years.

Fig. 4: Leukemia Free and Overall Survival.
figure 4

Leukemia-free survival (LFS) at 10 years in the entire study population (A); Leukemia-free survival in patients younger than 51 (B); Leukemia-free survival in patients older than 51 (C); Overall survival (OS) at 10 years in the entire study population (D); Overall survival in patients younger than 51 (E); Overall survival in patients older than 51 (F).

We also did not observe a difference in cumulative incidence of moderate/severe cGvHD (15% in the BuFlu arm vs. 22% in the BuCy2 arm) and GRFS in the BuFlu arm vs. the BuCy2 arm (25% vs. 20% at 4 years; 20% vs. 17% at 10 years, respectively).

During the entire available follow-up, we observed a total of 137 deaths, 66 in BuCy2 arm and 71 in the BuFlu arm. Disease relapse was the main cause of death, occurring in 88 patients (37 in the BuCy2 arm vs. 51 in the BuFlu arm), while non-relapse events leading to death occurred in 49 patients (29 in Bucy2 arm vs. 20 in BuFlu arm). In the original trial, NRM events were 36, 23 in BuCy2 arm and 13 in BuFlu arm (p = 0.060). Long-term analysis showed 13 new NRM events, 5 in BuCy2 arm (2 for second neoplasm, 2 for infection, 1 for other unspecified reason) and 8 in BuFlu arm (5 for second neoplasm, 2 for infection, 1 for cardiovascular disease). Among the 13 deceased patients, 6 had a concomitant moderate/severe cGvHD. Table 3 summarizes all death events occurring along the entire available follow-up.

Table 3 Death events along the entire available post-transplant follow-up.

We detected a total of 19 secondary neoplasms, 6 in BuCy2 arm and 13 in BuFlu arm. Histotypes are detailed in Supplementary Table 1. In 7 cases, the second tumor was the cause of death (n = 5 in the BuFlu and n = 2 in the BuCy2 arm).

Discussion

The AML-R2 study planned and conducted by the GITMO group showed a reduction in NRM in BuFlu compared to the BuCy2 arm, with no detrimental effects on disease relapse. This long-term analysis was aimed to confirm and further evaluate these findings throughout the available follow-up. We showed that the benefit of NRM reduction is sustained up to 4 years after transplant and the incidence of late relapses does not offset this advantage. Moreover, NRM continues to be halved in BuFlu arm, as it was initially hypothesized and proved in the clinical trial 1 year after the transplant. NRM proved lower in BuFlu arm even 10 years after transplant but, due to the occurrence of late non-relapse death in BuFlu arm, (infections and second neoplasms above all) the statistical significance is lost. Our data are in line with the analysis led by Fasslrinner et al., who analyzed the long-term outcomes of the randomized trial comparing two TBI-containing conditioning regimens in HSCT for AML, and showed that NRM at 10 years was not significantly different between the two study arms [13]. Indeed, many registry analyses with follow-up close to 10 years showed that patients undergoing HSCT have an increased risk of death compared to the general age-matched population for the sole reason of having received an HSCT, independently of the type of conditioning regimen and the disease for which the patient was transplanted [14].

The performed sub-analysis on patients older than 51 showed that the global benefit of BuFlu is due to NRM reduction in this specific population. Randomized data on myeloablative conditioning in patients above 51 years are rather sparse. Indeed, most of the available clinical data on myeloablative conditioning come from non-randomized studies in younger patients. Our analysis shows the feasibility of the myeloablative BuFlu regimen even in adults older than 51, with a NRM similar to younger patients and an acceptable safety profile. Our results also confirm that the myeloablative BuFlu program allows an optimal balance between anti-leukemic activity and toxicity [15], with NRM comparable to those reported by recently published randomized trials comparing reduced intensity conditioning regimens [16, 17].

A recent retrospective analysis of the Acute Leukemia Working Party of the European Society of Bone Marrow Transplant (EBMT) compared outcomes of a myeloablative combination of treosulfan and fludarabine (FT14) with BuFlu in patients with AML stratified per age older or younger than 55 years. With the limitation of a retrospective analysis and a relatively short follow-up, the FT14 conditioning was associated with a greater incidence of relapse and shorter LFS in patients younger than 55, while all the other outcomes were similar in patients older than 55. Limiting the analysis on patients receiving BuFlu, 2-year NRM was 7.7% and CIR was 27.5% [18]. Another recent publication of the Acute Leukemia Working Party of the EBMT retrospectively compared BuFlu with the myeloablative regimen based on TBI 12 Gy and fludarabine showing no difference between these two conditioning regimens, with a 2-year NRM and CIR in BuFlu arm of 10.9% and 29.4%, respectively [19]. Collectively, these findings from an International Registry demonstrate that real-life outcomes with BuFlu are basically superimposable to the ones observed in the clinical trial.

A very recent randomized phase 3 trial from the Chinese group compared BuFlu and BuCy2 in the haploidentical setting. The 1-year NRM was 7.2% in BuFlu vs. 14.1% in BuCy2, whereas the 5-year relapse incidence was 17.9% and 14.2% respectively. This study showed that outcomes with BuFlu are independent of the donor type and the subsequently adopted GvHD prophylaxis. Moreover, authors found a reduced incidence of grade 3 regimen-related toxicities (0% in BuFlu vs. 9% in BuCy2), thus confirming that BuFlu is less toxic when compared to BuCy2, even in patients receiving post-transplant cyclophosphamide [20].

The long-term analysis of the AML-R2 trial showed that disease relapse nearly reaches 40% in both arms, with most relapses occurring within the first 2 years, in line with other reports [13, 21]. This finding allows some considerations: first, the prolonged reduction in NRM is not offset by increased disease relapse and this is sustained even at a very long follow-up. In addition, disease relapse remains the leading cause of death in both arms and up to 8% of relapses occur later than 5 years after transplant, meaning that the treating physician should always be aware of the risk of the relapse, even at a very long follow-up, and in case of suspect, disease relapse must be thoroughly investigated. Finally, the risk definition according to the ELN classification was not associated with a major impact on late relapses. It has to be noted that, when the trial was conducted, post-transplant therapies (e.g., FLT3 inhibitors) were not available and this might have had a role in the high incidence of relapse we observed in both arms. Innovative post-transplant therapies are urgently needed.

We observed a non-significant increase in relapse death in the BuFlu arm (53 relapses with 51 deaths) with regards to the BuCy2 arm (46 relapses and 39 deaths). This finding was already outlined in the original clinical trial publication and potentially linked to a deeper immunosuppressive microenvironment. However, the number of relapses does not allow us to draw a definitive conclusion, and mechanisms underlying late relapses are probably much more complex and likely unrelated to the conditioning regimen. Nonetheless, most patients who relapsed either after BuFlu or BuCy2, unfortunately died. It is therefore confirmed that the probability of prolonged leukemia control when transplant fails after myeloablative conditioning remains very limited.

The combination of late non-relapse and relapse-associated deaths in the BuFlu arm explains the absence of any meaningful difference in the OS observed in the two randomized arms.

Concerning long-term toxicities, we focused on second neoplasms, whose incidence was 8%. The incidence of secondary neoplasms in the entire cohort is in line with data showing a progressive rise when an appropriate long-term follow-up analysis is conducted [22]. We observed a slight increase in second neoplasm incidence in BuFlu arm, although the absolute number of events is too low to reach a statistical significance. The deeper immunosuppressive activity of fludarabine could be also related to this late development of a second neoplasm. Nonetheless, cancerogenesis is a complex process in which many other transplant-related features could play a role, mainly cGvHD with all the related immunosuppressive therapy. In addition, retrospective studies did not find any difference between RIC and MAC regimens with the development of secondary neoplasms [23, 24], further suggesting that the type of conditioning on second neoplasms is not straightforward.

This long-term analysis has some limitations. First, these long-term follow-up data were collected retrospectively and were merged with data from the clinical trial, which were collected prospectively. Second, this long-term analysis was not pre-planned, so information concerning long-term follow-up has been collected according to the local clinical practice of the participating centers. Finally, the clinical trial was not designed to identify differences according to the median age of the study population.

Nonetheless, we confirmed a sustained reduction in NRM when the myeloablative BuFlu is used as a conditioning regimen in adult AML. Maintaining a low NRM—as BuFlu does—is a central point because the indication to proceed to transplant largely depends on the accurate evaluation of the benefit potentially associated with a lower risk of relapse granted by the transplant and the risk of death due to the transplant itself [25].

Based on these long-term results from a randomized multicentre study, we believe that this preparative regimen should be considered as a standard of care conditioning regimen for AML patients undergoing HSCT, even in the haploidentical setting. The possibility to use this conditioning regimen can be extended to a selected group of AML patients older than 51 years, for whom a myeloablative regimen could provide a more active anti-leukemic effect [6, 26]. For these patients, BuFlu is associated with a reduction in early conditioning-related organ toxicities.

Since the low NRM has been confirmed after a long-term follow-up, this offers the possibility to plan effective innovative post-transplant therapies to tackle the risk of disease relapse, which remains the major unmet clinical need after transplantation.