Introduction

Allogeneic hematopoietic cell transplantation (allo-HCT) is currently the only treatment with curative potential for patients with myelofibrosis (MF) [1, 2]. Nonetheless, allo-HCT in MF poses several challenges due to the risks of delayed engraftment, augmented relapse risk and substantial treatment-related toxicity [3]. The majority of MF patients are older individuals with comorbidities and virtually all patients are undergoing the procedure with ‘active’ disease, making this population extraordinarily demanding in terms of allo-HCT. There are several variables to take into account when considering patients’ eligibility for the procedure, such as age, performance status, molecular and cytogenetic risk, most of them remaining largely unmodifiable [4]. One of a few modifiable variables, however, is the choice of conditioning regimen. Classically, the intensity of regimen was defined as either standard myeloablative (MAC) or reduced intensity (RIC) [5]. However, along with the introduction of modifications in preparative regimens, the traditional classification became ambiguous. Recent retrospective studies show that the choice of either MAC or RIC regimens has no impact on outcomes in MF patients undergoing allo-HCT [6, 7]. In this study, we applied the Transplant Conditioning Intensity (TCI) index, an objective tool to define intensity of the regimen that was originally developed in acute myeloid leukemia (AML) patients, to MF allo-HCT patients and explored its utility and association with outcomes.

Materials/subjects and methods

Data source

This was a retrospective, multi-center EBMT registry-based study. EBMT is a voluntary organization comprising more than 600 transplant centers from Europe and beyond. Accreditation as a member center requires submission of minimal essential data (patient clinical data, including aspects of the diagnosis and disease, first-line treatments, HCT- or cell-therapy-associated procedures, transplant type, donor type, stem cell source, complications and outcome after the procedure) from all consecutive patients to a central database which provides a pool of data to EBMT members to perform retrospective studies, assess epidemiological trends, and ultimately contribute to improve healthcare. EBMT centers commit to obtaining informed consent according to the local regulations applicable at the time in order to report pseudonymized data to the EBMT.

Patient selection and endpoints

Adult patients undergoing first allo-HCT for primary or post-polycythemia vera (PV)/post-essential thrombocythemia (ET) MF between 2012-2021, with the use of peripheral blood or bone marrow-derived hematopoietic cells from HLA-matched related (MRD), unrelated (MUD), or HLA-mismatched (MMUD 9/10) unrelated donors, with available data on follow-up and conditioning regimen were selected. Blast phase disease was excluded.

The TCI score was calculated for each procedure as described previously [8]. Post-transplant cyclophosphamide (PTCy) was not considered in TCI calculations. For comparison, classification as reported by the centers as MAC or RIC was used.

The primary objective of the study was to assess the association of TCI with outcomes after allo-HCT, including overall survival (OS), relapse-free survival (PFS), cumulative incidence of relapse (CIR) and non-relapse mortality (NRM). Secondary endpoints included acute (aGvHD) and chronic graft-versus-host disease (cGvHD), GvHD-free-relapse-free survival (GRFS), primary and secondary graft failure, engraftment and specific causes of death compared to MAC/RIC conditioning groups.

Statistical analysis

Patient-, disease-, and transplant-related variables were expressed as median and interquartile range (IQR) for continuous variables and frequencies and percentages (of all patients with data available) for categorical variables. Groups were compared using the χ2 test for categorical variables and the Kruskal-Wallis test for continuous data. Median follow-up was calculated using the reverse Kaplan–Meier estimator. OS was defined as the length of time from allo-HCT to death, PFS as the length of time from allo-HCT to relapse, progression or death, whichever occurred first. NRM was defined as death before relapse or progression. GRFS was defined as absence of death, relapse or progression, grade III-IV aGvHD or extensive cGvHD. Neutrophil engraftment was defined as the time at which the absolute neutrophil count (ANC) was ≥0.5 × 10^9/L for 3 consecutive days and platelet engraftment as a platelet (PLT) count >20 × 10^9/L for 7 consecutive days without transfusion support. Primary graft failure was defined as no evidence of engraftment, availability of date of no engraftment ≥28 days, or earlier when followed by a second allo-HCT. Secondary graft failure was defined according to the EBMT criteria as the presence of an ANC < 0.5 × 10^9/L occurring after initial engraftment and not related to relapse, infection, or drug toxicity [9].

The Kaplan-Meier estimator was used for OS, PFS and GRFS, and the log-rank test was used to compare differences between groups. The crude cumulative incidence estimator and Gray’s test were used for competing events (relapse together with NRM, aGvHD together with competing events graft failure, 2nd allo-HCT and death before aGvHD, cGvHD together with competing events graft failure, 2nd allo-HCT and death before cGvHD, neutrophil and platelet engraftment together with death before engraftment, primary and secondary graft failure together with death before graft failure). Variables with potential association with endpoints were evaluated in univariable analyses (UVA).

Multivariable analysis (MVA) using (cause-specific) Cox proportional hazards models was run to estimate the association of the TCI index and, in separate models, RIC/MAC conditioning with the endpoints - OS, PFS, relapse, NRM and a/cGvHD. Variables included in MVA apart from conditioning intensity were age at allo-HCT, year of allo-HCT, Karnofsky performance status (KPS), dynamic international prognostic scoring system (DIPSS) at allo-HCT [10], donor type, hematopoietic cell transplantation-specific comorbidity index (HCT-CI) [11], cytomegalovirus (CMV) IgG serostatus, driver mutation genotype and ATG use. Additionally, PTCy use was included in MVA models for a/cGvHD only. All models also included a center random effect.

All statistical tests were two-sided and p-values < 0.05 were considered significant. All analyses were performed in R version 4.4.1 using “survival”, “cmprsk”, “coxme”, and “prodlim” packages.

Results

A total of 2454 patients from 247 centers were included. Overall, 1339 (55%), 987 (40%), and 128 (5%) patients were categorized as receiving a low (TCI-low), intermediate (TCI-int), and high (TCI-high) intensity regimens according to the TCI score, corresponding to 1719 (70%) and 735 (30%) categorized as RIC or MAC regimens, respectively. Baseline characteristics of the study group are summarized in Table 1. No significant differences existed between TCI groups according to DIPSS at allo-HCT (p = 0.19), HCT-CI (p = 0.18) and time since MF diagnosis to allo-HCT (p = 0.44). However, groups were significantly different in terms of age (p < 0.0001), KPS (p = 0.001), donor type (p = 0.003) and ATG use (p < 0.0001), with patients conditioned with TCI-high regimens being younger, having a higher KPS and less frequently transplanted utilizing a MUD. The distribution of the transplant procedures according to the MAC/RIC classification and TCI score is presented in Fig. 1. The most frequently used conditioning regimens in each TCI group are presented in Supplementary Fig. 1. As 207 (15.5%) patients in the TCI-low group were reported by centers as having received MAC and 12 (9.4%) in the TCI-high group were reported as having received RIC, we analyzed the regimens used in those procedures (Supplementary Table 1). Since TCI-high constituted only 5% of the study population, further analyses were performed grouping TCI-int with TCI-high (TCI-int/high).

Fig. 1
Fig. 1
Full size image

Distribution of the patient cohort according to the MAC/RIC classification and the TCI index.

Table 1 Patient characteristics according to the TCI.

Median time to neutrophil engraftment was 18 (IQR 14–23) days and 18 (IQR 14-22) days in TCI-int/high and TCI-low regimens, respectively (p = 0.36). Median time to platelet engraftment was 24 (IQR 17–78) days in patients conditioned with TCI-int/high regimens, and 22 (IQR 15-42) days in TCI-low regimens (p < 0.0001). Median time to neutrophil engraftment was 17 (IQR 14–22) days in patients receiving RIC regimens, and 18 (IQR 15–23) for MAC regimens (p = 0.03). Median time to platelet engraftment was 23 (IQR 16–49) days in patients receiving RIC regimens, and 23 (IQR 16–56) days for MAC regimens (p = 0.33). Cumulative incidence of primary graft failure in the whole cohort was 1.6% (95% CI 1.1–2.1%), with no significant differences between patients with TCI-int/high and TCI-low (p = 0.37) or when comparing RIC and MAC (p = 0.42). Cumulative incidence of secondary graft failure was significantly higher (p = 0.002) in those with TCI-low compared to TCI-int/high (at 5-year 8%, 95% CI 7–10% vs. 5%, 95% CI 4–6%, respectively). Similarly, the cumulative incidence of secondary graft failure was significantly higher (p = 0.03) in those undergoing RIC compared to MAC (at 5-year 7%, 95% CI 6–9% vs. 5%, 95% CI 3–7%, respectively).

Median follow-up after allo-HCT was 46 (IQR 24–72) months. Five-year OS, PFS, cumulative incidence of relapse (CIR) and NRM rates were 56% (95% CI 53–58%), 46% (95% CI 44–48%), 23% (95% CI 21–25%) and 31.7% (95% CI 29–33%), respectively. In UVA (Supplementary Table 2), OS (p = 0.60) and PFS (p = 0.39) did not significantly differ in patients conditioned with TCI-int/high and TCI-low regimens (Fig. 2a, b). However, patients conditioned with TCI-int/high regimens had lower CIR (p < 0.001) and higher cumulative incidence of NRM (p = 0.02) when compared to those conditioned with TCI-low regimens (Fig. 2c, d). In contrast, no significant differences were observed in OS, PFS, CIR, and NRM between patients receiving MAC and RIC regimens (Fig. 2e, h). Cumulative 5-year incidence of death for all causes was comparable between patients receiving TCI-int/high and TCI-low regimens (Table 2). However, patients receiving MAC regimens had higher cumulative 5-year incidence of death due to organ damage (p = 0.004) but less due to disease progression (p = 0.001) compared to RIC.

Fig. 2: Primary endpoints according to the conditioning group.
Fig. 2: Primary endpoints according to the conditioning group.
Full size image

Probability of OS, PFS, CIR and NRM according to the TCI index (ad) and RIC/MAC (eh).

Table 2 5-year cumulative incidence of death (95% CI) by cause according to the conditioning groups.

In MVA, the use of TCI-int/high regimens was not associated with different OS (HR 1.12, 95% CI 0.97–1.30, p = 0.13) or PFS (HR 1.00, 95% CI 0.88–1.14, p = 0.95) as compared to TCI-low regimens, but was associated with a lower risk of relapse (HR 0.74, 95% CI 0.61–0.91, p = 0.004), and higher risk of NRM (HR 1.24, 95% CI 1.04–1.48, p = 0.02) (Table 3). Similar to the results of the UVA, there were no significant differences in patients receiving MAC regimens when compared to those receiving RIC in OS (HR 1.03, 95% CI 0.87–1.20, p = 0.76), PFS (HR 1.00, 95% CI 0.87–1.15, p = 0.97), risk of relapse (HR 0.83, 95% CI 0.67–1.04, p = 0.1) and NRM (HR 1.16, 95% CI 0.96–1.39, p = 0.13).

Table 3 Hazard ratios (HR) for overall survival (OS) and progression-free survival (PFS), cause-specific hazard ratios for cumulative incidence of relapse (CIR) and non-relapse mortality (NRM) and 95% confidence intervals (CI) obtained with multivariable Cox proportional hazards models.

Cumulative incidence of aGvHD at day 100, cGvHD at 5-years and probability of GRFS at 5-years after allo-HCT were 28% (95% CI 26–30%), 44% (95% CI 42–46%) and 18% (95% CI 17-20%), for the whole study cohort, respectively. In UVA, patients conditioned with TCI-int/high and TCI-low regimens had similar cumulative incidence of cGvHD (p = 0.94) and GRFS probability (p = 0.28) (Fig. 3a–c). On the other hand, cumulative incidence of aGvHD at day 100 after allo-HCT was higher for TCI-int/high (32%, 95% CI 29–34%) compared to TCI-low (24%, 95% CI 22–27%, p < 0.001). There were no significant differences in the cumulative incidence of aGvHD, cGvHD and GRFS when comparing MAC with RIC regimens (Fig. 3d–f). Similar results were obtained in the MVA (Supplementary Table 3). Here, the use of TCI-int/high regimens was associated with a higher risk of aGvHD (HR 1.34, 95% CI 1.11–1.61, p = 0.002) but not cGvHD (HR 1.04, 95% CI 0.89–1.21, p = 0.65). Similar to the UVA results, no significant differences in the risk of aGvHD (HR 1.12, 95% CI 0.92–1.37, p = 0.24) and cGvHD (HR 1.16, 95% CI 0.99–1.37, p = 0.07) were found between patients receiving MAC regimens when compared to RIC.

Fig. 3: Secondary endpoints according to the conditioning group.
Fig. 3: Secondary endpoints according to the conditioning group.
Full size image

Cumulative incidence of aGvHD, cGvHD and GRFS probability according to the TCI index (ac) and RIC/MAC (df).

Discussion

Over the years, the composition of conditioning regimens has evolved, making it difficult to precisely define the intensity of particular drug combinations. In MF, a large registry-based study by McLornan et al. involving 2224 allo-HCT patients, showed that the use of conditioning intensity defined conventionally as MAC or RIC had no impact on OS, NRM or incidence of relapse after allo-HCT [6]. We hypothesized that utilizing a more refined tool, such as the TCI, may overcome the limitations of the traditional classification system [8, 12]. To define conditioning intensity in MF more precisely, we applied the TCI score to 2454 MF patients undergoing allo-HCT and explored its association with outcomes.

First, we looked at the distribution of TCI scores within the MF patient population. In the original AML series (2005–2017), TCI-high constituted 21% of the study population, whereas this percentage had decreased to 4% in the more contemporary validation cohort (2018–2021), which included AML patients aged between 55 and 75 years [8, 12]. In our study, TCI-high regimens constituted only 5% of the study group, with a notable decrease in their use over time: 2012–2014: 11%, 2015–2017: 6%, 2018–2021: 2%. Hence, in contemporary MF cohorts, TCI-low and intermediate are the groups including the vast majority of patients. As expected, most TCI-low patients were assigned to the RIC group, while most TCI-high patients were assigned to the MAC group. Interestingly, within the TCI-intermediate group, patients were almost evenly split between RIC and MAC. Notably, there were still some exceptions: a subset of TCI-low patients received MAC (n = 207; 15.5%), and a small number of TCI-high patients received RIC (n = 12; 9.4%) - a distribution not seen in the original AML study [8].

Time to engraftment is longer and the incidence of graft failure is higher after allo-HCT for MF compared to other hematological malignancies and largely depends on disease’s biology, splenomegaly, CD34+ cell dose or JAK inhibitor use in the peri-transplantation period [13,14,15,16,17]. Here, delayed neutrophil engraftment was observed in patients receiving TCI-int/high regimens compared to TCI-low and delayed platelet engraftment was observed in those receiving MAC regimens compared with RIC. However, the incidence of primary graft failure was not influenced by conditioning intensity, defined either by TCI or MAC/RIC. In contrast, patients receiving higher intensity conditioning (TCI-int/high and MAC) had a significantly lower incidence of secondary graft failure. Those findings underline that delayed engraftment in MF is expected and primary graft failure is relatively rare in the more contemporary era. While secondary graft failure can be overcome by the use of higher intensity regimens, the risk of accumulated treatment-related toxicity has to be taken into consideration in the frail MF patient population.

When comparing transplant outcomes between TCI groups, patients conditioned with TCI-int/high regimens had lower CIR but higher NRM rates than TCI-low patients. As a result, both OS and PFS were comparable between TCI-int/high and TCI-low, while other factors, such as age, HCT-CI, KPS, or DIPSS were stronger predictors of OS. The use of MAC or RIC regimen was not significantly associated with either OS and PFS nor relapse and NRM, as shown previously [6]. In the whole study cohort, GvHD was the cause of death with the highest cumulative incidence followed by infection. Whereas we did not observe an association between RIC or MAC regimens and GvHD incidence, only the use of TCI-int/high regimens was significantly associated with a higher risk of aGvHD, whilst there was no association with cGvHD.

While allo-HCT provides benefit in survival for patients with high-risk disease, treatment-related toxicity still remains a serious concern [18, 19]. In a study by Hernandez-Boluda et al, older MF patients (>65 years) undergoing allo-HCT had better long-term survival than those receiving conventional non-transplant approaches, but the OS curve-crossing phenomenon only occurred after 4-years of follow-up due to the significant early mortality of transplantation [20]. Moreover, Gagelmann and colleagues show that higher conditioning intensity does not improve the results of an adverse molecular background [7]. Hence, it is crucial for the conditioning regimen type and intensity to be tailored to each patient’s fitness, while still preserving its anti-neoplastic activity. Based on our results, the benefit in lower relapse risk gained from higher conditioning intensity at some point is abrogated by accumulated toxicity. Despite being more precise than RIC and MAC classification, TCI index can still attribute the same score to regimens with different toxicity profiles. Considering worse NRM and higher rates of aGvHD in patients conditioned with TCI-int/high regimens, we further evaluated particular regimens used in this assigned group. Here, Fludaraine Busulfan (FluBu) was the most frequently used combination. In a study by Gagelmann et al., the authors compared fludarabine plus busulfan or treosulfan in RIC versus MAC setting, showing better outcomes in RIC for both combinations [21]. In another study, Joseph et al. reported on 65 MF allo-HCT patients conditioned with fludarabine plus pharmacologically guided busulfan conditioning [22]. While authors report improvements in early NRM (1-year 16%, 3-year 20%), at 5-years all outcomes were comparable to the current cohort. In contrast, in a prospective trial, Popat et al. demonstrated better RI without impact on NRM in patients receiving fludarabine plus high-dose but pharmacologically guided busulfan [23]. In summary, the early toxicity of FluBu regimens may be mitigated by area under the curve (AUC)-guided busulfan, which ensures the delivery of the desired dose. The second most represented regimens in the TCI-int/high group were melphalan-based. In the CIBMTR cohort of 872 patients, Murthy et al. showed that in RIC setting, patients receiving FluBu regimen had better rates of OS, NRM, aGvHD and engraftment compared with the Fludarabine Melphalan (FluMel) [24]. In a study by Hernandez-Boluda, looking at 556 older MF patients, the authors show better OS in busulfan-based conditioning compared to melphalan, which was mediated mainly by the reduction in NRM in the former [20]. In contrast, Robin et al. compared FluBu (TCI-low) and FluMel (TCI-int) regimens in RIC setting, showing better RI for FluMel and comparable NRM across both regimens, but underlining early FluMel toxicity [25]. Finally, Jain et al. showed similar rates of OS, RI, NRM and aGvHD in MF patients receiving melphalan-based regimens (FluMel or FluMelCar) compared with FluBu regimen, however the sample size was relatively small [26]. Hence, it is possible that toxicity observed in TCI-int/high group may be attributed to melphalan use. Finally, the third most common regimen group in the TCI-int/high was based on two alkylating drugs (FluBuThio, FluBuMel, FluMelCar and FluMelTreo). Chiusolo et al. highlighted good outcomes in terms of RI and OS while maintaining low NRM in MF patients conditioned with double-alkylating agent-based conditioning (TBF, thiotepa, busulfan, fludarabine) compared to single-alkylating agent (either melphalan, busulfan or thiotepa based) [27]. In summary, it would seem encouraging to choose FluBu with pharmacologically guided busulfan or a two-alkylating agent-based regimen, especially TBF, from the TCI-int/high repertoire to provide the best outcomes for MF patients. Nevertheless, neither TCI index nor MAC/RIC classification captures individualized dosing achieved by therapeutic drug monitoring. Further prospective trials are required.

Conclusions

Our findings highlight that the TCI index may be more useful in discriminating transplant outcomes compared to the traditional MAC/RIC classification. However, regardless of the conditioning intensity definition used, tools to improve survival outcomes of MF patients undergoing allo-HCT and to better tailor conditioning intensity to each patient’s individual risk of disease relapse and treatment-related mortality are lacking, representing an important unmet need in the field.