Introduction

Allogeneic stem cell transplantation is a potentially curative strategy for hematological malignancies due to the graft versus tumor effect [1, 2]. However, graft versus host disease (GVHD) and infections may offset its benefits by increasing non-relapse mortality (NRM) [3]. Randomized trials have demonstrated that pre-transplant anti-T-Lymphocyte globulin (ATLG) can reduce severe acute and chronic-graft-versus-host disease (a/cGVHD) [4,5,6]. While lower doses of ATLG may compromise its immunosuppressive effects, higher doses could offset its benefits by reducing antiviral efficacy and graft-versus-malignancy effects through the depletion of donor effector T-cells [7]. Despite the established use of ATLG in allo-SCT, no studies have compared ATLG dosing specifically in the context of MSD-PBSCT, and only two studies have compared different ATLG doses in MUD-PBSCT [8, 9]. In this study, we aim to compare transplant outcomes between 15 mg/kg versus 30 mg/kg ATLG as an in vivo T-cell depletion (TCD) strategy for patients undergoing MSD-PBSCT for hematological malignancies.

Materials and methods

This retrospective study conducted at University Medical Center Hamburg-Eppendorf (UKE) with a primary goal to compare cGVHD between 15 mg/kg (ATLG-15) and 30 mg/kg (ATLG-30) ATLG in recipients of (HLA 10/10) MSD-PBSCT. Secondary outcomes included engraftment, aGVHD, NRM, cumulative incidence of relapse (CIR), progression free survival (PFS), overall survival (OS) and graftversushostdisease, relapse-free survival (GRFS).

Ethics approval and consent to participate

This study was approved by the Ethics Committee of the University Medical Center Hamburg-Eppendorf (UKE) (reference number: 2022-100940-BO-ff). The study was performed in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants.

Myeloablative conditioning (MAC) regimens were defined according to published working group definition [10]. ATLG (Grafalon®, Neovii, Switzerland) was given at a dose of 15 mg/kg or 30 mg/kg. ATLG (Grafalon®, Neovii, Switzerland) was administered with a test dose of 200 mg on day −4, and the remaining doses were fractionated between days −3 and −1. Posttransplant GVHD prophylaxis consisted of ciclosporine A from day −1. The ATLG dose was selected based on physician preference, with a trend toward using lower doses (15 mg/kg) in more recent years. To account for disease-related heterogeneity, the Disease Risk Stratification System (DRSS) was used as a covariate in all multivariate and propensity score models. For the analysis of NRM, the DRSS was regrouped to enable stable estimation in the competing risks model, as no NRM events occurred in the Standard-risk group. Specifically, Standard, Intermediate-1, and Intermediate-2 were combined into a Low-risk category, while High and Very High were grouped as High-risk. This regrouping was applied only for the NRM model. For all other endpoints, the original DRSS categories were retained [11]. Neutrophil engraftment was defined as the first 3 consecutive days with a measure of absolute neutrophil count >0.5 × 109/L. Platelet engraftment was defined as the first 3 consecutive days with a platelet count > 20 × 109/L without transfusion support. Acute GVHD was graded according to standard criteria [12]. Chronic GVHD was graded according to National Institute of Health (NIH) criteria routinely at every visit after transplantation [13].

All outcomes were measured from the time of allo-SCT. PFS was defined as the duration of survival without relapse or progression, with censoring for patients without these events at last follow-up. OS was defined as survival without death from any cause, while NRM was defined as death without evidence of relapse. For statistical analysis, Kaplan–Meier methods were employed to estimate probabilities for PFS and OS, with group differences assessed via the log-rank test. Cumulative incidence functions were used to estimate engraftment, CIR, NRM, aGVHD, and cGVHD in a competing risk framework. Specifically, CIR and NRM were treated as competing events, and death without the respective event was considered the competing risk for aGVHD, cGVHD, and engraftment. GRFS was defined as survival without disease relapse or progression and without Grade III-IV aGVHD or moderate to severe cGVHD.

For univariate analyses, continuous variables were categorized. Univariate comparisons were performed using the log-rank test and Gray’s test for cumulative incidences. Multivariate analysis (MVA) was conducted using a Cox proportional hazards model to calculate adjusted hazard ratios and 95% confidence intervals. A p value of less than 0.05 was considered statistically significant. MVA using Fine and Gray’s competing risks regression model was performed to identify independent prognostic factors for NRM, CIR, GVHD, and engraftment. Variables with a p value less than 0.1 or clinically relevant were included.

To ensure comparability between the ATLG-15 and ATLG-30 groups for confounding variables, we employed propensity score matching (PSM) using the MatchIt package in R. Due to the limited number of cases and potential issues with perfect separation, we excluded patients with an ECOG performance status of 2 and those who received total body irradiation (TBI) to create a more homogeneous study population. The final matching model included the following covariates: ECOG performance status, year of allogeneic SCT, patient age, patient CMV serology, conditioning regimen intensity (MAC vs. RIC), and a grouped version of the Disease Risk Stratification System (DRSS), where DRSS standard, intermediate 1 and intermediate 2 were combined and compared high and very high. Matching was performed using nearest neighbor matching with a 1:2 ratio and a caliper width of 0.2 to ensure stricter match quality. The balance of covariates between the matched groups was assessed using standardized mean differences (SMD) and visualized with love plots generated using the cobalt package. The resulting matched dataset was used for all subsequent outcome analyses.

To explore whether the observed associations between ATLG dose and outcomes were influenced by conditioning intensity, we performed stratified univariate analyses in a more homogeneous subgroup of the full cohort. Specifically, we identified patients with AML or MDS who received conditioning without TBI, and further stratified them by conditioning intensity (MAC vs. RIC). Within these subgroups, outcomes were compared between patients who received ATLG 15 mg/kg versus 30 mg/kg. Survival outcomes (OS, PFS, GRFS) were analyzed using Kaplan-Meier estimates and the log-rank test. Cumulative incidence analyses were performed for NRM, relapse, acute and chronic GVHD, using Gray’s test to account for competing risks. The goal was to isolate the effect of ATLG dose within clinically relevant and biologically homogeneous subpopulations. To address potential confounding from disease heterogeneity and variations in conditioning intensity, additional subgroup analyses were conducted. The cohort was first restricted to patients with AML or MDS to reduce diagnosis-related variability. Within this subgroup, we further stratified patients based on conditioning intensity (MAC vs. RIC) to enable comparison of ATLG dosing in more homogeneous clinical contexts. Univariate analyses were performed within each stratum to evaluate the association between ATLG dose (15 mg/kg vs. 30 mg/kg) and transplant outcomes, including OS, PFS, GRFS, NRM, relapse, and acute and chronic GVHD. These stratified analyses aimed to isolate the effect of ATLG dose under more uniform clinical conditions. All statistical analyses were conducted using R (version 3.0, R Development Core Team, Vienna, Austria; https://www.r-project.org/). The following R packages were used: survival, cmprsk, ggplot2, dplyr, survminer, mstate, tableone, matchit, cobalt, and forestplot.

Results

Patients and transplant characteristics

A total of 165 consecutive patients were included in the study. Seventy-one patients received ATLG-15, and 94 Patients received ATLG-30. The median year of transplant was 2020 (2018–2022) in the ATLG-15 group and 2018 (2011–2022) in the ATLG-30 group (p < 0.001). The median age at transplant was 57 years (range, 20–72) and 58 years (range, 20–71) in the ATLG-15 and ATLG-30 (p = 0.88) groups, respectively. ECOG performance status scores were higher in the ATLG-15 group (ECOG0 18%, ECOG1 72%, ECOG2 9%) compared to the ATLG-30 group (ECOG0 36%, ECOG1 64%) (p = 0.001). Sixty eight percent received MAC in the ATLG-15 group compared to 48% in the ATLG-30 group (p = 0.01). All patients, donors, and transplant characteristics are listed in Table 1.

Table 1 Patients Donors and Transplants characteristics.

Transplant outcomes

All univariate analysis for transplant outcomes are summarized in Table 2.

Table 2 Univariate analysis.

Engraftment

One patient died prior to engraftment and three patients had primary graft failure in the ATLG-15 group, all remaining patients successfully engrafted. The ATLG-15 cohort showed an earlier leukocyte engraftment (median 11 days, range 8–19, p = 0.004) and earlier platelet engraftment (median 12 days, range 8–107, p = 0.0002) compared to the other group’s medians of 12 days (range 8–16) and 15 days (range 3–249), respectively.

GVHD

The cumulative incidence of aGVHD grade II-IV at day 100 was comparable between the two groups (ATLG-15: 34% vs ATLG-30: 18%, p = 0.2) (Fig. 1a). Only patients’ CMV serology significantly affected aGVHD II-IV, with a cumulative incidence of 12% in patients with negative serology compared to 33% in those with positive serology on univariate analysis (p = 0.007). This difference persisted on MVA (HR: 2.40 [95% CI: 1.30–4.43], p = 0.005). Additionally, there was a trend for increased risk of aGVHD II-IV in patients transplanted from female donors versus male donors (HR: 1.75 [95% CI: 0.96, 3.18], p = 0.068).

Fig. 1: Graft versus host disease.
Fig. 1: Graft versus host disease.
Full size image

ATLG 15 mg/Kg vs 30 mg/Kg. a aGVHD grade II-IV. b aGVHD grade III-IV. c cGVHD all grade. d cGVHD moderate/Severe.

The cumulative incidence of aGVHD Grade III-IV at day 100 was comparable between the ATLG-15 (13%) and the ATLG-30 (9%) groups (p = 0.7) (Fig. 1b). Conversely, only conditioning intensity significantly affected aGVHD Grade III-IV, with a cumulative incidence of 4% in patients who received MAC and 19% in those who received RIC (p = 0.0006). This difference persisted on MVA (HR: 5.89 [95% CI: 1.71–20.26], p = 0.0049).

We observed no differences in the cumulative incidence of all grade cGVHD between the two groups, with a cumulative incidence at 2 years of 73% versus 62% in the ATLG-15 and ATLG-30 groups, respectively (p = 0.21) (Fig. 1c). On MVA, none of the factors affected all grade cGVHD.

Patients in the ATLG-15 group had a significantly higher cumulative incidence of moderate/severe cGVHD compared to patients in the ATLG-30 group (ATLG-15: 43% vs ATLG-30: 28%, p = 0.045) (Fig. 1d). No other factor significantly affected cGVHD. This difference persisted on MVA (HR: 0.450 [95% CI: 0.214, 0.946], p = 0.035). MVA for GVHD are summarized in Table 3.

Table 3 Multivariate analysis graft versus host disease, non relapse mortality and cumulative incidence of relapse.

OS and PFS

The estimated 2-year OS was 72% for patients in the ATLG-15 group and 77% in the ATLG-30 group (p = 0.2) (Fig. 2a). The estimated 2-year PFS was 60% for patients in the ATLG-15 group and 65% in the ATLG-30 group (p = 0.4) (Fig. 2b). On univariate analysis older patients, TBI, higher DRSS and progressive disease at transplant were associated with lower OS and PFS. There was a trend for lower OS for patients with higher ECOG performance status and lower PFS for patients with negative CMV serology. No other factor affected OS or PFS on univariate analysis. In MVA, older age (HR 1.04 [95%CI: 1.01–1.08], p = 0.02) and Very High DRSS risk vs. Standard (HR 6.70 [95%CI: 1.45–30.98], p = 0.01) were significantly associated with inferior OS. (Fig. 2c) For PFS, older age (HR 1.03 [95%CI: 1.00–1.06], p = 0.03) and Very High DRSS risk vs. Standard (HR 4.27 [95%CI: 1.37–13.31], p = 0.01) were also significantly associated with worse outcomes. (Fig. 2d)

Fig. 2: Overall survival and progression free survival ATLG 15 mg/Kg vs 30 mg/Kg.
Fig. 2: Overall survival and progression free survival ATLG 15 mg/Kg vs 30 mg/Kg.
Full size image

a OS. b PFS. cMultivariate analysis OS. d Multivariate analysis PFS.

NRM and CIR

The 2-year cumulative incidence of NRM was comparable between the two groups, with 13% in the ATLG-15 group and 6% in the ATLG-30 group (p = 0.11) (Fig. 3a). Only patients’ age negatively impacted NRM on univariate analysis. On MVA none of the factors impacted NRM (Table 3).

Fig. 3: Non relapse mortality, cumulative incidence of relapse, graft versus host disease relapse free survival ATLG 15 mg/Kg vs 30 mg/Kg.
Fig. 3: Non relapse mortality, cumulative incidence of relapse, graft versus host disease relapse free survival ATLG 15 mg/Kg vs 30 mg/Kg.
Full size image

a Non relapse mortality. b Cumulative incidence of relapse. c GRFS. d Multivariate analysis GRFS.

The CIR at 2 years was similar between the ATLG-15 and ATLG-30 groups, with rates of 25% and 28%, respectively (p = 0.64). (Fig. 3b) However, we observed a higher CIR in patients with negative CMV serology compared to those with positive CMV serology (at 2 years: CMV neg 37% vs CMV pos 21%, p = 0.01). Patients who received total body irradiation (TBI) had a significantly higher CIR at 2 years compared to those who did not receive TBI (64% vs 24%; p = 0.005). Additionally, patients transplanted with progressive disease (PD) exhibited a higher CIR at 2 years (PD: 55%) compared to those in complete remission (CR: 22%), partial remission (PR: 21%), or untreated (17%) (p = 0.004). The cumulative incidence of relapse was highest in the Very High DRSS group at 60 [95%CI: 37–77], compared to 16 [95%CI: 4–36] in Standard, 14 [95%CI: 6–26] in Intermediate 1, 21 [95%CI: 12–32] in Intermediate 2, and 15 [95%CI: 4–34] in the High-risk group (p < 0.001). No other factors significantly impacted CIR in the univariate analysis. In MVA, Very High DRSS risk vs. Standard (HR 3.37 [95%CI: 1.10–10.31], p = 0.03) and TBI vs. no TBI (HR 2.51 [95%CI: 1.04–6.04], p = 0.04) were significantly associated with increased relapse risk.

GRFS

The GRFS at 2 years was comparable between the ATLG-15 and ATLG-30 groups, with rates of 24% and 36%, respectively (p = 0.1) (Fig. 3c). On univariate analysis, patient age (at 2 years: age ≤58 years 36% vs > 58 years 24%, p = 0.02) and RIC (at 2 years: MAC 37% vs RIC 22%, p = 0.02) negatively impacted GRFS. GRFS was markedly lower in the very high DRSS group at 2 yrs 5% [95%CI: 1–33], compared to 39% [95%CI: 21–76] in Standard, 40% [95%CI: 26–61] in Intermediate 1, 30% [95%CI: 19–47] in Intermediate 2, and 43 [95%CI: 26–72] in the High-risk group (p = 0.005).

In MVA, older age (HR 1.02 [95%CI: 1.00–1.04], p = 0.04) was significantly associated with inferior GRFS, while ATLG-30 vs. ATLG-15 was associated with improved GRFS (HR 0.47 [95%CI: 0.25–0.88], p = 0.02). Additionally, there was a trend toward inferior GRFS in patients receiving RIC compared to MAC (HR 1.60 [95%CI: 0.98–2.63], p = 0.06), and in those with Very High DRSS risk vs. Standard (HR 2.08 [95%CI: 0.93–4.68], p = 0.08). (Fig. 3d)

Propensity score matching

Propensity score matching yielded 19 matched cases and 31 matched controls. Covariate balance improved substantially after matching, with standardized mean differences for most variables falling well below 0.2. The variance ratio for the propensity scores was close to 1 (1.16), indicating strong balance between groups. Visual assessment using a Love plot and a density plot confirmed the reduction in imbalance across covariates (Supplementary Fig. 1A, B). In competing risk analysis, the 2-year cumulative incidence of non-relapse mortality was 30% in the ATLG-15 group and 0% in the ATLG-30 group. Due to the absence of NRM events in the ATLG-30 group, the sub distribution hazard could not be reliably estimated using Fine-Gray regression. No significant differences were observed in OS, PFS, CIR GRFS, aGVHD Grade II-IV, aGVHD III-IV, cGVHD all grade and moderate/severe. All results are summarized in Supplementary Table 1

Subgroup analysis of AML/MDS Patients who did not receive TBI by conditioning intensity

Among AML/MDS patients who did not receive TBI (n = 108), 61 received ATLG 15 mg/kg and 47 received ATLG 30 mg/kg. In the MAC subgroup (n = 69), 38 received 15 mg/kg and 31 received 30 mg/kg. In the RIC subgroup (n = 39), 23 received 15 mg/kg and 16 received 30 mg/kg.

In the overall AML/MDS no-TBI population, the incidence of moderate-to-severe cGVHD was significantly higher with ATLG 15 mg/kg compared to 30 mg/kg (38% vs. 19%, p = 0.039). In the MAC subgroup, moderate-to-severe cGVHD occurred more frequently with ATLG 15 mg/kg (39% vs. 19%, p = 0.079). In the RIC subgroup, no statistically significant differences were observed between ATLG 15 mg/kg and 30 mg/kg for any outcomes.

All results are summarized in Supplementary Table 2.

Discussion

Although ATLG is recommended for GVHD prevention in allo-SCT [14], data on optimal dosing of ATLG in the setting of MSD-PBSCT is still lacking. A consensus recommendation by an international expert panel recommends the use of 30 mg/kg ATLG in MSD and 60 mg/KG in MUD allo-SCT [15].

This is the first study comparing ATLG doses in the MSD-PBSCT setting. Our findings demonstrate that the ATLG-30 was associated with a reduction in moderate to severe cGVHD and improved GRFS. Engraftment was faster in the ATLG-15 group; however, no significant differences were observed between the groups for aGVHD, CIR, NRM, PFS or OS. These results suggest that the ATLG-30 may provide a preferable balance between GVHD prevention and relapse risk without compromising key transplant outcomes.

Our results align with previous evidence demonstrating that pre-transplant ATLG can effectively reduce severe acute and chronic GVHD in allo-SCT [4,5,6]. Notably, although the overall incidence of aGVHD in our cohorts did not differ significantly, we observed a clear benefit in terms of lower moderate/severe cGVHD with ATLG-30. This is in line with prior work suggesting that sufficiently dosed ATLG might confer more pronounced immunomodulatory effects on donor T cells, thereby reducing the risk of severe cGVHD [7].

While multiple studies have reported the benefits of ATLG in MUD-PBSCT, only two have specifically compared different doses in that setting [8, 9]. In our previous study comparing 30 mg/kg (ATLG-30) vs 60 mg/kg (ATLG-60) ATLG in MUD-PBSCT, we reported an earlier engraftment in patients receiving ATLG-30, with no differences in other transplant outcomes [9]. Our current findings in MSD-PBSCT mirror this pattern, as we observed earlier engraftment in the ATLG-15 group. Another study published from our center compared ATLG-30 with ATLG-60 in the MUD-allo-SCT setting between 1997 and 2005, reporting a higher NRM in the ATLG-60 group with no differences in other outcomes [16]. In our study, NRM was comparable between the groups, which may be explained by improved HLA matching and supportive care over the years. Moreover, our data in MSD-PBSCT suggest that a 30 mg/kg dose achieves a favorable GVHD profile without significantly increasing CIR, indicating that donor type may play a role in modulating the impact of ATLG dosing.

A retrospective study from 2003 comparing two doses of ATLG (<60 mg/Kg vs 60 mg/Kg) in CML patients undergoing MUD allo-SCT reported improved OS and DFS in patients receiving ≥60 mg/Kg ATLG, attributing these differences to a higher incidence of severe aGVHD in the lower-dose group [17]. However, in that study, patients in the <60 mg/Kg arm received non-uniform dosing, with 58% receiving 20 mg/Kg and 36% receiving 40 mg/Kg, while our study in the MSD-PBSCT setting compared uniform doses of 15 mg/Kg and 30 mg/Kg. Moreover, differences in donor types and advancements such as improved HLA matching and supportive care over the years further complicate direct comparisons. These factors may explain why our results did not demonstrate significant differences in aGVHD or survival outcomes between the two dosing regimens.

Several studies have examined different doses of anti-thymocyte globulin (Thymoglobulin, ATG) in various transplant settings, highlighting the delicate balance between GVHD prevention and infection risks [18,19,20,21]. Butera et al. retrospectively compared two Thymoglobulin doses (5 mg/kg vs. 6.5 mg/kg) in adults undergoing MUD allo-SCT and found no significant differences in the long term transplant outcomes [18]. Bacigalupo et al. reported that a higher Thymoglobulin dose (15 mg/kg vs. 7.5 mg/kg) in MUD allo-SCT reduced acute GVHD (37% vs. 69%) [19]. Importantly, unlike ATLG, Thymoglobulin may also contain antibodies targeting thymus-specific cells, thus further impairing thymic T-cell regeneration [22]. Additional work in haplo-identical and cord blood transplant settings observed increased infection rates with higher ATLG doses [20, 21]. To refine dosing, it has been proposed that both body weight and absolute lymphocyte count (ALC) be considered, an approach that has effectively reduced GVHD and infection/relapse rates in two studies [23, 24]. A post hoc analysis of a randomized trial confirmed worse survival outcomes for patients with lower ALC on the first infusion day [25]. In our study, we did not evaluate the impact of ALC on outcomes.

Taken together, these findings underscore that optimal ATLG dosing should not only minimize GVHD but also preserve anti-leukemic immunity and maintain stable engraftment. Our observation of improved GRFS in the ATLG-30 group parallels results from other investigations highlighting the potential of higher ATLG doses to mitigate long-term complications [4,5,6, 16]. Future prospective trials focusing on patient-reported outcomes and immunologic monitoring could further elucidate the best dose to balance GVHD control and relapse, particularly in the MSD-PBSCT setting.

This is, to our knowledge, the first study comparing uniform ATLG doses (15 vs. 30 mg/kg) specifically in the MSD-PBSCT setting. Although retrospective and single-center in design, we sought to mitigate selection bias by employing propensity score matching, conducting a multivariate analysis and subgroup analyses in more homogeneous patient populations and conditioning regimens, thereby improving comparability and accounting for potential confounders. Nonetheless, the relatively low number of matched pairs remaining after propensity score matching substantially limits statistical power and increases the risk of overfitting or type II errors. These findings warrant validation in larger, multi-center randomized prospective trials. Going forward, efforts to refine ATLG dosing strategies should incorporate carefully stratified patient populations and could benefit from the inclusion of patient-reported outcomes to enhance our understanding of how different dosing regimens influence both clinical efficacy and quality of life.

Notably, our finding that TBI was associated with an increased risk of relapse contrasts with previous EBMT studies in AML and ALL, which reported a protective effect of TBI against relapse. In our analysis, this association remained significant even after adjustment for DRSS in the multivariate model. A likely explanation lies in the composition of our cohort: patients who received TBI predominantly had lymphoid malignancies, many of which were in advanced or refractory stages, or refractory AML. These disease contexts carry a high inherent relapse risk, which may have outweighed any potential antileukemic effect of TBI. Thus, the observed association likely reflects adverse disease biology and treatment resistance rather than a direct causal effect of TBI, and should be interpreted with caution. A limitation is potential confounding from evolving supportive care practices during 2011–2022, including letermovir introduction and enhanced antifungal strategies, with registry data showing 20-30% NRM reductions over similar periods [26,27,28]. While our single-institution standardized protocols minimize bias, incremental practice improvements may have contributed to the observed outcomes.

Contemporary transplant practice is evolving with recent evidence challenging traditional donor selection paradigms, particularly regarding the preference for older matched sibling donors over younger matched unrelated donors in elderly patients [29, 30].Additionally, post-transplant cyclophosphamide (PTCy) is rapidly expanding from haploidentical to all donor types, demonstrating the ability to eliminate HLA matching disparities and improve outcomes compared to traditional GVHD prophylaxis [31, 32]. These developments highlight the dynamic nature of transplant practice and the need for continued optimization of donor selection and GVHD prevention strategies alongside established approaches such as ATLG.

In conclusion, our study suggests that a 30 mg/kg ATLG dose in MSD-PBSCT can reduce moderate to severe cGVHD and potentially improve GRFS without significantly increasing relapse or NRM. Larger, multicenter trials are needed to confirm these observations and refine ATLG dosing in the MSD-PBSCT setting.