Introduction

Adenocarcinoma of the oesophagus and gastroesophageal junction (GEJ) continues to pose a formidable oncologic challenge, characterised by a rising incidence and persistently poor outcomes. For years, the field was shaped by two dominant therapeutic philosophies: intensive perioperative chemotherapy, culminating in the highly effective FLOT regimen [1], and neoadjuvant chemoradiotherapy (nCRT) as defined by the CROSS protocol [2]. Recent evidence has tilted the scale in favour of perioperative chemotherapy (CT) as the standard of care for most patients with adenocarcinoma [3].

Yet, in real-world practice, such apparent clarity at the population level often conceals significant complexity when examined at the individual level. Average trial effects may not yield durable benefit across all subgroups, and whether observed gains reflect true cure or simply deferred recurrence is unclear—a distinction standard survival analysis fails to capture due to the inherent design of the statistical models typically employed [4, 5]. While some fully parametric survival models, by their mathematical formulation, assume no possibility of cure, the semi-parametric Cox model—the cornerstone of inference in most trials—is more flexible. It allows for a cure fraction once the non-parametric baseline hazard approaches zero, yet it is not designed to explicitly quantify it. Its primary output, a single hazard ratio (HR), provides an average effect that cannot disentangle a true increase in cured patients from a mere postponement of recurrence in others [6].

This leads to the central paradox of curative-intent research: we study therapies to cure, yet our primary analytical tools treat cured survivors as ‘censored’ cases whose success remains statistically unresolved. This highlights a critical need for models that explicitly contemplate and quantify the possibility of cure, and not merely hazard reductions that can be difficult to interpret clinically [7, 8]. Scenarios suggesting a dormancy-inducing rather than a curative effect on micrometastases include transient disease-free survival (DFS) improvements or an increased relapse risk after therapy discontinuation, phenomena described in resected cancers such as breast, prostate, GIST, and adrenocortical cancer [8,9,10,11]. Identifying these scenarios is critical, as they directly inform clinical decisions regarding therapy duration, transitions to maintenance or consolidation phases, and the planning of surveillance strategies [8].

These conceptual challenges are exemplified by the evolving evidence in oesophageal-gastric cancer itself. In the CROSS trial, for instance, adenocarcinoma was associated with only a modest, statistically non-significant survival benefit from nCRT (HR 0.74; p = 0.07) and substantially lower pathological complete response rates compared to squamous cell carcinoma [2], casting doubt on its curative potential in this histology [12]. The CheckMate-577 trial attempted to address this by evaluating the addition of adjuvant nivolumab, demonstrating a significant DFS benefit (HR 0.69; 96.4% CI, 0.56 to 0.86; P < 0.001) [13]. However, with longer follow-up, the DFS curves showed progressive convergence and the HR increased to 0.76, indicating a diminishing treatment effect over time [14]. Compatibly, this did not translate into a statistically significant overall survival (OS) benefit (HR 0.85; 95.87% CI, 0.70–1.04). This is critical because while DFS is often considered a surrogate for OS, this correlation is unproven for adjuvant immunotherapy [15], leaving the question of nivolumab’s true curative potential unresolved.

Compounding these interpretive challenges is the impact of real-world treatment feasibility. In the pivotal FLOT4 trial, only 46% of patients completed the full perioperative regimen [1], suggesting limitations to its applicability. This low completion rate complicates the interpretation of head-to-head trials such as NeoAEGIS and ESOPEC [16, 17]. For instance, while ESOPEC demonstrated a survival advantage for FLOT, its generalisability has been debated, partly because the CROSS regimen used as a control may no longer represent the optimal standard of care for nCRT. Indeed, the optimal neoadjuvant chemoradiotherapy backbone remains a subject of intense debate, as highlighted by the divergent findings of recent trials: the CALGB 80803 study favoured a fluoropyrimidine-platinum regimen in a PET-selected population, whereas the UK NeoSCOPE trial, which administered induction chemotherapy with capecitabine and oxaliplatin (OxCap) to both arms before randomisation, found superior outcomes with a CROSS-like regimen [18, 19]. These divergent findings highlight that the efficacy of nCRT is highly context-dependent, likely influenced by patient selection strategies and the specific chemotherapy backbone used.

Furthermore, the subsequent integration of immune checkpoint inhibitors (ICIs) into practice, as seen in the MATTERHORN trial [3], once again exemplifies this crucial yet unresolved aspect of perioperative therapies, as it remains uncertain to what extent the observed DFS gains reflect definitive cure, merely delayed relapse, or some combination of both.

Given these multifaceted uncertainties, there is a clear need for analytical approaches that can separate patients who achieve durable remission from those with persistent risk. Mixture cure models directly address this need by simultaneously estimating the probability of cure (the ‘incident’ process) and modelling the survival dynamics of non-cured patients (the ‘latent’ process), yielding clinically actionable insights [5,6,7].

To address these gaps, we analysed the AGAMENON-SEOM Spanish registry (NCT04958720), which captures granular, real-world data across a heterogeneous national network. Our objectives were to map treatment-selection patterns and to describe whether intensive multimodality regimens, as implemented in this registry, appeared truly curative or primarily postponed relapse. To this end, we applied mixture cure models to separate the probability of definitive cure from the survival dynamics among patients who remain at risk, thus providing a more nuanced assessment of therapeutic benefit than conventional endpoints.

Material and methods

Study design and population

This retrospective cohort study utilised data from the AGAMENON-SEOM national registry (NCT04958720), a database focused on esophagogastric cancer, sponsored by the Spanish Society of Medical Oncology (SEOM) and encompassing 35 hospitals across Spain. Detailed methodology, quality standards, and eligibility criteria for the registry have been published previously [20,21,22,23].

Eligible patients for this analysis were adults with histologically confirmed, locally advanced (clinical stage II–IVA), resectable adenocarcinoma of the oesophagus or GEJ, who initiated curative-intent treatment with either perioperative CT or nCRT. Key exclusion criteria included stage IVB disease (as it is not considered amenable to curative-intent therapy), primary gastric adenocarcinoma, squamous histology, lack of neoadjuvant treatment, definitive radiotherapy/chemoradiotherapy, or immunotherapy as monotherapy.

To address key clinical questions using this real-world data, we examined treatment strategies as implemented in routine practice, reflecting those evaluated in pivotal trials. We defined a primary cohort for our main comparison and two secondary cohorts for exploratory, hypothesis-generating purposes. The primary analysis focused on the ESOPEC-like cohort (n = 321), which included patients receiving either perioperative FLOT or CROSS neoadjuvant chemoradiotherapy (nCRT) without adjuvant immunotherapy. The secondary cohorts were designed to assess outcomes in other real-world scenarios and to apply our cure model to strategies where this endpoint has not been formally evaluated. These were: (1) The NeoAEGIS-like cohort (n = 436), comparing any perioperative chemotherapy regimen versus CROSS nCRT; and (2) The hypothetical-strategy cohort (n = 308), comparing perioperative FLOT against a composite group of patients who received either various fluoropyrimidine-platinum based neoadjuvant chemoradiotherapy regimens (FP-based nCRT) or CROSS nCRT followed by adjuvant immunotherapy.

The study complied with the ethical standards of the Declaration of Helsinki and Good Clinical Practice guidelines, with approval obtained from a central research ethics committee and the institutional review boards or ethics committees of all participating sites.

Variables and outcomes

The database comprises over 500 variables capturing demographic, clinical, treatment, and outcome data. To adjust for confounders, a multivariable model incorporated a predefined set of baseline variables identified via systematic literature review and investigator consensus, mitigating biases from data-driven variable selection. These baseline variables were age, sex, ECOG performance status (ECOG-PS), dysphagia, tumour location (oesophagus or GEJ), Siewert type, tumour grade, HER2 status, neutrophil-to-lymphocyte ratio (NLR), albumin level (categorised), and treatment modality (perioperative CT or nCRT). All tumours were staged according to the 8th edition of the American Joint Committee on Cancer (AJCC) TNM staging system [24].

Perioperative treatment data included regimen type, selection rationale, compliance, and toxicity profiles, with treatments categorised as perioperative CT or nCRT. Chemotherapy-related toxicities were assessed according to the current Common Terminology Criteria for Adverse Events (CTCAE).

Primary endpoints were cure rates, DFS, and OS, measured from the initiation of perioperative CT or nCRT. OS was defined as time to death from any cause, with censoring at the last follow-up for patients without events. DFS, assessed only in resected patients, was defined as time to relapse or death, whichever occurred first, with censoring for event-free patients. The cure rate was defined as the estimated proportion of patients who would remain event-free with indefinite follow-up, as derived from the mixture cure model. Secondary endpoints included R0 resection rate, pathologic complete response, treatment-related toxicity, surgical outcomes, and rationale for treatment selection.

Statistical analysis

A multivariable logistic mixed-effects model was used to identify factors influencing the choice between perioperative CT and nCRT. Patient-level variables were entered as fixed effects, and a random intercept for hospital was included to account for interhospital variation. The proportion of variance attributable to hospital-level clustering was quantified using the intraclass correlation coefficient (ICC), which reflects the extent to which patients treated at the same institution share similar treatment allocation patterns [25]. Proportions were compared using χ² tests, and continuous variables using t-tests. Survival outcomes were analysed with the Kaplan–Meier method. To control for potential immortal time bias (18), which can arise in retrospective studies when treatment allocation depends on completing post-surgical therapy (e.g., postoperative immunotherapy), unlike intent-to-treat analyses, landmark analyses at 6, 9, and 12 months were performed [26]. To account for confounding, multivariable Cox proportional hazards (PH) models examined the association between treatment strategy and both DFS and OS, incorporating a prespecified set of covariates (see “Variables and outcomes”). Interactions between treatment type (CT vs. nCRT) and each covariate were tested individually, with p-values for interaction reported. To validate these findings, pseudo-individual patient data were derived from the Kaplan-Meier curves of the ESOPEC trial [16] using the method by Guyot et al. [27]. The reconstructed dataset was then reanalysed using the same modelling approach.

The PH assumption was evaluated using the Grambsch–Therneau test. A multivariable Royston-Parmar spline model was subsequently applied to characterise time-dependent hazard ratios. To further explore whether the waning effect impacted long-term outcomes, a mixture cure model was fitted using a Weibull distribution for the latency component and a logistic link for the cure fraction [7]. Exponentiated coefficients from the prevalence (cure) component are interpreted as odds ratios (OR). An OR > 1 denotes higher odds of belonging to the cured fraction. Exponentiated coefficients from the latency (susceptible) component are interpreted as time ratios (TR; acceleration factors). A TR < 1 indicates a shorter expected time to the event among uncured patients, i.e., worse prognosis.

A non-technical overview of the modelling approaches, intended for a clinical audience, along with a comparison of model fit under time-varying hazard scenarios between the Cox and Royston-Parmar models, is provided in the Appendix.

Missing data were addressed by multiple imputation with predictive mean matching. No formal sample size calculation was conducted, as the registry’s sample was fixed by patient availability, so results should be interpreted with due regard for the width of CIs.

All analyses were performed in R (version 4.3.2) using the rms, rstpm2, flexsurvcure, and lme4 packages [5, 28, 29], with a two-sided p-value < 0.05 considered statistically significant. The R code used for all analyses presented in this study is available in the Supplementary Material.

Results

Baseline characteristics and treatment selection in the global cohort

The AGAMENON registry included 2403 patients with nonmetastatic esophagogastric cancer entered between April 11, 2017, and January 7, 2025. A total of 500 patients were included in the global cohort. Figure 1 details the patient selection process. Overall, 305 patients (61.0%) received perioperative CT, most commonly FLOT (71.5%) or FOLFOX (20.0%), while 195 patients (39.0%) underwent nCRT, primarily using the CROSS regimen (67.2%). Of these, 39 out of 195 patients (20%) received adjuvant nivolumab (Table 1). A Sankey plot illustrates the complete details and the flow of pre- and postoperative therapies (Appendix Fig. S1).

Fig. 1: Study flowchart.
Fig. 1: Study flowchart.
Full size image

CT chemotherapy, CRT chemoradiotherapy, GEJ gastroesophageal junction, ICI immune checkpoint inhibitor, nCRT neoadjuvant chemoradiotherapy.

Table 1 Baseline characteristics of the total cohort and by treatment group (nCRT vs. perioperative CT).

Baseline characteristics were generally well balanced between groups, except for tumour location, where, as expected based on real-world practice patterns, treatment selection clearly influenced the distribution (Siewert I/II/III: 19.8%/32.1%/48.1% with perioperative CT vs 46.2%/38.5%/15.4% with nCRT; p < 0.0001) (Table 1 and Appendix Table S1). Treatment selection showed significant variability across centres. A generalised linear mixed model confirmed that hospital centre—reflecting local treatment preferences—accounted for approximately 23% of the variability in treatment choice (Appendix Table S1), indicating a substantial centre-level effect beyond measured patient factors. Consistent with these findings, significant differences were observed in the clinician-reported rationale for treatment choice. Institutional protocols and clinician experience were the most frequently cited drivers (influencing 66.9% of perioperative CT choices [54.1% + 12.8%] and 73.9% of nCRT choices [63.6% + 10.3%]), while patient-specific factors such as clinical stage or comorbidities were cited less often (Appendix Fig. S2). Additionally, a significant temporal trend in treatment selection was observed, suggesting practice patterns evolved in response to emerging evidence from major pivotal trials (Appendix Fig. S3).

Clinical outcomes in the global cohort

Surgery rates were comparable (perioperative CT 89% and nCRT 84%; p = 0.16), but reasons for omitting surgery differed (p = 0.008). Intraoperative unresectability was more frequent after CT (6.2% vs. 2.1%), while preoperative progression was higher after nCRT (5.1% vs. 1.0%). As expected, surgical approaches also varied significantly (p < 0.001), with total gastrectomy more common post-CT (28.0% vs. 3.7%) and transthoracic esophagectomy more common post-nCRT (51.2% vs. 27.7%); transhiatal approaches were similar (Table 2). Robotic assistance was utilised more frequently following nCRT (17.7% vs. 4.4%; p < 0.001). Although the median lymph node yield was higher following perioperative CT (23 vs. 19; p < 0.001), nCRT resulted in significantly higher R0 resection rates (92.7% vs 84.1%; p < 0.001) and superior pathologic response rates (ypT0: 18.9% vs. 7.8%; ypN0: 62.8% vs. 43.9%; both p < 0.001). Postoperative complication rates were similar between arms (Table 2).

Table 2 Treatment outcomes of the global cohort and by treatment group (nCRT vs. perioperative CT).

At a median follow-up of 41.8 months, 286 recurrences and 235 deaths were observed, with an estimated cure rate of 27.9% (23.3–33.6) by the end of follow-up. DFS was comparable between groups, while OS curves crossed at 36–42 months, indicating non-proportional hazards and a waning early benefit of perioperative chemotherapy over nCRT (Fig. 2; Appendix Tables S2 and S3). Prognostic factors are summarised in Appendix Table S5 and S6. Treatment-related toxicity profiles differed significantly, consistent with the expected toxicity spectrum of multi-agent chemotherapy versus chemoradiation, with perioperative CT associated with higher rates of neutropenia, anaemia, diarrhoea, palmar-plantar erythrodysesthesia, peripheral neuropathy, alopecia, and elevated transaminases (Appendix Table S4 and the Amit plots Fig. S4 and S5).

Fig. 2: Disease-free survival (DFS) and overall survival (OS) across cohorts.
Fig. 2: Disease-free survival (DFS) and overall survival (OS) across cohorts.
Full size image

a, b Show DFS and OS, respectively, for the overall cohort. c, d Present the same endpoints for the NeoAEGIS-like cohort. e, f Display DFS and OS for the ESOPEC-like cohort, while g, h Represent the hypothetical cohort (FLOT vs CRT excluding CROSS, or CROSS combined with immunotherapy). Kaplan-Meier curves comparing DFS and OS between perioperative chemotherapy and neoadjuvant chemoradiotherapy across the overall cohort, NeoAEGIS-like cohort, ESOPEC-like cohort, and the hypothetical-strategy cohort. Adjusted hazard ratios (HR) and p-values are provided for each comparison. CT chemotherapy, DFS disease-free survival, HR hazard ratio, ICI immune checkpoint inhibitor, nCRT neoadjuvant chemoradiotherapy, OS overall survival.

ESOPEC-like cohort: FLOT vs. CROSS

We next focused on patients who received standard regimens, specifically FLOT or CROSS. In this cohort, 218 patients (68%) received FLOT and 103 (32%) received CROSS. Baseline characteristics of this nested cohort were consistent with those of the overall population (Appendix Table S7). A Sankey plot illustrates the composition and derivation of this subset (Appendix Fig. S1). Survival outcomes achieved with these regimens in the registry are presented in Fig. 2 and Appendix Tables S2, S3 and S8, S9.

In this cohort, 82 patients (93%) completed the neoadjuvant CROSS protocol; treatment was discontinued in five patients (5%) due to toxicity and in one patient (1%) because of clinical deterioration. Radiotherapy was delivered over a mean duration of 1.2 months (standard deviation, 1.3) at a mean total dose of 45.1 Gy (range, 19.8–60 Gy). Of the 218 patients, 127 (58%) completed the FLOT regimen postoperatively (Appendix Sankey plot Fig. S1).

In our cohort, treatment with FLOT was associated with longer median DFS (24.1 vs. 21.5 months) and median OS (43.9 vs. 28.6 months) relative to CROSS (HR 0.60, 95% CI: 0.40–0.88, p = 0.01; OS HR 0.63, 95% CI: 0.43–0.91, p = 0.015). Landmark analysis within the FLOT arm suggested no significant difference in DFS between patients who completed postoperative CT and those who did not (Appendix Fig. S6).

Among patients with disease recurrence, first-relapse patterns did not differ statistically between the arms (p = 0.63). Numerically, a higher proportion of isolated locoregional recurrences was observed following FLOT (14.1% vs. 9.3%), while a higher proportion of distant metastases was observed following CROSS (90.7% vs. 85.9%).

Subgroup analysis for DFS showed significant interactions with clinical TNM stage (p = 0.0012) and histological grade (p = 0.0326), with the largest observed DFS advantage for FLOT in patients with stage III (HR 0.39, 95% CI: 0.25−0.62) and grade 3 tumours (HR 0.26, 0.13−0.55), while in stage II disease, FLOT was associated with a higher risk of relapse (HR 2.66, 1.02−6.93) compared to CROSS (Fig. 3a). For OS, a significant interaction between therapy and HER2 status (p = 0.0273) was observed, with greater apparent efficacy of FLOT in HER2-positive (IHC 3+) tumours (Appendix Fig. S7). Among 36 patients with IHC 3+ tumours, 8 (22%) received preoperative trastuzumab; no IHC 2+ patients did. Recurrences in IHC 3+ tumours were predominantly distant (88%), compared with 61% in HER2-negative and 33% in IHC 1-2+ tumours, p = 0.30.

Fig. 3: Forest plot of disease-free survival (DFS) by subgroup in the ESOPEC-like cohort.
Fig. 3: Forest plot of disease-free survival (DFS) by subgroup in the ESOPEC-like cohort.
Full size image

a Shows a forest plot of subgroup (stratified) effects for DFS in the ESOPEC-like cohort, with HRs derived from multivariable Cox models including covariate-by-treatment interactions. b Displays the time-varying hazard ratio for the treatment effect on DFS, obtained from a multivariable spline-based dynamic model (Royston -Parmar), also in the ESOPEC-like cohort. CI confidence interval, CT chemotherapy, DFS disease-free survival, ECOG PS Eastern Cooperative Oncology Group Performance Status, FISH fluorescence in situ hybridisation, HR hazard ratio, nCRT neoadjuvant chemoradiotherapy, NLR neutrophil-to-lymphocyte ratio. Forest plot displays subgroup-specific Hazard Ratios (HRs) and 95% Confidence Intervals (CIs) comparing Perioperative CT versus nCRT for Disease-Free Survival (DFS) within the ESOPEC-like cohort. Interaction P-values test for heterogeneity of treatment effect across subgroup levels and were derived from multivariable Cox proportional hazards models, each including main effects plus a single treatment-by-covariate interaction term, tested one at a time. Hazard Ratios < 1.0 indicate a result favouring perioperative CT, while HRs > 1.0 indicate a result favouring nCRT, plotted on a logarithmic scale.

In the mixture cure model, there was a non-significant trend toward lower odds of cure with nCRT compared to perioperative FLOT (OR 0.59; 95% CI, 0.15–2.35; reference: FLOT); a forest plot of the interaction analysis for this endpoint is shown in Fig. 4, and the full multivariable model is detailed in Appendix Table S10.

Fig. 4: Forest plot of subgroup analysis for the odds of Cure with perioperative chemotherapy versus nCRT (ESOPEC-like cohort).
Fig. 4: Forest plot of subgroup analysis for the odds of Cure with perioperative chemotherapy versus nCRT (ESOPEC-like cohort).
Full size image

Subgroup odds ratios (ORs) for cure, comparing perioperative FLOT chemotherapy to nCRT, were estimated using a mixture cure model with a Weibull distribution for time to event and a logistic link for the cure fraction. Multiple imputation was performed using predictive mean matching, and the model was subsequently fitted to the imputed datasets. Covariates were specified to influence only the probability of cure. Odds ratios less than 1 indicate a lower odds of cure with nCRT and, therefore, a higher odds of cure with perioperative FLOT. Interaction p-values correspond to each subgroup analysis. All estimates are adjusted for the covariates listed. This plot displays results from the prevalent (cured) component only. CI confidence interval, CT chemotherapy, DFS disease-free survival, ECOG PS Eastern Cooperative Oncology Group Performance Status, FISH fluorescence in situ hybridisation, HR hazard ratio, nCRT neoadjuvant chemoradiotherapy, NLR neutrophil-to-lymphocyte ratio.

Among non-cured (susceptible) patients, no significant difference in time to relapse was observed between groups (TR, 0.87; 95% CI, 0.56–1.35; reference: FLOT).

To test the robustness of our primary estimate, we performed two complementary analyses. First, a sensitivity analysis excluding cT4 (stage IVA) tumours yielded an OR for cure of 0.49 (95% CI, 0.19–1.24), which was broadly consistent with the estimate obtained in the full cohort. Second, as a form of external validation, we applied our model to reconstructed data from the pivotal ESOPEC trial. This analysis yielded concordant results, showing lower odds of cure for nCRT compared to perioperative FLOT (OR 0.50; 95% CI, 0.36–0.86). Furthermore, among susceptible (non-cured) patients in the ESOPEC data, time to relapse was shorter with nCRT (TR, 0.62; 95% CI, 0.62–0.99).

We next explored conditional effects of key clinical variables on cure probability. Subgroup analyses from the AGAMENON-SEOM registry indicated reduced cure rates with nCRT in patients with stage III disease or elevated neutrophil-to-lymphocyte ratios (all p for interaction <0.05; Fig. 4). For example, among patients with stage III tumours, the odds of cure were lower with nCRT compared to perioperative FLOT (OR 0.23; 95% CI, 0.06–0.93).

Dynamic modelling using time-varying hazard ratios from the Royston-Parmar model showed that the protective effect of FLOT in stage III tumours remained stable over time, consistent with durable benefit following treatment discontinuation (Grambsch–Therneau test, p = 0.449; Fig. 3b). For stage II patients, the OR for cure was 3.94 (95% CI, 0.44–35.47). Consistently, the 5-year recurrence rate for stage I–II tumours treated with nCRT was 39.5% (95% CI, 8.5–59.9%), compared to 76.6% (95% CI, 61.4–94.0%) for stage III tumours treated with nCRT. In contrast, among patients with stage III tumours treated with perioperative FLOT, the 5-year recurrence rate was 64.1% (95% CI, 52.3–72.8%). For stage IVA tumours, reported globally without separation by treatment strategy, the 5-year recurrence rate was 80.7% (95% CI, 58.1–92.0). A more detailed analysis of outcomes in this subgroup is reported in the Appendix Table S11.

FLOT exhibited higher rates of overall and grade ≥3 toxicity (Appendix Amit plots Figs. S8 and S9). Significant differences in surgical and pathological outcomes between the FLOT and CROSS cohorts mirrored those seen in the global cohort (Appendix Table S12). Surgical resection rates were similar (FLOT 92.2% vs CROSS 85.4%; p = 0.09), but R0 resection was higher with CROSS (95.5% vs 82.6%; p = 0.03). CROSS also yielded more favourable pathological T and N stages, fewer positive lymph nodes, and a lower overall ypTNM stage, while pCR rates were comparable (19.51% vs. 11.54%, p = 0.06). Postoperative complication and serious complications rates were generally similar, although pleural effusions were more frequent after FLOT (27.36% vs. 14.77%, p = 0.03). There was higher 30-day mortality with CROSS (11.36% vs. 3.98%, p = 0.03).

Exploratory analyses of secondary real-world cohorts

In addition to our primary comparisons, we conducted two exploratory analyses on secondary cohorts to better understand real-world treatment patterns. These findings should be considered hypothesis-generating. First, we examined a ‘NeoAEGIS-like’ cohort, whose outcomes were consistent with our primary ESOPEC-like comparison, providing supplementary support for our main findings (full analysis in Appendix Tables S13S16 and Figs. S10S13).

Second, we descriptively compared FLOT (n = 218) to a heterogeneous cohort of 90 patients who received one of two alternative strategies: either various fluoropyrimidine-platinum-based neoadjuvant chemoradiotherapy regimens (FP-based nCRT) or CROSS nCRT followed by adjuvant nivolumab (baseline characteristics in Appendix Table S17). This analysis yielded inconclusive results and, unlike our primary comparison, did not show a clear benefit for FLOT. Although the nCRT ± ICI arm was associated with a significantly higher rate of pathologic complete response (21.7% vs. 11.5%; p = 0.024), this did not translate into a survival advantage.

We found no statistically significant differences in survival, with adjusted HR for DFS of 1.33 (95% CI, 0.81–2.17) and for OS of 1.24 (95% CI, 0.78–1.97) (Fig. 2; Appendix Tables S2S3, S18S19, and Figs. S14S15). Similarly, estimates from the cure model were imprecise, showing no significant difference in cure rates (OR 0.95; 95% CI, 0.24–3.71) or time to relapse for non-cured patients (time ratio 1.23; 95% CI, 0.87–1.74) (Appendix Fig. S18). First-relapse patterns, surgical outcomes (Appendix Table S20), and toxicity profiles (Appendix Figs. S16S17) were also broadly comparable.

Discussion

Although the ‘cure rate’ is the Holy Grail of therapy for localised cancer, its use as a standard primary endpoint in clinical trials has been hindered by methodological challenges, beginning with the fundamental limitation that cure is not a directly observable state. Consequently, the scientific community has historically relied on endpoints like DFS and OS. Mixture cure models are emerging as a valuable tool to overcome these limitations by estimating the fraction of cured patients, thus providing a deeper understanding of long-term therapeutic benefit [7]. However, their application in the literature has so far been mostly limited to secondary or exploratory analyses rather than for the primary validation of treatments [8, 30,31,32]. While mixture cure models have been used in oesophageal cancer, their application to neoadjuvant or perioperative therapies remains rare and largely unexplored [33, 34].

In this analysis of the AGAMENON/SEOM registry, we explored the outcomes of competing therapeutic strategies in a real-world cohort, with a primary focus on estimating cure rates. Our findings provide a real-world context for the recent conclusions of the ESOPEC trial [16], focusing on the cure rates of each therapeutic strategy and the potential influence of dynamic treatment factors. The estimated 3-year DFS rates in our study (47.4% with FLOT vs. 40.1% with nCRT) closely paralleled those reported in ESOPEC (51.6% vs. 35.0%).

Similarly, the ESOPEC trial reported a median OS of 66 months versus 39 months (HR 0.70; p = 0.01), and our findings in the ESOPEC-like cohort were consistent with this result, demonstrating a statistically significant survival benefit for FLOT in a real-world setting (median OS, 43.9 vs. 28.6 months; HR 0.63; p = 0.015) [16].

Importantly, we verified that the benefit of perioperative FLOT over CROSS-based nCRT was largely attributable to higher cure rates, consistent with a cytotoxic model that eradicates micrometastases under these patterns of use. Notably, this benefit was reflected in a higher probability of definitive cure—rather than merely a delay in recurrence—among aggressive subgroups, such as patients with stage III disease or an elevated neutrophil-to-lymphocyte ratio (NLR > 2.5) [16]. By contrast, in smaller, stage II tumours with lower metastatic potential, the poorer outcomes observed with FLOT may indicate a potential role for nCRT, particularly in scenarios where achieving an R0 resection is technically challenging due to tumour location and local control remains critical.

In this regard, the ESOPEC trial reported a neutral effect of nCRT in cN0 tumours (HR 0.98) [16]. Nevertheless, accumulating evidence questions the value of intensifying neoadjuvant therapy in favourable-risk settings, as studies in cT2N0 cohorts have shown no survival benefit from neoadjuvant chemotherapy compared to upfront surgery [35], and even possible disadvantages of nCRT in smaller tumours [35].

Beyond the clinical stage, our DFS analysis revealed a significant interaction by histological grade, with FLOT conferring a benefit in patients with grade 3 tumours, whereas nCRT was associated with a trend toward lower cure rates in this subgroup.

This suggests that the aggressive biology of high-grade tumours may render them more responsive to intensive systemic CT. Further study of recurrence patterns stratified by tumour grade and treatment modality is needed to clarify these findings.

What is particularly noteworthy is that our findings on treatment completion in a real-world setting closely mirror the patterns reported in the pivotal ESOPEC trial, especially regarding the challenges of the FLOT regimen. In our cohort, only 58% of patients were able to complete the full perioperative course by receiving postoperative FLOT. This rate is strikingly similar to the 57% of patients in ESOPEC who received all four planned cycles of chemotherapy after surgery. Likewise, the high completion rate for neoadjuvant CROSS in our registry (93%) aligns with the feasibility observed in ESOPEC, where 93% of patients received at least four of the five planned chemotherapy cycles and 98% received the full radiation dose [16]. This consistency between a real-world cohort and a selected clinical trial population highlights a crucial point: while the neoadjuvant components of both FLOT and CROSS are highly feasible, completing the postoperative phase of FLOT is a significant challenge across different settings [1]. This observation lends further support to the hypothesis that the therapeutic benefit of FLOT may be primarily driven by its preoperative component.

In relation to these patterns, our analyses suggest that the therapeutic advantage of FLOT may primarily stem from its preoperative component, rather than the postoperative phase, which is typically more challenging to deliver across all series. However, this finding contrasts with limited observational data suggesting a benefit for adjuvant completion [36], as well as with results from the phase II VESTIGE trial, in which standard adjuvant chemotherapy was not outperformed by adjuvant nivolumab/ipilimumab in patients with ypN+ and/or R1 gastroesophageal adenocarcinoma following neoadjuvant CT and surgery [37]. Both studies, therefore, support the continuation of adjuvant chemotherapy in this setting. Adding further complexity, the international SPACE-FLOT study concluded that the benefit of adjuvant FLOT was confined to partial pathological responders, with no survival advantage for complete or minimal responders [38].

Whether nCRT has become obsolete in the era of immuno-augmented perioperative chemotherapy remains an open question. The MATTERHORN trial, evaluating perioperative FLOT plus durvalumab, appears to have tipped the balance strongly toward perioperative chemotherapy [3]; however, there are no direct comparisons with nCRT-based strategies that incorporate modern radiotherapy techniques, concurrent FOLFOX, or immunotherapy. Thus, current evidence does not support definitive conclusions, as the landscape continues to evolve.

Indeed, our present analyses advise caution against broad generalisations regarding the superiority of FLOT over alternative nCRT regimens. In our exploratory analysis of the AGAMENON-SEOM registry, we observed no statistically significant advantage in cure rates for perioperative FLOT relative to these alternative nCRT strategies; in fact, the point estimate numerically favoured the nCRT cohort, which also presented a distinct toxicity profile. Although exploratory, these observations may reflect the ongoing evolution of multimodal approaches. For instance, while some studies suggest potential benefits for certain nCRT strategies, such as the use of induction FOLFOX with PET-guided selection in CALGB 80803, this evidence is not yet firm and is contrasted by discrepant data from other trials like NeoSCOPE [18, 19].

This complexity, along with the potential contribution of adjuvant ICIs [14], collectively suggests that the superiority of perioperative FLOT may not be universal, and that certain integrated nCRT strategies remain a relevant therapeutic option with a potentially more favourable risk-benefit profile in selected clinical scenarios.

Prospective randomised evaluation will be necessary to clarify the optimal sequencing and integration of perioperative and neoadjuvant modalities—most notably, through direct comparison of FLOT plus anti–PD-1 versus nCRT with FOLFOX plus anti–PD-1, likely selecting immunotherapy candidates based on biomarker status. Meanwhile, adopting integrated nCRT strategies, particularly FOLFOX-based regimens, provides a tailored approach for patients who are not candidates for FLOT and highlights the increasing importance of individualised therapy in an era of expanding treatment options.

Our study has several limitations inherent to its registry-based, retrospective design. First, despite multivariable adjustment, our findings remain vulnerable to selection bias—stemming from centre-level differences in therapy selection—and to unmeasured confounding. While we attempted to mitigate this risk through instrumental variable selection criteria, these limitations cannot be entirely eliminated. However, it is important to underscore that our primary objective was not to replicate the causal inference of a randomised controlled trial. Rather, our intention was to illustrate the value of cure as a clinically meaningful endpoint and to provide insights into conditional treatment effects across patient subgroups. This approach moves beyond estimating average effects, instead drawing attention to which patients may achieve true long-term benefit. Second, real-world data inevitably capture heterogeneity in patient selection, CT regimens, treatment adherence, and radiotherapy techniques, precluding the type of standardised comparisons achievable in randomised trials and potentially limiting generalisability. At the same time, this diversity enables adaptive learning from routine practice and may reveal patterns not evident in controlled environments. Third, consistent with most studies in this field (including pivotal trials such as ESOPEC), therapies administered after disease recurrence were not systematically collected; variation in subsequent treatments may therefore have influenced OS independently of the initial neoadjuvant strategy. Additionally, our data do not permit biomarker-based analyses, limiting our ability to identify patient subgroups who may derive differential benefit from specific perioperative regimens. Translational studies from pivotal perioperative ± immunotherapy trials are needed to determine whether biomarkers such as PD-L1 have predictive value in the localised setting.

In conclusion, cure is an informative and clinically meaningful endpoint in localised oesophageal cancer and should be systematically reported. Specifically, within the AGAMENON-SEOM registry, patients treated with perioperative FLOT were associated with higher cure rates than those treated with CROSS, with this benefit appearing most pronounced in selected high-risk subgroups. Our exploratory analyses also suggest that alternative neoadjuvant strategies, such as those incorporating FOLFOX-based chemoradiotherapy or adjuvant immunotherapy, warrant further investigation. While our findings regarding these regimens were inconclusive, they underscore the importance of prioritising research into integrated and novel multimodal approaches to improve curative outcomes for patients.