Introduction

Esophageal cancer (EC) is a major contributor to the global cancer burden due to its high incidence and mortality rates1,2. As per the guidelines of the Japanese Esophageal Society, for patients diagnosed with EC that is confined to the superficial layers, endoscopic intervention is proposed as a definitive treatment. In particular, this applies to the epithelium (T1a-M1) or lamina propria (T1a-M2). Conversely, when EC invades deeper tissues such as the muscularis mucosae (T1a-M3) or submucosa (T1b), endoscopic therapy is not yet standardized due to concerns about an elevated risk of lymph node metastasis3,4. The risk associated with conventional esophagectomy, including complications, recurrence, and mortality, remains significant. Its impact on patients’ quality of life cannot be overlooked5,6,7. As an additional option for organ preservation, chemotherapy is associated with a significant local failure rate and dose-escalation-related side effects that are hard to overlook. Furthermore, there exists a notable risk of disease recurrence8,9,10,11. The most appropriate treatment for patients with T1a-M3/T1b EC has not been defined.

Given the gravity of EC disease, an increasing number of individuals are investigating the prognosis associated with various treatment modalities. McCarty et al.12 performed a stratified analysis of the outcomes for stage I EC patients using data from the Surveillance, Epidemiology, and End Results (SEER) Database covering the period from 2004 to 2015. According to Min et al.13, for patients with esophageal squamous cell carcinoma (ESCC) lacking endoscopic evidence of obvious submucosal invasion, Endoscopic submucosal dissection (ESD) can offer long-term outcomes similar to those of esophagectomy.

Numerous models have been developed for the prognosis or adjuvant treatment of EC. Jia et al.14 created a model for predicting the survival for early-stage EC. However, the article lacks external validation and thus appears to lack some conviction. Moreover, the cancer stage of the model developed by the general article lacks specificity15,16,17,18. There are also some studies focusing on the evaluation of the risk of lymphatic metastasis and chemotherapy-related death19,20,21,22. At present, in this field, there is a deficiency of comprehensive models which are able to integrate clinical factors in a systematic way.

The SEER database was utilized in this research to assess the survival results of patients having T1a-M3/T1b EC and receiving various treatment methods. After that, a user-friendly web-based platform was developed with the aim of predicting the survival of those diagnosed with stage T1a-M3/T1b EC.

Results

Patient characteristics

The research encompassed 1199 patients who were diagnosed with T1a-M3/T1bWC. The characteristics of these patients were outlined according to their treatment methods and the extent of tumor invasion (Table 1 and Tables S1S3). These observed variations were consistent with the distinct epidemiological features of the region.

Table 1 Characteristic of all esophageal cancer patients stratified by treatment

Comparative analysis of survival outcomes before adjustment

The research divided participants into five groups based on treatment methods. Table S4 details the 5-year OS and CSS rates. Compared to CRT, endoscopic therapy, esophagectomy, esophagectomy with CRT, and endoscopic therapy with CRT showed significant survival benefits. Specifically, endoscopic therapy (HR: 1.61; 95% CI, 1.46–1.79; P < 0.0001 for CSS; HR: 1.33; 95% CI, 1.24–1.43; P < 0.0001 for OS), esophagectomy (HR: 2.20; 95% CI, 1.89–2.56; P < 0.0001 for CSS; HR: 2.11; 95% CI, 1.86–2.39; P < 0.0001 for OS), esophagectomy with CRT (HR: 3.61; 95% CI, 2.18–5.99; P < 0.0001 for CSS; HR: 3.17; 95% CI, 2.12–4.75; P < 0.0001 for OS), and endoscopic therapy with CRT (HR: 1.38; 95% CI, 1.12–1.70; P = 0.0016 for CSS; HR: 1.18; 95% CI, 1.02–1.37; P = 0.024 for OS) all demonstrated better outcomes. Notably, no significant difference in CSS was found between endoscopic therapy and esophagectomy (HR: 1.18; 95% CI, 0.99–1.39; P = 0.059), but lower in OS (HR: 0.81; 95% CI, 0.73–0.90; P < 0.0001).

When comparing CSS in endoscopic therapy versus esophagectomy combined with CRT, superior long-term survival benefits were observed (HR: 1.24; 95% CI, 1.05 to 1.48; P = 0.012), while OS was similar (HR: 0.97; 95% CI, 0.86 to 1.10; P = 0.63). Endoscopic therapy combined with CRT showed lower long-term survival in compared to endoscopic therapy (HR: 2.59; 95% CI, 1.36 to 4.94; P = 0.0028 for CSS; HR: 1.95; 95% CI, 1.29–2.96; P = 0.0014for OS) and esophagectomy (HR: 0.53; 95% CI, 0.30 to 0.96; P = 0.032; HR: 0.36; 95% CI, 0.24–0.53; P < 0.0001 for OS). There was no significant difference in CSS between endoscopic therapy combined with CRT and esophagectomy combined with CRT (HR: 0.81; 95% CI, 0.57 to 1.17; P = 0.27), but OS was significantly better for the former (HR: 0.68; 95% CI, 0.52 to 0.88; P = 0.0027). Lastly, no significant difference in CSS or OS was found between esophagectomy and esophagectomy combined with CRT. All Kaplan-Meier curves illustrating these CSS and OS results are shown in Figs. S12.

Comparative analysis of survival outcomes based on sIPTW adjustment

After sIPTW adjustment, the CSS and OS results for endoscopic therapy, esophagectomy and esophagectomy compared with CRT remained consistent with unadjusted results. Notably, endoscopic therapy plus CRT showed a significant CSS benefit over CRT alone (HR: 2.96; 95% CI, 1.41–6.21; P = 0.0036), but OS was similar (HR: 1.54; 95% CI, 0.95–2.49; P = 0.80). Adjusted CSS results indicated that endoscopic therapy had a statistically significant survival advantage over esophagectomy and esophagectomy combined with CRT (HR: 0.60; 95% CI, 0.36–1.00; P = 0.038 and HR: 2.55; 95% CI, 1.31–5.00; P = 0.0068), while OS remained comparable (HR: 1.11; 95% CI, 0.82–1.50; P = 0.50 and HR: 1.07; 95% CI, 0.63–1.80; P = 0.80).

When comparing CSS between endoscopic therapy combined with CRT and esophagectomy, no significant differences were found (HR: 1.15; 95% CI, 0.44 to 3.00; P = 0.78). However, esophagectomy showed a significant advantage in OS (HR: 2.46; 95% CI, 1.37 to 4.42; P = 0.001). There were no notable differences in CSS or OS between endoscopic therapy alone vs. combined with CRT, or esophagectomy alone vs. combined with CRT. Comparing CSS and OS between endoscopic therapy combined with CRT and esophagectomy combined with CRT yielded the same results as before. These findings are summarized in Figs. 1 and 2, with covariate balance data in Tables S514.

Fig. 1: In the sIPTW-adjusted analysis, Kaplan-Meier curves were utilized to compare CSS outcomes for various treatments of T1a-M3/T1b esophageal cancer.
figure 1

Survival outcomes of different treatments: endoscopic therapy and esophagectomy (a), endoscopic therapy and CRT (b), endoscopic therapy and endoscopic therapy combined with CRT (c), endoscopic therapy and esophagectomy combined with CRT (d), endoscopic therapy combined with CRT and esophagectomy (e), endoscopic therapy combined with CRT and esophagectomy combined with CRT (f), endoscopic therapy combined with CRT and CRT (g), esophagectomy and CRT (h), esophagectomy and esophagectomy combined with CRT (i), and esophagectomy combined with CRT and CRT (j). sIPTW stabilized inverse probability of treatment weighting, CSS cancer-specific survival, CRT chemoradiotherapy.

Fig. 2: In the analysis adjusted using sIPTW, Kaplan-Meier curves were utilized to compare OS outcomes among different treatments for T1a-M3/T1b esophageal cancer.
figure 2

Survival outcomes of different treatments: endoscopic therapy and esophagectomy (a), endoscopic therapy and CRT (b), endoscopic therapy and endoscopic therapy combined with CRT (c), endoscopic therapy and esophagectomy combined with CRT (d), endoscopic therapy combined with CRT and esophagectomy (e), endoscopic therapy combined with CRT and esophagectomy combined with CRT (f), endoscopic therapy combined with CRT and CRT (g), esophagectomy and CRT (h), esophagectomy and esophagectomy combined with CRT (i), and esophagectomy combined with CRT and CRT (j). sIPTW stabilized inverse probability of treatment weighting, OS overall survival, CRT chemoradiotherapy.

Comparative analysis of survival outcomes based on PSM

In this study, PSM serves as an auxiliary sensitivity analysis method, effectively balancing covariates between groups (Tables S15S24). The CSS results from the PSM analysis were largely in agreement with those obtained through sIPTW, thereby reinforcing the reliability of our analytical methods. The Kaplan–Meier survival curves for both CSS and OS following PSM adjustment are illustrated in Figs. S3 and S4.

Comparative analysis of survival outcomes based on histological stratification

In the subgroups categorized by EA and ESCC, the key pathological characteristics are summarized in Tables S2526. Among 199 patients with squamous cell carcinomas, after adjustment using sIPTW analysis, no significant differences were observed in CSS and OS between endoscopic therapy and esophagectomy (HR: 0.67; 95% CI: 0.23 to 1.93; P = 0.43 and HR: 1.12; 95% CI: 0.58 to 2.17; P = 0.8). Endoscopic therapy showed a notable survival benefit in CSS compared to CRT (HR: 4.11; 95% CI: 1.22 to 13.84; P = 0.049), whereas OS remained comparable (HR: 1.74; 95% CI: 0.85 to 3.56; P = 0.19).

For the 999 patients with adenocarcinoma, both CSS and OS were significantly higher in the endoscopic therapy group compared to the CRT group (HR: 8.72; 95% CI: 4.90 to 15.52; P < 0.0001 for CSS and HR: 4.49; 95% CI: 2.86 to 7.05; P < 0.0001 for OS). The CSS analysis indicated that endoscopic therapy provided a survival advantage over esophagectomy (HR: 0.57; 95% CI: 0.36 to 0.91; P = 0.045), while OS did not differ significantly (HR: 1.14; 95% CI: 0.83 to 1.57; P = 0.42). All Kaplan-Meier survival curves and corresponding tables for CSS and OS are presented in Figs. S5S8 and Tables S27S46.

Comparative analysis of survival outcomes based on time and surgical procedures

Based on the development tendency of therapeutic strategy for ER at different phases, we completed subgroup analysis stratified with three intervals (cycle 2004–2008, cycle 2009–2013, and cycle 2014–2017) to interpret the dynamics, providing survival evaluation of the same treatment paradigm across subsequent eras and verified the advantages of endoscopic therapy (Table S47).

According to different risks of postoperative complications and mortality among patients caused by varying degrees of organ preservation, we categorized esophagectomy into three subgroups: code A300 (partial esophagectomy) as the first category; code A400 (total esophagectomy) as the second; and codes A500–A550 along with code A800 (esophagectomy combined with resection of other organs). Subgroup analyses was performed to further clarify the consistence among different approaches of esophagectomy (Table S4849).

Comparative analysis of survival outcomes from an external database

We gathered data from 121 patients diagnosed with T1a-M3/T1b EC who were treated at our hospital between 2015 and 2019, forming an external validation cohort. The baseline characteristics of this external dataset are summarized in Tables S50–51. Endoscopic therapy demonstrated superior outcomes in both CSS and overall OS compared to CRT (HR: 2.55; 95% CI: 1.71 to 3.81; P < 0.0001 for CSS, and HR: 2.19; 95% CI: 1.59 to 3.00; P < 0.0001 for OS). In comparison with esophagectomy, endoscopic therapy provided a significant survival advantage in CSS (HR: 3.92; 95% CI: 0.97 to 15.79; P = 0.038), while no statistically significant differences were observed in OS (HR :1.28; 95% CI: 0.46 to 3.55; P = 0.63). These findings are illustrated in Figs. S910.

Prediction model development

The training cohort comprised 959 patients with EC identified from the total data set was randomly divided into 80%, the internal validation cohort was 20%, a total of 240 patients. While validation cohort 2 encompassed 121 patients exclusively treated at our hospital. These cohorts collectively represented a diverse population of EC patients.

Univariate Cox regression analysis was employed to identify risk factors for variable screening (Table S52), with six relevant variables selected: age, therapy, MHI, sex, grade, and histology. BSR was then utilized to optimize the maximum R2 value and determine the best combination of variables, resulting in the selection of age, therapy, grade, histology, MHI, DTTT, and race (Fig. S11A). LASSO regression combined with cross-validation was applied to determine the variable combination corresponding to the smallest mean square error (MSE), as depicted in Fig. S11B, C, three variables were retained: MHI, age, and histology (Table S53). The variables ascertained via the aforementioned three methodologies were integrated into the multivariate Cox regression analysis and construct models. The ultimate determination of variables for constructing the model relied upon stepwise backward by using the function “step”, employing the minimized value of the Akaike information criterion (AIC). After excluding sex from consideration, a further reduction in AIC value occurred—from 6115.59 to 6113.69 in the Cox group. The AIC value in BSR decreased from 6122.13 to 6113.69 after excluding DTTT and race. The variables selected by LASSO regression were not excluded, and the AIC value was 6162.74. we incorporated an additional Boruta to identify the candidate variables: age, therapy, histology, grade, race and DTTT. Following AIC comparison (6124.04 vs. 6116.87), the final model was refined to include age, therapy, histology and grade (Fig. S11D).

Subsequently, we selected variables that were consistently identified across multiple screening approaches and exhibited clinical relevance with verified prognostic significance in reported literature. As the result, gender and treatment interval were comprised. Meanwhile, it is important to note that race was excluded from the final model due to its limited generalizability and inability to be validated in external cohorts.

Consequently, a final prediction model with seven variables (age, grade, histological, MHI, therapy, sex and DTTT) was constructed to forecast OS at 1-year, 3-year, and 5-year intervals, as depicted in Fig. S11E.

Predictive performance of the model

We incorporated selected factors to conduct our predictive prognostic model with AUC scores of 0.715, 0.726, and 0.746 for predicting outcomes at 1-year, 3-year, and 5-year intervals (Fig. 3a). In this study, the model was internally validated through a 10-fold cross-validation method repeated 200 times (Fig. S12). The validation yielded demonstrated AUC scores of 0.704, 0.698, and 0.719 at 1-year, 3-year, and 5-year intervals, respectively. Furthermore, the C-index values were recorded as 0.690, 0.675, and 0.674 for these time points. In the internal validation group, the AUC values for 1-year, 3-year, and 5-year predictions were 0.730, 0.743, and 0.769, while in the external validation group, they were 0.765, 0.731, and 0.785 (refer to Fig. 3b, c). The calibration curves for the 1-year, 3-year, and 5-year predictions showed minimal variations across all training and validation datasets (Fig. 3d–l), suggesting a strong alignment between the predicted and observed outcomes.

Fig. 3: Evaluation of the predictive performances for OS in T1a-M3/T1b patients in the validation groups.
figure 3

ROC curves showing the predictive performances of the model in the internal training group (a), internal validation group (b), and external validation group (c), respectively. Calibration chart depicting the 1-year, 3-year, and 5-year OS rates of T1a-M3/T1b esophageal cancer patient model. In the training group (df), validation 1 group (gi), and validation 2 group (jl). OS overall survival, ROC receiver operating characteristic, OS overall survival.

Dynamic model website

The webpage functions as an advanced instrument for forecasting the long-term survival outcomes of patients with stage T1a-M3/T1b EC, utilizing a sophisticated prediction model. Users can access interactive visualizations at https://ziqichen1.shinyapps.io/dynnomapp/. Additionally, the algorithm underlying the model can be refined and updated dynamically in response to feedback gathered from its ongoing applications.

Discussion

Our study showed a significant survival advantage with endoscopic therapy compared to esophagectomy and CRT for T1a-M3/T1b EC patients after sIPTW adjustment. These results were confirmed in the PSM-adjusted cohort, systematic subgroup analysis and external validation, reinforcing our conclusions.

The incidence of T1a-M3 lymph node metastasis and invasion observed in surgical specimens significantly varies from that detected in endoscopic specimens23,24, with markedly higher rates reported among surgically treated cases23,25. The role of endoscopic in estimating depth of invasion of T1-M3-SM1 or deeper remains uncertain26. This disparity likely stems from differences in histopathological diagnostic methods between surgical and endoscopic specimens. Surgical specimens, being larger, may lead to mistakes in diagnosing T1a-M3 cases as T1b4,27,28. The depth of invasion beyond T1a-M3 was identified as an important predictor for lymph node metastasis with an equivalent risk of metastasis to that observed in cases with T1b-SM123,25,29,30, introducing rationality to integrating both groups together as our target population.

Our findings aligned with majority of prior studies regarding the long-term survival outcomes of various treatment approaches31,32,33,34. Katy A Marino et al.35 found that the postoperative survival rate showed comparability between the endoscopic treatment group and the esophagectomy group in patients with T1a adenocarcinoma (P = 0.003). In a recent analysis, patients with T1a or T1b ESCC who underwent ESD showed no significant difference in OS, cancer recurrence, or metastasis compared to those treated with esophagectomy36. Our study included a larger cohort of patients undergoing various treatments compared to previous studies. By using sIPTW and PSM methods to assess datasets while considering EC’s relative indication range, our research provided more clinically practical and reliable results.

It is worth noting that, considering the variations in various subgroups, we performed systematic subgroup analyses with specific emphasis on histological stratification, time and surgical procedures. Results in divided cohorts adjusted by sIPTW method indicated the stability of our findings. Furthermore, when applied across diverse regions, the model’s internal and external validation cohorts demonstrated favorable performance. These observations further emphasized the robustness and highlight the extensive potential applicability of our study.

In a prospective study conducted by Keiko et al.37 from 2006 to 2012, the endoscopic therapy combined with selective CRT suggested comparable results to surgery in 176 T1b (SM1–SM2) ESCC patients. Kenji Yamauchi et al.38 designed a retrospective analysis and concluded that ESD alone or combined with chemotherapy and/or radiotherapy may be considered viable treatment options for M3 and SM1 ESCC patients. Through the comparison between the CSS of endoscopic therapy and CRT or esophagectomy in our study, we further validated above results in previous studies and highlighted the value of endoscopic treatment. Despite of the conclusion obtained in a recent randomized controlled parallel-group ESORES trial which verified the therapeutic efficacy of CRT, we noted that heterogeneity in the applicable population39, potential micrometastases and completion rates of the strategy might introduce the contradiction, ER is an initial diagnostic tool and a potential curative method, characterized with relatively low costs and high accessibility in more medical settings. Meanwhile, ESD is preferred for its superior curative resection and reduced local recurrence rate compared with CRT, especially for lesions >15 mm. Though it is widely accepted that ER alone is inadequate for patients with deep submucosal invasion), lymphovascular invasion, or poor differentiation, we find difficultly evaluating the depth of the tumor before ER in clinical practice.

Interestingly, the CSS was significant for endoscopic therapy over esophagectomy but not the OS. This might be explained by competing mortality in patients with EC especially those undergoing endoscopic therapy. Patients with superficial esophageal cancer, especially those suitable for endoscopic treatment, are typically elderly and often have multiple comorbidities. In this group, the risk of death from cardiovascular or respiratory disease, other cancers, or non-esophageal cancer-related causes may be as high as, or higher than, the risk from esophageal cancer itself. Endoscopic treatment, being minimally invasive, greatly reduces the risk of perioperative death and severe treatment-related complications, thereby lowering early cancer-specific mortality. This is the main reason for the significant improvement in CSS. In contrast, esophagectomy is a highly invasive procedure with significant perioperative mortality and morbidity. Most of these deaths were directly linked to cancer treatment, which greatly reduced early CSS in the esophagectomy group. In our analysis, we clarified that influence of the treatment methods’ on CSS was associated with the threshold value at 65 years old (Table S54). Ultimately, younger age and fewer comorbidities were associated with better overall survival.

The variables commonly used as significant prognostic indicators in numerous studies include histology, grade, age, and therapy15,40,41,42. Calvin et al.43 found that lower median household income was associated with higher cancer-specific mortality and lower endoscopic resection rates in T1aN0M0 esophageal adenocarcinoma. Our aticle also suggested that MHI served as a significant predictor within the model. Specifically, we found that higher-income individuals were less likely to receive chemoradiotherapy in healthcare systems where patients bear the cost of treatment (Table S55). This finding indicates that economic factors represent a significant barrier to treatment selection within specific healthcare financing models.

There exists obvious inconsistencies in the selection criteria for therapeutic approaches of T1a-M3/T1b EC patients among different countries’ treatment guidelines. The National Comprehensive Cancer Network guidelines recommend endoscopic resection for superficial ESCC confined to the mucosa (T1a), and esophagectomy in cases with submucosal invasion44. Differently, the Japan Esophageal Society defined mucosal cancer limited to the lamina propria as an absolute indication for endoscopic resection on account of its low potential for lymph node metastasis45. And the European Society of Gastrointestinal Endoscopy suggested lamina propria invasion without evidence of lymph node metastasis as an absolute indication for endoscopic treatment similarly46. Based on our comprehensive analyses and subsequent discussion, endoscopic treatment will be reasonably recommended as both an initial diagnostic modality and a potential therapeutic approach for patients with T1a-M3/T1b EC on account of the lower lymph node metastasis rate in this population, the current limitations of non-invasive preoperative staging, reduced procedural risk, fewer complications, broader availability across healthcare settings, and cost-effectiveness. In the United States, the utilization of endoscopic treatment for T1b EC increased significantly from 6.6% in 2004 to 20.9% in 201047. This upward trend reflected growing clinical acceptance driven by advances in endoscopic imaging and resection techniques, as well as the centralisation of endoscopic treatments—factors that collectively contributed to improved survival outcomes. Our study, validated through both internal and external cohorts, provided significant evidence for the selection of diagnostic and therapeutic strategies for superficial EC. Nevertheless, further multicenter, large-sample prospective studies are warranted to confirm these findings and refine strategic criteria.

Our study possesses several distinctive features: (1) We conducted a comprehensive analysis of treatment outcomes for the controversial stage T1a-M3/T1b, incorporating a detailed treatment classification that was not present in previous studies. (2) The collected clinical variables encompass not only conventional patient data but also socioeconomic factors. This integration enables our model to incorporate both clinical and social realities, thereby providing tailored prognoses for patients with diverse social circumstances. (3) For the first time, we have created a range of intuitive web-based prediction tools designed for endoscopic procedures, surgical applications, and comprehensive global patient care. Despite the patient data used for developing and validating the model being restricted to two specific regions, no significant regional variations were noted in the diagnostic and therapeutic approaches for EC across various locations. This suggests that our model exhibits wide-ranging applicability. Additionally, visual interfaces improve patients’ understanding of their condition’s severity and foster more effective doctor-patient communication and collaborative care.

This research acknowledges several inherent limitations. First, we must acknowledge that the SEER database has limitations, including gaps in sociodemographic information, data on metastasis, and data on lymphovascular invasion, which limit the scope of some analyses. Second, the relatively small size of the external validation cohort might affect the evaluation of model performance on independent datasets. Considering the variability in clinical data across different healthcare settings, developing a robust predictive model that can be effectively generalized to multiple centers poses a significant challenge.

In summary, this study developed a predictive model for T1a-M3/T1b EC patients validated its exceptional predictive performance, and established a dedicated webpage. The results of our study indicate that endoscopic therapy exhibits superior long-term survival outcomes in comparison to CRT and is comparable to esophagectomy.

Methods

Patients

The SEER database, which gathers comprehensive registry data from 18 U.S. cancer registries, was the source of the data used in this study. The criteria for choosing the participants were as shown below: (1) patients with a diagnosis of T1a-M3/T1b EC, having no regional lymph node involvement or distant metastasis between 2004 to 2017;(2) individuals who underwent endoscopic therapy, esophagectomy, or CRT, but did not receive preoperative neoadjuvant treatment; (3) availability of clinical details such as treatment type, tumor location, and survival duration; (4) existence of follow-up documentation; (5) diagnosis verified by autopsy or death certificate. The approval from the Ethics Committee (2024ZDSYLL281-P01) confirmed that this research follows the ethical standards set out in the Declaration of Helsinki and its subsequent revisions, as this was a retrospective study, the ethics committee waived the requirement for informed consent. Meanwhile, we registered and gained access to the SEER database, and no additional ethical statements were required for the use of publicly available data.

Variables

In the SEER database, the codes for the depth of invasion categorized as T1a-M3/T1b are 120, 160, and 165. These codes correspond to invasions of the muscularis mucosae, submucosa, and T1b classification without further extension details, respectively. For endoscopic therapy, the surgical codes fall within the ranges of A100-A140 and A200-A270. Codes A100 through 14 stands for local tumor destruction processes in which no specimens were sent for pathological examination, like photodynamic therapy, electrocautery, cryotherapy and laser ablation. Codes A200 to A270 are used to represent local tumor excision operations in which samples were dispatched for pathological examination. A300 to A800 indicated surgical procedure methods for esophagectomy: A300 Partial esophagectomy; A400 Total esophagectomy, NOS; A500 Esophagectomy, NOS WITH laryngectomy and/or gastrectomy, NOS; A510 WITH laryngectomy; A520 WITH gastrectomy, NOS; A530 Partial gastrectomy; A540 Total gastrectomy; A550 Combination of A510 WITH any of A520–A540; A800 Esophagectomy, NOS.

The clinical factors obtained from the SEER database comprised the patient’s age at diagnosis, racial background, marital status, radiotherapy administration, chemotherapy administration, median household income (MHI), and the interval from diagnosis to treatment initiation (DTTT). Tumor features were categorized based on tumor size (>2 cm versus ≤2 cm) and tumor grade (including well-differentiated [Grade I], moderately differentiated [Grade II], poorly differentiated [Grade III], undifferentiated [Grade IV], and those with unknown differentiation). Utilizing the International Classification of Diseases for Oncology, Third Edition (ICD-O-3), histological types were classified into three groups: adenocarcinoma (EA), characterized by specific codes (8140-8145,8210,8211,8255,8260,8261,8263,8310,8480,8481,8490, and 8574); esophageal squamous cell carcinoma (ESCC), identified through codes (8050-8052,8070-8076,8083, and 8084); and other unspecified types. Primary tumor locations were divided into eight categories. Details regarding survival status, survival length, and causes of death were included in the follow-up data. The two main outcomes for the survival analysis were 5-year cancer-specific survival (CSS) and 5-year overall survival (OS).

Statistical analysis

Continuous variables are presented in terms of mean and standard deviation or median along with inter - quartile range (IQR) for the reporting of findings. For categorical variables, frequencies and percentages are used for description. The Mann-Whitney U test was employed to analyze continuous data. For categorical data, Fisher’s exact test was used for examination. In order to check the long-term impacts on OS and CSS. The K-M survival curves were made use of. Survival differences among groups were assessed through the log-rank test. This retrospective study utilized three analytical approaches for comparative analysis: unadjusted analysis, propensity score matching (PSM), and stabilized inverse probability treatment weighting (sIPTW). Sensitivity analyses were carried out using the PSM (1:1 ratio) method and the hospital cohort. Given its capacity to control confounding factors, keep the sample size consistent and minimize false positives, the sIPTW method has been selected as the main analytical method,

Univariate Cox regression, best subset selection (BSR), Borut algorithm and cross-validated LASSO regression were utilized for the initial variable screening. Model selection was based on AUC evaluation. Then, multivariate Cox regression was utilized to refine the prediction model according to Akaike information criterion (AIC) values. To conduct internal verification, 10-fold cross-validation with 200 repetitions was utilized. The C-index, ROC, and calibration curves in both training and validation datasets were used to assess the model performance at 1, 3, and 5 years. A two-sided significance level of 0.05 was set for all statistical analyses. The analysis process flowchart is presented in Fig. 4.

Fig. 4: The overall flowchart of the study.
figure 4

EC esophageal cancer, sIPTW stabilized inverse probability treatment weighting, PSM propensity score matching, SEER the Surveillance, Epidemiology, and End Results, BSR best subset selection, LASSO least absolute shrinkage, and selection operator, CRT chemoradiotherapy.