Introduction

Early gastric cancer (EGC) refers to gastric cancer that is limited to the mucosa or submucosa, regardless of lymph node metastasis, for which the optimal treatment method currently is endoscopic submucosal dissection (ESD)1,2. ESD triumphs over traditional surgical intervention by offering minimal trauma, shorter recovery time, fewer complications, and reduced economic burdens, with a 5-year overall survival (OS) rate exceeding 90% for most cases3. Nevertheless, non-curative resection (NCR) still remains as a major caveat for ESD, affecting up to 30–40% of patients undergoing ESD. NCR poses potential risk for local recurrence and lymph node metastasis (LNM) that subsequently require surgical resection4.Hence, accurate pre-procedural prediction could facilitate risk evaluation and optimal treatment selection for EGC patients.

The continuous advancement of endoscopic techniques has significantly enhanced the detection and diagnostic capabilities for gastric lesions. Currently, widely used techniques include white light imaging (WLI), magnifying endoscopy, and endoscopic ultrasound5. Magnifying endoscopy can determine whether a lesion is neoplastic, but it cannot assess the depth of invasion. Factors such as lesions located in the upper third of the stomach, excessively large lesion diameter, and low differentiation grade can all affect the accuracy of endoscopic ultrasound evaluation. WLI is currently the most prevalent procedure for detecting abnormalities and evaluating mucosal status in the gastrointestinal tract worldwide. It enables physicians to identify early signs of cancerous lesions, such as ulceration, redness, elevated margins, and fold convergence6,7, at a manageable cost for most patients. Endoscopic features, along with other parameters such as pathology type, depth of infiltration, and inflammation indices including systemic immune inflammation index (NLR) and neutrophil–lymphocyte ratio (SII)8,9,10,11,12, have been utilized for constructing prediction models for NCR. Unfortunately, these models either lack adequate validation on a different cohort or fall short of comprehensively including WLI features for EGC. Herein, we aim to describe the pre-procedural WLI features of EGC patients in a Chinese cohort and analyze their capacity for predicting NCR.

Methods

Patient selection

We retrospectively collected the clinicopathological data of EGC patients receiving ESD at the Affiliated Hospital of Qingdao University from March 2016 to March 2024. Patients were divided into two groups using March 2023 as the cutoff point: those undergoing ESD before March 2023 were selected as the training set, while those treated afterward formed the validation set. Inclusion criteria were as follows: (1) Gastric carcinoma pathologically confirmed by post-operative pathology3; (2) Primary lesion of the stomach; (3) Complete clinical and pathological data. Exclusion criteria included: (1) History of gastric surgery; (2) History of other malignant tumors and receipt of any cancer therapy; (3) Invalid or incomplete pathological or WLI results. Ultimately, 455 cases were used as the development set and 113 cases as the validation set.

Data collection

The following data were collected from all patients through our electronic medical record system: (1) demographics: age, gender, body mass index (BMI), cigarette and alcohol consumption, Hp infection, history of hypertension, diabetes, and preoperative serum carcinoembryonic antigen (CEA)13,14; (2) Endoscopic features (tumor size, location, morphology, ulceration, and peripheral mucosal status)15,16,17; (3) Pathological data (histological type, depth of invasion, vascular infiltration, vertical and horizontal margins). ESD procedures were performed by senior endoscopists with experience of over 100 independent ESD cases. White-light endoscopic findings were independently evaluated by two junior endoscopists, followed by cross-review. All endoscopic assessments were conducted under the supervision of senior endoscopists. Our study was approved by the Ethics Committees of the Affiliated Hospital of Qingdao University. All patient information was anonymized and de-identified before analysis.

Gross and pathologic evaluation

Postoperative pathological classification was based on the WHO classification standard18. The differentiated type includes malignant epithelial tumors of the general type, specifically papillary adenocarcinoma (pap) and tubular adenocarcinoma (tub1, tub2). Meanwhile, the undifferentiated type includes poorly differentiated adenocarcinoma (por1, por2) and signet ring cell carcinoma (sig). For early gastric cancer (EGC) having a combination of both types of components, its classification was decided by the predominant histological component (> 50%). The gross appearance follows the Paris classification (elevated, depressed, and flat)19,20.

Evaluation of resection efficacy

The evaluation of resection efficacy is undertaken in accordance with the JGCA guideline4. En-bloc resection is defined as the removal of the lesion in a single piece without any fragments left. Complete resection is achieved when histopathological examination verifies that there is no horizontal or vertical margin invasion of the tumor after an en-bloc resection4,17,21. Curative resection is further defined provided that no evidence of margin invasion or lymphatic/vascular involvement has been confirmed after complete resection. The endoscopic curability of patients was subsequently categorized into three types: endoscopic curability A (eCura A), endoscopic curability B (eCura B), and endoscopic curability C (eCura C), which can be further divided into eCura C-1 and eCura C-2. The treatment strategy for eCura C-1 cases remains controversial, whereas eCura C-2 patients are more prone to lymph node metastasis. Therefore, additional gastrectomy is recommended for all eCura C-2 patients. In this study, eCura C-2 cases were accordingly classified as NCR.

WLI assessment

The size, location, and morphology were thoroughly documented according to the criteria of the Japanese Research Society for Gastric Cancer4. Remarkable redness was defined as a red mucosal surface of the lesion similar to that of regenerative epithelium. Whitish mucosa change was defined as a whitish or pale lesion compared with the surrounding mucosa. Ulceration stands for a local depression or disruption of the mucosa, sometimes accompanied by exudate and bleeding on the surface. Scar is characterized as a fibrotic, thickened area on the mucosa. Spontaneous bleeding indicates bleeding or an inclination towards bleeding of the lesion without external physical contact or stimulation. Converging folds were described as thickened and merged gastric folds around the lesion. A granular surface suggested small, uneven granular or nodular elevations on the lesion surface, while elevated margin referred to protrusion at the edge of the tumor. (Fig. 1). The determination of Hp positivity required simultaneous fulfillment of:1 typical endoscopic manifestations according to the Kyoto classification criteria (diffuse redness, mucosal swelling), and2 clearly documented histopathological examination or urea breath test positivity in medical records.

Fig. 1
Fig. 1
Full size image

Representative images of key endoscopic characteristics associated with early gastric cancer. (a) Ulceration—local mucosal disruption or depression, sometimes with exudate or bleeding. (b) Fold convergence—thickened and merging gastric folds surrounding the lesion. (c) Remarkable redness—red mucosal surface similar to regenerative epithelium. (d) Surface nodularity—small, uneven granular or nodular elevations. (e) Marginal elevation—protrusion at the tumor edge. (f) Scar—fibrotic, thickened mucosal area. (g) Spontaneous bleeding—lesion-associated bleeding without external physical contact. (h) Whitish mucosal change—pale or whitish lesion compared to the surrounding mucosa.

Statistical analysis

Continuous variables were expressed as means and standard deviations, and categorical variables were described as numbers with percentages. Univariate analysis was carried out using the chi-square test or Fisher’s exact test. Multivariate logistic regression analysis was further performed on variables with significant differences in the univariate analysis. Independent risk factors identified in the multivariate logistic regression analysis were applied to construct a nomogram for predicting NCR of EGC. A calibration curve was generated to evaluate the discrimination ability of the nomogram using the bootstrap method, which encompassed 1,000 repeated samplings. Receiver operating characteristic (ROC) curve was generated and the area under the curve (AUC) was calculated for testing the predictability of the nomogram. Decision curve analysis (DCA) was performed to evaluate the clinical practicability of the prediction models. A p-value < 0.05 was considered statistically significant. SPSS 27.0 was used for statistical analysis, and R software (version 4.3.2) was utilized to construct and validate the nomogram.

Results

Baseline characteristics

Based on the above inclusion and exclusion criteria and related definitions, the patient grouping and flowchart are shown in Fig. 2. The baseline characteristics of the patients are presented in Table 1. There were 455 cases in the training set, among which 385 cases underwent curative resection (CR) and 70 cases underwent NCR. The cohort was predominantly composed of male patients over 60 years old. Patients with male gender, advanced age, a history of alcoholism, and obesity were more likely to develop EGC. No baseline factors were found to be significantly correlated with the incidence of NCR.

Fig. 2
Fig. 2
Full size image

Study design. A total of 568 patients were included in the study, comprising both the training and validation sets.

Table 1 Demographic characteristics of EGC patients in the training and validation sets.

White light imaging features of the training set

The white light imaging features of the training set were analyzed (Table 2). The consistency evaluation between two endoscopists in assessing WLI features of early gastric cancer demonstrated excellent agreement, with kappa values ranging from 0.822 to 0.910. Ulceration and nodular surface were the most common endoscopic features of EGC, with a prevalence of 13.4% and 18.5% of cases, respectively. The training set consisted predominantly of cases with tumors growing in the middle and lower 1/3 of the stomach (92.8%), and most lesions were differentiated carcinomas located in the mucosal layer, with a diameter of no more than 30 mm. Ulceration, fold convergence, significant redness, elevated margin, whitish mucosa, larger lesion size, Hp infection, histology and depth of invasion were significantly associated with NCR (P < 0.05) (Table 2). Given that the primary objective of this study was to investigate the risk factors of NCR under WLI, histology and depth of invasion were not included in the multivariate analysis.

Table 2 White light imaging and pathological features of the training set.

Detailed characterization of NCR

As shown in Table 3, we performed a stratified analysis of the NCR group. The data demonstrated that lesions with submucosal invasion ≥ 500 μm (T1b2), positive vertical margins, undifferentiated histology, ulceration, and larger tumor size accounted for a considerable proportion in our cohort. Although the validation set contained significantly more cases with larger tumors and undifferentiated histology (P < 0.05), the distribution of key prognostic factors (T1b2, lymphovascular invasion, positive vertical margin) remained balanced.

Table 3 Characteristics of NCR cases in both training and validation sets.

Multivariate analysis of NCR

The collinearity test was performed on the significant factors identified in the univariate analysis. The variance inflation factors were all less than 5. We further carried out multivariate analysis. The results showed that ulceration (odds ratio [OR] 3.779; 95% confidence interval [CI] 1.861–7.672), fold convergence (OR 3.934; 95% CI 1.468–10.544), significant redness (OR 3.453; 95% CI 1.556–7.664), elevated margin (OR 2.333; 95% CI 1.058–5.146), whitish mucosal change (OR 2.850; 95% CI 1.077–7.539), lesion size between 20 and 30 mm (OR 2.463; 95% CI 1.203–5.043), lesion size > 30 mm (OR 3.368; 95% CI 1.543–7.351), and tumor location were identified as independent risk factors for NCR (Table 4).

Table 4 Multivariate logistic regression analysis of the training set.

Establishment and validation of the nomogram

We then established a nomogram based on the independent risk factors obtained from multivariate analysis, which include ulceration, fold convergence, significant redness, elevated margin, whitish mucosa, lesion size, and tumor location (Fig. 3). It can be seen that fold convergence had the greatest impact on the model. A receiver operating characteristic (ROC) curve was generated to evaluate the discrimination ability of the model in both the training and validation sets (Fig. 4). The area under the curve (AUC) was 0.8095 (95% CI: 0.7538–0.8651) for the training set and 0.7567 (95% CI: 0.6427–0.8707) for the validation set, respectively, indicating satisfactory predictive performance of the nomogram for both cohorts. We further performed internal validation on the validation set using a calibration curve (Fig. 4). The calibration curve (solid line) for the prediction of NCR showed great consistency between predicted and observed outcomes for both cohorts, with a mean absolute error of 0.023 and 0.026. The Hosmer-Lemeshow goodness-of-fit test yielded a p-value of 0.33 for the training set and 0.08 for the validation group, suggesting that the model exhibits satisfactory calibration.

Fig. 3
Fig. 3
Full size image

Nomogram for predicting non-curative resection in early gastric cancer patients. Each independent risk factor identified in multivariate analysis was assigned a corresponding point value based on its odds ratio. The total score, obtained by summing individual factor scores, is vertically aligned with the probability scale to estimate the patient’s risk of NCR. Abbreviations: EGC, early gastric cancer; u, upper third of the stomach; m, middle third of the stomach; l, lower third of the stomach.

Fig. 4
Fig. 4
Full size image

ROC curve and calibration curve of the training and validation sets. (a) ROC curve of the training set, showing the model’s ability to discriminate between curative and non-curative resection cases. (b) Calibration curve of the training set: The x-axis represents the predicted probability of NCR, while the y-axis represents the actual observed NCR occurrence. The diagonal dashed line indicates perfect prediction, and the solid line represents the nomogram’s performance. Closer alignment between the solid and dashed lines reflects better predictive accuracy. (c) ROC curve of the validation set, assessing the model’s discrimination performance in an independent cohort. (d) Calibration plot of the validation set, demonstrating the agreement between predicted and actual NCR rates. Abbreviations: AUC, area under the curve; ROC, receiver operating characteristic.

Clinical utility

Decision curve analysis (DCA) was also performed to evaluate the clinical practicality of this model (Fig. 5). DCA of the training set indicated that when the threshold probability falls within 0.03 to 0.71, the net benefit of deciding whether to take additional intervention based on the NCR risk calculated by this nomogram outweighs that of both an all-intervention scheme and a no-intervention scheme. DCA of the validation set indicated that when the threshold probability falls within 0.04 to 0.70, the net benefit of deciding whether to take additional intervention based on the NCR risk calculated by this nomogram outweighs that of both an all-intervention scheme and a no-intervention scheme. While DCA provides a theoretical decision framework, its clinical application requires integration with the endoscopist’s experience level, local healthcare resource availability and other practical clinical considerations.

Fig. 5
Fig. 5
Full size image

DCA curve of the training set and the validation set. (a) The DCA curves of the model in the training sets, (b) the DCA curves of the model in the validation sets. The X-axis represents the threshold probability, and the Y-axis represents the net benefit. The “All” line represents intervening on everyone, the “None” line represents intervening on no one, and the “Nomogram” represents the overall net benefit of the predictive model across the entire range.

Discussion

ESD possesses multiple advantages compared to gastrectomy, including a shorter hospital stay, fewer complications, higher overall survival and decreased overall postoperative morbidity. However, additional surgery resulting from NCR still poses a significant burden on both clinicians and patients. Therefore, it is necessary to carefully evaluate the curability of ESD for each patient before the procedure and to recommend surgical removal for those who are considered highly susceptible to NCR at the beginning. The traditional endoscopic examination method, WLI, offers endoscopists prompt and effective assessment of EGC and surrounding mucosa at first glance22,23. Endoscopic features, including size, morphology, and ulceration, also aid in the estimation of tumor malignancy, whereby the risk of NCR after ESD can be deduced. We herein conducted a retrospective study that emphasizes the WLI features of EGC in the prediction of NCR.

In this study, we confirmed through multivariate analysis that WLI features—including remarkable redness, ulceration, fold convergence, marginal elevation, whitish mucosal change, larger lesions, and Hp infection—are independent risk factors for NCR. The associations between NCR and both ulceration and tumor size have been consolidated in various studies and guidelines. We found contradictory results with respect to tumor gross type, as we did not observe what Jeon MY et al.24 discovered—that 0-IIa + IIc or 0-IIc + IIa tumors were more susceptible to NCR. This discrepancy may require further investigation by a large-scale, multi-centered study. Fold convergence held the larger odds ratio in our multivariate analysis, making it the robust indicator of NCR. This could be owing to its close association with SM invasion and potential LNM25,26. Lesions with converging folds, therefore, are at greater risk of NCR during ESD. Remarkable redness is often associated with vascular proliferation and inflammatory responses in the tumor27,28, which may indicate increased invasiveness into the deeper mucosa, thereby increasing the risk of NCR. As a potential predictor of fibrotic or scarred tissue, whitish mucosal changes are more frequently associated with undifferentiated carcinomas, which typically present with higher malignancy and increased resection difficulty29. The association between elevated margin and NCR has not been identified in previous studies. According to Yamada et al.30., 10–20% of lesions with an elevated margin develop SM infiltration. However, the risk of SM invasion grows significantly when an elevated margin is accompanied by other risk factors such as converging folds, ulceration, or depressed morphology26,31. It would be preferable to combine these features when assessing the risk of NCR. We also, for the first time, found that Hp-infected mucosa exhibited an elevated risk of NCR. Hp infection manifests as mucosal changes such as atrophy, meandering, and thickened folds in the gastric corpus, xanthoma, or gooseflesh-like mucosa32,33,34. Although Hp infection does not directly cause NCR, it may contribute to the occurrence of NCR by promoting tumor heterogeneity and causing mucosal inflammation that impairs endoscopic visibility. Based on considerations regarding the inherent limitations of retrospective studies and to avoid confounding factors affecting the predictive model, we excluded cases with successful H. pylori eradication. Investigation of gastric cancer following eradication therapy will constitute an important focus of future research directions.

We then utilized the above described WLI features to build a predictive nomogram for NCR before ESD procedure. Unlike previous models that incorporated serological, pathological or radiological parameters29,35,36, our model relies entirely on readily available endoscopic features, making it more accessible for routine clinical use. The nomogram showed excellent discrimination and calibration in both the training and validation sets, with AUC values of 0.8095 and 0.7597, respectively. We also employed DCA to assess the clinical utility of the model in actual decision-making. The model yielded greater benefit for patients with a calculated risk ranging from 0.03 to 0.71 and 0.04 to 0.70 compared to either a no-intervention or an all-intervention strategy. To our knowledge, this is the first attempt to use only WLI features to build a prediction model for NCR.

Our in-depth analysis of NCR revealed statistically significant differences in tumor size and histological differentiation between the training and validation cohorts (P < 0.05). Importantly, these variations likely reflect the inherent biological heterogeneity of NCR encountered in clinical practice rather than selection bias, as demonstrated by the maintained equilibrium in critical prognostic factors (T1b2: 37.1% vs. 33.3%; LVI+: 8.6% vs. 4.8%; VM1: 30.0% vs. 23.8%; all P > 0.05). The consistent distribution of these fundamental risk parameters across both cohorts strongly supports the robustness and clinical applicability of our developed prediction model. We also found that while margin status remains important, our data suggest tumor biology frequently drives NCR decisions, particularly in lesions with undifferentiated histology or deep submucosal invasion.

The elevated NCR rates in both training and validation sets likely stem from intrinsic tumor heterogeneity and limitations in preoperative assessment. Critical pathological features such as invasion depth, lymphovascular/neural invasion, frequently remained undetected during preoperative evaluation, only to be confirmed postoperatively. The high NCR proportion may enhance the model’s ability to identify high-risk features while potentially reducing its discriminative efficacy for early-stage lesions (eCura A/B).This further underscores the necessity of precise preoperative assessment for early gastric cancer.

Our study has several limitations: First, this single-center investigation utilized training and validation datasets exclusively derived from one hospital’s patient population, which might affect the generalizability of the findings. Second, the retrospective design and the exclusion of surgically overtreated cases introduce potential selection and information biases. Third, inter-observer variability in WLI assessments among endoscopists could influence the model’s clinical utility. Importantly, the inherent constraints of retrospective studies currently prevent us from providing direct evidence that WLI-based scoring could reduce unnecessary ESD procedures or improve therapeutic outcomes. Additionally, false-negative NCR predictions might result in inappropriate ESD attempts, constituting a notable limitation of the current model.

To address these limitations, prospective studies are warranted to compare WLI-guided triage strategies with standard multimodal assessment approaches. Furthermore, subsequent research should prioritize clinical outcome tracking to validate real-world utility. Future investigations should also compare overtreatment rates between endoscopic and surgical management in eCura C-2 cases. Multicenter studies with rigorous internal-external validation frameworks should be implemented, utilizing independent test sets to comprehensively evaluate model generalizability. Besides, incorporating comparative evaluations between the nomogram and endoscopists at different experience levels, along with implementing standardized training protocols, may enhance diagnostic accuracy and improve the generalizability of study findings.

In summary, we have developed a nomogram based on WLI characteristics for EGC that demonstrated excellent predictive performance for NCR. On one hand, for novice endoscopists this nomogram can serve as a clinically accessible first-line risk stratification tool and a preliminary screening tool to identify high-risk lesions requiring further evaluation with magnifying endoscopy, narrow-band imaging, or endoscopic ultrasound, thereby optimizing resource allocation. On the other hand, for experienced practitioners, our model can enhance existing multimodal assessment strategies and avoid unnecessary overtreatment caused by the imprecise evaluation of NCR risk prior to ESD.