Artificial intelligence-driven prediction of lymph node metastasis in T1 esophageal squamous cell carcinoma using whole slide images

Ren, Li-Hua; Ding, Yuan; Zhang, Yue-Xin; Teng, Ke-Han; Wang, Lu; Zhang, Wan-Yue; Zhu, Ye; Xu, Jia-Jia; Wei, Xiao-Ying; Wang, Bin; Hu, Kai; Shi, Rui-Hua

doi:10.1038/s41698-025-01186-z

Download PDF

Article
Open access
Published: 23 November 2025

Artificial intelligence-driven prediction of lymph node metastasis in T1 esophageal squamous cell carcinoma using whole slide images

Li-Hua Ren¹^na1,
Yuan Ding²^na1,
Yue-Xin Zhang³^na1,
Ke-Han Teng^2,4^na1,
Lu Wang⁵,
Wan-Yue Zhang²,
Ye Zhu¹,
Jia-Jia Xu⁴,
Xiao-Ying Wei⁴,
Bin Wang^2,6,
Kai Hu³ &
…
Rui-Hua Shi¹

npj Precision Oncology volume 9, Article number: 403 (2025) Cite this article

2035 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Accurate prediction of lymph node metastasis (LNM) in T1 esophageal squamous cell cancer is critical for guiding treatment decisions after endoscopic submucosal dissection (ESD). We developed a deep learning-based artificial intelligence model using whole slide images (WSIs) to predict LNM and reduce overtreatment. The model was trained, validated, and internally tested on 160 surgically resected cases (72 LNM+, 88 LNM–) from 374 patients without prior ESD, achieving an AUC of 0.949 (95% CI: 0.912–0.986) on internal test. Further validation was performed on an external ESD cohort comprising clinically high-risk cases with invasion depths from MM to SM2. The model attained an accuracy of 90.1%, sensitivity of 81.8%, specificity of 91.4%, and an F1-score of 69.2%. It correctly classified 90.1% of samples, with a negative predictive value (NPV) of 96.9%. The high NPV and specificity underscore the model’s utility in minimizing overtreatment while preserving diagnostic accuracy in high-risk T1 esophageal cancer.

Retrospective analysis of predictive factors for lymph node metastasis in superficial esophageal squamous cell carcinoma

Article Open access 16 August 2021

Deep learning-based pathology signature could reveal lymph node status and act as a novel prognostic marker across multiple cancer types

Article 03 May 2023

Unraveling the power of radiomics: prediction and exploration of lymph node metastasis in stage T1/2 esophageal squamous cell carcinoma

Article Open access 16 June 2025

Introduction

Esophageal squamous cell carcinoma (ESCC) remains a substantial global health burden, with disproportionately high incidence and mortality rates in China^1,2. For early-stage lesions limited to the mucosa or superficial submucosa (T1a), endoscopic submucosal dissection (ESD) is established as the first-line curative treatment³. Post-ESD histopathological evaluation identifies high-risk features such as lymphovascular invasion (LVI) and tumor budding, which correlate with lymph node metastasis (LNM), and often prompt recommendations for supplemental esophagectomy⁴. However, the significant invasiveness and morbidity associated with esophagectomy, particularly in the old or those with multiple comorbidities, raise concerns about overtreatment, as only ~10% of patients with LVI ultimately develop nodal metastases⁵. These findings underscore the critical need for refined risk stratification in post-ESD specimens to accurately identify occult LNM, thereby enabling personalized management and reducing unnecessary surgery in low-risk cohorts.

The conventional diagnostic paradigm for detecting tumor metastases, particularly micrometastases, relies on labor-intensive manual slide evaluation by pathologists, a process prone to diagnostic uncertainty due to subtle morphological features⁶. These challenges highlight the need for automated, objective tools to augment histopathological assessment. Over the past decade, artificial intelligence (AI) has emerged as a transformative tool in medical diagnostics, enabling automated or semi-automated analysis of complex imaging data⁷. Advances in computational pathology-fueled by high-throughput slide scanning, enhanced computing power, and scalable storage solutions, have further expanded AI’s capacity to mine microscopic lesions and interpret gigapixel-sized digital images (WSIs)⁸. While AI-driven prediction of LNM has been explored in multiple cancers^9,10, its application to ESCC remains unexplored, representing a critical gap in optimizing risk stratification for early-stage disease.

A cornerstone of AI implementation in WSIs analysis involves segmenting high-resolution images into smaller, computationally manageable patches. Current methodologies predominantly employ supervised learning frameworks, utilizing dichotomized LNM status (positive/negative) as supervisory labels^11,12. This methodology offers distinct advantages for LNM prediction in cancer. First, supervised learning leverages histologically validated labels to establish a robust ground truth, enabling models to discern metastasis-associated features with high diagnostic accuracy¹³. Second, it explicitly models known clinicopathological features such as LVI or tumor budding that correlate strongly with metastatic risk, ensuring biologically relevant feature prioritization¹⁴. Third, supervised frameworks enhance interpretability by linking predictions to specific histopathological patterns, a prerequisite for clinical adoption where model transparency and reliability are paramount¹⁵. Finally, the flexibility of supervised learning supports integration with advanced architectures, including convolutional neural networks (CNNs) and graph neural networks (GNNs), which excel at capturing spatial and contextual dependencies within WSIs¹⁶.

To address the unmet need for precise LNM risk assessment in early-stage ESCC with invasion depths from MM to SM2, we developed an AI-driven GNN model using supervised learning in order to analyze WSIs from ESD specimens. This approach aims to reduce diagnostic subjectivity, improve detection of micrometastases, and ultimately guide personalized post-resection management.

Results

Study population and cohort characteristics

This study was conducted utilizing two independent patient cohorts. The model was developed from a surgical cohort comprising 374 patients who underwent primary esophagectomy without prior ESD. Within this cohort, 72 patients were LNM+, and 302 were LNM–. To address the class imbalance and enhance model generalizability, a balanced training set was constructed, comprising 72 LNM+ and 88 randomly selected LNM– cases. The representativeness of this LNM subset was confirmed, as no significant differences in key baseline characteristics were observed compared to the remaining 214 LNM– patients (Table S1).

This cohort of 160 patients was then randomly divided into a training/validation set (n = 112, 442 WSIs) and an internal test set (n = 48, 217 WSIs) in a 7:3 ratio. The distribution of critical prognostic factors, including LVI (61.6% vs. 64.6%) and actual LNM rate (46.4% vs. 41.6%), was well-balanced between these sets, with no statistically significant differences in age, sex, tumor size, clinical stage, tumor location, differentiation grade, lymphovascular or perineural invasion status, or lymph node yield (Table 1).

Table. 1 Baseline clinicopathological characteristics of the training and validation set and internal test set of surgical cohorts (n = 160)

Full size table

For external validation, we utilized a separate cohort of 35 high-risk patients who had previously undergone ESD. This cohort comprised patients who subsequently received esophagectomy with systematic lymphadenectomy (n = 18, 85 WSIs) and those who managed with surveillance alone (n = 17, 76 WSIs), resulting in a total of 161 WSIs for analysis. The final nodal status, confirmed by histology or follow-up, identified 4 patients as LNM+ and 31 as LNM–. The model’s performance was rigorously evaluated on this independent ESD cohort to assess its clinical applicability.

Validation performance of the AI model

As illustrated in Fig. 1, the optimal cutoff value for the model was determined from the internal test set. At this optimized threshold, the model demonstrated robust performance in predicting LNM in ESCC, achieving an area under the ROC curve (AUC) of 0.949 (95% CI: 0.912–0.986) in the internal validation cohort and 0.866 (95% CI: 0.768–0.964) in the external ESD validation cohort.

**Fig. 1: ROC curves of the training and testing set.**

Test performance and clinical utility

Table 2 summarizes the distribution of histopathological features, including submucosal invasion depth and tumor budding grade, within the external validation cohort, providing context for correlation analyses with model predictions.

Table. 2 Clinicopathological characteristics and outcomes of the external ESD test cohort (n = 35)

Full size table

On a per-slide basis within the external cohort, the AI model achieved an accuracy of 90.1%. Performance metrics included a sensitivity of 81.8%, specificity of 91.4%, an F1-score of 69.2%, and a negative predictive value (NPV) of 96.9% (Fig. 2). This high NPV suggests a potential to reduce unnecessary surgeries by correctly identifying a substantial proportion of non-metastatic cases, highlighting its utility for patient stratification towards non-surgical surveillance.

The corresponding confusion matrix is detailed in Table 3, which shows 18 true positives (TP), 4 false negatives (FN), 127 true negatives (TN), and 12 false positives (FP). These results underscore the model’s high accuracy and reliability, particularly in correctly classifying non-metastatic cases, thereby effectively minimizing the risk of false-positive predictions.

Table. 3 The confusion matrix and performance metrics of AI in the external test set

Full size table

Case-level diagnostic performance aligned with clinical practice

Reflecting real-world clinical decision-making, where a single positive slide typically defines a case as high-risk, we aggregated slide-level predictions to the case level using a max-pooling rule. In the external ESD cohort, the model achieved robust case-level performance, with a sensitivity of 100.0% (4/4), a specificity of 83.9% (26/31), and an overall accuracy of 85.7% (30/35). Notably, the NPV at the case level reached 100.0% (26/26) (Table 4). This exceptionally high NPV indicates the model’s high reliability in identifying patients who can safely avoid esophagectomy, while maintaining high sensitivity for the detection of true metastatic cases.

Table. 4 Case-level diagnostic performance of the AI model in the external ESD cohort

Full size table

Discussion

The strategic management of T1 ESCC with submucosal invasion (MM-SM2) following ESD remains a considerable clinical challenge, primarily due to the substantial risk of LNM (approximately 15~30%)^17,18. Current clinical guidelines rely on conventional histopathological assessment, evaluating features such as depth of invasion, LVI, poor differentiation, and other high-risk histopathological features, to guide decisions regarding additional esophagectomy^19,20. Nevertheless, this approach is hampered by considerable interobserver variability and limited reproducibility in identifying features predictive of nodal involvement. The suboptimal discriminative capacity of these morphological criteria can lead to potential overtreatment of patients with minimal LNM risk and underscoring the urgent need for more precise and objective risk stratification tools^21,22.

To address this critical unmet need, we developed an AI-driven model for predicting LNM using computational pathology. Our model utilizes a hierarchical GNN architecture to autonomously learn multi-scale histopathological representations from WSIs, capturing intricate morphological patterns without relying on subjective human interpretation. Due to the scarcity of ESD specimens with surgically confirmed nodal status, model development incorporated surgically resected T1–T4 cases, while external validation was rigorously restricted to T1 ESD cases to ensure clinical relevance. This approach enables a fully automated, objective, and reproducible prediction of metastatic risk. The model demonstrated robust performance in internal validation (AUC: 0.949), and, crucially, in an external cohort of real-world MM-SM2 ESD cases, the most relevant subgroup of post-ESD decision-making. It achieved a sensitivity of 81.8% and a high NPV of 96.9%, with case-level max-pooling further enhancing its clinical utility (100% sensitivity, 100% NPV, 83.9% specificity). The consistently high NPV underscores the model’s capability to reliably identify patients at low risk of LNM, for whom conservative management may be appropriate, thereby potentially reducing unnecessary surgeries.

A key innovation of our framework is its ability to transcend the limitations of conventional region-of-interest (ROI) or patch-based analyses^10,23,24. By constructing a biologically interpretable k-nearest neighbor graph integrating multimodal features (including color histograms, spatial coordinates, and deep feature embeddings)^25,26,27, our GNN architecture effectively models local and global tissue architecture without manual annotation, overcoming the limitations of methods that introduce noise or fail to capture spatial dependencies. This end-to-end, supervised approach explicitly captures spatial relationships among histopathological patches, addressing the “needle-in-a-haystack” challenge inherent in WSI analysis and identifying subtle metastatic signatures potentially overlooked in conventional assessment^28,29.

Notably, our AI system autonomously learned prognostically relevant morphological patterns directly from WSIs, without explicit programming of established risk factors^30,31, It successfully identified a subset of low-risk patients, confirmed by postoperative histology, who might otherwise have been recommended for surgery under current guidelines^32,33. To enhance interpretability and mitigate the “black box” concern, we generated decision heatmaps that visualized model-prioritized regions. These heatmaps consistently highlighted areas concordant with established high-risk features, such as the invasive front and lymphocyte-rich stroma, a finding validated by independent expert pathologists, thereby providing biologically plausibility to the model’s predictions. Nevertheless, it should be acknowledged that heatmaps remain indirect proxies of the underlying model reasoning.

Notwithstanding these promising performances, several limitations merit consideration. The single-center, retrospective design may affect generalizability, necessitating future multi-institutional prospective validation. The inclusion of multiple tumor slides per patient, while improving data utilization, introduces analytical complexity regarding intra-patient dependency. Furthermore, the incorporation of more advanced ESCC cases during training, necessitated by the limited availability of node-positive T1 ESD cases, creates a potential domain shift, a common compromise in computational pathology. In the external cohort, the inference of nodal status based on recurrence-free survival for non-surgical patients, while clinically accepted, represents an indirect method of outcome assessment. Future work should also systematically investigate case-level prediction integration, which may yield even higher diagnostic performance.

In conclusion, we developed and validated a pathologist-independent AI model that accurately predicts LNM risk in T1 ESCC from WSIs. This GNN-based framework provides a robust, automated decision-support tool to optimize post-ESD management pathways, facilitating personalized care and potentially improving quality of life. Future efforts should focus on external validation, real-world clinical integration, and the development of hybrid models combining AI predictions with molecular biomarkers for enhanced risk stratification.

Methods

Study design

This retrospective single-center study enrolled 374 patients with ESCC, stages T1–T4, who underwent primary esophagectomy with systematic lymphadenectomy without previous ESD at Zhongda Hospital Affiliated to Southeast University from January 2019 to December 2024 (Fig. 3). Among them, 72 were LNM+ and 302 were LNM–. To address class imbalance, 72 LNM+ and 88 randomly selected LNM– cases were included as the surgical cohort (n = 160) for model training, validation, and internal testing.

An independent external validation cohort comprised 35 patients with T1 ESCC (MM to SM2) who underwent ESD. This cohort included patients with LNM+ status confirmed by subsequent surgical resection, as well as LNM– patients defined by the absence of tumor recurrence during a 3-year follow-up period after ESD³⁴. This follow-up criterion is grounded in established oncological principles, where 3-year recurrence-free survival (RFS) serves as a clinically validated surrogate for confirming true nodal negativity in non-surgically managed patients³⁵.

This AI model employed a supervised GNN framework to analyze histopathological patterns in WSIs. Notably, no handcrafted histologic features (such as submucosal invasion depth, tumor budding, LVI, etc.) were manually extracted or explicitly incorporated as input variables. Instead, the model was trained directly on raw WSIs, allowing it to infer predictive patterns from the underlying morphology in a data-driven manner. The study protocol was approved by the institutional ethics review committee (No. 2024ZDSYLL385-P01).

Conventional histologic assessment

All specimens obtained were immediately fixed in 10% neutral buffered formalin. They were then cut at the point where the deepest invasion area could be exposed on the cut end surface. Histological sections of ESD specimens were cut into parallel 2–3 mm-thick sections, and esophagectomy specimens into 4–5 mm-thick sections, followed by Hematoxylin and eosin (H&E) staining. All specimens were diagnosed on the basis of the 2019 World Health Organization Classification of Tumors and the categorizing lesions as well differentiated, moderately differentiated, or poorly differentiated^3,5. Submucosal invasion depth was measured vertically from the muscularis mucosa, with cases stratified as SM1 (≤200 μm) or SM2 (>200 μm)¹⁷. LVI was assessed through combined immunohistochemical (D2-40) and histochemical methods (Victoria blue staining)¹⁷. Tumor budding, defined as isolated cancer cell clusters (≤5 cells) at the invasive margin, was graded as BD1 (0–4 buds/field), BD2 (5–9 buds/field), or BD3 (≥10 buds/field) under 200× magnification⁴. At our institution, additional surgery following ESD is recommended if any of the following features are present: (1) submucosal invasion depth >200 μm (SM2), (2) presence of LVI, (3) poorly differentiated histology, (4) positive vertical or horizontal resection margins, and (5) tumor budding grade ≥BD2. For surgically resected cases, both the number of metastatic lymph nodes and the total number of dissected lymph nodes were recorded from pathology reports. Lymph node yield was used to assess the adequacy of lymphadenectomy, with reference to guideline standards (≥15 nodes for accurate staging according to AJCC criteria)³⁶.

Data preparation and preprocessing

Among the 1284 WSIs obtained from the surgical cohort, slides without tumor tissue, slides of inadequate quality, or those containing only blank regions were excluded. As a result, 659 WSIs containing sufficient tumor regions were retained for model development. To provide a clinically interpretable workflow, the selected WSIs were then divided into small patches, morphological and spatial features were extracted, and graphs were constructed to represent the histological architecture. The proposed computational framework implements a unified analytical workflow for predicting LNM in T1-stage ESCC by systematically combining multimodal computational histopathological feature extraction with a hierarchical GNN architecture. Multimodal features (color histograms, spatial coordinates, ResNet-50 embeddings) were concatenated and normalized to a shared latent space via a fully connected layer (512 dimensions).

The overall framework of the proposed method is illustrated in Fig. 4. The ‘GNNClassifier’ leverages two graph convolutional layers (GCNConv) with ReLU activation and dropout (p = 0.4) to propagate node features across the graph structure, ultimately aggregating slide-level representations via global mean pooling for classification. Results, including predicted probabilities and binary classifications, are systematically logged in ‘prediction_results.txt’ for retrospective analysis. Auxiliary utilities validate CUDA compatibility and GPU acceleration prerequisites, completing a robust computational ecosystem that bridges histopathological feature engineering with clinical decision support through modular, reproducible design.

**Fig. 4: A hierarchical GNN-based model was built for predicting lymph node metastasis.**

WSI acquisition and annotation

H&E-stained slides of all the tissue masses in each case were selected for further analysis. The slides were captured as WSIs at 40× magnification using NanoZoomer (Hamamatsu Photonics, Hamamatsu, Japan). QuPath (https://qupath.github.io) was used to annotate and designate cancerous regions by two experienced pathologists (T.K.H. and X.J.J.). All results were double reviewed and were discussed with an independent and blinded pathologist (W.X.Y.) if not in concordance. The captured WSIs were partitioned into non-overlapping 224 × 224 pixel patches. Blank patches and patches without cancerous areas were excluded. Patches were assigned slide-level labels according to the LNM status of the corresponding patient, and patches in cases without LNM were defined as LNM-negative patches.

Data preprocessing and feature extraction

To balance computational efficiency with tissue representation, a maximum of 1000 patches per WSI was retained. Data augmentation strategies included random horizontal/vertical flipping and 30° rotation to enhance rotational invariance, supplemented by a multi-scale sampling strategy (0.5~1.5× scaling) by randomly selecting patches across different WSI pyramid levels to improve scale invariance. For feature extraction, a pretrained ResNet-50 architecture (with final classification layers removed) generated 2048-dimensional feature vectors. These features were subsequently reduced to 512 dimensions via a fully connected layer. To address illumination invariance, LAB color space-based histogram matching was applied for standardization, with additional random brightness/contrast perturbations (±20%) simulating tissue staining variations under diverse exposure conditions. Spatial coordinates were normalized to the [0,1] range, followed by construction of a 10-nearest neighbor graph (k = 10) using the ‘knn graph’ function, establishing topological connections to model spatial relationships between adjacent tissue regions. The selection of k = 10 for nearest neighbor graph construction was empirically validated through ablation studies (k = 5, 10, 15, 20). Performance peaked at k = 10, which balances local context capture and computational efficiency.

Graph neural network architecture

The proposed hierarchical GNN model was trained on graph representations constructed from WSIs. First, each WSI was divided into non-overlapping 224 × 224 pixel patches. Patch-level feature vectors were extracted using a pretrained ResNet-50 backbone (2048 dimensions), followed by a linear compression layer that reduced the features to 512 dimensions. Spatial adjacency among patches was then used to construct a graph, where each patch served as a node and neighboring patches were connected via edges. This hierarchical design was chosen to capture both local tumor microenvironment features and global tissue architecture, which are both critical for predicting LNM. The detailed process of patch division, feature extraction, and graph construction is shown in Table 5.

Table. 5 Architecture of the hierarchical graph neural network (GNN)

Full size table

The resulting graph was processed through two graph convolutional layers (GCNConv) with 512 hidden units, ReLU activation, and dropout. A global mean pooling layer was applied to aggregate node-level information into a slide-level embedding. Finally, a fully connected classification head (512 → 2 units) with softmax activation outputs the predicted probability of LNM. The GCN layers allowed the model to capture spatial patterns within the tumor microenvironment, while global pooling enabled holistic WSI-level prediction based on local features.

Supervised training protocol

The model was trained using PyTorch Lightning with class-weighted cross-entropy loss to address class imbalance. Optimization was performed via the Adam optimizer (initial learning rate = 1 × 10⁻⁴) paired with a ‘ReduceLROnPlateau’ scheduler (factor = 0.1, patience = 5 epochs). Early stopping (patience = 4000 epochs) monitored validation accuracy to mitigate overfitting, while mixed-precision training (16-bit) on NVIDIA A100 GPUs accelerated computational efficiency. Early stopping at 4000 epochs was determined by plateau analysis of validation loss (no improvement for 50 epochs), preventing overfitting while ensuring convergence. Class weighting and early stopping were implemented to reduce bias from class imbalance and to prevent overfitting, thereby improving the generalizability of the model. Five-fold cross-validation demonstrated stable performance across partitions (accuracy: 88.7% ± 1.1%, F1-score: 0.85 ± 0.03). Three independent trials with randomized seeds yielded consistent results (accuracy: 89.2% ± 1.3%, F1-score: 0.87 ± 0.02), confirming low variance. Augmentation robustness tests with randomized parameters (rotation, flipping, multi-scale sampling) showed negligible performance degradation (accuracy <1.5%), underscoring feature invariance under diverse transformations. This multi-faceted validation framework ensured statistical reliability and minimized bias in clinical deployment.

Evaluation of the trained model

Model performance was evaluated using the area under the curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The optimal classification threshold was determined by maximizing Youden’s J statistic. Results were compared against the actual pathological condition. Threshold optimization prioritized NPV (maximizing Youden’s J with NPV > 95%), as false negatives (missed LNM) may lead to under-treatment, whereas false positives (unnecessary surgery) were deemed clinically tolerable.

Model interpretability and attention heatmap generation

To improve model interpretability, we computed node-level attention scores from the hierarchical GNN and generated attention heatmaps overlaid on the original WSIs. Each node corresponded to a histopathological patch, with color intensity reflecting its relative contribution to the final prediction. These heatmaps consistently highlighted histological regions of interest, such as invasive tumor fronts and areas rich in lymphoid tissue with clustered vessels, features that are known to be associated with LNM risk. All heatmaps were independently reviewed by two gastrointestinal pathologists (both with >10 years of diagnostic experience), who confirmed that the high-attention regions corresponded closely with established pathological risk areas. Representative examples are shown in Fig. 5.

**Fig. 5: Representative attention heatmaps generated from the hierarchical GNN model.**

Statistical analysis

The translation model was developed using Python 3.8 (Python Software Foundation) with PyTorch 1.12.0 and PyTorch Geometric 2.2.0 libraries. The architecture integrated a GNN with CNNs. Statistical analyses were conducted using SPSS 26.0 (IBM, Armonk, NY). Continuous variables were assessed for normality via the Shapiro–Wilk test and homogeneity of variance with Levene’s test. Normally distributed variables were compared using Student’s t-test, while non-parametric data were analyzed with the Mann–Whitney U-test. Categorical variables were evaluated by χ²-test or Fisher’s exact test when expected cell counts fell below 5. The discriminative performance of the predictive model was quantified by the area under the receiver operating characteristic curve (AUC), with 95% confidence intervals (CI) calculated through bootstrap resampling (1000 iterations) using the percentile method. For the primary endpoint of predicting LNM in patients with T1 ESCC undergoing ESD, model performance was evaluated using standard diagnostic metrics, including sensitivity, specificity, accuracy, PPV, NPV, F1-score, and ROC AUC. The results were comprehensively summarized both in tabular format and through graphical presentations. All P-values were two-sided, and P < 0.05 was considered statistically significant. Predictions were saved to a file that contains detailing filenames, LNM probabilities, and classifications for clinical review.

Data availability

Due to the privacy of patients, the data related to patients cannot be available for public access, but can be obtained from the corresponding author on reasonable request approved by the institutional review board of all enrolled centers.

Code availability

The underlying code for this study is available on GitHub and can be accessed via this link: https://github.com/dingy97/WSI-main.

References

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
PubMed Google Scholar
Yang, H. et al. Oesophageal cancer. Lancet 404, 1991–2005 (2024).
Article PubMed Google Scholar
Fleischmann, C., Probst, A. & Messmann, H. Indications for endoscopic treatment of adenocarcinoma and squamous cell cancer of the esophagus. Ann. Esophagus 6 (2023).
Fuchinoue, K. et al. Immunohistochemical analysis of tumor budding as predictor of lymph node metastasis from superficial esophageal squamous cell carcinoma. Esophagus 17, 168–174 (2020).
Article PubMed Google Scholar
Nagtegaal, I. D. et al. The 2019 WHO classification of tumours of the digestive system. Histopathology 76, 182–188 (2020).
Article PubMed Google Scholar
Weaver, D. L. et al. Pathologic analysis of sentinel and nonsentinel lymph nodes in breast carcinoma: a multicenter study. Cancer 88, 1099–1107 (2000).
Article PubMed Google Scholar
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
Article PubMed PubMed Central Google Scholar
Komura, D. & Ishikawa, S. Machine learning methods for histopathological image analysis. Comput. Struct. Biotechnol. J. 16, 34–42 (2018).
Article PubMed PubMed Central Google Scholar
Ichimasa, K. et al. Efficacy of a whole slide image-based prediction model for lymph node metastasis in T1 colorectal cancer: a systematic review. J. Gastroenterol. Hepatol. 39, 2555–2560 (2024).
Article PubMed PubMed Central Google Scholar
Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).
Article PubMed PubMed Central Google Scholar
Mi, S. et al. Development and validation of a machine-learning model to predict lymph node metastasis of intrahepatic cholangiocarcinoma: a retrospective cohort study. Biosci. Trends 18, 535–544 (2024).
Article PubMed Google Scholar
Ueki, H. et al. Utility of machine learning models to predict lymph node metastasis of Japanese localized prostate cancer. Cancers 16, 4073 (2024).
Article PubMed PubMed Central Google Scholar
Rani, V. et al. Self-supervised learning: a succinct review. Arch. Comput. Methods Eng. 30, 2761–2775 (2023).
Article PubMed PubMed Central Google Scholar
Cheplygina, V., de Bruijne, M. & Pluim, J. P. W. Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 54, 280–296 (2019).
Article PubMed Google Scholar
Qi, G. J. & Luo, J. B. Small data challenges in big data era: a survey of recent progress on unsupervised and semi-supervised methods. IEEE Trans. Pattern Anal. Mach. Intell. 44, 2168–2187 (2022).
Article PubMed Google Scholar
Ericsson, L. et al. Self-supervised representation learning: introduction, advances, and challenges. IEEE Signal Process. Mag. 39, 42–62 (2022).
Article Google Scholar
Mine, S. et al. Japanese classification of esophageal cancer, 12th edition: part I. Esophagus 21, 179–215 (2024).
Article PubMed PubMed Central Google Scholar
Rice, T. W., Patil, D. T. & Blackstone, E. H. 8th edition AJCC/UICC staging of cancers of the esophagus and esophagogastric junction: application to clinical practice. Ann. Cardiothorac. Surg. 6, 119–130 (2017).
Article PubMed PubMed Central Google Scholar
Altorki, N. K. et al. Total number of resected lymph nodes predicts survival in esophageal cancer. Ann. Surg. 248, 221–226 (2008).
Article PubMed Google Scholar
Ajani, J. A. et al. Esophageal and esophagogastric junction cancers, version 2.2023, NCCN clinical practice guidelines in oncology. J. Natl Compr. Cancer Netw. 21, 393–422 (2023).
Article Google Scholar
Al-Haddad, M. A. et al. American Society for Gastrointestinal Endoscopy guideline on endoscopic submucosal dissection for the management of early esophageal and gastric cancers: methodology and review of evidence. Gastrointest. Endosc. 98, 285–305.e38 (2023).
Article PubMed Google Scholar
van Hagen, P. et al. Preoperative chemoradiotherapy for esophageal or junctional cancer. N. Engl. J. Med. 366, 2074–2084 (2012).
Article PubMed Google Scholar
Hou, L. et al. Patch-based convolutional neural network for whole slide tissue image classification. In Proc. IEEE conference on computer vision and pattern Recognition 2424–2433 (2016).
Shafique, A. et al. Automatic multi-stain registration of whole slide images in histopathology. In Annual International Conference of the IEEE Engineering in Medicine and Biology Society 3622–3625 (2021).
Wang, X. Y. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Article PubMed PubMed Central Google Scholar
Wang, X. et al. Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans. Cyber. 50, 3950–3962 (2020).
Article Google Scholar
Zhou, H. J. et al. ccRCC metastasis prediction via exploring high-order correlations on multiple WSIs. In 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Palmeraie Conf Ctr, Marrakesh, MOROCCO, Oct 06-10; Palmeraie Conf Ctr, Marrakesh, MOROCCO 145–154 (2024).
Li, J. W. et al. In Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, Jun 16-22; Seattle, WA, 2024; pp 11323-11332.
Ha, R. K. et al. Histopathologic risk factors for lymph node metastasis in patients with T1 colorectal cancer. Ann. Surg. Treat. Res. 93, 266–271 (2017).
Article PubMed PubMed Central Google Scholar
Yasue, C. et al. Pathological risk factors and predictive endoscopic factors for lymph node metastasis of T1 colorectal cancer: a single-center study of 846 lesions. J. Gastroenterol. 54, 708–717 (2019).
Article PubMed Google Scholar
Asano, M. Endoscopic submucosal dissection and surgical treatment for gastrointestinal cancer. World J. Gastrointest. Endosc. 4, 438–447 (2012).
Article PubMed PubMed Central Google Scholar
Lee, H. Management strategy of non-curative ESD in gastric cancer: curative criteria, and the critical building block for determining beyond it. J. Gastric Cancer 25, 210–227 (2025).
Article PubMed Google Scholar
Chang, C. et al. Monitoring for recurrence after esophagectomy. Ann. Thorac. Surg. 114, 211–217 (2022).
Article PubMed Google Scholar
Tanaka, T. et al. Comparison of long-term outcomes between esophagectomy and chemoradiotherapy after endoscopic resection of submucosal esophageal squamous cell carcinoma. Dis. Esophagus 32, doz023 (2019).
Amin, M. B. et al. The eighth edition AJCC cancer staging manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J. Clin. 67, 93–99 (2017).
PubMed Google Scholar

Download references

Acknowledgements

This study was supported by the grants from the Zhongda Hospital Affiliated to Southeast University, Jiangsu Province High-Level Hospital Pairing Assistance Construction Funds (zdlyg12); National Natural Science Foundation of China (82404088); Jiangsu Provincial Basic Research Special Fund (Natural Science Foundation) Youth Fund (BK20241683); China Postdoctoral Science Foundation (2023M730587).

Author information

These authors contributed equally: Li-Hua Ren, Yuan Ding, Yue-Xin Zhang, Ke-Han Teng.

Authors and Affiliations

Department of Gastroenterology, Zhongda Hospital, Southeast University, Nanjing, China
Li-Hua Ren, Ye Zhu & Rui-Hua Shi
School of Medicine, Southeast University, Nanjing, China
Yuan Ding, Ke-Han Teng, Wan-Yue Zhang & Bin Wang
School of Automation, Nanjing University of Information Science & Technology, Nanjing, China
Yue-Xin Zhang & Kai Hu
Department of Pathology, Zhongda Hospital, Southeast University, Nanjing, China
Ke-Han Teng, Jia-Jia Xu & Xiao-Ying Wei
Department of Gastroenterology, The First People’s Hospital of Lianyungang, Lianyungang, China
Lu Wang
Department of Gastroenterology, Affiliated Changshu Hospital of Nantong University, Suzhou, China
Bin Wang

Authors

Li-Hua Ren
View author publications
Search author on:PubMed Google Scholar
Yuan Ding
View author publications
Search author on:PubMed Google Scholar
Yue-Xin Zhang
View author publications
Search author on:PubMed Google Scholar
Ke-Han Teng
View author publications
Search author on:PubMed Google Scholar
Lu Wang
View author publications
Search author on:PubMed Google Scholar
Wan-Yue Zhang
View author publications
Search author on:PubMed Google Scholar
Ye Zhu
View author publications
Search author on:PubMed Google Scholar
Jia-Jia Xu
View author publications
Search author on:PubMed Google Scholar
Xiao-Ying Wei
View author publications
Search author on:PubMed Google Scholar
Bin Wang
View author publications
Search author on:PubMed Google Scholar
Kai Hu
View author publications
Search author on:PubMed Google Scholar
Rui-Hua Shi
View author publications
Search author on:PubMed Google Scholar

Contributions

L.R., Y.D., Y.Z., and K.T. contributed equally to the article. The four authors drafted the manuscript together. L.W., W.Z., Y.Z., and B.W. contributed to the data collection. J.X. and X.W. contributed to the pathological interpretation. L.R., K.H., and R.S. conceived and designed the project and are responsible for the overall content. All authors contributed significantly and agree with the content of the manuscript.

Corresponding authors

Correspondence to Li-Hua Ren, Kai Hu or Rui-Hua Shi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

supplementary information(1)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ren, LH., Ding, Y., Zhang, YX. et al. Artificial intelligence-driven prediction of lymph node metastasis in T1 esophageal squamous cell carcinoma using whole slide images. npj Precis. Onc. 9, 403 (2025). https://doi.org/10.1038/s41698-025-01186-z

Download citation

Received: 22 May 2025
Accepted: 10 November 2025
Published: 23 November 2025
Version of record: 29 December 2025
DOI: https://doi.org/10.1038/s41698-025-01186-z

Subjects

Abstract

Similar content being viewed by others

Retrospective analysis of predictive factors for lymph node metastasis in superficial esophageal squamous cell carcinoma

Deep learning-based pathology signature could reveal lymph node status and act as a novel prognostic marker across multiple cancer types

Unraveling the power of radiomics: prediction and exploration of lymph node metastasis in stage T1/2 esophageal squamous cell carcinoma

Introduction

Results

Study population and cohort characteristics

Validation performance of the AI model

Test performance and clinical utility

Case-level diagnostic performance aligned with clinical practice

Discussion

Methods

Study design

Conventional histologic assessment

Data preparation and preprocessing

WSI acquisition and annotation

Data preprocessing and feature extraction

Graph neural network architecture

Supervised training protocol

Evaluation of the trained model

Model interpretability and attention heatmap generation

Statistical analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

supplementary information(1)

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links