Whole slide image based deep learning refines prognosis and therapeutic response evaluation in lung adenocarcinoma

Chen, Tao; Wen, Jialiang; Shen, Xinchen; Shen, Jiaqi; Deng, Jiajun; Zhao, Mengmeng; Xu, Long; Wu, Chunyan; Yu, Bentong; Yang, Minglei; Ma, Minjie; Wu, Junqi; She, Yunlang; Zhong, Yifan; Hou, Likun; Jin, Yanrui; Chen, Chang

doi:10.1038/s41746-025-01470-z

Download PDF

Article
Open access
Published: 29 January 2025

Whole slide image based deep learning refines prognosis and therapeutic response evaluation in lung adenocarcinoma

Tao Chen¹^na1,
Jialiang Wen¹^na1,
Xinchen Shen¹^na1,
Jiaqi Shen²,
Jiajun Deng¹,
Mengmeng Zhao¹,
Long Xu¹,
Chunyan Wu³,
Bentong Yu⁴,
Minglei Yang⁵,
Minjie Ma⁶,
Junqi Wu¹,
Yunlang She ORCID: orcid.org/0000-0003-2841-1250¹,
Yifan Zhong¹,
Likun Hou³,
Yanrui Jin ORCID: orcid.org/0000-0001-9489-5447^7,8 &
…
Chang Chen ORCID: orcid.org/0000-0002-9981-3110¹

npj Digital Medicine volume 8, Article number: 69 (2025) Cite this article

7758 Accesses
8 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Existing prognostic models are useful for estimating the prognosis of lung adenocarcinoma patients, but there remains room for improvement. In the current study, we developed a deep learning model based on histopathological images to predict the recurrence risk of lung adenocarcinoma patients. The efficiency of the model was then evaluated in independent multicenter cohorts. The model defined high- and low-risk groups successfully stratified prognosis of the entire cohort. Moreover, multivariable Cox analysis identified the model defined risk groups as an independent predictor for disease-free survival. Importantly, combining TNM stage with the established model helped to distinguish subgroups of patients with high-risk stage II and stage III disease who are highly likely to benefit from adjuvant chemotherapy. Overall, our study highlights the significant value of the constructed model to serve as a complementary biomarker for survival stratification and adjuvant therapy selection for lung adenocarcinoma patients after resection.

A new model using deep learning to predict recurrence after surgical resection of lung adenocarcinoma

Article Open access 16 March 2024

Exploiting common patterns in diverse cancer types via multi-task learning

Article Open access 29 October 2024

A novel 14-gene signature for overall survival in lung adenocarcinoma based on the Bayesian hierarchical Cox proportional hazards model

Article Open access 07 January 2022

Introduction

Estimating prognosis is essential for adjuvant treatment decision making and follow-up strategy selection for lung adenocarcinoma patients after surgery¹. Some pathological factors, such as visceral pleural invasion (VPI)², spread through the air space (STAS)³, and lymphovascular invasion (LVI)⁴, have been reported to be associated with patient outcomes. Compared to the factors mentioned above, the International Association for the Study of Lung Cancer (IASLC)-proposed grading system has been proven to be more efficient and robust for patient stratification according to refs. ^5,6. However, these factors may affect the prognosis of patients with stage I tumors, but their effect on those with stage II or III tumors requires further investigation. Furthermore, the TNM staging system can be used to categorize patients into several groups with distinct survival outcomes⁷. Nevertheless, there is often variation in patient outcomes even among those at a specific TNM stage. Above all, there remains room for improvement in precise risk stratification to improve patient management and disease outcomes.

Recent advances in artificial intelligence (AI) have enabled the use of quantitative data derived from whole slide images (WSIs) to predict patient outcomes directly^8,9,10. Histopathology images contain prognostically important information such as tumor-infiltrating lymphocytes^11,12, and proportions of tissue types¹³, each of which can be quantified by specific digital pathology approaches. The hidden information in routine haematoxylin and eosin (H&E)-stained images may help to stratify prognosis from a different dimension, and may serve as a complementary biomarker to the current clinical variables.

In the present study, we developed a WSI-based deep learning model to predict the recurrence risk of resected lung adenocarcinoma without any annotations from pathologists. We then assessed the ability of our model to stratify patients according to prognosis and investigated whether it could help refine the populations of patients likely to benefit from adjuvant chemotherapy. Finally, we obtained WSI heatmaps to explore the pathological features that may contribute to the predictive value of the model and the underlying biological basis of the model was also explored.

Results

Patient characteristics

With the constructed model, a WSI-based score was calculated for each patient in the two validation sets (Fig. 1). Patients in both validation sets were divided into low- and high-risk groups according to the median WSI-based score. In validation cohort 1, more patients in the high-risk group had STAS (36.3% vs. 29.3%, p = 0.048), IASLC grade III tumors (61.5% vs. 45.5%, p < 0.001), and adjuvant chemotherapy (39.7% vs. 30.4%, p = 0.010). In addition, more patients in the high-risk group had VPI (26.6% vs. 21.3%, p = 0.098), LVI (15.9% vs. 12.2%, p = 0.163), and TNM stage III tumors (17.3% vs. 12.8%, p = 0.234), although these differences were not statistically significant (Table 1). In validation cohort 2, the proportions of patients with VPI (35.6% vs. 23.9%, p = 0.089), STAS (39.1% vs. 34.1%, p = 0.493), LVI (21.8% vs. 12.5%, p = 0.101), IASLC grade III tumors (57.5% vs. 53.4%, p = 0.826), TNM stage III tumors (20.7% vs. 14.8%, p = 0.382), and adjuvant chemotherapy (34.5% vs. 29.5%, p = 0.484) were higher in the high-risk group, but none of the differences were statistically significant (Table 1).

**Fig. 1: Schematic illustration of the overall study design.**

Table 1 Baseline characteristics of patients in validation cohort 1 and validation cohort 2

Full size table

Survival analysis of DFS

In validation cohort 1, the model successfully stratified patients into high- and low-risk groups according to prognosis in the entire cohort (hazard ratio [HR] 1.95, 95% confidence interval [CI] 1.46–2.62, p < 0.001) and in most of the prespecified subgroups (Fig. 2a). Similar results were found in the analysis of validation cohort 2 (Fig. 2b).

**Fig. 2: Analysis of DFS in specified subgroups.**

The incremental prognostic value of the WSI-based score

Univariable analysis of the validation cohort 1 revealed that patient outcomes were strongly associated with pathological-related factors, including VPI (p < 0.001), STAS (p < 0.001), and LVI status (p < 0.001), IASLC grade (p < 0.001), TNM stage (p < 0.001), and WSI-based score (p < 0.001) (Table 2). Before incorporating the WSI-based score into the multivariable Cox model, the IASLC grade (grade II vs. grade I, HR 13.17, 95% CI 3.16–54.96, p < 0.001; grade III vs. grade I, HR 26.99, 95% CI 6.50–112.13, p < 0.001) and TNM stage (stage II vs. stage I, HR 1.59, 95% CI 0.98–2.57, p = 0.061; stage III vs. stage I, HR 4.04, 95% CI 2.83–5.76, p < 0.001) were identified as independent predictors of DFS. After incorporating the WSI-based score into the multivariable Cox model, it is suggested that the IASLC grade (grade II vs. grade I, HR 13.54, 95% CI 3.24–56.52, p < 0.001; grade III vs. grade I, HR 25.69, 95% CI 6.19–106.63, p < 0.001), TNM stage (stage II vs. stage I, HR 1.67, 95% CI 1.03–2.71, p = 0.037; stage III vs. stage I, HR 4.22, 95% CI 2.95–6.05, p < 0.001), and the constructed model (HR, 1.82, 95% CI, 1.35–2.44, p < 0.001) were all independent predictors of DFS (Table 2). Similar results were found in the analysis of validation cohort 2 (Table 3).

Table 2 Multivariable Cox analysis of disease-free survival in validation cohort 1

Full size table

Table 3 Multivariable Cox analysis of disease-free survival in validation cohort 2

Full size table

We then used the C-index to compare the performance of each variable for predicting DFS. For variables significantly associated with DFS according to univariable Cox regression analysis, the WSI-based score did not outperform several pathological factors in either validation cohort 1 (C-index _{[WSI-based score]} = 0.586; C-index _{[IASLC grade]} = 0.674; C-index _{[TNM stage]} = 0.665) or validation cohort 2 (C-index _{[WSI-based score]} = 0.643; C-index _[VPI] = 0.654; C-index _[STAS] = 0.651; C-index _{[IASLC grade]} = 0.718; C-index _{[TNM stage]} = 0.699) (Supplementary Table 1; Supplementary Fig. 1). Regarding the 3-year and 5-year AUCs for predicting DFS, our model did not show an advantage over some other pathological factors (Supplementary Table 1; Supplementary Fig. 1).

For variables independently predicting DFS in the multivariable Cox regression analysis, we compared the predictive performance of their combinations. The results showed that the combination of IASLC grade, TNM stage and WSI-based score (C-index _{[WSI-based score & IASLC grade & TNM stage]} = 0.753) outperformed any combination of two variables in validation cohort 1 (C-index _{[IASLC grade & TNM stage]} = 0.737, p < 0.001; C-index _{[WSI-based score & IASLC grade]} = 0.708, p < 0.001; C-index _{[WSI-based score & TNM stage]} = 0.706, p < 0.001; Fig. 3a; Table 4). In validation cohort 2, the combination of three variables (C-index _{[WSI-based score & IASLC grade & TNM stage]} = 0.811) also outperformed any combination of two variables (C-index _{[IASLC grade & TNM stage]} = 0.777, p < 0.001; C-index _{[WSI-based score & IASLC grade]} = 0.786, p < 0.001; C-index _{[WSI-based score & TNM stage]} = 0.763, p < 0.001; Fig. 3d; Table 4). The combined model also showed advantages with respect to the 3-year and 5-year AUCs for predicting DFS (Fig. 3b, c, e, f; Table 4). These results collectively demonstrated the added value of the constructed model to the existing clinical models.

**Fig. 3: C-index and time-dependent ROC curves at 3 and 5 years for the combination of variables independently predicting DFS according to multivariable Cox regression analysis.**

Table 4 Performance of the constructed model combined with existing pathological variables according to multivariable analysis

Full size table

The WSI-based score refines patient selection for adjuvant chemotherapy

We then investigated whether our model could help refine subgroups of patients who could mostly benefit from adjuvant chemotherapy. In validation cohort 1, neither stage IB (p = 0.551), stage II (p = 0.116), nor stage III patients (p = 0.068) significantly benefited from adjuvant chemotherapy (Supplementary Fig. 2). Further analysis with combination of the constructed model revealed no survival benefit for patients in the low-risk groups of patients with stage IB (p = 0.974, Fig. 4a), stage II (p = 0.800, Fig. 4b), or stage III (p = 0.464, Fig. 4c) disease. For patients in the high-risk groups, a survival advantage was acquired for patients in stage III (p = 0.030, Fig. 4f) and potentially acquired for patients in stage II (p = 0.077, Fig. 4e), but no survival benefit was observed for patients in stage IB (p = 0.367, Fig. 4d). Similar results were obtained for validation cohort 2 (Supplementary Fig. 2; Fig. 4g–l).

**Fig. 4: The combination of TNM stage and the constructed model helps to identify patient subgroups who may particularly benefit from postoperative chemotherapy.**

Interpretation of the deep learning model

To better understand the pathological mechanism underlying this prediction, we used heatmap visualization to explore the pathomorphological features of our model. As illustrated in Fig. 1d, micropapillary components were identified in the ‘high-risk’ region of the patient with stage IA tumor. Moreover, acinar patterns with tertiary lymphoid structures were characterized in the ‘low-risk’ region of the patient with stage III tumor. This reflects the substantial associations of the constructed model with the current pathological factors and its ability to serve as a complementary biomarker.

Patients in different risk groups present significant heterogeneity in gene expression patterns (Fig. 5a). In GO analyses (Fig. 5b), patients with the model defined high-risk group were associated with pathways representing tumor metabolism and proliferation such as cellular metabolic process, protein metabolic process, cellular component organization, and cellular component organization or biogenesis. Furthermore, as shown in Fig. 5c, tumors in two groups were characterized by diverse immune infiltration patterns. According to results of ssGSEA (Fig. 5d), patients with the model defined high-risk group yielded significantly less infiltrations of activated CD4 T cell, activated dendritic cell, central memory CD4 T cell, central memory CD8 T cell, effector memeory CD4 T cell, immature B cell, immature dendritic cell, macrophage, MDSC, natural killer T cell, and T follicular helper cell.

**Fig. 5: Biological basis of the deep learning model.**

Discussion

Recently, the development of digital pathology has provided important information for precise risk stratification and treatment planning. However, predicting prognosis (time-to-event) is considered a more sophisticated problem than a conventional regression task, mainly due to the fact that some patients have not experienced the expected outcomes (death, recurrence, etc.). For this reason, the number of studies using WSIs for predicting prognosis is relatively small^{9,10,14,15,16,17}. From the perspective of technology, these studies have mostly used convolution neural network^{9,10,15,16,17} to automatically extract features. However, WSIs have many pixels containing lots of invalid information, which seriously affects the accuracy and efficiency of the convolutional neural network. Lee et al.¹⁴ used the aggregation algorithm to obtain the superpatch graph before using GNN, which bring a great impact on the working efficiency of the prediction system. Further, Mobadersany et al.¹⁶ used region of interest from WSI as input information, which increased the workload of clinicians. Compared with the methods mentioned above, the proposed model converts WSI into graph-based data and introduces the attention mechanism to assign different weights to different nodes, which effectively reduces the computational complexity and improves the prediction accuracy. Moreover, the proposed model was established without the facility of pathologists, overcoming the shortcoming that deep learning algorithms rely on manual annotations and the expertise of pathologists to a certain extent^18,19, which may help to improve the generalization of the model.

On the one hand, although the predictive performance for prognosis of our model did not outperform some of the pathological factors, it remained statistically significant in the multivariable analysis when it was combined with TNM stage and IASLC grade, indicating the added value of the model and its ability to serve as a complementary biomarker for survival stratification. On the other hand, the presence of high-risk pathological factors, including VPI, STAS, LVI, and high-grade tumors, was more common in the model-defined high-risk groups, suggesting substantial associations between the constructed model and the current well-defined pathological factors.

A large meta-analysis revealed that adjuvant chemotherapy could yield an overall survival (OS) benefit of 5% at 5 year, however, the statistically significance was not reached (HR 0.87, p = 0.08)²⁰. Following this study, randomized trials evaluating the efficiency of adjuvant chemotherapy were conducted for a decade. Some of the large trials successfully demonstrated the OS benefit^21,22, while others failed²³. Afterwards, the Lung Adjuvant Cisplatin Evaluation (LACE) study further confirmed the effect of adjuvant chemotherapy on both OS and DFS²⁴. Nevertheless, we must be clear that the overall benefit from adjuvant chemotherapy is limited: stage II-III patients may mostly benefit, stage IB patients may only have trend toward benefit, while stage IA patients may experience deleterious effect. We need to identify subgroups of patients who may particularly benefit from adjuvant chemotherapy. In the current study, no significant survival benefits from adjuvant chemotherapy were acquired across the overall population of patients with stage IB, stage II, or stage III disease. However, combining TNM stage with our constructed model helps to distinguish a survival advantage for high-risk stage III patients, and a potential survival advantage for high-risk stage II patients (statistical significance was not reached for this group perhaps for the limitation of the relatively small sample size). According to our results, we advocate adjuvant chemotherapy for high-risk stage II-III patients and to avoid unnecessary chemotherapy for other patients.

The results demonstrated that our established model exhibits significant biological relevance. The model outputs are likely associated with genes and molecular pathways that promote tumor proliferation, and high-risk patients show significantly lower levels of immune cell infiltration. This partly explains the model’s predictive capability for prognosis and adjuvant chemotherapy decision-making in lung cancer patients.

Despite the promising results obtained in the present study, several limitations should be declared. Firstly, the retrospectively nature of the study may limit the statistical power and hinder the generalization of the results to other centers and regions, especially the results regarding adjuvant therapy, prospective validation with larger sample size is warranted. Second, although our model could be used as a complement to the existing prognostic models of lung adenocarcinoma, there remains much room for improvement in its ability to predict prognosis. Multiomics data integrating radiology, pathology, molecular, and other modalities are needed to establish more efficient and robust models in the future.

In summary, our constructed model can predict the recurrence risk of resected lung adenocarcinoma without the need for annotations from pathologists, which can complement the current prognostic models. Moreover, the model defined high- and low-risk groups may help to guide adjuvant therapy strategies in clinical practice.

Methods

Participants and study design

This multicenter study was approved by the Ethics Committee and Institutional Review Board of Shanghai Pulmonary Hospital (No. K23-292), the First Affiliated Hospital of Nanchang University, Ningbo Hwamei Hospital, the First Affiliated Hospital of Lanzhou University, and followed the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement²⁵ (Supplementary Note 1). The informed consent was waived as this was a retrospective study.

We used 3712 H&E-stained, formalin-fixed and paraffin-embedded (FFPE) tumor tissue sections from 1705 patients with surgically resected lung adenocarcinoma. Patients with stage I-III disease and available clinicopathological data and follow-up information were included. Patients with stage IV disease, a history of neoadjuvant therapy, and no available follow-up information or tumor tissue sections were excluded. For each patient in the training and validation sets, digital WSIs were scanned from the corresponding H&E-stained tumor tissue sections.

To train the model, we used a dataset of 1889 sections from 825 patients who underwent surgery at Shanghai Pulmonary Hospital between January 2012 and December 2012 (Supplementary Table 2). We applied deep learning techniques to develop a histopathological model, the patient-level WSI-based score, to predict the disease-free survival (DFS) of patients with resected lung adenocarcinoma. We then evaluated the capability of our model for survival stratification and investigated whether it could help refine the populations of patients likely to benefit from adjuvant chemotherapy in two separate validation sets. Finally, WSI heatmaps were obtained to explore the pathological features underlying the predictions (see Fig. 1 for the study design). In addition, the underlying biological basis of the model was also explored to enhance the interpretability of the model. The validation cohort 1 included 1516 sections from 705 patients who underwent surgery at Shanghai Pulmonary Hospital between January 2015 and June 2015. The validation cohort 2 included 307 sections from 175 patients between January 2015 and December 2015 from three departments of thoracic surgery: the First Affiliated Hospital of Nanchang University, Ningbo Hwamei Hospital, and the First Affiliated Hospital of Lanzhou University.

Clinical data, including age, sex, smoking history, tumor location, surgery type, and TNM stage, were available for both the training and validation sets. To compare the performance of the constructed model in prognosis prediction with that of the current clinical models, the VPI, STAS, and LVI status and IASLC tumor grade²⁶ were re-evaluated by two of our experienced pathologists (C.W., L.H.) for patients in the validation sets.

Four 21-day cycles of intravenous chemotherapy of cisplatin 75 mg/m² or carboplatin AUC 5 on day 1 plus pemetrexed 500 mg/m² on day 1 were administrated after thorough evaluation of the patients’ conditions and discussion among a group of surgeons and oncologists at our centers.

WSI-based score for recurrence risk prediction

The patient-level survival prediction model in this paper is a multiple-classification model based on variable length input. Since the number of WSIs obtained for each patient varied, and the effective area of different WSIs also varied considerably, we need to utilize a model that can handle inputs of variable length. In addition, since hundreds of millions of pixels are contained in WSIs, efficient compression of the input data was also considered to be crucial. Thus, this paper presents a graph-attention-based multiple-instance neural network (GAMINN) for processing variable WSIs for survival prediction. The total analysis system contains the following modules, whose details are shown below.

Data preprocessing, tissue segmentation and feature compression

Because each WSI contains a large number of pixels, it is difficult for a deep learning model to directly process the WSI and obtain good prediction results. Moreover, each WSI contains a large amount of invalid interference information, which not only affects the subsequent analysis performance but also consumes computing resources. Therefore, in this paper, the CLAM model²⁷ was used to classify the tissue regions of pathological images, which can efficiently and accurately classify the regions with high diagnostic value based on low computational burden. Figure 1a shows the tissue profile extracted by the CLAM model, which was used to segment the tissue region effectively and eliminate interference from the invalid region. Thereafter, we partitioned the extracted regions into patch slices, each with a size of 256 × 256 pixels. Meanwhile, we used the ResNet50 model pretrained on the ImageNet2012 dataset to process the extracted patch slices and extract the morphological features of each slice, where the morphological feature dimension of each slice was 1024.

WSI graph construction

For each patch, we saved the position coordinates of each patch in the pathological image from the tissue segmentation and built an adjacency matrix A_j using the fast approximation KNN (k = 8). The adjacency matrix A_j modeled a 3 × 3 image receptive field in the pathological image. Finally, we combined the feature matrix X_j and adjacency matrix A_j to construct the subgraph G_j = (X_j, A_j), and all WSIs for the same patient were constructed as G = {G_j}_j=1. Figure 1a shows the process of WSI graph construction.

Feature process module

We combined a graph convolutional neural network and a self-attention mechanism to process the feature input of the graph structure, effectively grasp the implicit relationships between slices, and obtain an effective representation for prognostic risk assessment. Moreover, according to the different numbers of patient-level input WSIs, a multi-instance pooling method was designed to effectively obtain the features of different WSIs of the same patient, thereby improving the final prognosis prediction performance.

Graph-attention-based network

To better handle the input data of the graph structure, we use graph neural network to automatically extract features. Similar to convolutional neural networks, graph convolutional neural networks (GCNs) have powerful feature learning capabilities, in which the convolution of a certain point can be viewed as a weighted sum of the neighbors of the point. However, the GCN treats all neighboring nodes equally during convolution and cannot assign different weights according to the importance of the nodes. Assuming that there are N nodes in a graph, in practical analysis, the contributions of different adjacent nodes to the target node should also be different. To better distribute weights among different nodes, we used an attention mechanism to uniformly normalize the correlation calculated between the target node and all its neighbors.

$${a}_{i,j}=\frac{\exp (Leaky{\rm{R}}{\rm{e}}LU({a}^{T}[W{X}_{i}\parallel W{X}_{j}]))}{{\sum }_{k\in {N}_{i}}\exp (Leaky{\rm{R}}{\rm{e}}LU({a}^{T}[W{X}_{i}\parallel W{X}_{k}]))}$$

(1)

|| is the concatenation operation, W is the linear transformation matrix, a is a renewable matrix, and a_{i, j} are the connection degrees of node j to node i. By combining the GCN with then attention mechanism, we constructed a graph-attention-based (GAT) layer for subsequent analysis.

Learning global features

We build an end-to-end differentiable function F_GAT, using a GAT layer to mine the node features of each neighbor in the space. To further learn the global morphological features of pathological images and avoid gradient vanishing in the network, inspired by the idea of residual learning, we used F_GAT (l) as a residual map, which allows the superposition of multiple layers of F_GAT (l) together, where the output of F_GAT (l) is added to the input.

$${G}^{l+1}={F}_{GAT}^{l}({G}^{l};{\Phi }^{l},{\rho }^{l},{\zeta }^{l})+{G}^{l}$$

(2)

where φ^l is a message construction function that calculates the association characteristics between node u and its neighbor node v, ρ^l is an aggregation function that aggregates all the features passed to v, and ζ^l is an update function that updates the existing node features at node v with the aggregated features X^l+1_v.

$${m}_{v}^{l}={\rho }^{l}({\Phi }^{l}({X}_{v}^{l},{X}_{u}^{l}))$$

(3)

$${X}_{v}^{l+1}={\zeta }^{l}({X}_{v}^{l},{m}_{v}^{l})$$

(4)

We implemented the main model structure of GAMINN using a 3-layer residual GAT model. In addition, we output the last GAT layer to the fully connected layer and aggregated the different WSI features in the same patient to achieve better patient-level feature expression (Fig. 1b).

Details on network training

We use NVIDIA GeForce 3070 GPU RTX for training the model, which has 16 GB of memory. Additionally, we use the PyTorch library version 1.12.1 for training and evaluation. Adam optimizer is selected as the model optimizer, whose initial weight is 0.0002, and each batch contains a multiple pathological image data of patients. During the training process, the model is trained through 100 epochs and utilize Cox likelihood function as loss function, which is listed as follows:

$$Loss=-{\sum }_{i}\left({h \atop}\theta \,({x}_{i})-\,{\log}\sum _{j\in R({T}_{{\rm{i}}})}\,{e}^{h}\theta ({x}_{{\rm{j}}})\,\right)$$

(5)

x_i represents ith cases, h_θ(.) means the risk score from the proposed model, and R(T_i) is the list of patients with shorter survival time than the ith patient.

Biological basis of deep learning model

RNA-sequencing was performed in 112 patients in validation cohort 1, the TruSeq RNA Access Library Prep Kit (Illumina) was utilized to generate library and the paired-end sequencing based on an Illumina Novaseq™ 6000 was subsequently conducted. Among them, 63 patients were classified as low-risk and 49 as high-risk. We used the edgeR package to determine differentially expressed genes between two groups with standard of log fold changes more than 1 and adjusted p values less than 0.05. Subsequently, Gene Ontology (GO) pathway analyses was performed to determine pathways related to the model defined risk groups. Additionally, the single sample gene set enrichment analysis (ssGSEA) was conducted with the GSVA package to quantify the relative infiltration of immune cell types in the tumor microenvironment.

Statistical analysis

DFS was defined as the time from surgery to the first-confirmed event of lung cancer recurrence. The Kaplan–Meier method and log-rank test were used to compare survival outcomes between groups. Cox regression analysis was performed to identify independent predictors of survival. The predictive performance of each model was assessed via the Harrell concordance index (C-index), time-dependent receiver operating characteristic (ROC) curves, and area under the curve (AUC) values at 3 and 5 years. The missing information was dealt with using the single imputation method. Statistical analysis was performed with R software (version 4.3.1). A two-sided p value less than 0.05 was considered to indicate statistical significance.

Data availability

The datasets analyzed in the current study are not publicly available due to patient privacy purposes, but are available upon reasonable request to the corresponding author. Access to the data will be restricted to non-commercial research.

Code availability

The source codes of this study are available on reasonable request from the corresponding author. The source codes for visualization can be accessed via the following link: https://github.com/Kim12312/WSI-based-Evaluation.

References

Chaft, J. E., Shyr, Y., Sepesi, B. & Forde, P. M. Preoperative and postoperative systemic therapy for operable non-small-cell lung cancer. J. Clin. Oncol. 40, 546–555 (2022).
Article CAS PubMed PubMed Central Google Scholar
Huang, H., Wang, T., Hu, B. & Pan, C. Visceral pleural invasion remains a size-independent prognostic factor in stage I non-small cell lung cancer. Ann. Thorac. Surg. 99, 1130–1139 (2015).
Article PubMed Google Scholar
Zhong, Y. et al. Prognostic impact of tumour spread through air space in radiological subsolid and pure solid lung adenocarcinoma. Eur. J. Cardiothorac. Surg. 59, 624–632 (2021).
Article PubMed Google Scholar
Okiror, L. et al. Prognostic factors including lymphovascular invasion on survival for resected non-small cell lung cancer. J. Thorac. Cardiovasc. Surg. 156, 785–793 (2018).
Article PubMed Google Scholar
Fujikawa, R. et al. Clinicopathologic and genotypic features of lung adenocarcinoma characterized by the International Association for the Study of Lung Cancer Grading System. J. Thorac. Oncol. 17, 700–707 (2022).
Article CAS PubMed Google Scholar
Hou, L. et al. Prognostic and predictive value of the newly proposed grading system of invasive pulmonary adenocarcinoma in Chinese patients: a retrospective multicohort study. Mod. Pathol. 35, 749–756 (2022).
Article PubMed Google Scholar
Detterbeck, F. C., Boffa, D. J., Kim, A. W. & Tanoue, L. T. The eighth edition lung cancer stage classification. Chest 151, 193–203 (2017).
Article PubMed Google Scholar
Wulczyn, E. et al. Interpretable survival prediction for colorectal cancer using deep learning. NPJ Digit Med. 4. https://doi.org/10.1038/s41746-021-00427-2 (2021)
Qaiser, T. et al. Usability of deep learning and H&E images predict disease outcome-emerging tool to optimize clinical trials. NPJ Precis. Oncol. 6. https://doi.org/10.1038/s41698-022-00275-7 (2022)
Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25, 1519–1525 (2019).
Article CAS PubMed Google Scholar
Ding, R. et al. Image analysis reveals molecularly distinct patterns of TILs in NSCLC associated with treatment outcome. NPJ Precis. Oncol. 6. https://doi.org/10.1038/s41698-022-00277-5 (2022)
Rakaee, M. et al. Machine learning-based immune phenotypes correlate with STK11/KEAP1 co-mutations and prognosis in resectable NSCLC: a sub-study of the TNM-I trial. Ann. Oncol. 34, 578–588 (2023).
Article CAS PubMed Google Scholar
Shi, J.-Y. et al. Exploring prognostic indicators in the pathological images of hepatocellular carcinoma based on deep learning. Gut 70, 951–961 (2021).
Article CAS PubMed Google Scholar
Lee, Y. et al. Derivation of prognostic contextual histopathological features from whole-slide images of tumours via graph deep learning. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-022-00923-0 (2022)
Jiang, X. et al. End-to-end prognostication in colorectal cancer by deep learning: a retrospective, multicentre study. Lancet Digit Health 6, e33–e43 (2024).
Article CAS PubMed Google Scholar
Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–e2979 (2018).
Article CAS PubMed PubMed Central Google Scholar
Saillard, C. et al. Predicting survival after hepatocellular carcinoma resection using deep learning on histological slides. Hepatology 72, 2000–2013 (2020).
Article PubMed Google Scholar
Zhang, Y. et al. Histopathology images-based deep learning prediction of prognosis and therapeutic response in small cell lung cancer. NPJ Digit Med. 7, 15 (2024).
Article PubMed PubMed Central Google Scholar
Nagpal, K. et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. NPJ Digit Med. 2, 48 (2019).
Article PubMed PubMed Central Google Scholar
Alberti, W. et al. Chemotherapy in non-small cell lung cancer: a meta-analysis using updated data on individual patients from 52 randomised clinical trials. Non-small Cell Lung Cancer Collaborative Group. BMJ 311, 899–909 (1995).
Article Google Scholar
Arriagada, R. et al. Cisplatin-based adjuvant chemotherapy in patients with completely resected non-small-cell lung cancer. N. Engl. J. Med. 350, 351–360 (2004).
Article PubMed Google Scholar
Douillard, J. Y. et al. Adjuvant vinorelbine plus cisplatin versus observation in patients with completely resected stage IB-IIIA non-small-cell lung cancer (Adjuvant Navelbine International Trialist Association [ANITA]): a randomised controlled trial. Lancet Oncol. 7, 719–727 (2006).
Article CAS PubMed Google Scholar
Waller, D. et al. Chemotherapy for patients with non-small cell lung cancer: the surgical setting of the Big Lung Trial. Eur. J. Cardiothorac. Surg. 26, 173–182 (2004).
Article CAS PubMed Google Scholar
Pignon, J. P. et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE Collaborative Group. J. Clin. Oncol. 26, 3552–3559 (2008).
Article PubMed Google Scholar
Collins, G. S. et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385, e078378 (2024).
Article PubMed PubMed Central Google Scholar
Moreira, A. L. et al. A grading system for invasive pulmonary adenocarcinoma: a proposal from the International Association for the Study of Lung Cancer Pathology Committee. J. Thorac. Oncol. 15, 1599–1610 (2020).
Article PubMed PubMed Central Google Scholar
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by National Natural Science Foundation of China (92259205, 92474114, 82402371, 82102126, 82272943, 62406190, 824B2055); National Key Research and Development Program of China (2022YFC2407401, 2023YFC2508600); National Postdoctoral Program for Innovative Talents (No. BX20230215); Science and Technology Commission of Shanghai Municipality (21YF1438200); Shanghai Municipal Health Commission (No.20234Z0001); Clinical Research Foundation of Shanghai Pulmonary Hospital (SKPY2021008); Investigator-Initiated Trial of Shanghai Pulmonary Hospital (2021LY1144, 2023LY0310); Ningbo Top Medical and Health Research Program (2022030208); and Medicine and Public Health Scientific Projects in Zhejiang Province (2020KY270).

Author information

These authors contributed equally: Tao Chen, Jialiang Wen, Xinchen Shen.

Authors and Affiliations

Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, China
Tao Chen, Jialiang Wen, Xinchen Shen, Jiajun Deng, Mengmeng Zhao, Long Xu, Junqi Wu, Yunlang She, Yifan Zhong & Chang Chen
School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China
Jiaqi Shen
Department of Pathology, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, China
Chunyan Wu & Likun Hou
Department of Thoracic Surgery, The First Affiliated Hospital of Nanchang University, Jiangxi, China
Bentong Yu
Department of Thoracic Surgery, Ningbo HwaMei Hospital, Chinese Academy of Science, Zhejiang, China
Minglei Yang
Department of Thoracic Surgery, The First Affiliated Hospital of Lanzhou University, Gansu, China
Minjie Ma
State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
Yanrui Jin
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Yanrui Jin

Authors

Tao Chen
View author publications
Search author on:PubMed Google Scholar
Jialiang Wen
View author publications
Search author on:PubMed Google Scholar
Xinchen Shen
View author publications
Search author on:PubMed Google Scholar
Jiaqi Shen
View author publications
Search author on:PubMed Google Scholar
Jiajun Deng
View author publications
Search author on:PubMed Google Scholar
Mengmeng Zhao
View author publications
Search author on:PubMed Google Scholar
Long Xu
View author publications
Search author on:PubMed Google Scholar
Chunyan Wu
View author publications
Search author on:PubMed Google Scholar
Bentong Yu
View author publications
Search author on:PubMed Google Scholar
Minglei Yang
View author publications
Search author on:PubMed Google Scholar
Minjie Ma
View author publications
Search author on:PubMed Google Scholar
Junqi Wu
View author publications
Search author on:PubMed Google Scholar
Yunlang She
View author publications
Search author on:PubMed Google Scholar
Yifan Zhong
View author publications
Search author on:PubMed Google Scholar
Likun Hou
View author publications
Search author on:PubMed Google Scholar
Yanrui Jin
View author publications
Search author on:PubMed Google Scholar
Chang Chen
View author publications
Search author on:PubMed Google Scholar

Contributions

T.C., C.C., L.H. designed this study. T.C., Y.J., J.W., X.S. analyzed the data and wrote the first version of the manuscript. Y.J., J.S. built the deep learning model. J.W., J.D., Y.Z., M.Z., L.X., Y.S. collected the clinicopathological data. L.H., B.Y., M.Y., M.M. provided the H&E slides. L.H., C.W. reviewed the H&E slides in two validation sets. C.C., Y.J., L.H., Y.Z. conceived the project and edited the paper. T.C., Y.J., Y.Z. mainly revised the manuscript. All authors read and approved the final manuscript for submission.

Corresponding authors

Correspondence to Yifan Zhong, Likun Hou, Yanrui Jin or Chang Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, T., Wen, J., Shen, X. et al. Whole slide image based deep learning refines prognosis and therapeutic response evaluation in lung adenocarcinoma. npj Digit. Med. 8, 69 (2025). https://doi.org/10.1038/s41746-025-01470-z

Download citation

Received: 30 March 2024
Accepted: 19 January 2025
Published: 29 January 2025
Version of record: 29 January 2025
DOI: https://doi.org/10.1038/s41746-025-01470-z

This article is cited by

PAM: a propagation-based model for segmenting any 3D objects across multi-modal medical images
- Zifan Chen
- Xinyu Nan
- Bin Dong
npj Digital Medicine (2025)
Cross-slide augmentation for whole slide image classification based on class activation map
- Yanjia Chen
- Hejun Wu
- Hao Cai
Scientific Reports (2025)
Comprehensive analysis of a machine learning prognostic model for the interaction between mitochondrial function and lactylation in lung adenocarcinoma
- Yaozong Xia
- Zhongxun Li
- Bin Nie
Discover Oncology (2025)
Artificial intelligence-driven pathomics in hepatocellular carcinoma: current developments, challenges and perspectives
- Wei Ding
- Jinxing Zhang
- Haibin Shi
Discover Oncology (2025)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Patient characteristics

Survival analysis of DFS

The incremental prognostic value of the WSI-based score

The WSI-based score refines patient selection for adjuvant chemotherapy

Interpretation of the deep learning model

Discussion

Methods

Participants and study design

WSI-based score for recurrence risk prediction

Data preprocessing, tissue segmentation and feature compression

WSI graph construction

Feature process module

Graph-attention-based network

Learning global features

Details on network training

Biological basis of deep learning model

Statistical analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links