An interpretable deep learning model for predicting endometrial cancer molecular subtypes from H&E-stained slides

Guo, Qinhao; Cui, Haoyu; Zhang, Yangyang; Tang, Shaoxian; Yan, Weicheng; Zhou, Xiaoyan; Ding, Hongmei; Zhou, Jinhua; Ju, Xingzhu; Feng, Zheng; Zhu, Jun; Bai, Fang; Zhong, Yanping; Li, Haiming; Xu, Jun; Wu, Xiaohua; Wang, Xiangxue; Wen, Hao

doi:10.1038/s41698-026-01280-w

Download PDF

Article
Open access
Published: 21 January 2026

An interpretable deep learning model for predicting endometrial cancer molecular subtypes from H&E-stained slides

Qinhao Guo^1,2^na1,
Haoyu Cui³^na1,
Yangyang Zhang^1,2^na1,
Shaoxian Tang^2,4^na1,
Weicheng Yan³,
Xiaoyan Zhou^2,4,
Hongmei Ding⁵,
Jinhua Zhou⁵,
Xingzhu Ju^1,2,
Zheng Feng^1,2,
Jun Zhu^1,2,
Fang Bai^1,2,
Yanping Zhong^1,2,
Haiming Li^2,6,
Jun Xu³,
Xiaohua Wu^1,2,
Xiangxue Wang³ &
…
Hao Wen^1,2

npj Precision Oncology volume 10, Article number: 71 (2026) Cite this article

4028 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The molecular subtype of endometrial cancer is important for predicting prognosis and treatment effectiveness. This study aimed to develop an interpretable deep learning model based on H&E-stained whole slide images (WSIs) to predict the molecular subtype of endometrial cancer. Data from the Fudan cohort (n = 364) were used to train an end-to-end prediction network for identifying four molecular subtypes. Two external cohorts—the TCGA (n = 296) and Suzhou (n = 36)—were used to validate model generalizability and potential clinical applicability. We further assessed the correlation between histological and molecular features at both the macro- (WSI) and micro- (patch) levels. The network achieved a macro-average area under the receiver operating characteristic curve (AUROC) of 0.867 (95% CI: 0.823–0.911) in 5-fold cross-validation. The class-wise AUROCs were 0.846 (95% CI: 0.798–0.894) for the microsatellite instability–high (MSI-H) subtype, 0.876 (95% CI: 0.831–0.921) for the nonspecific molecular profile (NSMP) subtype, 0.910 (95% CI: 0.818–1.000) for the p53-abnormal (p53abn) subtype, and 0.835 (95% CI: 0.784–0.886) for the POLE-mutated (POLEmut) subtype. Morphological analysis revealed that MSI-H-subtype tumors exhibited increased stromal lymphocytic infiltration; POLEmut tumors showed higher heterogeneity, solid growth patterns, and elevated tumor grade; p53abn tumors were characterized by papillary growth and serous-like features; while NSMP tumors demonstrated high stromal cellularity. This method provides an accurate and interpretable tool for molecular subtype prediction, offering a theoretical basis for future individualized treatment strategies in endometrial cancer.

Variability in endometrial carcinoma pathology practice: opportunities for improvement with molecular classification

Article 14 October 2022

Real-world benchmarking and validation of foundation model transformers for endometrial cancer subtyping from histopathology

Article Open access 04 April 2026

Molecular characterization of mixed-histology endometrial carcinoma provides prognostic and therapeutic value over morphologic findings

Article Open access 08 February 2025

Introduction

Endometrial cancer, a malignancy of the uterine lining, is the sixth most commonly diagnosed cancer in women, with an estimated 420,000 new cases reported globally in 2022¹. Its age-adjusted mortality has increased steadily, rising from 74,000 deaths in 2008 to nearly 97,000 by 2022^1,2,3. Although early-stage disease is associated with favorable outcomes, patients with advanced or recurrent disease fare poorly, with a 5-year overall survival rate of only 20–25%^4,5.

Histological assessment remains central to endometrial cancer (EC) diagnosis when biopsy is feasible, guiding treatment and prognosis⁶. Endometrioid carcinoma, the most common histological type, typically arises from endometrial hyperplasia and is often low-grade with favorable outcomes. In contrast, serous carcinoma arises from atrophic endometrium and is associated with a high risk of extrauterine spread and poor prognosis⁷. Given the limited reproducibility and prognostic precision of histology alone, molecular subtyping now complements traditional classification and enhances risk stratification⁸. According to the National Comprehensive Cancer Network (NCCN) guidelines, molecular testing for POLE-mutated (POLEmut), mismatch repair (MMR)/microsatellite instability (MSI), and p53 abnormalities is recommended to complement histological tumor classification⁹. According to the FIGO 2023 staging system, molecular subtype information (particularly POLEmut and p53abn status) can modify staging in FIGO stage I and II endometrial cancers¹⁰. For advanced-stage disease (FIGO III and IV), molecular classification provides important prognostic and therapeutic information but does not alter the anatomical stage^4,11,12,13. Despite its clinical value, comprehensive molecular classification remains inaccessible in many settings due to cost, technical requirements, and limited availability of sequencing—particularly for detecting POLEmut subtype, which require targeted sequencing methods not routinely available in low-resource laboratories¹⁴.

To address this, digital pathology and deep learning have emerged as promising tools for predicting molecular features directly from whole slide images (WSIs)¹⁵. High-resolution WSIs capture morphological cues associated with genomic alterations and the tumor microenvironment, and deep learning models have successfully identified mutation profiles in various cancers^16,17,18. However, in the field of endometrial cancer, only a few studies have used deep learning to predict molecular characteristics from histological images. Wang et al. proposed a model based on ResNet-18 and XGBoost to predict the microsatellite status of endometrial cancer patients, but interpretability analysis was not performed¹⁹. Hong et al. proposed a multiresolution classification model based on the Inception network capable of predicting the four subtypes defined by TCGA and the mutation status of 18 genes⁶. However, the authors treated the classification task as multiple binary classification subtasks and neglected to assess the correlations between different subtypes. Additionally, the models did not exclude nontumor tissues such as the myometrium and were more likely to yield ambiguous prediction scores for nontumor tissue tiles, potentially affecting the overall prediction performance. Fremond et al. proposed a workflow based on attention-based multi-instance learning (AB-MIL) called im4MEC, which achieved competitive performance in a classification task and demonstrated reproducibility¹¹. Building upon this, further improvements could focus on deepening cellular-level insights and enhancing prediction for challenging subtypes such as POLEmut.

In this study, we trained an interpretable deep learning model for predicting the molecular subtype of endometrial cancer with 364 WSIs from 324 patients admitted to Fudan University Shanghai Cancer Center. We subsequently validated this model using 296 WSIs from 274 patients in the TCGA cohort and 36 WSIs from 33 patients from the First Affiliated Hospital of Soochow University. This model predicted the molecular subtypes of endometrial cancer directly without gene sequencing by extracting key features from WSIs. We developed an end-to-end subtype prediction network, significantly reducing computational costs and improving model interpretability. We then assessed current mainstream subtyping networks with 5-fold cross-validation in the Fudan cohort and compared their performance with that of our model. Additionally, we explored the morphological features potentially related to the molecular subtypes of endometrial cancer. Our study offers a proof-of-concept for morphology-based molecular subtyping, with potential implications for future clinical decision-making in endometrial cancer.

Results

Model performance and cohort overview

After excluding patients with incomplete information, images containing scanning errors and other data issues, the Fudan cohort included 364 slides from 324 patients, including 76 patients with MSI-H, 145 with NSMP, 56 with p53abn, and 47 with POLEmut (Supplementary Fig. S1, Supplementary Table S1). Within the Fudan cohort, Supplementary Table S2 presents the cross-tabulation of age group, histological type, FIGO stage, grade, and recurrence by molecular subtype. Similarly, after screening, the TCGA cohort included 296 slides from 274 patients, including 80 patients with MSI-H, 113 with NSMP, 60 with p53abn, and 21 with POLEmut. The Suzhou cohort ultimately included 36 slides from 33 patients, including 4 patients with MSI-H, 23 with NSMP, and 6 with p53abn; no patients with POLEmut were identified in the cohort (Supplementary Table S1). Notably, the Fudan cohort exhibited a higher proportion of POLEmut cases than the other cohorts, which may reflect institutional referral patterns, regional genetic variation, or minor differences in sequencing pipelines⁴.

In 5-fold cross-validation in the Fudan cohort, the proposed model achieved an area under the ROC curve (AUROC) of 0.846 (95% CI: 0.798–0.894) for MSI-H, 0.876 (95% CI: 0.831–0.921) for NSMP, 0.910 (95% CI: 0.818–1.003) for p53abn, and 0.835 (95% CI: 0.784–0.886) for POLEmut, with a macroaverage AUROC of 0.867 (95% CI: 0.823–0.911) (Fig. 1A, Supplementary Tables S10–S11). In the external validation in the TCGA cohort, the proposed model achieved an AUROC of 0.775 for MSI-H, 0.844 for NSMP, 0.95 for p53abn, and 0.798 for POLEmut, with a macroaverage AUROC of 0.844 (Fig. 1C, Supplementary Tables S7–S9). In the external validation in the Suzhou cohort, the proposed model achieved an AUROC of 0.761 for MSI-H, 0.873 for NSMP, and 0.862 for p53abn, with a macroaverage of 0.847 (Fig. 1B, Supplementary Tables S12–S13). Compared with the results from the 5-fold cross-validation in the Fudan cohort, the overall performance of the model showed only slight worsening, except in the prediction of p53abn. We report the accuracy, precision, recall, F1 score, sensitivity, specificity, and negative predictive value (NPV) of the model in the different cohorts in the Supplementary Table S3.

**Fig. 1: Model performance in predicting the molecular subtype in endometrial cancer.**

To further assess the performance of the network proposed in this study, we conducted 5-fold cross-validation within the Fudan cohort and compared it with representative weakly supervised frameworks, including TransMIL, AB-MIL, Max-Pooling, Mean-Pooling, CLAM-SB, and CLAM-MB¹¹. All models were built upon the UNI pathology foundation encoder for feature extraction, ensuring domain-specific and consistent representation across methods^11,20,21. After calculating comparative evaluation metrics including the AUROC, sensitivity, and specificity, the results showed that the proposed network outperformed the three mainstream subtyping networks (Fig. 1D, Supplementary Table S4).

Patch-level visualization revealed the morpho-molecular correlates

Grad-CAM was used to generate heatmaps based on the input patches. We selected representative patches with high confidence from different WSIs and assessed and compared the histological features of the high-activation regions. Finally, we visualized the morphological features associated with the molecular subtypes (Fig. 2). Our results suggested that the majority of MSI-H and NSMP tumors were low-grade endometrioid carcinomas, with MSI-H having a greater density of stromal lymphocytes than NSMP. POLEmut endometrial cancer more frequently exhibited solid growth with more evident cellular atypia than the MSI-H and NSMP subtypes. Finally, p53abn endometrial cancer often displayed papillary growth and characteristics of serous carcinoma, such as high cellular atypia, pleomorphism, and a high nucleus-to-cytoplasm ratio (Fig. 2).

**Fig. 2: Patch-level visualization of the morpho-molecular correlates.**

Extraction and analysis of single-cell nuclear features

We used Hover-Net to perform cell segmentation on the regions of interest (ROI) of the 364 WSIs from the Fudan Cohort. On average, each WSI contained approximately 50,000 inflammatory cells, 75,000 stromal cells, and 550,000 tumor cells. By analyzing the cell masks obtained from segmentation with PyPathomics, 84 features were extracted for each segmented cell, including shape features, texture features, and spatial distribution features (Supplementary Fig. S3, Supplementary Table S5). In total, 245 million cells were analyzed (Fig. 3A).

**Fig. 3: Correlations between single-cell nuclear features and molecular subtypes.**

We calculated the mean and standard deviation of the single-cell nuclear feature data collected in the previous phase to reveal the overall trends of the cell populations within the regions of interest at the WSI level. Spearman correlation analysis was conducted between these extracted features and the four subtypes. The correlation heatmaps and box plots are shown below (Fig. 3B).

For the MSI-H subtype, no significant correlations were found with any of the nuclear features. For the NSMP subtype, we identified a positive correlation with the proportion of stromal cells (p < 0.0001) and negative correlations with the Delaunay triangle perimeter of stromal cells (p = 0.00104) and the area of immune cells (p < 0.0001). These findings indicate that in patients with NSMP, the tumor region appears to consist of many densely packed stromal cells and small volumes of inflammatory cells. For the p53abn subtype, we identified positive correlations with the mean areas of both tumor cells and inflammatory cells (both p < 0.0001) and their standard deviations (both p < 0.0001). This finding suggests that in patients with p53abn, tumor cells and inflammatory cells are not only of large volumes but also show high variability in their sizes. For the POLEmut subtype, we found positive correlations with the mean difference between the maximum and minimum angles of Delaunay triangles for stromal cells (p < 0.0005) and the area of Delaunay triangles (p = 0.0112). These findings suggest that stromal cells are arranged tightly and exhibit uneven heterogeneity in POLEmut endometrial cancer. This observation was further supported by the heatmap for the POLEmut subtype generated by Grad-CAM, wherein the high-activation regions were dispersed across the patches (Fig. 3C).

Discussion

Currently, the molecular classification of endometrial cancer relies primarily on immunohistochemistry and gene sequencing²². The interpretation of immunohistochemistry results often depends on the pathologist’s experience and subjective judgment, leading to poor reproducibility²³. Furthermore, the expression levels of certain protein markers may overlap among different molecular subtypes, resulting in insufficient specificity and sensitivity. The specificity and quality of antibodies also affect the accuracy of the immunohistochemistry results. Although gene sequencing technology has advantages in precisely determining the subtype, it is time-consuming, costly, technically complex, and requires challenging data analysis, significantly reducing accessibility in economically underdeveloped regions²⁴. The deep learning model in this study infers likely molecular subtype directly from H&E whole-slide images. We do not propose it as a replacement for guideline molecular testing; rather, it is positioned as an H&E-first, human-in-the-loop triage tool. Slide-level probabilities and patch-level heatmaps can help pathologists prioritize confirmatory assays (e.g., MSI testing, POLE sequencing, p53 IHC) and guide selective reflex testing when tissue is limited—for example, flagging cases for MMR IHC/MSI workup when the model assigns a high MSI-H probability. By ordering and accelerating confirmatory testing, this workflow may reduce turnaround time and cost, particularly in resource-limited settings.

We trained, tested, and validated the model with 364 WSIs from 324 patients in the Fudan cohort and performed external validation with 296 WSIs from 274 patients in the TCGA cohort and 36 WSIs from 33 patients in the Suzhou cohort. Using an end-to-end predictive network characterized by low computational overhead and good interpretability, we developed a model that achieved improved performance compared with representative MIL and pooling frameworks (TransMIL, AB-MIL, CLAM-SB, CLAM-MB, Max-Pooling, and Mean-Pooling), all trained with features extracted from UNI pathology foundation model for consistent histopathologic representation.

We developed a replicable and efficient method for molecular subtype classification by formulating the problem as a soft voting task across tumor-region patches. Instead of using more complex or data-intensive models, we adopted EfficientNetV2, a state-of-the-art convolutional neural network (CNN) backbone, for patch-level classification. This choice leveraged the inherent strengths of CNNs, including rapid convergence, strong local feature extraction, and the capacity to process high-resolution inputs efficiently²⁵. These characteristics are particularly advantageous in medical imaging tasks involving limited datasets and large histopathological slides. Compared to vision transformer-based architectures such as ViT, which often require large-scale data and operate on fixed-size patch tokenization, our approach is better suited for capturing both fine-grained morphological and global contextual patterns. This design also allowed efficient extraction of informative regions and facilitated incremental learning, avoiding the need for retraining the entire feature extractor. As an end-to-end network was used in our study, we directly captured the most advantageous regions for subtyping from the gradients of the CNN by using Grad-CAM. This facilitated the exploration of subtype-specific morphological characteristics, offering finer-grained interpretability compared with conventional MIL attention maps. Additionally, a tumor segmentation network was used in the preprocessing stage to ensure that the included tiles belonged to tumor areas, minimizing the impact of nontumor areas on the predictive performance of the model. In addition to those of endometrioid carcinoma, the most common endometrial cancer, we also included samples of other histological subtypes, increasing the representativeness of the samples and interpretability of the model.

By analyzing the billions of cells segmented from the 364 WSIs in the dataset, we confirmed the findings of Fremond et al. obtained at the WSI level, that NSMP tiles contained a relatively high proportion of stromal cells and that p53abn tiles displayed relatively large tumor cells¹¹. Furthermore, we revealed correlations between the four molecular subtypes and certain morphological features (e.g., surface curvature) as well as distribution characteristics (e.g., density, cell arrangement) of tumor cells, stromal cells, and inflammatory cells through a larger WSI-level coverage area and richer feature analysis. These correlation analyses allowed us to explore the potential associations between pathogenic mechanisms and prognosis corresponding to the four subtypes in the tumor microenvironment.

This study has several limitations that should be noted. First, although the proposed model achieved high AUROC, accuracy, specificity, and negative predictive value (NPV) in predicting the POLEmut subtype, its performance in terms of precision, recall, F1 score, and sensitivity was relatively poor. This may be attributed to the low prevalence of the POLEmut subtype in the general population, as well as the limited number of such cases in our dataset. Second, the relatively high POLEmut frequency in our cohort may reflect referral bias and enriched molecular testing in a tertiary care setting, and thus may not fully represent population-level distributions. Additionally, given the relatively short follow-up period for the patients included in this study and their generally favorable outcomes, no correlation was established between the molecular subtype and prognostic information.

To enhance the clinical utility of the model, future work should address several key areas. First, improving the identification of the POLEmut subtype will require the expansion of training datasets to include a larger and more representative sample of such cases, thereby mitigating class imbalance and improving statistical power. Second, given the morphological overlap between POLEmut and MSI-H tumors, additional strategies such as feature disentanglement may help the network better distinguish these subtypes. Third, while the current model focuses on four-class molecular subtype prediction, future efforts could explore clinically meaningful binary tasks—such as distinguishing immune-active versus immune-inactive tumors—with particular relevance to the MSI-H subgroup, which is closely associated with immunotherapy response⁵. Moreover, large-scale clinical validation across diverse cohorts will be essential before the model can be translated into real-world clinical practice.

In summary, this proof-of-concept study introduces a morphology-based framework for molecular subtyping of endometrial cancer. Rather than replacing genomic assays, it provides a scalable and interpretable basis for future clinical translation, especially in settings where molecular testing is limited or unavailable.

Methods

Data preparation and molecular classification

We used diagnostic formalin-fixed paraffin-embedded (FFPE) and H&E-stained tumor slides and complete molecular and clinicopathological data from three cohorts to train, test and validate our models (Fig. 4). In both the Fudan and Suzhou cohorts, molecular classification was performed using a hierarchical algorithm based on targeted next-generation sequencing (NGS) with the ColonCore 46-gene panel (Burning Rock Biotech, Guangzhou, China)⁴. This panel was designed to simultaneously assess microsatellite instability (MSI) status via a read-based algorithm and to detect mutations associated with Lynch syndrome (LS) (e.g., MLH1, MSH2, MSH6, PMS2), POLE, TP53, and other genes frequently altered in endometrial cancer (e.g., PTEN, CTNNB1, ARID1A, PIK3CA), as well as genes relevant to hereditary cancer syndromes and targeted therapy (Supplementary Table S6). For MSI calling, loci were deemed unstable when the coverage ratio of pre-specified repeat-length bins fell below locus-specific thresholds; samples were labeled MSI-H if >30% of marker loci were unstable (otherwise MSS). Sequencing was performed on the Illumina NextSeq platform (Illumina Inc., San Diego, CA) at a mean depth of 1000× following the manufacturer’s instructions. Genomic loci meeting the depth threshold (≥100×) were identified, followed by variant calling using the VarScan pipeline²⁶. Base calling in tissue samples required ≥8 supporting reads for single-nucleotide variants (SNVs), and ≥2 and ≥5 supporting reads for insertion and deletion variants, respectively. Common polymorphisms (population frequency >0.1% in ExAC/1000 Genomes/dbSNP/ESP6500SI-V2) were excluded. Variants that passed these thresholds were retained for downstream molecular classification. Molecular subtypes were assigned following the mutually exclusive algorithm consistent with The Cancer Genome Atlas (TCGA) framework (Supplementary Fig. S1). Tumors were assigned to the four WHO molecular classes using a fixed hierarchy. First, cases with pathogenic/likely pathogenic POLE exonuclease-domain mutations were classified as POLE ultramutated (POLEmut). Among POLE-wild-type tumors, those identified as high microsatellite instability (MSI-H) by our NGS-based algorithm were labeled MSI-H and treated as a distinct group throughout the study. Next, among microsatellite-stable, POLE-wild-type tumors, those harboring pathogenic/likely pathogenic, protein-altering TP53 variants were classified as p53-abnormal (p53abn). The remaining tumors were assigned to no specific molecular profile (NSMP). As an orthogonal check, IHC for MLH1, MSH2, MSH6, PMS2 was performed; IHC results were concordant with our sequencing-based assignments and did not change any final labels (NGS–IHC concordance 91.8% for assessing MMR status; previously reported by our group⁴). Representative IHC results are shown in Supplementary Fig. S2, but IHC findings did not alter the final molecular labels used for analysis. The TCGA cohort was classified according to published molecular subtype annotations based on multi-omics profiling²⁷.

**Fig. 4: Schematic diagram of data preparation.**

Deep learning pipeline

To effectively process the hundred million-pixel WSIs, we first precropped the WSIs obtained from scanning the H&E-stained slides to generate small nonoverlapping patches. The pixel density of these patches was set to 0.549 microns per pixel, and the resolution was set to 1024 × 1024 to ensure that the patches contained rich details of local cellular structures while retaining global tissue structure information. Next, to remove regions lacking biological content, we applied a brightness-based filter: patches with a mean RGB intensity below 100 or above 200 were excluded, as they typically represented blank slide borders or overstained background. This step focused computation on informative areas, improving both efficiency and accuracy. We then performed stain normalization using the Vahadane method, applied independently to each patch using a single reference slide as the template. Following this, Color mean normalization was performed by subtracting the channel-wise mean RGB values of the ImageNet dataset (\({{\rm{mean}}}_{({\rm{r}},{\rm{g}},{\rm{b}})}\) = (0.485, 0.456, 0.406),\(\,{{\rm{std}}}_{({\rm{r}},{\rm{g}},{\rm{b}})}\) = (0.229, 0.224, 0.225), assuming input pixel values are scaled to the [0,1] range) from each patch after stain normalization.

Next, to enrich the dataset with tumor-relevant regions, we applied a pretrained tumor segmentation model based on DeepLab-v3 model²⁸. A patch was considered “tumor” if >50% of its pixels were labeled as tumor; only tumor patches were retained for all experiments. During 5-fold cross-validation (Fudan cohort), to mitigate class imbalance we performed subtype-stratified retention after tumor filtering—retaining 50% of tumor patches from MSI-H, 30% from NSMP, and 100% from p53abn and POLEmut (MSI-H:NSMP:p53abn:POLEmut = 0.5:0.3:1:1), with a fixed random seed per fold. For external evaluation cohorts, we did not use stratified retention; instead, to standardize slide contribution and control compute, we capped each WSI at 200 tumor patches (keeping all when ≤200, otherwise uniform sampling without replacement to 200). All training/validation/test patches originated from tumor regions; normal-only or stroma-only patches were excluded.

The filtered and normalized patches were then fed into a classical weakly supervised learning framework. In this end-to-end paradigm, the feature extractor and classifier were trained jointly using WSI-level molecular subtype labels. Specifically, each patch was assigned the ground truth label of its parent WSI, and the model was optimized using a categorical cross-entropy loss computed at the patch level. No attention-based weighting or instance selection was applied—each patch contributed equally to the final prediction.

The backbone architecture was EfficientNetV2²⁹, initialized with ImageNet-pretrained weights. The first five blocks (shallowest layers) were frozen during training, while the remaining layers were fine-tuned on our histopathology data. This configuration preserved generic low-level feature extraction while allowing deeper layers to adapt to the histopathological domain. Training was conducted using the AdamW optimizer (learning rate = 2 × 10⁻⁵, weight decay = 1 × 10⁻³, batch size = 10, dropout = 0.25, epochs = 80). Hyperparameters were selected based on 5-fold cross-validation within the Fudan training cohort; no external cohort (TCGA, Suzhou) were used in hyperparameter tuning.

For inference, the model outputted softmax probabilities across the four molecular subtypes for each patch. Slide-level classification was performed using soft voting: all patch-level probabilities from a given WSI were averaged, and the subtype with the highest mean probability was assigned using argmax. This straightforward aggregation strategy enabled interpretable and reproducible decision-making.

We evaluated six representative weakly supervised frameworks under a unified protocol, including TransMIL³⁰, AB-MIL³¹, Max-Pooling, Mean-Pooling, CLAM-SB, and CLAM-MB²⁰. For fair comparison, all models used patch-level features extracted from the UNI pathology foundation model, which provides pathology-specific representations with improved consistency across cohorts²¹. The extracted features were kept frozen, and only the respective aggregation or attention heads were trained on the Fudan cohort with slide-level supervision using cross-entropy loss. Preprocessing, tiling, stain normalization, tumor-region filtering (DeepLabv3), data partitioning, and evaluation metrics were identical across all models (Supplementary Tables S5).

In contrast to these pooling- and MIL-based baselines that operate on frozen patch embeddings, our model fine-tunes the feature encoder jointly with the classifier under WSI-level supervision. This joint learning strategy has been described as “classical weakly supervised learning” by Laleh et al.³² and has been shown to yield biologically relevant predictions even without explicit patch-level annotations. Moreover, our approach enables downstream interpretability through Grad-CAM visualization, highlighting tissue regions most influential for the final subtype prediction and providing finer, pathology-aligned insights beyond those offered by decoupled MIL frameworks.

By leveraging both microscopic cellular morphology and macroscopic tissue patterns in a unified pipeline, our model captures multi-scale histopathological features critical for endometrial cancer molecular subtyping. This interpretable design supports integration into future workflows where explainability and traceability are paramount (Fig. 5).

Patch-level visualization of the morpho-molecular correlates

To visualize the morphological features related to the molecular subtype, we used gradient-weighted class-activated mapping (Grad-CAM), a visualization technique that uses gradient information to weigh the activation values of each convolutional layer feature map and corresponding heatmaps generated based on the input patches. These heatmaps show the areas of the patch that the CNN focused on when predicting the molecular subtype, with red indicating high areas of activation and blue indicating low areas of activation. Representative patches with high confidence from different WSIs were selected to assess and compare histological features corresponding to high activation areas, establishing the connection between the molecular subtype and morphological features of endometrial cancer.

Correlations between single-cell nuclear features and molecular subtypes

In addition to these patch-level heatmaps for analyzing the histological interpretability of H&E-stained slides, Hover-Net was used to classify and segment the nuclei of all cells (including tumor cells, inflammatory cells, and stromal cells) within the tumor cell-enriched areas of each WSI. Unlike Fremond et al.¹¹, who focused on nuclear features from the top 20 most-attended patches, our approach analyzed all cells—including tumor, stromal, and inflammatory cells—across the entire tumor compartment, thereby encompassing a much larger and more representative cellular population per case. From the segmented nuclei, we extracted a total of 84 interpretable features per cell, which were organized into three categories: morphological, texture, and spatial distribution metrics (Supplementary Fig. S1). Morphological features included area, perimeter, eccentricity, elongation, and boundary curvature, which capture aspects of cellular growth, shape regularity, and surface smoothness. Texture features such as angular second moment (ASM), contrast, entropy, and correlation quantified intra-nuclear homogeneity or heterogeneity. To assess spatial architecture, we employed Delaunay triangulation across nuclear centroids, calculating metrics such as triangle area, perimeter, and angular variance to reflect cell packing density and spatial uniformity (Supplementary Table S5).

We computed molecular subtype-specific correlations between these cell-level features and the predicted labels across the entire training dataset. For clarity and conciseness in the main text, we reported the top 20 features showing the strongest correlations with specific molecular subtypes. However, the full feature matrix across all cases is available in the supplementary appendix, ensuring transparency and enabling downstream reproducibility.

For interpretability, we considered correctly classified WSIs and computed a patch score as the softmax probability for the WSI’s ground-truth subtype. Within each subtype, patches with score ≥0.95 were retained; if fewer were available, additional patches were added in descending score order to reach the target, enforcing ≤2 patches per WSI. Two board-certified gynecologic pathologists (≥5 years’ experience) then independently reviewed the candidates (blinded; structured checklist with optional free-text) and selected 20 representative patches per subtype for qualitative illustration. These qualitative evaluations were used to cross-check whether the model-prioritized features aligned with human-observed subtype distinctions.

Together, this whole-slide, single-cell feature analysis provides a reproducible framework for exploring morpho-molecular correlates at scale. While individual correlations may remain modest, the breadth and transparency of the extracted features enable future meta-analyses and facilitate integration with prospective biological validation efforts.

Data availability

The TCGA diagnostic whole-slide images and corresponding molecular subtype labels are publicly available through the NIH Genomic Data Commons (https://portal.gdc.cancer.gov/). The datasets generated and analyzed from the Fudan and Suzhou cohorts during the current study are not publicly available due to institutional data-sharing policies but are available from the corresponding author upon reasonable request. The source code used in this study is available on GitHub at https://github.com/HaoyuCui/EC-Subtype-Prediction and https://github.com/HaoyuCui/PyPathomics, and has been archived on Zenodo (https://doi.org/10.5281/zenodo.14628906), together with two representative per-cell feature matrices from our cell-segmentation pipeline (https://doi.org/10.5281/zenodo.17019320).

References

Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).
PubMed Google Scholar
Gu, B. et al. Variations in incidence and mortality rates of endometrial cancer at the global, regional, and national levels, 1990-2019. Gynecol. Oncol. 161, 573–580 (2021).
Article PubMed Google Scholar
Ferlay, J. et al. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int. J. Cancer 127, 2893–2917 (2010).
Article PubMed CAS Google Scholar
Oaknin, A. et al. Endometrial cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Ann. Oncol. 33, 860–877 (2022).
Article PubMed CAS Google Scholar
Grau Bejar, J. F. et al. Immune predictors of response to immune checkpoint inhibitors in mismatch repair-deficient endometrial cancer. J. Immunother. Cancer 12, e009143 (2024).
Hong, R., Liu, W., DeLair, D., Razavian, N. & Fenyö, D. Predicting endometrial cancer subtypes and molecular features from histopathology images using multi-resolution deep learning models. Cell Rep. Med. 2, 100400 (2021).
Article PubMed PubMed Central CAS Google Scholar
Crosbie, E. J. et al. Endometrial cancer. Lancet 399, 1412–1428 (2022).
Article PubMed Google Scholar
Urick, M. E. & Bell, D. W. Clinical actionability of molecular targets in endometrial cancer. Nat. Rev. Cancer 19, 510–521 (2019).
Article PubMed PubMed Central CAS Google Scholar
Abu-Rustum, N. et al. Uterine Neoplasms, Version 1.2023, NCCN Clinical Practice Guidelines in Oncology. J. Natl. Compr. Canc Netw. 21, 181–209 (2023).
Article PubMed CAS Google Scholar
Berek, J. S. et al. FIGO staging of endometrial cancer: 2023. Int. J. Gynaecol. Obstet. 162, 383–394 (2023).
Article PubMed Google Scholar
Fremond, S. et al. Interpretable deep learning model to predict the molecular classification of endometrial cancer from haematoxylin and eosin-stained whole-slide images: a combined analysis of the PORTEC randomised trials and clinical cohorts. Lancet Digit Health 5, e71–e82 (2023).
Article PubMed CAS Google Scholar
Jamieson, A. & McAlpine, J. N. Molecular profiling of endometrial cancer from TCGA to clinical practice. J. Natl. Compr. Canc Netw. 21, 210–216 (2023).
Article PubMed CAS Google Scholar
Kobayashi-Kato, M. et al. Utility of the revised FIGO2023 staging with molecular classification in endometrial cancer. Gynecol. Oncol. 178, 36–43 (2023).
Article PubMed CAS Google Scholar
Guo, Q. et al. Identification of molecular subtypes for endometrial carcinoma using a 46-gene next-generation sequencing panel: a retrospective study on a consecutive cohort. ESMO Open 9, 103710 (2024).
Article PubMed PubMed Central CAS Google Scholar
Hanna, M. G. et al. Integrating digital pathology into clinical practice. Mod. Pathol. 35, 152–164 (2022).
Article PubMed Google Scholar
Tsai, P. C. et al. Histopathology images predict multi-omics aberrations and prognoses in colorectal cancer patients. Nat. Commun. 14, 2102 (2023).
Article PubMed PubMed Central CAS Google Scholar
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Article PubMed PubMed Central CAS Google Scholar
Guo, Q. et al. Predicting lymph node metastasis from primary cervical squamous cell carcinoma based on deep learning in histopathologic images. Mod. Pathol. 36, 100316 (2023).
Article PubMed CAS Google Scholar
Wang, T. et al. in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI) 1289-1292 (IEEE, 2020).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Article PubMed PubMed Central Google Scholar
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).
Article PubMed PubMed Central CAS Google Scholar
Talhouk, A. et al. Confirmation of ProMisE: a simple, genomics-based clinical classifier for endometrial cancer. Cancer 123, 802–813 (2017).
Article PubMed CAS Google Scholar
Thompson, E. F. et al. Variability in endometrial carcinoma pathology practice: opportunities for improvement with molecular classification. Mod. Pathol. 35, 1974–1982 (2022).
Article PubMed CAS Google Scholar
Plotkin, A. et al. Budget impact analysis of molecular subtype profiling in endometrial cancer. Gynecol. Oncol. 178, 54–59 (2023).
Article PubMed CAS Google Scholar
Zhang, Y. et al. Development of a PANoptosis-Related Pathomics Prognostic Model in Ovarian Cancer: A Multi-Omics Study.J. Cell. Mol. Med. 29, e70958 (2025).
Article PubMed PubMed Central CAS Google Scholar
Zhu, L. et al. A novel and reliable method to detect microsatellite instability in colorectal cancer by next-generation sequencing. J. Mol. Diagn. 20, 225–231 (2018).
Article PubMed CAS Google Scholar
Bell, D. W. & Ellenson, L. H. Molecular genetics of endometrial carcinoma. Annu. Rev. Pathol. 14, 339–367 (2019).
Article PubMed CAS Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV) 801–818 (2018).
Tan, M. & Le, Q. Efficientnetv2: Smaller models and faster training. In International conference on machine learning 10096-10106 (PMLR, 2021).
Shao, Z., et al. TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classication. In Neural Information Processing Systems (2021).
Ilse, M., Tomczak, J.M. & Welling, M. Attention-based Deep Multiple Instance Learning. in International Conference on Machine Learning (2018).
Ghaffari Laleh, N. et al. Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology. Med. Image Anal. 79, 102474 (2022).
Article PubMed Google Scholar

Download references

Acknowledgements

We would like to thank all doctors, nurses, patients, and their family members for their kindness in supporting our study. This work was supported by the National Natural Science Foundation of China (Nos. 82272898 and 82203723), the Three-Year Action Plan to Promote Clinical Skills and Clinical Innovation Capacity of Municipal Hospitals by the Shanghai Shenkang Hospital Development Center (SHDC2020CR5003-001), and the Science and Technology Commission of Shanghai Municipality (21ZR1415000).

Author information

These authors contributed equally: Qinhao Guo, Haoyu Cui, Yangyang Zhang, Shaoxian Tang.

Authors and Affiliations

Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
Qinhao Guo, Yangyang Zhang, Xingzhu Ju, Zheng Feng, Jun Zhu, Fang Bai, Yanping Zhong, Xiaohua Wu & Hao Wen
Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
Qinhao Guo, Yangyang Zhang, Shaoxian Tang, Xiaoyan Zhou, Xingzhu Ju, Zheng Feng, Jun Zhu, Fang Bai, Yanping Zhong, Haiming Li, Xiaohua Wu & Hao Wen
Jiangsu Key Laboratory of Intelligent Medical Image Computing, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, China
Haoyu Cui, Weicheng Yan, Jun Xu & Xiangxue Wang
Department of Pathology, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
Shaoxian Tang & Xiaoyan Zhou
Department of Obstetrics and Gynecology, The First Affiliated Hospital of Soochow University, Suzhou, China
Hongmei Ding & Jinhua Zhou
Department of Radiology, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
Haiming Li

Authors

Qinhao Guo
View author publications
Search author on:PubMed Google Scholar
Haoyu Cui
View author publications
Search author on:PubMed Google Scholar
Yangyang Zhang
View author publications
Search author on:PubMed Google Scholar
Shaoxian Tang
View author publications
Search author on:PubMed Google Scholar
Weicheng Yan
View author publications
Search author on:PubMed Google Scholar
Xiaoyan Zhou
View author publications
Search author on:PubMed Google Scholar
Hongmei Ding
View author publications
Search author on:PubMed Google Scholar
Jinhua Zhou
View author publications
Search author on:PubMed Google Scholar
Xingzhu Ju
View author publications
Search author on:PubMed Google Scholar
Zheng Feng
View author publications
Search author on:PubMed Google Scholar
Jun Zhu
View author publications
Search author on:PubMed Google Scholar
Fang Bai
View author publications
Search author on:PubMed Google Scholar
Yanping Zhong
View author publications
Search author on:PubMed Google Scholar
Haiming Li
View author publications
Search author on:PubMed Google Scholar
Jun Xu
View author publications
Search author on:PubMed Google Scholar
Xiaohua Wu
View author publications
Search author on:PubMed Google Scholar
Xiangxue Wang
View author publications
Search author on:PubMed Google Scholar
Hao Wen
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: G.Q.H., W.H., and T.S.X. Methodology: C.H.Y., Z.Y.Y., X.J., L.H.M., W.X.X., and Y.W.C. Investigation: D.H.M., Z.J.H., Z.Y.Y., J.X.Z., F.Z., Z.J., B.F., and W.X.H. Visualization: L.Y.Z., Z.Y.P., L.H.M., and Z.Y.Y. Funding acquisition: G.Q.H. and W.X.H. Project administration: G.Q.H., C.H.Y., Z.Y.Y., and T.S.X. Supervision: W.X.H., W.X.X., and W.H. Writing—original draft: G.Q.H., C.H.Y., L.Y.Z., and Z.Y.Y. Writing—review and editing: T.S.X., W.H., W.X.H., W.X.X., and X.J.

Corresponding authors

Correspondence to Jun Xu, Xiaohua Wu, Xiangxue Wang or Hao Wen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Table (download XLSX )

Supplementary Information (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Guo, Q., Cui, H., Zhang, Y. et al. An interpretable deep learning model for predicting endometrial cancer molecular subtypes from H&E-stained slides. npj Precis. Onc. 10, 71 (2026). https://doi.org/10.1038/s41698-026-01280-w

Download citation

Received: 13 March 2025
Accepted: 11 January 2026
Published: 21 January 2026
Version of record: 19 February 2026
DOI: https://doi.org/10.1038/s41698-026-01280-w