Abstract
Weakly supervised segmentation of cancerous regions in whole-slide images (WSIs) is a crucial task in computational pathology, but it is severely hampered by the need for expensive pixel-level annotations. Existing Multiple Instance Learning (MIL) frameworks, while popular, typically fail to produce accurate segmentation masks because they treat WSIs as an unordered ’bag-of-patches’, ignoring the critical tissue topology and architectural patterns that define malignancy. In this paper, we address this fundamental limitation by proposing Geometric Multi-Instance Learning (Geo-MIL), a novel graph-based framework that explicitly models the spatial relationships between tissue patches. At the core of our method is a new topological attention mechanism that operates on the WSI graph, learning to identify and prioritize entire diagnostically relevant tissue structures over isolated patch features. Through extensive experiments on three public gastric cancer datasets, we demonstrate that Geo-MIL significantly outperforms a wide array of state-of-the-art baselines, achieving a new benchmark in both segmentation accuracy and classification performance. Our work represents a significant step towards bridging the gap between weak slide-level labels and precise, pixel-level predictions, paving the way for scalable and accurate quantitative analysis in digital pathology.
Similar content being viewed by others
Data availability
This study utilized publicly available gastric cancer pathological slice datasets: TCGA-STAD (421 WSIs, 375 patients, sourced from the GDC portal), and GasHisSDB (522 WSIs, 522 patients, obtained through KFBIO KF-PRO-120 scanning). All training was conducted using slide-level labels. The segmentation performance evaluation relied on the gold standard masks labeled pixel-by-pixel by two pathologists in the test set and confirmed by a third expert.
Code availability
The code of this project will be made available to readers upon reasonable request, subject to the approval of the corresponding author.
References
Sung, H. et al. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Madabhushi, A. & Lee, G. Image analysis and machine learning in digital pathology: challenges and opportunities. Med. Image Anal. 33, 170–175 (2016).
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Ilse, M., Tomczak, J. M. & Welling, M. Attention-based deep multiple instance learning. In International Conference on Machine Learning, 2127–2136 (PMLR, 2018).
Shao, Z. et al. Transmil: Transformer based correlated multiple instance learning for whole slide image classification. In Advances in Neural Information Processing Systems, 34, 2136–2147 (2021).
Tan, L. et al. Colorectal cancer lymph node metastasis prediction with weakly supervised transformer-based multi-instance learning. Med. Biol. Eng. Comput. 61, 1565–1580 (2023).
Zhao, L. et al. Lung cancer subtype classification using histopathological images based on weakly supervised multi-instance learning. Phys. Med. Biol. 66, 235013 (2021).
Zhang, J. et al. 2dmamba: Efficient state space model for image representation with applications on giga-pixel whole slide image classification. In Proc. Computer Vision and Pattern Recognition Conference, 3583–3592 (2025).
Fountzilas, E., Pearce, T., Baysal, M. A., Chakraborty, A. & Tsimberidou, A. M. Convergence of evolving artificial intelligence and machine learning techniques in precision oncology. NPJ Digit. Med. 8, 75 (2025).
Ma, Y., Jamdade, S., Konduri, L. & Sailem, H. Ai in histopathology explorer for comprehensive analysis of the evolving ai landscape in histopathology. npj Digit. Med. 8, 156 (2025).
Gao, Z. et al. Accurate spatial quantification in computational pathology with multiple instance learning. MedRxiv 2024–04 (2024).
Caron, M. et al. Emerging properties in self-supervised vision transformers. In Proc. IEEE/CVF International Conference on Computer Vision, 9650–9660 (2021).
El Nahhas, O. S. et al. From whole-slide image to biomarker prediction: end-to-end weakly supervised deep learning in computational pathology. Nat. Protoc. 20, 293–316 (2025).
Song, A. H. et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 1, 930–949 (2023).
Wagner, S. J. et al. Make deep learning algorithms in computational pathology more reproducible and reusable. Nat. Med. 28, 1744–1746 (2022).
Lu, M. Y. et al. Federated learning for computational pathology on gigapixel whole slide images. Med. Image Anal. 76, 102298 (2022).
Xiao, X. et al. Visual instance-aware prompt tuning. In Proc. 33rd ACM International Conference on Multimedia. 2880–2889 (2025).
Wibawa, M. S., Lo, K.-W., Young, L. S. & Rajpoot, N. Multi-scale attention-based multiple instance learning for classification of multi-gigapixel histology images. In European Conference on Computer Vision, 635–647 (Springer, 2022).
Liu, M. et al. Exploiting geometric features via hierarchical graph pyramid transformer for cancer diagnosis using histopathological images. IEEE Trans. Med. Imaging 43, 2888–2900 (2024).
Diao, Z. & Jiang, H. A multi-instance tumor subtype classification method for small pet datasets using RA-DL attention module guided deep feature extraction with radiomics features. Comput. Biol. Med. 174, 108461 (2024).
Zhang, Y., Xia, Z., Yin, G. & Liu, B. Cluster-level sparse multi-instance learning for whole-slide images. arXiv preprint arXiv:2509.11034 (2025).
Tan, J. W., Lee, K. & Jeong, W.-K. Hid-con: weakly supervised intrahepatic cholangiocarcinoma subtype classification of whole slide images using contrastive hidden class detection. J. Med. Imaging 12, 061402–061402 (2025).
Wei, F. et al. Weakly-supervised segmentation with ensemble explainable AI: A comprehensive evaluation on crack detection. Rev. Sci. Instrum. 96, 045106 (2025).
Huang, X. et al. 2.5 d deep learning radiomics and clinical data for predicting occult lymph node metastasis in lung adenocarcinoma. BMC Med. Imaging 25, 225 (2025).
Liang, M. et al. NSB-H2GAN: “Negative Sample”-Boosted Hierarchical Heterogeneous Graph Attention Network for Interpretable Classification of Whole-Slide Images. IEEE Trans. Image Process. 34, 4215–4229 (2025).
Li, C., Weng, X., Li, Y. & Zhang, T. Multimodal learning engagement assessment system: an innovative approach to optimizing learning engagement. Int. J. Hum.-Comput. Interact. 41, 3474–3490 (2025).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Zhang, H. et al. Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18802–18812 (2022).
Xiao, X. et al. Describe anything in medical images. arXiv preprint arXiv:2505.05804 (2025).
Xiao, X. et al. Hgtdp-dta: Hybrid graph-transformer with dynamic prompt for drug-target binding affinity prediction. In International Conference on Neural Information Processing, 340–354 (Springer, 2024).
Lin, T., Yu, Z., Hu, H., Xu, Y. & Chen, C.-W. Interventional bag multi-instance learning on whole-slide pathological images. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19830–19839 (2023).
Chen, R. J. et al. Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks. International Conference on Medical Image Computing and Computer-Assisted Intervention. 339–349 (Cham: Springer International Publishing, 2021).
Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202 (2014).
Hu, W. et al. Gashissdb: a new gastric histopathology image dataset for computer aided diagnosis of gastric cancer. Comput. Biol. Med. 142, 105207 (2022).
Tellez, D. et al. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med. Image Anal. 58, 101544 (2019).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
Li, B. et al. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14318–14328 (2021).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torra, A. Learning deep features for discriminative localization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2921–2929 (2016).
Chan, T. H., Cendra, F. J., Ma, L., Yin, G. & Yu, L. Histopathology whole slide image analysis with heterogeneous graph representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15661–15670 (2023).
Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 1107–1110 (IEEE, 2009).
Acknowledgements
This study was supported by the Joint Funds for the Innovation of Science and Technology, Fujian Province (Grant number: 2023Y9299, to Chenshen Huang).
Author information
Authors and Affiliations
Contributions
Conceptualization, C.H., H.X., and X.X.; Methodology, C.H., X.X., and Y.J.; Literature Research, C.H., H.C., Y.L., Z.N., and N.W.; Data Acquisition, C.H., X.X., H.C., and T.W.; Data Analysis & Interpretation, X.X., Y.J., and T.W.; Visualization, C.H., H.X., and X.X., Writing—Original Draft, H.X., and X.X.; Writing—Review & Editing, C.H., X.X., T.W., N.W., and Q.H.; Funding acquisition: C.H.; All authors read and approved the submitted version of manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Huang, C., Xia, H., Xiao, X. et al. Geometric multi-instance learning for weakly supervised gastric cancer segmentation. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-025-02287-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-025-02287-6


