Abstract
Accurate segmentation of polyp tissues in colonoscopic images is crucial for early colorectal cancer detection. Existing CNN-based approaches effectively capture local dependencies but struggle with long-range relations, while transformer-based methods excel in global context modeling yet often overlook fine contextual details. Hybrid CNN–transformer models attempt to combine both, but typically overfit to convolutional features, weakening attention mechanisms. To address these limitations, we propose a Hierarchical Contextual Information Aggregation Network (HCIA) for polyp segmentation. HCIA introduces an Interconnected Attention Module (IAM) that applies global attention to single-level features, enabling comprehensive cross-hierarchy information exchange. In parallel, a Hierarchical Aggregation Module (HAM) fuses adjacent feature levels to enhance local contextual representation. This dual refinement allows HCIA to jointly capture global and local dependencies, yielding more precise tissue boundaries. Extensive experiments across multiple polyp segmentation benchmarks demonstrate that HCIA achieves superior generalization and state-of-the-art accuracy, highlighting its potential for clinical applications.
Data availability
The Kvasir-SEG dataset analysed during the current study is available in the Kvasir-SEG repository, https://datasets.simula.no/kvasir-seg/and the CVC-ClinicDB dataset analysed during the current study is available in the CVC-ClinicDB repository, https://polyp.grand-challenge.org/CVCClinicDB/.
References
Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. Ca Cancer J Clin. 73, 17–48 (2023).
Kim, N. H. et al. Miss rate of colorectal neoplastic polyps and risk factors for missed polyps in consecutive colonoscopies. Intestinal research. 15, 411–418 (2017).
Lee, J. et al. Risk factors of missed colorectal lesions after colonoscopy. Medicine. 96 (2017).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N. & Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, 3–11 (Springer, 2018).
Jha, D. et al. Resunet++: An advanced architecture for medical image segmentation. In 2019 IEEE international symposium on multimedia (ISM), 225–2255 (IEEE, 2019).
Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems. 30 (2017).
Jha, D., Tomar, N. K., Sharma, V. & Bagci, U. Transnetr: transformer-based residual network for polyp segmentation with multi-center out-of-distribution testing. In Medical Imaging with Deep Learning, 1372–1384 (PMLR, 2024).
Brandao, P. et al. Fully convolutional neural networks for polyp segmentation in colonoscopy. In Medical Imaging 2017: Computer-Aided Diagnosis, vol. 10134, 101–107 (Spie, 2017).
Fan, D.-P. et al. Pranet: Parallel reverse attention network for polyp segmentation. In International conference on medical image computing and computer-assisted intervention, 263–273 (Springer, 2020).
Tomar, N. K. et al. Ddanet: Dual decoder attention network for automatic polyp segmentation. In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10-15, 2021, Proceedings, Part VIII, 307–314 (Springer, 2021).
Zhao, Y., Li, J. & Hua, Z. Tact: Text attention based cnn-transformer network for polyp segmentation. International Journal of Imaging Systems and Technology. 34, e22997 (2024).
Huang, Z. et al. Mgf-net: Multi-channel group fusion enhancing boundary attention for polyp segmentation. Medical Physics. 51, 407–418 (2024).
Liu, Y., Shen, X., Lyu, Y. & Wang, X. Mca-net: multi-cascade attention network for polyp segmentation. Multimedia Tools and Applications. 83, 33713–33730 (2024).
Du, X. et al. Um-net: Rethinking icgnet for polyp segmentation with uncertainty modeling. Medical Image Analysis. 99, 103347 (2025).
Zhu, X., Wang, W., Zhang, C. & Wang, H. Polyp-mamba: A hybrid multi-frequency perception gated selection network for polyp segmentation. Information Fusion. 115, 102759 (2025).
Wang, J. et al. Stepwise feature fusion: Local guides global. arxiv 2022. arXiv:2203.03635 .
Duc, N. T., Oanh, N. T., Thuy, N. T., Triet, T. M. & Dinh, V. S. Colonformer: An efficient transformer based method for colon polyp segmentation. IEEE Access10, 80575–80586 (2022).
Zhang, Y., Liu, H. & Hu, Q. Transfuse: Fusing transformers and cnns for medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, 14–24 (Springer, 2021).
Chen, J. et al. Transunet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306 (2021).
Dong, B. et al. Polyp-pvt: Polyp segmentation with pyramid vision transformers. arxiv 2021. arXiv:2108.06932 .
Xue, H., Yonggang, L., Min, L. & Lin, L. A lighter hybrid feature fusion framework for polyp segmentation. Scientific Reports. 14, 23179 (2024).
Liu, G. et al. Cafe-net: Cross-attention and feature exploration network for polyp segmentation. Expert Systems with Applications. 238, 121754 (2024).
Xiao, B., Hu, J., Li, W., Pun, C.-M. & Bi, X. Ctnet: Contrastive transformer network for polyp segmentation. IEEE Transactions on Cybernetics. 54, 5040–5053 (2024).
Wang, Y., Tian, Q., Chu, J. & Lu, W. A wavelet-enhanced boundary aware network with dynamic fusion for polyp segmentation. Neurocomputing. 130259 (2025).
Zhang, W. et al. Hsnet: A hybrid semantic network for polyp segmentation. Computers in biology and medicine. 150, 106173 (2022).
Liu, Y., Yang, Y., Jiang, Y. & Xie, Z. Multi-view orientational attention network combining point-based affinity for polyp segmentation. Expert Systems with Applications. 249, 123663 (2024).
Wang, W. et al. Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media. 8, 415–424 (2022).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 25 (2012).
Oktay, O. et al. Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999 (2018).
Guo, M., Liu, Z., Mu, T. & Hu, S. Beyond self-attention: External attention using two linear layers for visual tasks. arxiv 2021. arXiv:2105.02358 (2021).
Jha, D. et al. Kvasir-seg: A segmented polyp dataset. In MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26, 451–462 (Springer, 2020).
Bernal, J. et al. Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized medical imaging and graphics. 43, 99–111 (2015).
Tajbakhsh, N., Gurudu, S. R. & Liang, J. Automated polyp detection in colonoscopy videos using shape and context information. IEEE transactions on medical imaging. 35, 630–644 (2015).
Shi, J.-H., Zhang, Q., Tang, Y.-H. & Zhang, Z.-Q. Polyp-mixer: An efficient context-aware mlp-based paradigm for polyp segmentation. IEEE Transactions on Circuits and Systems for Video Technology. 33, 30–42 (2022).
Milletari, F., Navab, N. & Ahmadi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), 565–571 (Ieee, 2016).
Cheng, M.-M. & Fan, D.-P. Structure-measure: A new way to evaluate foreground maps. International Journal of Computer Vision. 129, 2622–2638 (2021).
Fan, D.-P. et al. Enhanced-alignment measure for binary foreground map evaluation. arXiv:1805.10421 (2018).
Ma, J. et al. Segment anything in medical images. Nature Communications15, 654 (2024).
Yin, Z., Liang, K., Ma, Z. & Guo, J. Duplex contextual relation network for polyp segmentation. In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), 1–5 (IEEE, 2022).
Nam, J.-H., Syazwany, N. S., Kim, S. J. & Lee, S.-C. Modality-agnostic domain generalizable medical image segmentation by multi-frequency in multi-scale attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11480–11491 (2024).
Muhammad, U. et al. Mmfil-net: Multi-level and multi-source feature interactive lightweight network for polyp segmentation. Displays. 81, 102600 (2024).
Lin, L., Lv, G., Wang, B., Xu, C. & Liu, J. Polyp-lvt: Polyp segmentation with lightweight vision transformers. Knowledge-Based Systems. 300, 112181 (2024).
Yue, G. et al. Attention-guided pyramid context network for polyp segmentation in colonoscopy images. IEEE Transactions on Instrumentation and Measurement. 72, 1–13 (2023).
Zhou, T. et al. Edge-aware feature aggregation network for polyp segmentation. Machine Intelligence Research. 22, 101–116 (2025).
Dong, B. et al. Polyp-pvt: Polyp segmentation with pyramid vision transformers. arXiv:2108.06932 (2021).
Lee, G.-E., Cho, J. & Choi, S.-I. Shallow and reverse attention network for colon polyp segmentation. Scientific Reports. 13, 15243 (2023).
Yue, G. et al. Boundary uncertainty aware network for automated polyp segmentation. Neural Networks. 170, 390–404 (2024).
Wang, J. et al. Stepwise feature fusion: Local guides global. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 110–120 (Springer, 2022).
Wang, H. et al. Dynamic spectrum-driven hierarchical learning network for polyp segmentation. Medical Image Analysis. 101, 103449 (2025).
Xia, Y., Yun, H., Liu, Y., Luan, J. & Li, M. Mgcbformer: The multiscale grid-prior and class-inter boundary-aware transformer for polyp segmentation. Computers in Biology and Medicine. 167, 107600 (2023).
Sanderson, E. & Matuszewski, B. J. Fcn-transformer feature fusion for polyp segmentation. In Annual Conference on Medical Image Understanding and Analysis, 892–907 (Springer, 2022).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017).
Srivastava, A. et al. Msrf-net: a multi-scale residual fusion network for biomedical image segmentation. IEEE Journal of Biomedical and Health Informatics. 26, 2252–2263 (2021).
Funding
We state that there was no Funding for this manuscript.
Author information
Authors and Affiliations
Contributions
Li Lin and He xue conceived the experiments, Yang Haicheng and Zhang Jialin conducted the experiments, Zheng Cuijuan and Chen Xiaoyu analysed the results. All authors wrote and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, L., Yang, H., Zhang, J. et al. Hierarchical contextual information aggregation for polyp segmentation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35703-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-35703-2