Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
CGDFNet: a dual-branch real-time semantic segmentation network with context-guided detail fusion
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 16 February 2026

CGDFNet: a dual-branch real-time semantic segmentation network with context-guided detail fusion

  • Shan Zhao1 na1,
  • Wenjing Fu2 na1,
  • Jiajia Gao1,
  • Fukai Zhang1 &
  • …
  • Zhanqiang Huo1 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Engineering
  • Mathematics and computing

Abstract

Dual-branch networks serve a crucial role in real-time semantic segmentation. During feature extraction, sequential downsampling frequently results in the loss of fine details, while existing methods often underutilize contextual information. Traditional spatial domain fusion approaches cannot fully integrate local and global information, limiting the network’s expressive capability. To address these challenges, a Context-Guided Detail Fusion Network (CGDFNet) is developed based on existing dual-branch frameworks to enhance feature representation while preserving image details. Specifically, a Semantic Refinement Module (SRM) is implemented in the context branch, where global semantic information is captured through adaptive pooling, and local and global features undergo parallel processing. In the detail branch, high-frequency detail features are guided and reinforced by a Context-Guided Detail Module (CGDM), which leverages semantic information and implements detail-enhanced convolution. Additionally, a Fourier-Domain Adaptive Fusion Module (FDAFM) is developed to achieve efficient fusion of contextual and detail features. This module extracts global frequency information through a Fourier transform, and dynamically fuses features from both branches via an adaptive gating mechanism, enabling effective integration of dual-branch features. CGDFNet achieves 77.8% mIoU with an inference speed of 87.6 FPS on the Cityscapes test set, while attaining 77.9% mIoU at 128.7 FPS on the CamVid test set. Experimental evaluations indicate that CGDFNet balances segmentation quality with real-time inference speed.

Similar content being viewed by others

Multitask semantic change detection guided by spatiotemporal semantic interaction

Article Open access 08 May 2025

DBRSNet: a dual-branch remote sensing image segmentation model based on feature interaction and multi-scale feature fusion

Article Open access 30 July 2025

Multi-scale feature progressive fusion network for remote sensing image change detection

Article Open access 13 July 2022

Data availability

All datasets used in this study are publicly available. The Cityscapes dataset can be accessed at https://www.cityscapes-dataset.com/. The CamVid dataset is available at https://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/. No datasets were generated during the current study.

References

  1. Fan, J. et al. Segtransconv: Transformer and cnn hybrid method for real-time semantic segmentation of autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 25, 1586–1601 (2023).

    Google Scholar 

  2. Song, Q., Mei, K. & Huang, R. Attanet: Attention-augmented network for fast and accurate scene parsing. In Proceedings of the AAAI Conference on Artificial Intelligence vol. 35, pp. 2567–2575 (2021).

  3. Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 3431–3440 (2015).

  4. Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2881–2890 (2017).

  5. Zhang, Q., Wu, J., Miao, D., Zhao, C. & Zhang, Q. Attentive multi-granularity perception network for person search. Inf. Sci. 681, 121191 (2024).

    Google Scholar 

  6. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A. L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014).

  7. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

  8. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2017).

    Google Scholar 

  9. Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).

  10. Li, H., Xiong, P., Fan, H. & Sun, J. Dfanet: Deep feature aggregation for real-time semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9522–9531 (2019).

  11. Gao, G. et al. Mscfnet: A lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 23, 25489–25499 (2021).

    Google Scholar 

  12. Fan, M. et al. Rethinking bisenet for real-time semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9716–9725 (2021).

  13. Xu, G. et al. Lightweight real-time semantic segmentation network with efficient transformer and cnn. IEEE Trans. Intell. Transp. Syst. 24, 15897–15906 (2023).

    Google Scholar 

  14. Xu, G., Jia, W., Wu, T., Chen, L. & Gao, G. Haformer: Unleashing the power of hierarchy-aware features for lightweight semantic segmentation. IEEE Trans. Image Process. (2024).

  15. Zhou, Q. et al. Boundary-guided lightweight semantic segmentation with multi-scale semantic context. IEEE Trans. Multimedia 26, 7887–7900 (2024).

    Google Scholar 

  16. Weng, X. et al. Deep multi-branch aggregation network for real-time semantic segmentation in street scenes. IEEE Trans. Intell. Transp. Syst. 23, 17224–17240 (2022).

    Google Scholar 

  17. Peng, J. et al. Pp-liteseg: A superior real-time semantic segmentation model. arXiv preprint arXiv:2204.02681 (2022).

  18. Li, W., Liao, M., Hua, G., Zhang, Y. & Zou, W. Contextual guidance network for real-time semantic segmentation of autonomous driving. IEEE Trans. Intell. Transp. Syst. (2025).

  19. Yoo, J., Ko, D. & Kim, G. Ccaseg: Decoding multi-scale context with convolutional cross-attention for semantic segmentation. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 9479–9488 (IEEE, 2025).

  20. Chen, G., Li, H., Li, Y., Zhang, W. & Song, T. Parallel segmentation network for real-time semantic segmentation. Eng. Appl. Artif. Intell. 148, 110487 (2025).

    Google Scholar 

  21. Yu, C. et al. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 325–341 (2018).

  22. Yu, C. et al. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vision 129, 3051–3068 (2021).

    Google Scholar 

  23. Zhao, H., Qi, X., Shen, X., Shi, J. & Jia, J. Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European Conference on Computer Vision (ECCV), 405–420 (2018).

  24. Saksena, S. CABiNet: Efficient Context Aggregation Network for Low-Latency Semantic Segmentation. Master’s thesis, (University of Twente, 2020).

  25. Pan, H., Hong, Y., Sun, W. & Jia, Y. Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes. IEEE Trans. Intell. Transp. Syst. 24, 3448–3460 (2022).

    Google Scholar 

  26. Xu, J., Xiong, Z. & Bhattacharyya, S. P. Pidnet: A real-time semantic segmentation network inspired by pid controllers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 19529–19539 (2023).

  27. Guo, Z. et al. Dsnet: A novel way to use atrous convolutions in semantic segmentation. IEEE Trans. Circuits Syst. Video Technol. (2024).

  28. Dong, Y. et al. Afpn: Alignment feature pyramid network for real-time semantic segmentation. Pattern Recogn. 112019 (2025).

  29. Dong, Y., Mao, C., Zheng, L. & Wu, Q. Dmanet: Dual-branch multiscale attention network for real-time semantic segmentation. Neurocomputing 617, 128991 (2025).

    Google Scholar 

  30. Zhang, Q. et al. Learning adaptive shift and task decoupling for discriminative one-step person search. Knowl.-Based Syst. 304, 112483 (2024).

    Google Scholar 

  31. Zhang, Q. et al. Iris recognition based on adaptive optimization log-gabor filter and rbf neural network. In Chinese Conference on Biometric Recognition 312–320 (Springer, 2019).

  32. Chen, Z., He, Z. & Lu, Z.-M. Dea-net: Single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans. Image Process. 33, 1002–1015 (2024).

    Google Scholar 

  33. Mathieu, M., Henaff, M. & LeCun, Y. Fast training of convolutional networks through ffts. arXiv preprint arXiv:1312.5851 (2013).

  34. Sun, H. et al. Fourier convolution block with global receptive field for mri reconstruction. Med. Image Anal. 99, 103349 (2025).

    Google Scholar 

  35. Zhang, Q. et al. Dynamic frequency selection and spatial interaction fusion for robust person search. Inf. Fusion 103314 (2025).

  36. Zhou, J., Liu, Y., Peng, B., Liu, L. & Li, X. Madinet: Mamba diffusion network for sar target detection. IEEE Trans. Circuits Syst. Video Technol. (2025).

  37. Chen, L., Yang, M.-H., Pu, J. & Zheng, Z. Triplenet: Exploiting complementary features and pseudo-labels for semi-supervised salient object detection. IEEE Trans. Image Process. (2025).

  38. Cordts, M. et al. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3213–3223 (2016).

  39. Brostow, G. J., Shotton, J., Fauqueur, J. & Cipolla, R. Segmentation and recognition using structure from motion point clouds. In European Conference on Computer Vision 44–57 (Springer, 2008).

  40. Brar, D. S., Aggarwal, A. K., Nanda, V., Saxena, S. & Gautam, S. Ai and cv based 2d-cnn algorithm: Botanical authentication of Indian honey. Sustain. Food Technol. 2, 373–385 (2024).

    Google Scholar 

  41. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).

    Google Scholar 

  42. Goyal, P. et al. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017).

  43. Contributors, M. Openmmlab semantic segmentation toolbox and benchmark (Tech. Rep, Shanghai, China, 2020).

  44. Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015).

    Google Scholar 

  45. Ouyang, D. et al. Efficient multi-scale attention module with cross-spatial learning. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1–5 (IEEE, 2023).

  46. Cai, H., Li, J., Hu, M., Gan, C. & Han, S. Efficientvit: Multi-scale linear attention for high-resolution dense prediction. arxiv 2022. arXiv preprint arXiv:2205.14756 (2022).

  47. Liu, X., Liu, J., Tang, J. & Wu, G. Catanet: Efficient content-aware token aggregation for lightweight image super-resolution. In Proceedings of the Computer Vision and Pattern Recognition Conference 17902–17912 (2025).

  48. Zhang, T. et al. Cas-vit: Convolutional additive self-attention vision transformers for efficient mobile applications. arXiv preprint arXiv:2408.03703 (2024).

  49. Selvaraju, R. R. et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision 618–626 (2017).

  50. Elhassan, M. A. et al. \(s^{2}\)-fpn: Scale-ware strip attention guided feature pyramid network for real-time semantic segmentation. arXiv preprint arXiv:2206.07298 (2022).

  51. Shi, M. et al. Lmffnet: A well-balanced lightweight network for fast and accurate semantic segmentation. IEEE Trans. Neural Netw. Learn. Syst. 34, 3205–3219 (2022).

    Google Scholar 

  52. Zhang, W. et al. Topformer: Token pyramid transformer for mobile semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 12083–12093 (2022).

  53. Dong, B., Wang, P. & Wang, F. Head-free lightweight semantic segmentation with linear transformer. In Proceedings of the AAAI Conference on Artificial Intelligence vol. 37, pp. 516–524 (2023).

  54. Xu, Z. et al. Sctnet: Single-branch cnn with transformer semantic information for real-time segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence 38, 6378–6386 (2024).

  55. Wei, C. et al. Hyperseg: Towards universal visual segmentation with large language model. arXiv preprint arXiv:2411.17606 (2024).

  56. Gao, G. et al. Fbsnet: A fast bilateral symmetrical network for real-time semantic segmentation. IEEE Trans. Multimedia 25, 3273–3283 (2022).

    Google Scholar 

  57. Shi, M. et al. Lightweight context-aware network using partial-channel transformation for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 25, 7401–7416 (2024).

    Google Scholar 

  58. Ye, B. & Xue, R. Dual attention dual-resolution networks for real-time semantic segmentation of street scenes. IEEE Access (2024).

  59. Xiao, X. et al. Baseg: Boundary aware semantic segmentation for autonomous driving. Neural Netw. 157, 460–470 (2023).

    Google Scholar 

  60. Li, S. et al. Ndnet: Spacewise multiscale representation learning via neighbor decoupling for real-time driving scene parsing. IEEE Trans. Neural Netw. Learn. Syst. 35, 7884–7898 (2022).

    Google Scholar 

Download references

Funding

This research was funded by the National Natural Science Foundation of China (No. 62472145) the Henan Provincial Science and Technology Research Project (No. 252102211015).

Author information

Author notes
  1. Shan Zhao and Wenjing Fu contributed equally to this work.

Authors and Affiliations

  1. School of Software, Henan Polytechnic University, 2001 Century Avenue, Jiaozuo, 454000, China

    Shan Zhao, Jiajia Gao, Fukai Zhang & Zhanqiang Huo

  2. School of Computer Science and Technology, Henan Polytechnic University, 2001 Century Avenue, Jiaozuo, 454000, China

    Wenjing Fu

Authors
  1. Shan Zhao
    View author publications

    Search author on:PubMed Google Scholar

  2. Wenjing Fu
    View author publications

    Search author on:PubMed Google Scholar

  3. Jiajia Gao
    View author publications

    Search author on:PubMed Google Scholar

  4. Fukai Zhang
    View author publications

    Search author on:PubMed Google Scholar

  5. Zhanqiang Huo
    View author publications

    Search author on:PubMed Google Scholar

Contributions

All authors participated in the conception and design of the study. Shan Zhao, Wenjing Fu, and Jiajia Gao were responsible for material preparation, data collection, and analysis. Fukai Zhang and Zhanqiang Huo handled software development and project management. The initial manuscript draft was prepared by Wenjing Fu, with all authors providing feedback on earlier versions. Every author reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Wenjing Fu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, S., Fu, W., Gao, J. et al. CGDFNet: a dual-branch real-time semantic segmentation network with context-guided detail fusion. Sci Rep (2026). https://doi.org/10.1038/s41598-026-39370-1

Download citation

  • Received: 08 November 2025

  • Accepted: 04 February 2026

  • Published: 16 February 2026

  • DOI: https://doi.org/10.1038/s41598-026-39370-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Dual-branch network
  • Real-time semantic segmentation
  • Context-guided detail
  • Feature fusion
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics