Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Communications Engineering
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. communications engineering
  3. articles
  4. article
A diffusion model-based image generation framework for underwater object detection
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 29 December 2025

A diffusion model-based image generation framework for underwater object detection

  • Yaoming Zhuang  ORCID: orcid.org/0000-0001-8815-08011,
  • Longyu Ma1,2,
  • Jiaming Liu1,
  • Yonghao Xian1,
  • Baoquan Chen1,
  • Li Li3,
  • Chengdong Wu1,
  • Wei Cui4 &
  • …
  • Zhanlin Liu5 

Communications Engineering , Article number:  (2025) Cite this article

  • 1254 Accesses

  • 18 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computer science
  • Ocean sciences

Abstract

Underwater object detection plays a crucial role in applications such as marine ecological monitoring and underwater rescue operations. However, challenges such as limited underwater data availability and low scene diversity hinder detection accuracy. In this paper, we propose the Underwater Layout-Guided Diffusion Framework (ULGF), a diffusion model-based framework designed to augment underwater detection datasets. Unlike conventional methods that generate underwater images by integrating in-air information, ULGF operates exclusively on a small set of underwater images and their corresponding labels, requiring no external data. We have publicly released the ULGF source code and the generated dataset for further research. Our approach enables the generation of high-fidelity, diverse, and theoretically infinite underwater images, substantially enhancing object detection performance in real-world underwater scenarios. Furthermore, we evaluate the quality of the generated underwater images, demonstrating that ULGF produces images with a smaller domain gap.

Similar content being viewed by others

MHF-UIE a multi-task hybrid fusion method for real-world underwater image enhancement

Article Open access 24 May 2025

A new generative method for multi-focus image fusion of underwater micro bubbles

Article Open access 05 December 2024

UIGen: a hybrid framework for underwater image encryption based on evolutionary chaotic system and multi-domain transformation

Article Open access 09 December 2025

Data availability

In terms of data availability, the datasets and models generated or used in this study have been made available in the following ways: The RUOD dataset can be accessed at https://github.com/dlut-dimt/RUOD. The UDD dataset is available at https://github.com/chongweiliu/udd_official. The underwater image generation dataset, the pre-trained detection model and weights related to ULGF are available from the corresponding author on reasonable request.

Code availability

The code for ULGF can be found at the following URL: https://github.com/maxiaoha666/ULGF.

References

  1. Wu, Z. et al. Self-supervised underwater image generation for underwater domain pre-training. IEEE Trans. Instrum. Meas. 73, 5012714 (2024).

  2. Zhou, J. et al. AMSP-UOD: when vortex convolution and stochastic perturbation meet underwater object detection. Proc. AAAI Conf. Artif. Intell. 38, 7659–7667 (2024).

    Google Scholar 

  3. Zeng, L., Sun, B. & Zhu, D. Underwater target detection based on Faster R-CNN and adversarial occlusion network. Eng. Appl. Artif. Intell. 100, 0952–1976 (2021).

    Google Scholar 

  4. Sun, B., Zhang, W., Xing, C. & Li, Y. Underwater moving target detection and tracking based on enhanced You Only Look Once and deep simple online and real-time tracking strategy. Eng. Appl. Artif. Intell. 143, 0952–1976 (2025).

    Google Scholar 

  5. Cao, J. et al. Unveiling the underwater world: CLIP perception model-guided underwater image enhancement. Pattern Recognit. 162, 111395 (2025).

    Google Scholar 

  6. Fu, Z., Wang, W., Huang, Y., Ding, X. & Ma, K. Uncertainty inspired underwater image enhancement. Proceedings of the European Conference on Computer Vision 465–482 (Springer, 2022).

  7. Zhang, W. et al. Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement. IEEE Trans. Image Process. 31, 3997–4010 (2022).

    Google Scholar 

  8. Peng, L., Zhu, C. & Bian, L. U-shape transformer for underwater image enhancement. IEEE Trans. Image Process. 32, 3066–3079 (2021).

    Google Scholar 

  9. Fu, C. et al. Rethinking general underwater object detection: datasets, challenges, and solutions. Neurocomputing 517, 243–256 (2023).

    Google Scholar 

  10. Chen, K. et al. GeoDiffusion: text-prompted geometric control for object detection data generation. In Proceedings of International Conference on Learning Representations, 1–12 (ICLR, 2024).

  11. Redmon, J. et al. You only look once: unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision on Pattern Recognition, 779–788 (IEEE, 2016).

  12. Liu, H., Song, P. & Ding, R. WQT and DG-YOLO: towards domain generalization in underwater object detection. Preprint at https://arxiv.org/abs/2004.06333 (2020).

  13. Li, C., et al. YOLOv6: a single-stage object detection framework for industrial applications. Preprint at https://arxiv.org/abs/2209.02976 (2022).

  14. Zhao, Y., et al. DETRs beat YOLOs on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision on Pattern Recognition, 16965–16974 (IEEE, 2024).

  15. Li, C. et al. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 29, 4376–4389 (2019).

    Google Scholar 

  16. Zhang, F. et al. Atlantis: enabling underwater depth estimation with stable diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision on Pattern Recognition 11852–11861 (IEEE, 2024).

  17. Bynagari, N. B. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Asian J. Appl. Sci. Eng. 8, 25–34 (2019).

  18. Li, C., Anwar, S. & Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 98, 0031–3203 (2020).

    Google Scholar 

  19. Wang, N. et al. UWGAN: Underwater GAN for real-world underwater color restoration and dehazing. Preprint at https://arxiv.org/abs/1912.10269 (2019).

  20. Ye, T. et al. Underwater light field retention: neural rendering for underwater imaging.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 488–497 (IEEE, 2022).

  21. Salimans, T. et al. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems. 2234–2242 (2016).

  22. Szegedy, C. et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision on Pattern Recognition, 1–9 (IEEE, 2015).

  23. Wang, Z. et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).

    Google Scholar 

  24. Yan, J. et al. Underwater image enhancement via multiscale disentanglement strategy. Sci Rep 15, 6076 (2025).

    Google Scholar 

  25. Liu, K. et al. A maneuverable underwater vehicle for near-seabed observation. Nat Commun 15, 10284 (2024).

    Google Scholar 

  26. Xu, S. et al. A systematic review and analysis of deep learning-based underwater object detection. Neurocomputing 527, 204–232 (2023).

    Google Scholar 

  27. Rombach, R. et al. High-resolution image synthesis with latent diffusion models. InProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10684–10695 (IEEE, 2022).

  28. Zhang, L., Rao, A. & Agrawala, M. Scaling in-the-wild training for diffusion-based illumination harmonization and editing by imposing consistent light transport. In Proceedings of the 13th International Conference on Representation Learning (ICLR, 2024).

  29. Radford, A., et al. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning, 8748–8763 (PMLR, 2021).

  30. Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).

  31. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).

    Google Scholar 

  32. Gao, H. et al. SCP-Diff: spatial-categorical joint prior for diffusion based semantic image synthesis. Proceedings of the European Conference on Computer Vision 37–54 (Springer Nature Switzerland, 2024).

  33. Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention MICCAI 2015, 234–241 (Springer Int. Publ., 2015).

Download references

Acknowledgements

This research was supported in part by the National Natural Science Foundation of China (62403108, 42301256), the Liaoning Provincial Natural Science Foundation Joint Fund (2023-MSBA-075), the Aeronautical Science Foundation of China (20240001042002), the Scientific Research Foundation of Liaoning Provincial Education Department (LJKQR20222509), the Fundamental Research Funds for the Central Universities (N2426005), the Science and Technology Planning Project of Liaoning Province (2023JH1/11200011 and 2024JH2/10240049).

Author information

Authors and Affiliations

  1. Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China

    Yaoming Zhuang, Longyu Ma, Jiaming Liu, Yonghao Xian, Baoquan Chen & Chengdong Wu

  2. College of Information Science and Engineering, Northeastern University, Shenyang, China

    Longyu Ma

  3. JangHo School of Architecture, Northeastern University, Shenyang, China

    Li Li

  4. Institute for Infocomm Research, Astar, Singapore

    Wei Cui

  5. ProRata.ai, Bellevue, WA, USA

    Zhanlin Liu

Authors
  1. Yaoming Zhuang
    View author publications

    Search author on:PubMed Google Scholar

  2. Longyu Ma
    View author publications

    Search author on:PubMed Google Scholar

  3. Jiaming Liu
    View author publications

    Search author on:PubMed Google Scholar

  4. Yonghao Xian
    View author publications

    Search author on:PubMed Google Scholar

  5. Baoquan Chen
    View author publications

    Search author on:PubMed Google Scholar

  6. Li Li
    View author publications

    Search author on:PubMed Google Scholar

  7. Chengdong Wu
    View author publications

    Search author on:PubMed Google Scholar

  8. Wei Cui
    View author publications

    Search author on:PubMed Google Scholar

  9. Zhanlin Liu
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Y.Z.: Data management, annotation of training datasets, data analysis, and interpretation. L.M.: Data management and preparation, algorithm development and validation, comparative experiments, data analysis and interpretation, and manuscript drafting. J.L.: Comparative experiments and manuscript drafting. Y.X.: Algorithm design and implementation, experimental validation, and data visualization. B.C.: Method design and supervision of the research process. L.L.: Project management, funding acquisition, resource provision, and manuscript review. C.W.: Validation of experimental results and critical feedback on the manuscript. W.C.: Data curation and model performance evaluation. Z.L.: Investigation and assistance with manuscript revision.

Corresponding author

Correspondence to Yaoming Zhuang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Engineering thanks Fenglei Han, Ajisha Mathias and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: [Philip Coatsworth].

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhuang, Y., Ma, L., Liu, J. et al. A diffusion model-based image generation framework for underwater object detection. Commun Eng (2025). https://doi.org/10.1038/s44172-025-00579-z

Download citation

  • Received: 22 March 2025

  • Accepted: 18 December 2025

  • Published: 29 December 2025

  • DOI: https://doi.org/10.1038/s44172-025-00579-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Uncrewed Underwater Vehicles

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Journal Information
  • Editors
  • Editorial Board
  • Calls for Papers
  • Editorial Policies
  • Open Access
  • Journal Metrics
  • Article Processing Charges
  • Contact
  • Conferences

Publish with us

  • For authors
  • For referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Communications Engineering (Commun Eng)

ISSN 2731-3395 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing Anthropocene

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Anthropocene