Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Active guidance in ultrasound bladder scanning using reinforcement learning
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 15 January 2026

Active guidance in ultrasound bladder scanning using reinforcement learning

  • Hao-Lun Hsu1 na1,
  • Mohsen Zahiri2 na1,
  • Gary Y. Li2,
  • Rashid Al Mukaddim2,
  • HyeonWoo Lee2,
  • Martha Grewe Wilson2,
  • Joyce Grube2,
  • Stephen Schmidt2,
  • Goutam Ghoshal2 &
  • …
  • Balasundar Raju2 

Scientific Reports , Article number:  (2026) Cite this article

  • 520 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Biomedical engineering
  • Ultrasonography

Abstract

Accurate measurement of bladder volume is essential for diagnosing urinary retention and voiding dysfunction. However, finding optimal view can be challenging for less experienced operators, potentially leading to suboptimal imaging and potential misdiagnoses. This study proposes an intelligent guidance system leveraging reinforcement learning (RL) to improve the acquisition of ultrasound images in ultrasound bladder scanning procedure. We introduce a novel pipeline that incorporates a practical variant of Deep Q-Networks (DQN), known as Adam LMCDQN, which is theoretically validated within linear Markov Decision Processes. Our system aims to offer real-time, adaptive feedback to operators, improving image quality and consistency. We also present a novel domain-specific reward design for reinforcement learning (RL), incorporating domain knowledge to enhance performance. Our results demonstrate a promising \(81 \%\) success rate in reaching target points along the transverse direction and \(67 \%\) along the longitudinal direction, significantly outperforming supervised deep learning models, which achieved \(58 \%\) and \(32 \%\), respectively. This work is among the first to apply RL in ultrasound guidance for bladder assessment, demonstrating the technical feasibility of optimal-view localization in a simulated environment and exploring exploration strategies and reward formulations relevant to the guidance task.

Similar content being viewed by others

Memory-efficient low-compute segmentation algorithms for bladder-monitoring smart ultrasound devices

Article Open access 30 September 2023

Development and validation of a deep reinforcement learning algorithm for auto-delineation of organs at risk in cervical cancer radiotherapy

Article Open access 25 February 2025

On the public dissemination and open sourcing of ultrasound resources, datasets and deep learning models

Article Open access 24 November 2025

Data availability

The data supporting the findings of this study are available from the corresponding author, Mohsen Zahiri, upon reasonable request.

References

  1. Kelly, C. E. Evaluation of voiding dysfunction and measurement of bladder volume. Rev. Urol.6, S32 (2004).

    Google Scholar 

  2. Bent, A., Nahhas, D. & McLennan, M. Portable ultrasound determination of urinary residual volume. Int. Urogynecol. J.8, 200–202 (1997).

    Google Scholar 

  3. Coombes, G. M. & Millard, R. J. The accuracy of portable ultrasound scanning in the measurement of residual urine volume. J. Urol.152, 2083–2085 (1994).

    Google Scholar 

  4. Krogh, C. L. et al. Effect of ultrasound training of physicians working in the prehospital setting. Scand. J. Trauma, Resusc. Emerg. Med.24, 1–7 (2016).

    Google Scholar 

  5. Toporek, G. et al. User guidance for point-of-care echocardiography using a multi-task deep neural network. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part V 22, 309–317 (Springer, 2019).

  6. Li, K., Li, A., Xu, Y., Xiong, H. & Meng, M.Q.-H. Rl-tee: Autonomous probe guidance for transesophageal echocardiography based on attention-augmented deep reinforcement learning. IEEE Trans. Autom. Sci. Eng.21, 1526–1538 (2023).

    Google Scholar 

  7. Bi, Y., Qian, C., Zhang, Z., Navab, N. & Jiang, Z. Autonomous path planning for intercostal robotic ultrasound imaging using reinforcement learning. arXiv preprint arXiv:2404.09927 (2024).

  8. Jarosik, P. & Lewandowski, M. Automatic ultrasound guidance based on deep reinforcement learning. In 2019 IEEE International Ultrasonics Symposium (IUS), 475–478 (IEEE, 2019).

  9. Hase, H. et al. Ultrasound-guided robotic navigation with deep reinforcement learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5534–5541 (IEEE, 2020).

  10. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature518, 529–533 (2015).

    Google Scholar 

  11. Ishfaq, H. et al. Provable and practical: Efficient exploration in reinforcement learning via langevin monte carlo. In The Twelfth International Conference on Learning Representations (2024).

  12. Droste, R., Drukker, L., Papageorghiou, A. T. & Noble, J. A. Automatic probe movement guidance for freehand obstetric ultrasound. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part III 23, 583–592 (Springer, 2020).

  13. Jiang, Z. et al. Intelligent robotic sonographer: Mutual information-based disentangled reward learning from few demonstrations. Int. J. Robot. Res.43, 981–1002 (2024).

    Google Scholar 

  14. Milletari, F., Birodkar, V. & Sofka, M. Straight to the point: Reinforcement learning for user guidance in ultrasound. In Smart Ultrasound Imaging and Perinatal, Preterm and Paediatric Image Analysis: First International Workshop, SUSI 2019, and 4th International Workshop, PIPPI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13 and 17, 2019, Proceedings 4, 3–10 (Springer, 2019).

  15. Li, K. et al. Image-guided navigation of a robotic ultrasound probe for autonomous spinal sonography using a shadow-aware dual-agent framework. IEEE Trans. Med. Robot. Bionics4, 130–144 (2021).

    Google Scholar 

  16. Jin, T., Xu, P., Xiao, X. & Anandkumar, A. Finite-time regret of Thompson sampling algorithms for exponential family multi-armed bandits. Adv. Neural Inf. Process. Syst.35, 38475–38487 (2022).

    Google Scholar 

  17. Jin, T., Yang, X., Xiao, X. & Xu, P. Thompson sampling with less exploration is fast and optimal. In International Conference on Machine Learning, 15239–15261 (PMLR, 2023).

  18. Jin, T., Hsu, H.-L., Chang, W. & Xu, P. Finite-time frequentist regret bounds of multi-agent Thompson sampling on sparse hypergraphs. Proc. AAAI Conf. Artif. Intell.38, 12956–12964 (2024).

    Google Scholar 

  19. Xu, P., Zheng, H., Mazumdar, E. V., Azizzadenesheli, K. & Anandkumar, A. Langevin monte carlo for contextual bandits. In International Conference on Machine Learning, 24830–24850 (PMLR, 2022).

  20. Osband, I., Van Roy, B. & Wen, Z. Generalization and exploration via randomized value functions. In International Conference on Machine Learning, 2377–2386 (PMLR, 2016).

  21. Russo, D. Worst-case regret bounds for exploration via randomized value functions. Adv. Neural Inf. Process. Syst.32, 14410–14420 (2019).

    Google Scholar 

  22. Agrawal, P., Chen, J. & Jiang, N. Improved worst-case regret bounds for randomized least-squares value iteration. Proc. AAAI Conf. Artif. Intell.35, 6566–6573 (2021).

    Google Scholar 

  23. Zanette, A., Brandfonbrener, D., Brunskill, E., Pirotta, M. & Lazaric, A. Frequentist regret bounds for randomized least-squares value iteration. In International Conference on Artificial Intelligence and Statistics, 1954–1964 (PMLR, 2020).

  24. Ishfaq, H. et al. Randomized exploration in reinforcement learning with general value function approximation. In International Conference on Machine Learning, 4607–4616 (PMLR, 2021).

  25. Osband, I., Blundell, C., Pritzel, A. & Roy, B. V. Deep exploration via bootstrapped dqn. Advances in neural information processing systems29 (2016).

  26. Osband, I., Aslanides, J. & Cassirer, A. Randomized prior functions for deep reinforcement learning. Advances in neural information processing systems31 (2018).

  27. Karbasi, A., Kuang, N. L., Ma, Y. & Mitra, S. Langevin thompson sampling with logarithmic communication: Bandits and reinforcement learning. In Krause, A. et al. (eds.) Proceedings of the 40th International Conference on Machine Learning, vol. 202 of Proceedings of Machine Learning Research, 15828–15860 (PMLR, 2023).

  28. Hsu, H.-L., Wang, W., Pajic, M. & Xu, P. Randomized exploration in cooperative multi-agent reinforcement learning. Advances in neural information processing systems (2024).

  29. Hsu, H.-L. & Pajic, M. Robust exploration with adversary via langevin monte carlo. In 6th Annual Learning for Dynamics & Control Conference, 1592–1605 (PMLR, 2024).

  30. Brockman, G. et al. Openai gym. arXiv preprint arXiv:1606.01540 (2016).

  31. Bi, Y. et al. Vesnet-rl: Simulation-based reinforcement learning for real-world us probe navigation. IEEE Robot. Autom. Lett.7, 6638–6645 (2022).

    Google Scholar 

  32. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. & Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4510–4520 (2018).

  33. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 (2016).

Download references

Funding

The study described in this abstract was funded in part with federal funds from the U.S. Department of Health and Human Services (HHS); Administration for Strategic Preparedness and Response (ASPR); Biomedical Advanced Research and Development Authority (BARDA), under contract number 75A50120C00097. The contract and federal funding are not an endorsement of the study results, product or company.

Author information

Author notes
  1. These authors contributed equally to this work.

Authors and Affiliations

  1. Department of Computer Science, Duke University, Durham, NC, USA

    Hao-Lun Hsu

  2. Philips North America, Cambridge, MA, USA

    Mohsen Zahiri, Gary Y. Li, Rashid Al Mukaddim, HyeonWoo Lee, Martha Grewe Wilson, Joyce Grube, Stephen Schmidt, Goutam Ghoshal & Balasundar Raju

Authors
  1. Hao-Lun Hsu
    View author publications

    Search author on:PubMed Google Scholar

  2. Mohsen Zahiri
    View author publications

    Search author on:PubMed Google Scholar

  3. Gary Y. Li
    View author publications

    Search author on:PubMed Google Scholar

  4. Rashid Al Mukaddim
    View author publications

    Search author on:PubMed Google Scholar

  5. HyeonWoo Lee
    View author publications

    Search author on:PubMed Google Scholar

  6. Martha Grewe Wilson
    View author publications

    Search author on:PubMed Google Scholar

  7. Joyce Grube
    View author publications

    Search author on:PubMed Google Scholar

  8. Stephen Schmidt
    View author publications

    Search author on:PubMed Google Scholar

  9. Goutam Ghoshal
    View author publications

    Search author on:PubMed Google Scholar

  10. Balasundar Raju
    View author publications

    Search author on:PubMed Google Scholar

Contributions

H.H, M.Z., G.G., R.A.M., H.L., G.Y.L., and B.R. conceived the project idea. M.Z., G.G., and J.G. designed the data collection protocol. J.G. and M.G.W. recruited the subjects for the study and managed the IRB process. M.Z., G.G., J.G., and G.Y.L. performed the ultrasound data acquisition. M.Z., H.H, and S.S. prepared the data for reinforcement learning (RL) and supervised learning training. H.H and M.Z. designed, developed, and trained the RL models, and supervised algorithm development and data analysis. H.H wrote the initial draft of the manuscript, and M.Z., G.Y.L., and R.A.M. critically revised it. All authors reviewed and approved the final manuscript before submission.

Corresponding author

Correspondence to Mohsen Zahiri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hsu, HL., Zahiri, M., Li, G. et al. Active guidance in ultrasound bladder scanning using reinforcement learning. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35285-z

Download citation

  • Received: 05 June 2025

  • Accepted: 05 January 2026

  • Published: 15 January 2026

  • DOI: https://doi.org/10.1038/s41598-026-35285-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing