Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Lightweight SwiM-UNet with multi-dimensional adaptor for efficient on-device medical image segmentation
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 20 January 2026

Lightweight SwiM-UNet with multi-dimensional adaptor for efficient on-device medical image segmentation

  • Yeonwoo Noh1,
  • Seongwook Lee2,
  • Seyong Jin2,
  • Yunyoung Chang3,
  • Dong-Ok Won4,7,8,
  • Minwoo Lee6,8 &
  • …
  • Wonjong Noh5,8 

Scientific Reports , Article number:  (2026) Cite this article

  • 490 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Engineering
  • Mathematics and computing

Abstract

For medical image segmentation, transformer-based models have demonstrated superior performance. However, their high computational complexity remains a significant challenge. In contrast, Mamba provides a more computationally efficient alternative, though its segmentation performance is generally inferior to that of transformers. This study proposes a novel lightweight hybrid model based on U-Net, named SwiM-UNet, which represents the first Mamba–transformer hybrid model specifically designed for processing three-dimensional data. Specifically, efficient TSMamba (eTSMamba) blocks are incorporated in the early stages of the U-Net architecture to effectively manage computational overhead, while efficient Swin transformer (eSwin) blocks are employed in the later stages to capture long-range dependencies and local contextual information. Additionally, the model strategically integrates both the Mamba and Swin transformer architectures through a Mamba–Swin adapter (MS-adapter). The proposed MS-adapter comprises three sub-adapters that emphasize local information along the \(x\)-, \(y\)-, and \(z\)-axes, as well as channel-wise features between the eTSMamba and eSwin modules, and includes gating mechanisms to balance the contributions of the sub-adapters. Moreover, a low-rank MLP is utilized in the encoder, and channel reduction is applied in the decoder to further enhance computational efficiency. Performance evaluations conducted on the publicly available BraTS2023 and BraTS2024 datasets demonstrate that the proposed model surpasses state-of-the-art benchmark models while maintaining low computational complexity.

Data availability

The BraTS 2023 and 2024 datasets used in this study are publicly available. The BraTS 2023 data were accessed through the official challenge page on the Synapse platform under Synapse ID syn51156910. The BraTS 2024 data were accessed via the Synapse portal under Synapse ID syn53708249. Additionally, the Beyond the Cranial Vault (BTCV) multi-organ abdominal CT segmentation data were accessed via the official challenge page on the Synapse platform under Synapse ID syn3193805.

References

  1. Alexey, D. An image is worth 16x16 words: Transformers for image recognition at scale. Preprint at arXiv: 2010.11929 (2020).

  2. Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. In Proc. of the IEEE/CVF International Conference on Computer Vision, 10012–10022 (2021).

  3. Ghazouani, F., Vera, P. & Ruan, S. Efficient brain tumor segmentation using swin transformer and enhanced local self-attention. Int. J. Comput. Assist. Radiol. Surg. 19, 273–281 (2024).

    Google Scholar 

  4. Ferreira, A. et al. How we won brats 2023 adult glioma challenge? just faking it! enhanced synthetic data augmentation and model ensemble for brain tumour segmentation. Preprint at arXiv:2402.17317 (2024).

  5. Gu, A. & Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. Preprint at arXiv:2312.00752 (2023).

  6. Gu, A., Goel, K. & Ré, C. Efficiently modeling long sequences with structured state spaces. Preprint at arXiv:2111.00396 (2021).

  7. Zhu, L. et al. Vision mamba: Efficient visual representation learning with bidirectional state space model. Preprint at arXiv:2401.09417 (2024).

  8. Shi, D. TransNeXt: Robust Foveal Visual Perception for Vision Transformers . In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 17773–17783, https://doi.org/10.1109/CVPR52733.2024.01683 (IEEE Computer Society, 2024).

  9. Han, D. et al. Demystify mamba in vision: a linear attention perspective. In Proc. of the 38th International Conference on Neural Information Processing Systems, NIPS ’24 (Curran Associates Inc., 2024).

  10. Han, D. et al. Agent attention: On the integration of softmax and linear attention. In Leonardis, A. et al. (eds.) Computer Vision – ECCV 2024, 124–140 (Springer Nature, 2025).

  11. Lou, M. et al. Transxnet: Learning both global and local dynamics with a dual dynamic token mixer for visual recognition. IEEE Trans. Neural Netw. Learn. Syst. 36, 11534–11547. https://doi.org/10.1109/TNNLS.2025.3550979 (2025).

    Google Scholar 

  12. Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, 234–241 (Springer, 2015).

  13. Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N. & Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, 3–11 (Springer, 2018).

  14. Hatamizadeh, A. et al. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI brainlesion workshop, 272–284 (Springer, 2021).

  15. Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J. & Maier-Hein, K. H. nnu-net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).

    Google Scholar 

  16. Goodfellow, I. et al. Generative adversarial nets. Advances in Neural Information Processing Systems 27 (2014).

  17. ZongRen, L., Silamu, W., Yuzhen, W. & Zhe, W. Densetrans: Multimodal brain tumor segmentation using swin transformer. IEEE Access 11, 42895–42908 (2023).

    Google Scholar 

  18. Shi, Y., Li, M., Dong, M. & Xu, C. Vssd: Vision mamba with non-causal state space duality. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 10819–10829 (2025).

  19. Lou, M., Fu, Y. & Yu, Y. Sparx: A sparse cross-layer connection mechanism for hierarchical vision mamba and transformer networks. vol. 39, https://doi.org/10.1609/aaai.v39i18.34103 (2025).

  20. Shi, Y., Dong, M. & Xu, C. Multi-scale vmamba: Hierarchy in hierarchy visual state space model. In Globerson, A. et al. (eds.) Advances in Neural Information Processing Systems, vol. 37, 25687–25708, https://doi.org/10.52202/079017-0808 (Curran Associates, Inc., 2024).

  21. Fu, Y., Lou, M. & Yu, Y. Segman: Omni-scale context modeling with state space models and local attention for semantic segmentation. In 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 19077–19087, https://doi.org/10.1109/CVPR52734.2025.01777 (2025).

  22. Lai, Y. et al. Advancing efficient brain tumor multi-class classification–new insights from the vision mamba model in transfer learning. Preprint at arXiv:2410.21872 (2024).

  23. Bozinovski, S. & Fulgosi, A. The influence of pattern similarity and transfer of learning upon training of a base perceptron b2. In Proc. Symp. Informatica, 3–121–5 (Bled, 1976). Original in Croatian: Utjecaj slicnosti likova i transfera ucenja na obucavanje baznog perceptrona B2.

  24. Dang, T. D. Q., Nguyen, H. H. & Tiulpin, A. Log-vmamba: Local-global vision mamba for medical image segmentation. In Proc. of the Asian Conference on Computer Vision, 548–565 (2024).

  25. Xing, Z., Ye, T., Yang, Y., Liu, G. & Zhu, L. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 578–588 (Springer, 2024).

  26. Ding, X., Zhang, X., Han, J. & Ding, G. Scaling up your kernels to 31\(\times\)31: Revisiting large kernel design in cnns. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11953–11965, https://doi.org/10.1109/CVPR52688.2022.01166 (2022).

  27. Ding, X. et al. Unireplknet: A universal perception large-kernel convnet for audio, video, point cloud, time-series and image recognition. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5513–5524, https://doi.org/10.1109/CVPR52733.2024.00527 (2024).

  28. Lou, M. & Yu, Y. Overlock: An overview-first-look-closely-next convnet with context-mixing dynamic kernels. https://doi.org/10.1109/CVPR52734.2025.00021 (2025).

  29. Zhou, R. et al. Cascade residual multiscale convolution and mamba-structured unet for advanced brain tumor image segmentation. Entropy 26, 385 (2024).

    Google Scholar 

  30. Hatamizadeh, A. & Kautz, J. Mambavision: A hybrid mamba-transformer vision backbone. Preprint at arXiv:2407.08083 (2024).

  31. Zhang, M., Chen, Z., Ge, Y. & Tao, X. Hmt-unet: A hybird mamba-transformer vision unet for medical image segmentation. Preprint at arXiv:2408.11289 (2024).

  32. Cao, A., Li, Z., Jomsky, J., Laine, A. F. & Guo, J. Medsegmamba: 3d cnn-mamba hybrid architecture for brain segmentation. Preprint at arXiv:2409.08307 (2024).

  33. Peiris, H., Hayat, M., Chen, Z., Egan, G. & Harandi, M. A robust volumetric transformer for accurate 3d tumor segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention, 162–172 (Springer, 2022).

  34. Gong, H. et al. nnmamba: 3d biomedical image segmentation, classification and landmark detection with state space model. In 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI), 1–5 (IEEE, 2025).

  35. Chen, J. et al. Transunet: Transformers make strong encoders for medical image segmentation. Med. Image Anal. 97, 103280. https://doi.org/10.1016/j.media.2024.103280 (2024).

    Google Scholar 

  36. Hatamizadeh, A. et al. Unetr: Transformers for 3d medical image segmentation. In Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 574–584, https://doi.org/10.1109/WACV51458.2022.00181 (2022).

  37. Hatamizadeh, A. et al. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop, 272–284, https://doi.org/10.1007/978-3-031-08999-2_22 (Springer, 2022).

  38. Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J. & Maier-Hein, K. H. nnu-net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211. https://doi.org/10.1038/s41592-020-01008-z (2021).

    Google Scholar 

  39. Xing, Z., Ye, T., Yang, Y., Liu, G. & Zhu, L. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 578–588, https://doi.org/10.1007/978-3-031-69023-9_57 (Springer, 2024).

  40. Esteva, A. et al. A guide to deep learning in healthcare. IEEE Trans. Med. Imaging 38, 2650–2660 (2019).

    Google Scholar 

  41. NVIDIA Corporation. Nvidia jetson xavier nx developer kit: Technical specifications (2023). Reference 42: Source for computing specifications (TOPS/TFLOPS) for portable MRI and edge boxes.

  42. NVIDIA Corporation. Nvidia jetson orin nano and orin nx product brief (2024). Reference 43: Specifications for next-generation edge GPU and DLA units mentioned in Table 10.

  43. Google LLC. Edge tpu system architecture (2022). Reference 44: Technical basis for Ultra-Low-Power (ULP) Edge Accelerator specs.

  44. ARM Ltd. Arm ethos npu technical overview (2023). Reference 45: Source for NPU computing capabilities in handheld devices.

  45. Maier-Hein, L. et al. Surgical data science for next-generation interventions. Nat. Biomed. Eng. 1, 691–696 (2017).

    Google Scholar 

Download references

Funding

This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2023-00223501, No. 2022R1A5A8019303), and partly supported by Hallym University MHC (Mighty Hallym 4.0 Campus) project, 2025 (MHC-MHC-202502-002).

Author information

Authors and Affiliations

  1. College of Medicine, Gachon University, Incheon, 21936, South Korea

    Yeonwoo Noh

  2. Department of Artificial Intelligence, Sejong University, Seoul, 05006, South Korea

    Seongwook Lee & Seyong Jin

  3. School of Computing, Gachon University, Seongnam, 13120, South Korea

    Yunyoung Chang

  4. Department of AI Convergence, College of Information Science, Hallym University, Chuncheon, 24252, South Korea

    Dong-Ok Won

  5. School of Software, College of Information Science, Hallym University, Chuncheon, 24252, South Korea

    Wonjong Noh

  6. Neurology, Hallym University Sacred Heart Hospital, Anyang, 14068, South Korea

    Minwoo Lee

  7. Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School, Worcester, MA, 01655, USA

    Dong-Ok Won

  8. Department of Neurology, College of Medicine, Hallym University, Chuncheon, 24252, South Korea

    Dong-Ok Won, Minwoo Lee & Wonjong Noh

Authors
  1. Yeonwoo Noh
    View author publications

    Search author on:PubMed Google Scholar

  2. Seongwook Lee
    View author publications

    Search author on:PubMed Google Scholar

  3. Seyong Jin
    View author publications

    Search author on:PubMed Google Scholar

  4. Yunyoung Chang
    View author publications

    Search author on:PubMed Google Scholar

  5. Dong-Ok Won
    View author publications

    Search author on:PubMed Google Scholar

  6. Minwoo Lee
    View author publications

    Search author on:PubMed Google Scholar

  7. Wonjong Noh
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Y. N. (Yeonwoo Noh) conceived and designed the study, developed the methodology, implemented the software, and performed the experiments. S. L. (Seongwook Lee) and S. J. (Seyong Jin) supported the data curation and validation process. Y. C. (Yunyoung Chang) contributed to visualization of the results. D. W. (Dong-Ok Won) contributed to analyzing real-world on-device and edge deployment scenarios. Y. N. and W. N (Wonjong Noh) wrote the original draft of the manuscript. W. N., M. L. (Minwoo Lee) and D. W. contributed to supervision, critical review of the manuscript, and funding acquisition. W. N. was responsible for project administration. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Wonjong Noh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Noh, Y., Lee, S., Jin, S. et al. Lightweight SwiM-UNet with multi-dimensional adaptor for efficient on-device medical image segmentation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35771-4

Download citation

  • Received: 02 October 2025

  • Accepted: 08 January 2026

  • Published: 20 January 2026

  • DOI: https://doi.org/10.1038/s41598-026-35771-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Adaptor
  • Brain tumor segmentation
  • Hybrid model
  • Lightweight model
  • Mamba
  • Swin transformer
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics