CA-3DTransUNet with dynamic cross-scale fusion for pulmonary nodule segmentation

Zhang, Kaikai; Lan, Xiaowen; Wang, Yanhui; Wang, Lixin; Liu, Yuhan; Guo, Feng

doi:10.1038/s41598-026-47436-3

Download PDF

Article
Open access
Published: 03 April 2026

CA-3DTransUNet with dynamic cross-scale fusion for pulmonary nodule segmentation

Kaikai Zhang¹,
Xiaowen Lan¹,
Yanhui Wang²,
Lixin Wang¹,
Yuhan Liu¹ &
…
Feng Guo¹

Scientific Reports , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Precise segmentation of pulmonary nodules in low-dose computed tomography is challenged by nodule heterogeneity, low contrast, and spatial overlap with adjacent anatomical structures. To address these issues, we propose CA-3DTransUNet, a segmentation framework based on the 3D-nnUNet architecture. The proposed network incorporates a Transformer 3D module in the bottleneck to model global volumetric dependencies and a CrossEMA3D module in the decoder to dynamically refine spatial features. Additionally, the wavelet transform is applied during the data preprocessing stage to augment input edge details. Evaluations on the LIDC-IDRI, LUNA16, and private BT datasets indicate the model’s performance. Specifically, on the LIDC-IDRI dataset, the model achieved a Dice Similarity Coefficient of 91.85 ± 0.43% [95% CI: 91.32–92.38], a Precision of 90.53 ± 0.51%, and a Sensitivity of 93.12 ± 0.42%. These results surpassed the hybrid architecture nnFormer, which attained a Dice score of 89.48 ± 0.52% (p = 0.014). These findings suggest that CA-3DTransUNet holds potential for the computer-aided analysis of pulmonary nodules.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Xue, J., Yang, J., Luo, M., Cho, W. C. & Liu, X. MicroRNA-targeted therapeutics for lung cancer treatment. Expert Opin. Drug Discov. 12 (2), 141–157 (2017).
Google Scholar
Jeoun, B. S. et al. Canal-Net for automatic and robust 3D segmentation of mandibular canals in CBCT images using a continuity-aware contextual network. Sci. Rep. 12 (1), 13460 (2022).
Google Scholar
Langner, T. et al. Fully convolutional networks for automated segmentation of abdominal adipose tissue depots in multicenter water–fat MRI. Magn. Reson. Med. 81 (4), 2736–2745 (2019).
Google Scholar
Duan, B., Cao, J., Wang, W., Cai, D. & Yan, Y. Cell instance segmentation via multi-scale non-local correlation. In2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI) 2023 Apr 18 (pp. 1–5). IEEE.
Chen, S., Qiu, C., Yang, W. & Zhang, Z. Multiresolution aggregation transformer UNet based on multiscale input and coordinate attention for medical image segmentation. Sensors 22 (10), 3820 (2022).
Google Scholar
Wu, Z. et al. A comparative study of deep learning dose prediction models for cervical cancer volumetric modulated arc therapy. Technol. Cancer Res. Treat. 23, 15330338241242654 (2024).
Google Scholar
Wu, L. F., Wei, D. & Xu, C. A. CFANet: The Cross-Modal Fusion Attention Network for Indoor RGB-D Semantic Segmentation. J. Imaging. 11 (6), 177 (2025).
Google Scholar
Li, Y. & Zhang, Q. A Nomogram Combining Two Novel Biomarkers for Predicting Lung Adenocarcinoma in Ground-Glass Nodule Patients. Hum. Mutat. 2025 (1), 8647969 (2025).
Google Scholar
Liu, Q., Zhou, T., Cheng, C., Ma, J. & Hoque Tania, M. Hybrid generative adversarial network based on frequency and spatial domain for histopathological image synthesis. BMC Bioinform. 26 (1), 29 (2025).
Google Scholar
Abdullah, Fatima, Z., Abdullah, J., Rodríguez, J. L. & Sidorov, G. A multimodal AI framework for automated multiclass lung disease diagnosis from respiratory sounds with simulated biomarker fusion and personalized medication recommendation. Int. J. Mol. Sci. 26 (15), 7135 (2025).
Google Scholar
Danilov, V. V. et al. Efficient workflow for automatic segmentation of the right heart based on 2D echocardiography. Int. J. Cardiovasc. Imaging. 34 (7), 1041–1055 (2018).
Google Scholar
Mohammad, F., Ansari, R., Wanek, J., Francis, A. & Shahidi, M. Feasibility of level-set analysis of enface OCT retinal images in diabetic retinopathy. Biomedical Opt. Express. 6 (5), 1904–1918 (2015).
Google Scholar
Wang, X., Luo, Z., Huang, W., Zhang, Y. & Hu, R. Optimized UNet framework with a joint loss function for underwater image enhancement. Sci. Rep. 15 (1), 7327 (2025).
Google Scholar
Ma, X., Song, H., Jia, X. & Wang, Z. An improved V-Net lung nodule segmentation model based on pixel threshold separation and attention mechanism. Sci. Rep. 14 (1), 4743 (2024).
Google Scholar
Zhang, L., Deng, Y. & Zou, Y. Automatic road damage recognition based on improved YOLOv11 with multi-scale feature extraction and fusion attention mechanism. PLoS One. 20 (9), e0327387 (2025).
Google Scholar
Bruntha, P. M. et al. Lung_PAYNet: a pyramidal attention based deep learning network for lung nodule segmentation. Sci. Rep. 12 (1), 20330 (2022).
Google Scholar
Wang, H. et al. Nutritional composition analysis in food images: an innovative Swin Transformer approach. Front. Nutr. 11, 1454466 (2024).
Google Scholar
Hou, M., Wu, Y., Shi, H. & Mu, X. A two-stage multi-object tracking algorithm with transformer and attention mechanism. Sci. Rep. 15 (1), 31414 (2025).
Google Scholar
Liu, Y., Zhang, Z., Yue, J., Guo, W. & SCANeXt. Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution. Heliyon 10(5), e26775 (2024).
Huang, L. et al. A transformer-based generative adversarial network for brain tumor segmentation. Front. NeuroSci. 16, 1054948 (2022).
Google Scholar
Soh, W. K. & Rajapakse, J. C. Hybrid UNet transformer architecture for ischemic stroke segmentation with MRI and CT datasets. Front. NeuroSci. 17, 1298514 (2023).
Google Scholar
Cahan, N. et al. Multimodal fusion models for pulmonary embolism mortality prediction. Sci. Rep. 13 (1), 7544 (2023).
Google Scholar
Li, X. et al. TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization. Med. Biol. Eng. Comput. 61 (8), 1929–1946 (2023).
Google Scholar
Huo, H., Deng, H., Gao, J., Duan, H. & Ma, C. Mitigating under-sampling artifacts in 3d photoacoustic imaging using Res-UNet based on digital breast phantom. Sensors 23 (15), 6970 (2023).
Google Scholar
Dutande, P., Baid, U. & Talbar, S. Deep residual separable convolutional neural network for lung tumor segmentation. Comput. Biol. Med. 141, 105161 (2022).
Google Scholar
Jia, Q., Liu, S., Chen, M., Li, T. & Yang, J. ECSA: Mitigating Catastrophic Forgetting and Few-Shot Generalization in Medical Visual Question Answering. Tomography 11 (10), 115 (2025).
Google Scholar
Shen, Z., Cao, P., Yang, J. & Zaiane, O. R. WS-LungNet: A two-stage weakly-supervised lung cancer detection and diagnosis network. Comput. Biol. Med. 154, 106587 (2023).
Google Scholar
Teranikar, T. et al. Automated cell tracking using 3D nnUnet and Light Sheet Microscopy to quantify regional deformation in zebrafish. bioRxiv https://doi.org/10.1101/2024.11.04.621759 (2024).
Nasrullah, N. et al. Automated lung nodule detection and classification using deep learning combined with multiple strategies. Sensors 19 (17), 3722 (2019).
Google Scholar
Bhattacharyya, D., Thirupathi Rao, N., Joshua, E. S. & Hu, Y. C. A bi-directional deep learning architecture for lung nodule semantic segmentation. Visual Comput. 39 (11), 5245–5261 (2023).
Google Scholar
Zhou, Z., Siddiquee, M. M., Tajbakhsh, N., Liang, J. & Unet++ Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging. 39 (6), 1856–1867 (2019).
Google Scholar
Huang, J., Li, H., Li, G. & Wan, X. Attentive symmetric autoencoder for brain MRI segmentation. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Sep 16 (pp. 203–213). Cham: (Springer Nature Switzerland, 2022).
Zhou, H. Y. et al. nnformer: Volumetric medical image segmentation via a 3d transformer. IEEE Trans. Image Process. 32, 4036–4045 (2023).
Google Scholar

Download references

Funding

This research was supported by the following projects: Natural Science Foundation of Inner Mongolia Autonomous Region: Research on intelligent recognition, segmentation and 3D reconstruction algorithm of lung nodules in CT images (2025LHMS06016). There was no additional external funding received for this study.

Author information

Authors and Affiliations

School of Digital and Intelligence Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, Inner Mongolia, China
Kaikai Zhang, Xiaowen Lan, Lixin Wang, Yuhan Liu & Feng Guo
Department of Gastroenterology, The First Affiliated Hospital of Baotou Medical College, Baotou, 014010, Inner Mongolia, China
Yanhui Wang

Authors

Kaikai Zhang
View author publications
Search author on:PubMed Google Scholar
Xiaowen Lan
View author publications
Search author on:PubMed Google Scholar
Yanhui Wang
View author publications
Search author on:PubMed Google Scholar
Lixin Wang
View author publications
Search author on:PubMed Google Scholar
Yuhan Liu
View author publications
Search author on:PubMed Google Scholar
Feng Guo
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, K.Z.and X.L.; methodology, K.Z.and L.W.; validation, K.Z., X.L., and F.G.; formal analysis, K.Z., X.L., and Y.L.; investigation, Y.W. and G.F.; data curation, K.Z.and L.W.; writing-original draft preparation, K.Z.; writing-review and editing, X.L., L.W., F.G., and Y.W.; supervision, X.L.; All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Xiaowen Lan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, K., Lan, X., Wang, Y. et al. CA-3DTransUNet with dynamic cross-scale fusion for pulmonary nodule segmentation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47436-3

Download citation

Received: 21 October 2025
Accepted: 31 March 2026
Published: 03 April 2026
DOI: https://doi.org/10.1038/s41598-026-47436-3

CA-3DTransUNet with dynamic cross-scale fusion for pulmonary nodule segmentation

Subjects

Abstract

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Subjects

Abstract

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links