Abstract
Precise segmentation of pulmonary nodules in low-dose computed tomography is challenged by nodule heterogeneity, low contrast, and spatial overlap with adjacent anatomical structures. To address these issues, we propose CA-3DTransUNet, a segmentation framework based on the 3D-nnUNet architecture. The proposed network incorporates a Transformer 3D module in the bottleneck to model global volumetric dependencies and a CrossEMA3D module in the decoder to dynamically refine spatial features. Additionally, the wavelet transform is applied during the data preprocessing stage to augment input edge details. Evaluations on the LIDC-IDRI, LUNA16, and private BT datasets indicate the model’s performance. Specifically, on the LIDC-IDRI dataset, the model achieved a Dice Similarity Coefficient of 91.85 ± 0.43% [95% CI: 91.32–92.38], a Precision of 90.53 ± 0.51%, and a Sensitivity of 93.12 ± 0.42%. These results surpassed the hybrid architecture nnFormer, which attained a Dice score of 89.48 ± 0.52% (p = 0.014). These findings suggest that CA-3DTransUNet holds potential for the computer-aided analysis of pulmonary nodules.
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Xue, J., Yang, J., Luo, M., Cho, W. C. & Liu, X. MicroRNA-targeted therapeutics for lung cancer treatment. Expert Opin. Drug Discov. 12 (2), 141–157 (2017).
Jeoun, B. S. et al. Canal-Net for automatic and robust 3D segmentation of mandibular canals in CBCT images using a continuity-aware contextual network. Sci. Rep. 12 (1), 13460 (2022).
Langner, T. et al. Fully convolutional networks for automated segmentation of abdominal adipose tissue depots in multicenter water–fat MRI. Magn. Reson. Med. 81 (4), 2736–2745 (2019).
Duan, B., Cao, J., Wang, W., Cai, D. & Yan, Y. Cell instance segmentation via multi-scale non-local correlation. In2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI) 2023 Apr 18 (pp. 1–5). IEEE.
Chen, S., Qiu, C., Yang, W. & Zhang, Z. Multiresolution aggregation transformer UNet based on multiscale input and coordinate attention for medical image segmentation. Sensors 22 (10), 3820 (2022).
Wu, Z. et al. A comparative study of deep learning dose prediction models for cervical cancer volumetric modulated arc therapy. Technol. Cancer Res. Treat. 23, 15330338241242654 (2024).
Wu, L. F., Wei, D. & Xu, C. A. CFANet: The Cross-Modal Fusion Attention Network for Indoor RGB-D Semantic Segmentation. J. Imaging. 11 (6), 177 (2025).
Li, Y. & Zhang, Q. A Nomogram Combining Two Novel Biomarkers for Predicting Lung Adenocarcinoma in Ground-Glass Nodule Patients. Hum. Mutat. 2025 (1), 8647969 (2025).
Liu, Q., Zhou, T., Cheng, C., Ma, J. & Hoque Tania, M. Hybrid generative adversarial network based on frequency and spatial domain for histopathological image synthesis. BMC Bioinform. 26 (1), 29 (2025).
Abdullah, Fatima, Z., Abdullah, J., Rodríguez, J. L. & Sidorov, G. A multimodal AI framework for automated multiclass lung disease diagnosis from respiratory sounds with simulated biomarker fusion and personalized medication recommendation. Int. J. Mol. Sci. 26 (15), 7135 (2025).
Danilov, V. V. et al. Efficient workflow for automatic segmentation of the right heart based on 2D echocardiography. Int. J. Cardiovasc. Imaging. 34 (7), 1041–1055 (2018).
Mohammad, F., Ansari, R., Wanek, J., Francis, A. & Shahidi, M. Feasibility of level-set analysis of enface OCT retinal images in diabetic retinopathy. Biomedical Opt. Express. 6 (5), 1904–1918 (2015).
Wang, X., Luo, Z., Huang, W., Zhang, Y. & Hu, R. Optimized UNet framework with a joint loss function for underwater image enhancement. Sci. Rep. 15 (1), 7327 (2025).
Ma, X., Song, H., Jia, X. & Wang, Z. An improved V-Net lung nodule segmentation model based on pixel threshold separation and attention mechanism. Sci. Rep. 14 (1), 4743 (2024).
Zhang, L., Deng, Y. & Zou, Y. Automatic road damage recognition based on improved YOLOv11 with multi-scale feature extraction and fusion attention mechanism. PLoS One. 20 (9), e0327387 (2025).
Bruntha, P. M. et al. Lung_PAYNet: a pyramidal attention based deep learning network for lung nodule segmentation. Sci. Rep. 12 (1), 20330 (2022).
Wang, H. et al. Nutritional composition analysis in food images: an innovative Swin Transformer approach. Front. Nutr. 11, 1454466 (2024).
Hou, M., Wu, Y., Shi, H. & Mu, X. A two-stage multi-object tracking algorithm with transformer and attention mechanism. Sci. Rep. 15 (1), 31414 (2025).
Liu, Y., Zhang, Z., Yue, J., Guo, W. & SCANeXt. Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution. Heliyon 10(5), e26775 (2024).
Huang, L. et al. A transformer-based generative adversarial network for brain tumor segmentation. Front. NeuroSci. 16, 1054948 (2022).
Soh, W. K. & Rajapakse, J. C. Hybrid UNet transformer architecture for ischemic stroke segmentation with MRI and CT datasets. Front. NeuroSci. 17, 1298514 (2023).
Cahan, N. et al. Multimodal fusion models for pulmonary embolism mortality prediction. Sci. Rep. 13 (1), 7544 (2023).
Li, X. et al. TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization. Med. Biol. Eng. Comput. 61 (8), 1929–1946 (2023).
Huo, H., Deng, H., Gao, J., Duan, H. & Ma, C. Mitigating under-sampling artifacts in 3d photoacoustic imaging using Res-UNet based on digital breast phantom. Sensors 23 (15), 6970 (2023).
Dutande, P., Baid, U. & Talbar, S. Deep residual separable convolutional neural network for lung tumor segmentation. Comput. Biol. Med. 141, 105161 (2022).
Jia, Q., Liu, S., Chen, M., Li, T. & Yang, J. ECSA: Mitigating Catastrophic Forgetting and Few-Shot Generalization in Medical Visual Question Answering. Tomography 11 (10), 115 (2025).
Shen, Z., Cao, P., Yang, J. & Zaiane, O. R. WS-LungNet: A two-stage weakly-supervised lung cancer detection and diagnosis network. Comput. Biol. Med. 154, 106587 (2023).
Teranikar, T. et al. Automated cell tracking using 3D nnUnet and Light Sheet Microscopy to quantify regional deformation in zebrafish. bioRxiv https://doi.org/10.1101/2024.11.04.621759 (2024).
Nasrullah, N. et al. Automated lung nodule detection and classification using deep learning combined with multiple strategies. Sensors 19 (17), 3722 (2019).
Bhattacharyya, D., Thirupathi Rao, N., Joshua, E. S. & Hu, Y. C. A bi-directional deep learning architecture for lung nodule semantic segmentation. Visual Comput. 39 (11), 5245–5261 (2023).
Zhou, Z., Siddiquee, M. M., Tajbakhsh, N., Liang, J. & Unet++ Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging. 39 (6), 1856–1867 (2019).
Huang, J., Li, H., Li, G. & Wan, X. Attentive symmetric autoencoder for brain MRI segmentation. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Sep 16 (pp. 203–213). Cham: (Springer Nature Switzerland, 2022).
Zhou, H. Y. et al. nnformer: Volumetric medical image segmentation via a 3d transformer. IEEE Trans. Image Process. 32, 4036–4045 (2023).
Funding
This research was supported by the following projects: Natural Science Foundation of Inner Mongolia Autonomous Region: Research on intelligent recognition, segmentation and 3D reconstruction algorithm of lung nodules in CT images (2025LHMS06016). There was no additional external funding received for this study.
Author information
Authors and Affiliations
Contributions
Conceptualization, K.Z.and X.L.; methodology, K.Z.and L.W.; validation, K.Z., X.L., and F.G.; formal analysis, K.Z., X.L., and Y.L.; investigation, Y.W. and G.F.; data curation, K.Z.and L.W.; writing-original draft preparation, K.Z.; writing-review and editing, X.L., L.W., F.G., and Y.W.; supervision, X.L.; All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, K., Lan, X., Wang, Y. et al. CA-3DTransUNet with dynamic cross-scale fusion for pulmonary nodule segmentation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47436-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-47436-3