Abstract
Colposcopy is essential for the early detection of cervical cancer; however, its accuracy depends heavily on clinician experience and is often limited in low-resource settings. Under acetic acid application, most high-grade lesions maintain acetowhitening for 180 seconds, whereas nearly all low-grade or benign areas fade more rapidly. Leveraging this dynamic contrast, we propose TLS-Net, a deep network that processes time-series images captured at 60, 90, 150, and 180 seconds post-application. First, a Swin Transformer encoder extracts rich spatial features to localize lesion candidates. Next, a temporal attention module–incorporating a Convolutional Block Attention Module, fuses information across time points to distinguish persistent acetowhite regions. Finally, a segmentation head delineates High-Grade Squamous Intraepithelial Lesions or worse (HSIL+) areas within the detected regions. Trained and validated on 1,152 images from 288 patients, TLS-Net achieved mean Dice scores of 85.55% ± 1.33%, mean pixel accuracy of 85.61% ± 2.30%, and mean intersection-over-union of 76.65% ± 1.72% on the validation set, outperforming single-frame approaches. This method demonstrates promising potential for AI-assisted colposcopy in clinical practice.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to patient privacy considerations and institutional policy, but are available from the corresponding author on reasonable request.
References
World Health Organization. Global health sector strategies on, respectively, HIV, viral hepatitis and sexually transmitted infections for the period 2022-2030 (2022). License: CC BY-NC-SA 3.0 IGO.
Stelzle, D. et al. Estimates of the global burden of cervical cancer associated with HIV. Lancet Glob. Health 9, e161–e169 (2021).
Guida, F. et al. Global and regional estimates of orphans attributed to maternal cancer mortality in 2020. Nature medicine 28, 2563–2572 (2022).
Ginsburg, O. et al. The global burden of women’s cancers: A grand challenge in global health. Lancet 389, 847–860. https://doi.org/10.1016/S0140-6736(16)31392-7 (2017).
Denny, L. et al. Screen-and-treat approaches for cervical cancer prevention in low-resource settings: A randomized controlled trial. JAMA 294, 2173–2181 (2005).
Khan, M. et al. ASCCP colposcopy standards: Role of colposcopy, benefits, potential harms, and terminology for colposcopic practice. J. Low Genit. Tract Dis. 21, 223–229. https://doi.org/10.1097/LGT.0000000000000338 (2017).
Doorbar, J. Molecular biology of human papillomavirus infection and cervical cancer. Clin. Sci. 110, 525–541 (2006).
Pallavi, V. & Payal, K. Automated analysis of cervix images to grade the severity of cancer. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 3439–3442 (IEEE, 2011).
Frankel, K. Formal proposal to combine the Papanicolaou numerical system with Bethesda terminology for reporting cervical/vaginal cytologic diagnoses. Diagn. Cytopathol. 10, 395–396 (1994).
Ronco, G. et al. Efficacy of HPV-based screening for prevention of invasive cervical cancer: Follow-up of four European randomised controlled trials. Lancet 383, 524–532 (2014).
Karimi-Zarchi, M., Peighmbari, F., Karimi, N., Rohi, M. & Chiti, Z. A comparison of 3 ways of conventional pap smear, liquid-based cytology and colposcopy vs cervical biopsy for early diagnosis of premalignant lesions or cervical cancer in women with abnormal conventional pap test. Int. J. Biomed. Sci. 9, 205 (2013).
Azvolinsky, A. Screening guideline for cervical cancer recommends against human papillomavirus–Pap cotesting. JNCI J. Natl. Cancer Inst. 109, djx256. https://doi.org/10.1093/jnci/djx256 (2017).
Saini, S., Bansal, V., Kaur, R. & Juneja, M. Colponet for automated cervical cancer screening using colposcopy images. Mach. Vis. Appl. 31, 15. https://doi.org/10.1007/s00138-020-01063-8 (2020).
Asiedu, M. et al. Development of algorithms for automated detection of cervical pre-cancers with a low-cost, point-of-care, pocket colposcope. IEEE Trans. Biomed. Eng. 66, 2306–2318. https://doi.org/10.1109/TBME.2018.2887208 (2019).
Bai, B. et al. Detection of cervical lesion region from colposcopic images based on feature reselection. Biomed. Signal Process. Control 57, 101785 (2020).
Liu, J., Li, L. & Wang, L. Acetowhite region segmentation in uterine cervix images using a registered ratio image. Comput. Biol. Med. 93, 47–55 (2018).
Liu, J. et al. Segmentation of acetowhite region in uterine cervical image based on deep learning. Technol. Health Care 30, 469–482. https://doi.org/10.3233/THC-212890 (2022).
Yuan, C. et al. The application of deep learning based diagnostic system to cervical squamous intraepithelial lesions recognition in colposcopy images. Sci. Rep. 10, 11639 (2020).
Yu, H. et al. Segmentation of the cervical lesion region in colposcopic images based on deep learning. Front. Oncol. 12, 952847 (2022).
Dhahbi, W. et al. Machine learning and deep learning applications in sports biomechanical analysis: A systematic scoping review of performance enhancement and injury prevention strategies. ISBS Proceedings Archive 43, 18 (2025).
Souaifi, M. et al. Artificial intelligence in sports biomechanics: A scoping review on wearable technology, motion analysis, and injury prevention. Bioengineering 12, 887 (2025).
Dash, S., Sethy, P. K. & Behera, S. K. Cervical transformation zone segmentation and classification based on improved Inception-ResNet-v2 using colposcopy images. Cancer Inform. 22, 11769351231161476 (2023).
Li, Y. et al. Computer-aided cervical cancer diagnosis using time-lapsed colposcopic images. IEEE Trans. Med. Imaging 39, 3403–3415. https://doi.org/10.1109/TMI.2020.2994778 (2020).
Mukku, L. & Thomas, J. CMT-CNN: Colposcopic multimodal temporal hybrid deep learning model to detect cervical intraepithelial neoplasia. Int. J. Adv. Intell. Inform. 10, 317–332 (2024).
Acosta-Mesa, H.-G. et al. Aceto-white temporal pattern classification using k-NN to identify precancerous cervical lesion in colposcopic images. Comput. Biol. Med. 39, 778–784 (2009).
Gutiérrez-Fragoso, K. et al. Optimization of classification strategies of acetowhite temporal patterns towards improving diagnostic performance of colposcopy. Comput. Math. Methods Med. 2017, 5989105 (2017).
Perkins, R. et al. Comparison of accuracy and reproducibility of colposcopic impression based on a single image versus a two-minute time series of colposcopic images. Gynecologic Oncology 167, 89–95 (2022).
Clark, C. et al. Automating the detection of acetowhite lesions by classifying the temporal behavior of cervical regions. J. Low. Genit. Tract Dis. https://doi.org/10.1097/lgt.0000000000000927 (2025).
Ben Saad, H. et al. The assisted technology dilemma: A reflection on AI chatbots use and risks while reshaping the peer review process in scientific research. AI Soc. https://doi.org/10.1007/s00146-025-02299-6 (2025).
Fan, A. et al. Diagnostic value of the 2011 International Federation for Cervical Pathology and Colposcopy Terminology in predicting cervical lesions. Oncotarget 9, 9166–9176 (2018).
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), 801–818 (2018).
Tan, M. & Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, 6105–6114 (PMLR, 2019).
Khare, S. K. et al. An explainable attention model for cervical precancer risk classification using colposcopic images. Comput. Methods Programs Biomed. https://doi.org/10.1016/j.cmpb.2025.108976 (2025).
Yang, J. et al. A novel lightweight multi-scale feature fusion segmentation algorithm for real-time cervical lesion screening. Sci. Rep. 15, 6343 (2025).
Chen, J. et al. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021).
Cao, H. et al. Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, 205–218 (Springer, 2022).
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems 30 (2017).
Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141 (2018).
Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, 234–241 (Springer, 2015).
Wada, K. Labelme: Image Polygonal Annotation with Python, https://doi.org/10.5281/zenodo.5711226 (2016).
Dhahbi, W. et al. Tennis-specific incremental aerobic test (TSIAT): Construct validity, inter session reliability and sensitivity. Tunis. J. Sports Sci. Med. 2, 25–32 (2024).
Wang, X., Girshick, R., Gupta, A. & He, K. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7794–7803 (2018).
Woo, S., Park, J., Lee, J.-Y. & Kweon, I. S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), 3–19 (2018).
Honra, J. R. Enhancing high school biology education students’ transdisciplinary thinking through design thinking. The Journal of Experimental Education, 1–17 https://doi.org/10.1080/00220973.2025.2606241 (2025).
Honra, J. R. Understanding the influence of biomimetic projects on transdisciplinary thinking in biology education. Innovations in Education and Teaching International, 1–15 https://doi.org/10.1080/14703297.2025.2512386 (2025).
Honra, J. R. & Monterola, S. L. C. Enhancing biology students’ cognitive flexibility through problem-based learning moderated by transdisciplinary thinking. J. Educ. Res. (1), 11. https://doi.org/10.1080/00220671.2025.2525245 (2025).
Acknowledgements
The authors thank all colleagues and staff who provided valuable assistance and technical support during this study.
Funding
This work was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LTGY23H180016).
Author information
Authors and Affiliations
Contributions
LY conceived the study, designed the experiments, and drafted the manuscript. LY and ZW contributed equally to the conception and performed the main experiments and data collection. XS supervised the project, guided the methodology, provided critical revisions, and approved the final version of the manuscript. JY participated in the interpretation of clinical data and critical revision of the manuscript. WZ assisted in data preprocessing and statistical analysis. YG contributed to the acquisition and interpretation of clinical data. TZ and QW participated in algorithm development, software implementation, and validation. XM provided clinical insights, research resources, and helped refine the study design. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This study was approved by the Ethics Review Committee of Sir Run Run Shaw Hospital, Zhejiang University School of Medicine (Approval No. 2024-2288-01) and the Ethics Committee of the Qiaosi Branch of the First People’s Hospital of Linping District (Approval No. 2025-013). All procedures were performed in accordance with the Declaration of Helsinki and relevant institutional guidelines and regulations. Written informed consent to participate was obtained from all participants.
Consent for publication
Written informed consent for publication of clinical details and images was obtained from all participants.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yan, L., Wang, Z., Shen, X. et al. Time-lapsed colposcopy image-based segmentation of cervical lesion areas. Sci Rep (2026). https://doi.org/10.1038/s41598-026-43146-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-43146-y